A robust approach to model-based classification based on trimming and constraints. Semi-supervised learning in presence of outliers and label noise

DOI10.1007/S11634-019-00371-WMaRDI QIDQ2201323zbMATH OpenOpenAlexWikidataFDO

Authors Andrea Cappozzo, Francesca Greselin, Thomas Brendan Murphy

Publication date 29 September 2020

Published in Advances in Data Analysis and Classification. ADAC (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1904.06136

robust estimation model-based classification outliers detection label noise impartial trimming eigenvalues restrictions

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Robustness and adaptive procedures (parametric inference) (62F35) Compound decision problems in statistical decision theory (62C25)

Abstract: In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations, namely outliers and data with incorrect labels, can strongly undermine the classifier performance, especially if the training size is small. The present work introduces a robust modification to the Model-Based Classification framework, employing impartial trimming and constraints on the ratio between the maximum and the minimum eigenvalue of the group scatter matrices. The proposed method effectively handles noise presence in both response and exploratory variables, providing reliable classification even when dealing with contaminated datasets. A robust information criterion is proposed for model selection. Experiments on real and simulated data, artificially adulterated, are provided to underline the benefits of the proposed method.

Recommendations

Cites work

Cited in

(7)

Describes a project that uses

Uses Software

This page was built for publication: A robust approach to model-based classification based on trimming and constraints. Semi-supervised learning in presence of outliers and label noise

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2201323)