Robust machine learning by median-of-means: theory and practice
From MaRDI portal
Publication:2196199
Abstract: We introduce new estimators for robust machine learning based on median-of-means (MOM) estimators of the mean of real valued random variables. These estimators achieve optimal rates of convergence under minimal assumptions on the dataset. The dataset may also have been corrupted by outliers on which no assumption is granted. We also analyze these new estimators with standard tools from robust statistics. In particular, we revisit the concept of breakdown point. We modify the original definition by studying the number of outliers that a dataset can contain without deteriorating the estimation properties of a given estimator. This new notion of breakdown number, that takes into account the statistical performances of the estimators, is non-asymptotic in nature and adapted for machine learning purposes. We proved that the breakdown number of our estimator is of the order of (number of observations)*(rate of convergence). For instance, the breakdown number of our estimators for the problem of estimation of a d-dimensional vector with a noise variance sigma^2 is sigma^2d and it becomes sigma^2 s log(d/s) when this vector has only s non-zero component. Beyond this breakdown point, we proved that the rate of convergence achieved by our estimator is (number of outliers) divided by (number of observation). Besides these theoretical guarantees, the major improvement brought by these new estimators is that they are easily computable in practice. In fact, basically any algorithm used to approximate the standard Empirical Risk Minimizer (or its regularized versions) has a robust version approximating our estimators. As a proof of concept, we study many algorithms for the classical LASSO estimator. A byproduct of the MOM algorithms is a measure of depth of data that can be used to detect outliers.
Recommendations
- Robust classification via MOM minimization
- Learning from MOM's principles: Le Cam's approach
- Robust statistical learning with Lipschitz and convex loss functions
- Regularization, sparse recovery, and median-of-means tournaments
- Mean estimation and regression under heavy-tailed distributions: A survey
Cites work
- scientific article; zbMATH DE number 6388313 (Why is no real title available?)
- scientific article; zbMATH DE number 5957408 (Why is no real title available?)
- scientific article; zbMATH DE number 3790208 (Why is no real title available?)
- scientific article; zbMATH DE number 1332320 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- scientific article; zbMATH DE number 3320125 (Why is no real title available?)
- scientific article; zbMATH DE number 3336465 (Why is no real title available?)
- A General Qualitative Definition of Robustness
- A new method for estimation and model selection: \(\rho\)-estimation
- A survey of cross-validation procedures for model selection
- A unified framework for high-dimensional analysis of \(M\)-estimators with decomposable regularizers
- Asymptotic methods in statistical decision theory
- Bounding the smallest singular value of a random matrix without concentration
- Challenging the empirical mean and empirical variance: a deviation study
- Choice of V for V-Fold Cross-Validation in Least-Squares Density Estimation
- Confidence intervals for low dimensional parameters in high dimensional linear models
- Confidence sets in sparse regression
- Convergence of estimates under dimensionality restrictions
- Distributed statistical estimation and rates of convergence in normal approximation
- Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions
- Estimation of High Dimensional Mean Regression in the Absence of Symmetry and Light Tail Assumptions
- Estimator selection with respect to Hellinger-type risks
- Geometric median and robust estimation in Banach spaces
- High-dimensional generalized linear models and the lasso
- High-dimensional graphs and variable selection with the Lasso
- Lasso-type recovery of sparse representations for high-dimensional data
- Learning from MOM's principles: Le Cam's approach
- Learning without concentration
- Model selection via testing: an alternative to (penalized) maximum likelihood estimators.
- On asymptotically optimal confidence regions and tests for high-dimensional models
- On optimality of empirical risk minimization in linear aggregation
- Oracle inequalities in empirical risk minimization and sparse recovery problems. École d'Été de Probabilités de Saint-Flour XXXVIII-2008.
- Random generation of combinatorial structures from a uniform distribution
- Regularization and the small-ball method. I: Sparse recovery
- Regularization and the small-ball method. II: Complexity dependent error rates
- Regularization, sparse recovery, and median-of-means tournaments
- Rho-estimators for shape restricted density estimation
- Risk bounds for statistical learning
- Risk minimization by median-of-means tournaments
- Robust Estimation of a Location Parameter
- Robust Statistics
- Robust linear least squares regression
- Robust low-rank matrix estimation
- SLOPE is adaptive to unknown sparsity and asymptotically minimax
- SLOPE-adaptive variable selection via convex optimization
- Segmentation of the mean of heteroscedastic data via cross-validation
- Simultaneous analysis of Lasso and Dantzig selector
- Slope meets Lasso: improved oracle bounds and optimality
- Stabilité et instabilité du risque minimax pour des variables indépendantes équidistribuées
- Statistical performance of support vector machines
- Statistics for high-dimensional data. Methods, theory and applications.
- Sub-Gaussian mean estimators
- Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators
- The Future of Data Analysis
- The Influence Curve and Its Role in Robust Estimation
- The space complexity of approximating the frequency moments
- Weakly decomposable regularization penalties and structured sparsity
Cited in
(43)- Robust Bregman clustering
- The main contributions of robust statistics to statistical science and a new challenge
- Super-polynomial accuracy of multidimensional randomized nets using the median-of-means
- Outlier detection in networks with missing links
- On least squares estimation under heteroscedastic and heavy-tailed errors
- Robust and tuning-free sparse linear regression via square-root slope
- Super-polynomial accuracy of one dimensional randomized nets using the median of means
- All-in-one robust estimator of the Gaussian mean
- On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means
- Robust subgaussian estimation with VC-dimension
- ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels
- Distribution-free robust linear regression
- DeepMoM: Robust Deep Learning With Median-of-Means
- Data perturbations in stochastic generalized equations: statistical robustness in static and sample average approximated models
- Byzantine-robust and efficient distributed sparsity learning: a surrogate composite quantile regression approach
- scientific article; zbMATH DE number 7370601 (Why is no real title available?)
- Byzantine-robust distributed sparse learning for \(M\)-estimation
- Robust \(k\)-means clustering for distributions with two moments
- High-dimensional \(M\)-estimation for Byzantine-robust decentralized learning
- A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning
- Iteratively reweighted \(\ell_1\)-penalized robust regression
- Optimal robust mean and location estimation via convex programs with respect to any pseudo-norms
- Robust statistical learning with Lipschitz and convex loss functions
- Mean estimation and regression under heavy-tailed distributions: A survey
- Exponential concentration for geometric-median-of-means in non-positive curvature spaces
- Regularization, sparse recovery, and median-of-means tournaments
- Robust partially linear trend filtering for regression estimation and structure discovery
- Non-asymptotic analysis and inference for an outlyingness induced winsorized mean
- Core-elements for large-scale least squares estimation
- Robust classification via MOM minimization
- Parallel and robust empirical risk minimization via the median trick
- A robust estimator of the proportional hazard transform for massive data
- Robust multivariate mean estimation: the optimality of trimmed mean
- K-bMOM: A robust Lloyd-type clustering algorithm based on bootstrap median-of-means
- Scale calibration for high-dimensional robust regression
- Learning in repeated auctions
- Classifier selection using the predicate depth
- Robust supervised learning with coordinate gradient descent
- Learning from MOM's principles: Le Cam's approach
- Topics in robust statistical learning
- Risk minimization by median-of-means tournaments
- scientific article; zbMATH DE number 7306878 (Why is no real title available?)
- Robust sub-Gaussian estimation of a mean vector in nearly linear time
This page was built for publication: Robust machine learning by median-of-means: theory and practice
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2196199)