Robust estimators in high-dimensions without the computational intractability
From MaRDI portal
Publication:4634036
Abstract: We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an -fraction of the samples. Such questions have a rich history spanning statistics, machine learning and theoretical computer science. Even in the most basic settings, the only known approaches are either computationally inefficient or lose dimension-dependent factors in their error guarantees. This raises the following question:Is high-dimensional agnostic distribution learning even possible, algorithmically? In this work, we obtain the first computationally efficient algorithms with dimension-independent error guarantees for agnostically learning several fundamental classes of high-dimensional distributions: (1) a single Gaussian, (2) a product distribution on the hypercube, (3) mixtures of two product distributions (under a natural balancedness condition), and (4) mixtures of spherical Gaussians. Our algorithms achieve error that is independent of the dimension, and in many cases scales nearly-linearly with the fraction of adversarially corrupted samples. Moreover, we develop a general recipe for detecting and correcting corruptions in high-dimensions, that may be applicable to many other problems.
Recommendations
- Robustly learning a Gaussian: getting optimal error, efficiently
- High-dimensional robust mean estimation in nearly-linear time
- List-decodable robust mean estimation and learning mixtures of spherical Gaussians
- Efficient algorithms and lower bounds for robust linear regression
- All-in-one robust estimator of the Gaussian mean
Cites work
- scientific article; zbMATH DE number 1804119 (Why is no real title available?)
- scientific article; zbMATH DE number 3870398 (Why is no real title available?)
- scientific article; zbMATH DE number 3954047 (Why is no real title available?)
- scientific article; zbMATH DE number 3541764 (Why is no real title available?)
- scientific article; zbMATH DE number 4001209 (Why is no real title available?)
- scientific article; zbMATH DE number 837911 (Why is no real title available?)
- scientific article; zbMATH DE number 3336465 (Why is no real title available?)
- scientific article; zbMATH DE number 3390139 (Why is no real title available?)
- 10.1162/153244304773936072
- A novel M-estimator for robust PCA
- A size-free CLT for Poisson multinomials and its applications
- A universally acceptable smoothing factor for kernel density estimates
- Adaptive estimation of a quadratic functional by model selection.
- Combinatorial methods in density estimation
- Decision theoretic generalizations of the PAC model for neural net and other learning applications
- Distributional and \(L^q\) norm inequalities for polynomials over convex bodies in \(\mathbb{R}^n\)
- Efficient density estimation via piecewise polynomial approximation
- Efficiently learning mixtures of two Gaussians
- Elements of Information Theory
- Evolutionary trees can be learned in polynomial time in the two-state general Markov model
- Geometric algorithms and combinatorial optimization
- Geometric median in nearly linear time
- Learning Halfspaces with Malicious Noise
- Learning Mixtures of Product Distributions over Discrete Domains
- Learning \(k\)-modal distributions via testing
- Learning discrete distributions from untrusted batches
- Learning from satisfying assignments
- Learning from untrusted data
- Learning geometric concepts with nasty noise
- Learning in the Presence of Malicious Errors
- Learning mixtures of arbitrary Gaussians
- Learning mixtures of spherical Gaussians: moment methods and spectral decompositions (extended abstract)
- Learning mixtures of structured distributions over discrete domains
- List-decodable robust mean estimation and learning mixtures of spherical Gaussians
- Maximum Likelihood Estimation of Misspecified Models
- Mixture models, robustness, and sum of squares proofs
- Nonasymptotic universal smoothing factors, kernel complexity and Yatracos classes
- On estimation of the diagonal elements of a sparse precision matrix
- On the learnability of discrete distributions
- Provable ICA with unknown Gaussian noise, and implications for Gaussian mixtures and autoencoders
- Rates of convergence of minimum distance estimators and Kolmogorov's entropy
- Resilience: a criterion for learning in the presence of arbitrary outliers
- Robust Statistics
- Robust computation of linear models by convex relaxation
- Robust estimators in high-dimensions without the computational intractability
- Robust moment estimation and improved clustering via sum of squares
- Robust principal component analysis?
- Sample-optimal density estimation in nearly-linear time
- The Fourier transform of Poisson multinomial distributions and its algorithmic applications
- The densest hemisphere problem
- Tight bounds for learning a mixture of two Gaussians (extended abstract)
- Toward efficient agnostic learning
Cited in
(37)- A shrinkage principle for heavy-tailed data: high-dimensional robust low-rank matrix recovery
- Robust regression via mutivariate regression depth
- Low rank approximation in the presence of outliers
- Notes on computational hardness of hypothesis testing: predictions using the low-degree likelihood ratio
- Robust high dimensional expectation maximization algorithm via trimmed hard thresholding
- Maximum selection and sorting with adversarial comparators
- Mean estimation with sub-Gaussian rates in polynomial time
- Robustly learning a Gaussian: getting optimal error, efficiently
- On robustness and local differential privacy
- Robust estimators in high-dimensions without the computational intractability
- Robustness and Tractability for Non-convex M-estimators
- Stronger data poisoning attacks break data sanitization defenses
- All-in-one robust estimator of the Gaussian mean
- Confidence regions and minimax rates in outlier-robust estimation on the probability simplex
- ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels
- Robust estimation of covariance matrices: adversarial contamination and beyond
- Robust inference and local algorithms
- Resilience: a criterion for learning in the presence of arbitrary outliers
- High-dimensional robust mean estimation in nearly-linear time
- Nearly optimal robust mean estimation via empirical characteristic function
- Efficient parameter estimation of truncated Boolean product distributions
- Mean estimation and regression under heavy-tailed distributions: A survey
- Robust classification via MOM minimization
- Graph powering and spectral robustness
- Multidimensional linear functional estimation in sparse Gaussian models and robust estimation of the mean
- Improved covariance estimation: optimal robustness and sub-Gaussian guarantees under heavy tails
- Robust multivariate mean estimation: the optimality of trimmed mean
- List-decodable robust mean estimation and learning mixtures of spherical Gaussians
- Efficient algorithms and lower bounds for robust linear regression
- Learning from untrusted data
- Finite sample properties of parametric MMD estimation: robustness to misspecification and dependence
- scientific article; zbMATH DE number 7306864 (Why is no real title available?)
- Corruption-tolerant bandit learning
- Robust supervised learning with coordinate gradient descent
- Learning under \(p\)-tampering poisoning attacks
- Generalized resilience and robust statistics
- Robust sub-Gaussian estimation of a mean vector in nearly linear time
This page was built for publication: Robust estimators in high-dimensions without the computational intractability
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4634036)