Mixture models, robustness, and sum of squares proofs

From MaRDI portal
Publication:5230359

DOI10.1145/3188745.3188748zbMATH Open1428.68240arXiv1711.07454OpenAlexW2962820675MaRDI QIDQ5230359FDOQ5230359


Authors: Samuel B. Hopkins, Jerry Li Edit this on Wikidata


Publication date: 22 August 2019

Published in: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (Search for Journal in Brave)

Abstract: We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Firstly, we study mixtures of k distributions in d dimensions, where the means of every pair of distributions are separated by at least kvarepsilon. In the special case of spherical Gaussian mixtures, we give a (dk)O(1/varepsilon2)-time algorithm that learns the means assuming separation at least kvarepsilon, for any varepsilon>0. This is the first algorithm to improve on greedy ("single-linkage") and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. We also study robust estimation. When an unknown (1varepsilon)-fraction of X1,ldots,Xn are chosen from a sub-Gaussian distribution with mean mu but the remaining points are chosen adversarially, we give an algorithm recovering mu to error varepsilon11/t in time dO(t2), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than varepsilon1/2. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,ldots,XninmathbbRd for large d and n=poly(d) with the promise that a subset of X1,ldots,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.


Full work available at URL: https://arxiv.org/abs/1711.07454




Recommendations





Cited In (20)





This page was built for publication: Mixture models, robustness, and sum of squares proofs

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5230359)