Robust k-means clustering for distributions with two moments

From MaRDI portal
Publication:2054489

DOI10.1214/20-AOS2033zbMATH Open1487.62070arXiv2002.02339OpenAlexW3204688258MaRDI QIDQ2054489FDOQ2054489


Authors: Yegor Klochkov, Alexey Kroshnin, Nikita Zhivotovskiy Edit this on Wikidata


Publication date: 3 December 2021

Published in: The Annals of Statistics (Search for Journal in Brave)

Abstract: We consider the robust algorithms for the k-means clustering problem where a quantizer is constructed based on N independent observations. Our main results are median of means based non-asymptotic excess distortion bounds that hold under the two bounded moments assumption in a general separable Hilbert space. In particular, our results extend the renowned asymptotic result of Pollard, 1981 who showed that the existence of two moments is sufficient for strong consistency of an empirically optimal quantizer in mathbbRd. In a special case of clustering in mathbbRd, under two bounded moments, we prove matching (up to constant factors) non-asymptotic upper and lower bounds on the excess distortion, which depend on the probability mass of the lightest cluster of an optimal quantizer. Our bounds have the sub-Gaussian form, and the proofs are based on the versions of uniform bounds for robust mean estimators.


Full work available at URL: https://arxiv.org/abs/2002.02339




Recommendations




Cites Work


Cited In (7)





This page was built for publication: Robust \(k\)-means clustering for distributions with two moments

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2054489)