A central limit theorem for k-means clustering

DOI10.1214/aop/1176993713zbMath0502.62055OpenAlexW2050744541MaRDI QIDQ1172905

Publication date: 1982

Published in: The Annals of Probability (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1214/aop/1176993713

zbMATH Keywords

asymptotic normality empirical processes k-means clustering differentiability in quadratic mean Donsker classes of functions minimized within cluster sum of squares

Mathematics Subject Classification ID

Random fields (60G60) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Probability measures on topological spaces (60B05) Central limit and other weak theorems (60F05) Functional limit theorems; invariance principles (60F17)

Related Items

On the added value of bootstrap analysis for $K$-means clustering, Clusterwise functional linear regression models, On the minimum of the mean-squared error in 2-means clustering, Numerical studies of MacQueen's $k$-means algorithm for computing the centroidal Voronoi tessellations, Optimal stratification and clustering on the line using the $L_ 1$- norm, Convergence rate of optimal quantization grids and application to empirical measure, Convergence of the $k$-Means Minimization Problem using $\Gamma$-Convergence, Probabilistic models in cluster analysis, Asymptotics of $k$-mean clustering under non-i.i.d. sampling, Trimmed $k$-means: An attempt to robustify quantizers, Fast rates for empirical vector quantization, Nonparametric K-means algorithm with applications in economic and functional data, Optimal clustering on the real line, Strong Consistency of ReducedK-means Clustering, Representative points for location-biased datasets, Statistical learning guarantees for compressive clustering and compressive mixture modeling, ON ASYMPTOTIC NORMALITY OF A CLASS OF FUZZY C-MEANS CLUSTERING PROCEDURES, A statistical view of clustering performance through the theory of $U$-processes, Empirical risk minimization for heavy-tailed losses, Central limit theorems for semi-discrete Wasserstein distances, Detecting communities in attributed networks through bi-direction penalized clustering and its application, Testing for Unobserved Heterogeneity via k-means Clustering, Asymptotics of a clustering criterion for smooth distributions, $L_1$-quantization and clustering in Banach spaces, Stability and model selection in $k$-means clustering, Unnamed Item, Dimensionality-Dependent Generalization Bounds for k-Dimensional Coding Schemes, Large-sample results for optimization-based clustering methods, How to speed up the quantization tree algorithm with an application to swing options, Robust recovery of multiple subspaces by geometric $l_{p}$ minimization, Bayesian Fourier clustering of gene expression data, Initializing $k$-means clustering by bootstrap and data depth, Weak limit theorems for univariate $k$-mean clustering under a nonregular condition, Impact of Contamination on Training and Test Error Rates in Statistical Clustering, Q-convergence with interquartile ranges, Mixed-rates asymptotics, Chi-squared tests for evaluation and comparison of asset pricing models, Using combinatorial optimization in model-based trimmed clustering with cardinality constraints, An asymptotic result on principal points for univariate distributions, Average Competitive Learning Vector Quantization, Infinite Dirichlet mixture models learning via expectation propagation, On the asymptotics of trimmed best $k$-nets, Quantization-based clustering algorithm, A parametric $k$-means algorithm, METHODS FOR ESTIMATING PRINCIPAL POINTS, SOM's mathematics, Convergence rate of estimators of clustered panel models with misclassification, Bandwidth selection in kernel empirical risk minimization via the gradient, A Robust Maximal F-Ratio Statistic to Detect Clusters Structure, Asymptotics of the empirical cross-over function, The next‐generation K‐means algorithm, Pointwise Convergence of the Lloyd I Algorithm in Higher Dimension, A central limit theorem for multivariate generalized trimmed $k$-means, Principal point classification: applications to differentiating drug and placebo responses in longitudinal studies, Asymptotics for trimmed $k$-means and associated tolerance zones., Empirical geometry of multivariate data: a deconvolution approach., A FAST IMPLEMENTATION OF THE ISODATA CLUSTERING ALGORITHM, Nonasymptotic bounds for vector quantization in Hilbert spaces, Statistical theory in clustering, Consistency of Archetypal Analysis, On minimizing sequences for $k$-centres, On some significance tests in cluster analysis