Recovery guarantees for exemplar-based clustering
From MaRDI portal
Abstract: For a certain class of distributions, we prove that the linear programming relaxation of -medoids clustering---a variant of -means clustering where means are replaced by exemplars from within the dataset---distinguishes points drawn from nonoverlapping balls with high probability once the number of points drawn and the separation distance between any two balls are sufficiently large. Our results hold in the nontrivial regime where the separation distance is small enough that points drawn from different balls may be closer to each other than points drawn from the same ball; in this case, clustering by thresholding pairwise distances between points can fail. We also exhibit numerical evidence of high-probability recovery in a substantially more permissive regime.
Recommendations
Cites work
- scientific article; zbMATH DE number 1303535 (Why is no real title available?)
- scientific article; zbMATH DE number 1303608 (Why is no real title available?)
- scientific article; zbMATH DE number 1559542 (Why is no real title available?)
- scientific article; zbMATH DE number 2086926 (Why is no real title available?)
- scientific article; zbMATH DE number 1833407 (Why is no real title available?)
- A 1.488 Approximation Algorithm for the Uncapacitated Facility Location Problem
- A Binary Variable Model for Affinity Propagation
- A new greedy approach for facility location problems
- A new partitioning around medoids algorithm
- A spectral algorithm for learning mixture models
- Algorithm AS 136: A K-Means Clustering Algorithm
- Algorithms for graph partitioning on the planted partition model
- An Optimal Bifactor Approximation Algorithm for the Metric Uncapacitated Facility Location Problem
- Approximating k-median via pseudo-approximation
- Approximation Algorithms for Metric Facility Location Problems
- Clustering by passing messages between data points
- Convex optimization for the planted \(k\)-disjoint-clique problem
- Correlation clustering
- Efficiently learning mixtures of two Gaussians
- Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP
- Guaranteed clustering and biclustering via semidefinite programming
- Improved Approximation Algorithms for the Uncapacitated Facility Location Problem
- Learning Theory
- Learning Theory
- Learning mixtures of Gaussians in high dimensions
- Learning mixtures of arbitrary Gaussians
- Least squares quantization in PCM
- Local Search Heuristics for k-Median and Facility Location Problems
- NP-hardness of Euclidean sum-of-squares clustering
- On the Complexity of Some Common Geometric Location Problems
- PAC Learning Axis-Aligned Mixtures of Gaussians with No Separation Assumption
- Random Projection Trees for Vector Quantization
- Robust PCA and clustering in noisy mixtures
- Worst-Case and Probabilistic Analysis of a Geometric Location Problem
Cited in
(12)- Discrete facility location in machine learning
- The ratio-cut polytope and K-means clustering
- Size matters: cardinality-constrained clustering and outlier detection via conic optimization
- Local versions of sum-of-norms clustering
- Hidden Integrality and Semirandom Robustness of SDP Relaxation for Sub-Gaussian Mixture Model
- Probably certifiably correct k-means clustering
- Clustering subgaussian mixtures by semidefinite programming
- Relax, no need to round: integrality of clustering formulations
- Partial recovery bounds for clustering with the relaxed \(K\)-means
- Convex optimization for the densest subgraph and densest submatrix problems
- Exact clustering of weighted graphs via semidefinite programming
- \(k\)-median: exact recovery in the extended stochastic ball model
This page was built for publication: Recovery guarantees for exemplar-based clustering
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q897656)