Estimating the number of clusters via a corrected clustering instability
From MaRDI portal
Publication:2228237
Abstract: We improve current instability-based methods for the selection of the number of clusters in cluster analysis by developing a normalized cluster instability measure that corrects for the distribution of cluster sizes, a previously unaccounted driver of cluster instability. We show that our normalized instability measure outperforms current instability-based measures across the whole sequence of possible and especially overcomes limitations in the context of large . We also compare, for the first time, model-based and model-free approaches to determine cluster-instability and find their performance to be comparable. We make our method available in the R-package verb+cstab+.
Recommendations
Cites work
- scientific article; zbMATH DE number 3579840 (Why is no real title available?)
- A Sober Look at Clustering Stability
- A non-parametric method to estimate the number of clusters
- Cluster-wise assessment of cluster stability
- Consistent estimation of a mixing distribution
- Consistent selection of the number of clusters via crossvalidation
- Estimating the dimension of a model
- Estimating the number of clusters in a data set via the gap statistic
- Finding the Number of Clusters in a Dataset
- Learning Eigenfunctions Links Spectral Embedding and Kernel PCA
- Selection of the number of clusters via the bootstrap method
- Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
- The elements of statistical learning. Data mining, inference, and prediction
Cited in
(5)
This page was built for publication: Estimating the number of clusters via a corrected clustering instability
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2228237)