SMLSOM: the shrinking maximum likelihood self-organizing map
From MaRDI portal
Publication:6168922
Abstract: Determining the number of clusters in a dataset is a fundamental issue in data clustering. Many methods have been proposed to solve the problem of selecting the number of clusters, considering it to be a problem with regard to model selection. This paper proposes an efficient algorithm that automatically selects a suitable number of clusters based on a probability distribution model framework. The algorithm includes the following two components. First, a generalization of Kohonen's self-organizing map (SOM) is introduced. In Kohonen's SOM, clusters are modeled as mean vectors. In the generalized SOM, each cluster is modeled as a probabilistic distribution and constructed by samples classified based on the likelihood. Second, the dynamically updating method of the SOM structure is introduced. In Kohonen's SOM, each cluster is tied to a node of a fixed two-dimensional lattice space and learned using neighborhood relations between nodes based on Euclidean distance. The extended SOM defines a graph with clusters as vertices and neighborhood relations as links and updates the graph structure by cutting weakly-connection and unnecessary vertex deletions. The weakness of a link is measured using the Kullback--Leibler divergence, and the redundancy of a vertex is measured using the minimum description length. Those extensions make it efficient to determine the appropriate number of clusters. Compared with existing methods, the proposed method is computationally efficient and can accurately select the number of clusters.
Recommendations
- scientific article; zbMATH DE number 1860775
- Advances in Intelligent Data Analysis VI
- Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density.
- On the generative probability density model in the self-organizing map
- Kernel-based self-organizing map clustering
Cites work
- scientific article; zbMATH DE number 1694979 (Why is no real title available?)
- scientific article; zbMATH DE number 4066122 (Why is no real title available?)
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 1085980 (Why is no real title available?)
- 10.1162/153244303321897735
- A Stochastic Approximation Method
- A new R package for Bayesian estimation of multivariate normal mixtures allowing for selection of the number of components and interval-censored data
- A new look at the statistical model identification
- An Information Measure for Classification
- Clustering: a neural network approach
- Dealing with overdispersion in multivariate count data
- Degrees of freedom and model selection for \(k\)-means clustering
- Estimating the dimension of a model
- Mixture models for standard \(p\)-dimensional Euclidean data
- Model Selection and the Principle of Minimum Description Length
- Model-Based Clustering, Discriminant Analysis, and Density Estimation
- Model-based clustering and classification for data science. With applications in R
- Modeling by shortest data description
- On Information and Sufficiency
- Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Self-organized formation of topologically correct feature maps
- Self-organizing maps.
- The dip test of unimodality
This page was built for publication: SMLSOM: the shrinking maximum likelihood self-organizing map
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6168922)