Improved model-based clustering performance using Bayesian initialization averaging
From MaRDI portal
(Redirected from Publication:1729337)
Abstract: The Expectation-Maximization (EM) algorithm is a commonly used method for finding the maximum likelihood estimates of the parameters in a mixture model via coordinate ascent. A serious pitfall with the algorithm is that in the case of multimodal likelihood functions, it can get trapped at a local maximum. This problem often occurs when sub-optimal starting values are used to initialize the algorithm. Bayesian initialization averaging (BIA) is proposed as an ensemble method to generate high quality starting values for the EM algorithm. Competing sets of trial starting values are combined as a weighted average, which is then used as the starting position for a full EM run. The method can also be extended to variational Bayes (VB) methods, a class of algorithm similar to EM that is based on an approximation of the model posterior. The BIA method is demonstrated on real continuous, categorical and network data sets, and the convergent log-likelihoods and associated clustering solutions presented. These compare favorably with the output produced using competing initialization methods such as random starts, hierarchical clustering and deterministic annealing, with the highest available maximum likelihood estimates obtained in a higher percentage of cases, at reasonable computational cost. The implications of the different clustering solutions obtained by local maxima are also discussed.
Recommendations
- A Gaussian mixture model based \(k\)-means to initialize the EM algorithm
- Improved initialisation of model-based clustering using Gaussian hierarchical partitions
- EM for mixtures
- Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models
- An experimental comparison of model-based clustering methods
Cites work
- scientific article; zbMATH DE number 1817585 (Why is no real title available?)
- scientific article; zbMATH DE number 3986503 (Why is no real title available?)
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 1222290 (Why is no real title available?)
- scientific article; zbMATH DE number 1059776 (Why is no real title available?)
- scientific article; zbMATH DE number 1168330 (Why is no real title available?)
- scientific article; zbMATH DE number 2124691 (Why is no real title available?)
- scientific article; zbMATH DE number 805043 (Why is no real title available?)
- A Limited Memory Algorithm for Bound Constrained Optimization
- A classification EM algorithm for clustering and two stochastic versions
- Algorithm 778: L-BFGS-B
- Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke
- Bayesian model averaging: A tutorial. (with comments and a rejoinder).
- Choosing initial values for the EM algorithm for finite mixtures
- Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models
- Computational aspects of fitting mixture models via the expectation-maximization algorithm
- Consistent estimation of the order of mixture models.
- Data Mining and Knowledge Discovery Handbook
- EM for mixtures
- Enhancing the selection of a model-based clustering with external categorical variables
- Estimating the dimension of a model
- Estimation and prediction for stochastic blockmodels for graphs with latent block structure
- Exploratory latent structure analysis using both identifiable and unidentifiable models
- Finite mixture models
- Finite mixtures of multivariate skew \(t\)-distributions: some recent and new results
- Information-based clustering
- MCLUST: Software for model-based cluster analysis
- Maximum likelihood estimation via the ECM algorithm: A general framework
- Mixture Densities, Maximum Likelihood and the EM Algorithm
- Model-based clustering, classification, and discriminant analysis via mixtures of multivariate \(t\)-distributions
- On the Bumpy Road to the Dominant Mode
- Variational Bayes approach for model aggregation in unsupervised classification with Markovian dependency
Cited in
(4)
Describes a project that uses
Uses Software
This page was built for publication: Improved model-based clustering performance using Bayesian initialization averaging
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1729337)