Singularity, misspecification and the convergence rate of EM
From MaRDI portal
Publication:1996764
Abstract: A line of recent work has analyzed the behavior of the Expectation-Maximization (EM) algorithm in the well-specified setting, in which the population likelihood is locally strongly concave around its maximizing argument. Examples include suitably separated Gaussian mixture models and mixtures of linear regressions. We consider over-specified settings in which the number of fitted components is larger than the number of components in the true distribution. Such misspecified settings can lead to singularity in the Fisher information matrix, and moreover, the maximum likelihood estimator based on i.i.d. samples in dimensions can have a non-standard rate of convergence. Focusing on the simple setting of two-component mixtures fit to a -dimensional Gaussian distribution, we study the behavior of the EM algorithm both when the mixture weights are different (unbalanced case), and are equal (balanced case). Our analysis reveals a sharp distinction between these two cases: in the former, the EM algorithm converges geometrically to a point at Euclidean distance of from the true parameter, whereas in the latter case, the convergence rate is exponentially slower, and the fixed point has a much lower accuracy. Analysis of this singular case requires the introduction of some novel techniques: in particular, we make use of a careful form of localization in the associated empirical process, and develop a recursive argument to progressively sharpen the statistical rate.
Recommendations
- Statistical convergence of the EM algorithm on Gaussian mixture models
- Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in \(O(\sqrt{n})\) iterations
- A probabilistic analysis of EM for mixtures of separated, spherical Gaussians
- Improved convergence guarantees for learning Gaussian mixture models by EM and gradient EM
- Statistical guarantees for the EM algorithm: from population to sample-based analysis
Cites work
- scientific article; zbMATH DE number 5654889 (Why is no real title available?)
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 1085980 (Why is no real title available?)
- Asymptotic Convergence Properties of the EM Algorithm for Mixture of Experts
- Asymptotic behaviour of the posterior distribution in overfitted mixture models
- Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions
- CHIME: clustering of high-dimensional Gaussian mixtures with EM algorithm and its optimality
- Convergence of latent mixing measures in finite and infinite mixture models
- Dealing With Label Switching in Mixture Models
- Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities.
- Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization
- High-dimensional statistics. A non-asymptotic viewpoint
- Hypothesis test for normal mixture models: the EM approach
- Local Rademacher complexities
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Mixture Densities, Maximum Likelihood and the EM Algorithm
- Non-finite Fisher information and homogeneity: an EM approach
- On the convergence properties of the EM algorithm
- Optimal rate of convergence for finite mixture models
- Simultaneous Clustering and Estimation of Heterogeneous Graphical Models
- Statistical guarantees for the EM algorithm: from population to sample-based analysis
- Strong identifiability and optimal minimax rates for finite mixture estimation
Cited in
(10)- Supermix: sparse regularization for mixtures
- Sequential estimation for mixture of regression models for heterogeneous population
- Improved convergence guarantees for learning Gaussian mixture models by EM and gradient EM
- A diffusion process perspective on posterior contraction rates for parameters
- A Doubly Enhanced EM Algorithm for Model-Based Tensor Clustering
- Sharp global convergence guarantees for iterative nonconvex optimization with random data
- Statistical convergence of the EM algorithm on Gaussian mixture models
- Network Gradient Descent Algorithm for Decentralized Federated Learning
- Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in \(O(\sqrt{n})\) iterations
- Iterative algorithm for discrete structure recovery
This page was built for publication: Singularity, misspecification and the convergence rate of EM
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1996764)