Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in O(n) iterations

From MaRDI portal
Publication:2113265

DOI10.4171/MSL/29zbMATH Open1493.62350arXiv1908.10935MaRDI QIDQ2113265FDOQ2113265


Authors: Yanyan Li Edit this on Wikidata


Publication date: 11 March 2022

Published in: Mathematical Statistics and Learning (Search for Journal in Brave)

Abstract: We analyze the classical EM algorithm for parameter estimation in the symmetric two-component Gaussian mixtures in d dimensions. We show that, even in the absence of any separation between components, provided that the sample size satisfies n=Omega(dlog3d), the randomly initialized EM algorithm converges to an estimate in at most O(sqrtn) iterations with high probability, which is at most O((fracdlog3nn)1/4) in Euclidean distance from the true parameter and within logarithmic factors of the minimax rate of (fracdn)1/4. Both the nonparametric statistical rate and the sublinear convergence rate are direct consequences of the zero Fisher information in the worst case. Refined pointwise guarantees beyond worst-case analysis and convergence to the MLE are also shown under mild conditions. This improves the previous result of Balakrishnan et al cite{BWY17} which requires strong conditions on both the separation of the components and the quality of the initialization, and that of Daskalakis et al cite{DTZ17} which requires sample splitting and restarting the EM iteration.


Full work available at URL: https://arxiv.org/abs/1908.10935




Recommendations




Cites Work


Cited In (5)





This page was built for publication: Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in \(O(\sqrt{n})\) iterations

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2113265)