Number of relevant directions in Principal Component Analysis and Wishart random matrices

From MaRDI portal
Publication:6229869

arXiv1112.5391MaRDI QIDQ6229869FDOQ6229869

Satya N. Majumdar, P. Vivo

Publication date: 22 December 2011

Abstract: We compute analytically, for large N, the probability mathcalP(N+,N) that a NimesN Wishart random matrix has N+ eigenvalues exceeding a threshold Nzeta, including its large deviation tails. This probability plays a benchmark role when performing the Principal Component Analysis of a large empirical dataset. We find that , where is the Dyson index of the ensemble and psizeta(kappa) is a rate function that we compute explicitly in the full range 0leqkappaleq1 and for any zeta. The rate function psizeta(kappa) displays a quadratic behavior modulated by a logarithmic singularity close to its minimum kappastar(zeta). This is shown to be a consequence of a phase transition in an associated Coulomb gas problem. The variance Delta(N) of the number of relevant components is also shown to grow universally (independent of zeta) as for large N.












This page was built for publication: Number of relevant directions in Principal Component Analysis and Wishart random matrices

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6229869)