On the principal components of sample covariance matrices (Q2634905)

Let us consider an \(M\times M\) sample covariance matrix \(\mathcal{Q}= \frac{1}{N} AA^{*}\), where the sample matrix \(A\) is an \(M\times N\) random matrix with real entries. By the law of large numbers, if \(M\) is fixed and \(N\) tends to infinity, the sample covariance matrix converges almost surely to the population covariance matrix \(\Sigma= E (\mathcal{Q})\). In many applications, the population size \(M\) is very large and, as an immediate consequence, the obtention of sampling is costly. Thus, it seems natural to deal with the cases where \(M\) is of the same order as \(N\) or even larger. Then, the behavior of \(\mathcal{Q}\) changes very much and the difficulty of the problem increases. In principal component analysis, one is interested to understand the correlations by considering the top eigenvalues and eigenvectors of \(\mathcal{Q}\), the so-called principal components of such a matrix. These provide an effective low-dimensional projection of the high-dimensional data set \(A\) where the significant trends and correlations are revealed by discarding superfluous data. The basic question is how the principal components of \(\Sigma\) are related to those of \(\mathcal{Q}\). The model presented in the paper under review takes into account a rescaled sample covariance matrix \(Q= \phi^{-1/2} \mathcal{Q}\), where \(\phi= \frac{M}{N}\), that ensures the bulk spectrum of \(Q\) has asymptotically a fixed diameter, 4, for arbitrary \(N\) and \(M\). It is also assumed that \(M\) and \(N\) satisfy the bounds \(N^{1/C} \leq M \leq N^{C}\). This contribution is centered on covariance matrices \(Q= T X X^{*} T^{*}\) where \(X\) is an \((M+r) \times N\) random matrix and \(T\) is an \(M \times (M+r)\) deterministic matrix. The main interest is centered on covariance matrices \(Q= T X X^{*} X^{*}\) where \(X\) is an \((M+r) \times N\) random matrix and \(T\) is an \(M \times (M+r)\) deterministic matrix. Since \(TX\) is an \(M\times N\) matrix, \(Q\) has \(K= M \wedge N\) non-zero eigenvalues. In such a case, the population covariance matrix is defined as \(\Sigma= TT^{*}= I_{M}+ \phi^{1/2}\sum_{i=1}^{M} d_{i} \mathbf{v}_{i}\mathbf{v}^{*}_{i}\), and it is assumed that \(\Sigma\) is positive definite as well as \(\Sigma- I_{M}\) has bounded rank. The couples \((d_{i}, \mathbf{v}_{i})\), \(d_{i}\neq 0\), are said to be the spikes of \(\Sigma\). On the other hand, the entries of \(X\) are independent random variables with \(E (X_{i,j})=0\), \(E (X^{2}_{i,j})= (NM)^{-1/2}\), and the random variables \((NM)^{1/4}X_{i,j}\) have a uniformly bounded \(p\)-th moment for every positive integer \(p\). The spectrum of \(Q\) consists of a bulk spectrum and of outliers (eigenvalues separated from the bulk). The bulk contains an order of \(K\) eigenvalues distributed on large scales according to the Marchenko-Pastur law (see [\textit{V. A. Marchenko} and \textit{L. A. Pastur}, Math. USSR, Sb. 1 (1967), 457--483 (1968; Zbl 0162.22501)]). In addition, if \(\phi >1\), there are \(M-K\) trivial eigenvalues at zero. In this paper, the analysis of eigenvalues and eigenvectors of \(Q\) is presented, focusing the attention on large deviation bounds and asymptotic laws. For eigenvalues, large deviations bounds on the locations of the outliers are deduced. Next, the authors prove eigenvalue sticking for the non-outliers, whereby each non-outlier ``sticks'' with high probability and very accurately to the eigenvalues of a related covariance matrix such that \(\Sigma= I_{M}\) with top eigenvalues exhibiting universality. As a consequence, the top non-outlier eigenvalue of \(Q\) has asymptotically the Tracy-Widom-\(1\) distribution. For the eigenvectors \(\xi_{i}\) of \(Q\), if \(\mathbf{w}\) is a deterministic vector in \(\mathbb{R}^{M}\), then large deviation bounds of the generalized components \(\langle \mathbf{w} , \xi_{i} \rangle\) of outlier eigenvectors are deduced. The complete delocalization of an outlier eigenvector in any direction orthogonal to the spike direction is obtained, provided the outlier is very well separated from the bulk spectrum and other outliers. If the outlier approaches the bulk spectrum or another outlier, the cone concentration becomes less accurate. For the case of two nearby outlier eigenvalues the cone concentration of the eigenvectors breaks down when the distributions of the outlier eigenvalues have a nontrivial overlap. On the other hand, the authors establish delocalization bounds for the generalized components of non-outlier eigenvectors. Furthermore, the non-outlier eigenvectors away from the edge are completely delocalized in all directions. Finally, the asymptotic law of the generalized component of a non-outlier eigenvector is deduced. It is asymptotically Gaussian and has a variance predicted from the delocalization bounds previously obtained. The proofs of universality of the non-outlier eigenvalues and eigenvectors require the universality of \(Q\) for the uncorrelated case \(\Sigma\) as an input. This universality result establishes the joint, fixed-index, universality of the eigenvalues and eigenvectors of \(Q\) and, as a special case, the quantum unique ergodicity of the eigenvectors of \(Q\). For the uncorrelated case \(\Sigma= I_{M}\) and Gaussian \(X\) with fixed \(\phi\), the top eigenvalue, with a convenient rescaling, is asymptotically distributed according to the Tracy-Widom law (see [\textit{K. Johansson}, Commun. Math. Phys. 209, No. 2, 437--476 (2000; Zbl 0969.15008)] for the complex case and \textit{I. M. Johnstone} [Ann. Stat. 29, No. 2, 295--327 (2001; Zbl 1016.62078)] for the real case). The study of covariance matrices with nontrivial population covariance matrix \(\Sigma\neq I_{M}\) goes back to the above paper by Johnstone, where the spiked model was introduced. The so-called BBP phase transition for complex Gaussian \(X\), fixed rank of \(\Sigma-I_{M}\) and fixed \(\phi\) was established in [\textit{J. Baik} et al., Ann. Probab. 33, No. 5, 1643--1697 (2005; Zbl 1086.15022)].

0 references

zbMATH Keywords

covariance matrices

0 references

principal components

0 references

outlier and non-outlier eigenvectors

0 references

spikes

0 references

level repulsion

0 references