Spiked separable covariance matrices and principal components (Q2039807)

In this paper, the spiked separable covariance matrix model is introduced. Let \(A\) and \(B\) be deterministic nonnegative definite symmetric (or Hermitian) and \(X= (x_{i,j})\) be a \(p\times n\) random matrix with real (or complex) entries which are independent random variables such that \(\mathbb{E} (x_{i,j})=0, \mathbb{E} (x_{i,j})^2= 1/n.\) Let us consider the classes of separable sample matrices of the form \(Q_{1}= A^{1/2} X B X^{*} A^{1/2}\) and \(Q_{2}= B^{1/2} X^{*} A X B^{1/2}\). Notice that \(Q_{1}\) and \(Q_{2}\) share the same nonzero eigenvalues. On the other hand, it is assumed that there exists a constant \(0<\tau<1\) such that \(\tau \leq p/n \leq \tau^{-1}\) for all \(n.\) As a basic assumption, the operator norms of \(A, B\) are bounded by \(\tau^{-1},\) where \(0<\tau<1\) is a small constant, and the spectra of \(A\) and \(B\) cannot collapse at zero. This means that the separable sample covariance matrix has no spikes. The corresponding classes of spike separable sample covariance matrices are defined by \(\tilde{Q}_{1}= \tilde{A}^{1/2} X \tilde{B} X^{*} \tilde{A}^{1/2}\) and \(\tilde{Q}_{2}= \tilde{B}^{1/2} X^{*}\tilde{ A} X \tilde{B}^{1/2}.\) Here, \(\tilde{A}= A+ \triangle_{A}, \tilde{B}= B+ \triangle_{B},\) where \(\triangle_{A}, \triangle_{B}\) are finite rank matrices, respectively. \(A\) is called the nonspiked component of \(\tilde{A}\) and we must point out that \(A\) and \(\tilde{A}\) share the same eigenvectors. This model allows for a more general covariance structure and it is suitable for spatio-temporal data analysis with spikes in both space and time. Here, the authors focus the attention on the principal components of the above spiked separable covariance matrix models by using their resolvents (Green) functions. The ``outliers'' are the eigenvalues of the sample covariance matrices \(Q_{1}\) and \(Q_{2}\), while the ``spikes'' are the eigenvalues of the population matrices \(\tilde{A}\) and \(\tilde{B}.\) For both supercritical and subcritical spikes, the first order limits of the corresponding eigenvalue outliers and the generalized components of the corresponding eigenvectors are obtained. A precise rate of convergence is deduced. On the other hand, large deviation bounds for the non-outlier eigenvalues and eigenvectors are deduced. In particular, the non-outlier eigenvalues will stick with those of the reference matrix and the non-outlier eigenvectors near the spectrum edge will be biased in the direction of the population eigenvectors of the subcritical spikes. Finally, some estimates of the number of spikes for \(\tilde{A}, \tilde{B},\) are given. More precisely, the authors show that the eigenvectors constitute a key element to separate the outliers from the spikes of \(A\) and those from the spikes of \(\tilde{B}.\) The optimal shrinkage for the eigenvalues, which is adaptive only to the data matrix, is obtained. Notice that the number of spikes has an important meaning in real applications. For instance, it represents the number of signals in signal processing (see [\textit{B. Nadler}, IEEE Trans. Signal Process. 58, No. 5, 2746--2756 (2010; Zbl 1392.94642)]). Such a problem has been studied for spiked covariance matrices in [\textit{D. Passemier} and \textit{J. Yao}, J. Multivariate Anal. 127, 173--183 (2014; Zbl 1293.62044)].

0 references

zbMATH Keywords

BBP transition

0 references

local laws

0 references

principal components

0 references

spiked separable covariance matrices

0 references

reviewed by

Francisco Marcellán