Consistency of AIC and BIC in estimating the number of significant components in high-dimensional principal component analysis (Q1650069)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Consistency of AIC and BIC in estimating the number of significant components in high-dimensional principal component analysis |
scientific article |
Statements
Consistency of AIC and BIC in estimating the number of significant components in high-dimensional principal component analysis (English)
0 references
29 June 2018
0 references
This paper deals with the problem of estimating the number of significant components in principal component analysis (PCA), which is known as the dimensionality in PCA. Specifically, let \(y_{1}\),\dots,\(y_{n}\) be a random sample of size \(n\) from a \(p\)-dimensional population with mean \(\mu\) and covariance matrix \(\Sigma\). The problem of estimating the dimensionality is considered as a problem of selecting an appropriate model from the set \(\{M_{0}, M_{1},\dots,M_{p-1}\}\), where \[ M_{k}=\lambda_{k}>\lambda_{k+1}=\dots=\lambda_{p}=\lambda, \] with \(\lambda_{1}\geqq\dots\geqq\lambda_{p}\) the population eigenvalues of the covariance matrix \(\Sigma\). In this context, the authors consider two estimation criteria, AIC [\textit{H. Akaike}, in: 2nd International Symposium on Information Theory, Tsahkadsor 1971, 267--281 (1973; Zbl 0283.62006)] and BIC [\textit{G. Schwarz}, Ann. Stat. 6, 461--464 (1978; Zbl 0379.62005)], and their purpose is to examine the consistency of the estimation criteria under a high-dimensional framework where \(p,n\rightarrow \infty\) such that \(p/n\rightarrow c>0\). It is assumed that the number of significant components, say \(k\), is fixed; that the number of candidate models is greater than \(k\) and that the fourth population moment is finite. Both the cases of \(p<n\) (\(0<c<1\)) and \(p>n\) (\(c>1\)) are discussed. In this last case, modified AIC and BIC criteria given on p.~1060 are considered. The main results of the paper are obtained by techniques from random matrix theory and are summarized as follows: {\parindent=6mm \begin{itemize}\item[a)] For \(0<c<1\), if \(\lambda_1\) is bounded then under the so-called gap condition (C3) given on p.~1057 of the paper, AIC is strongly consistent, but BIC is not. Furthermore, if \(\lambda_k\rightarrow \infty\) AIC is always strongly consistent regardless of whether the gap condition holds, while if \(\lambda_k/\log n\rightarrow \infty\) then BIC is strongly consistent. \item[b)] For \(c>1\), if \(\lambda_1\) is bounded then under the so-called modified gap condition (C5) given on p.~1060 of the paper, the modified AIC is strongly consistent, but the modified BIC is not. Furthermore, if \(\lambda_k\rightarrow \infty\) the modified AIC is always strongly consistent regardless of whether the modified gap condition holds, while if \(\lambda_k/\log n\rightarrow \infty\) then the modified BIC is strongly consistent. \end{itemize}} Finally, simulation studies show that the sufficient conditions given are essential.
0 references
principal component analysis
0 references
dimensionality
0 references
AIC
0 references
BIC
0 references
consistency
0 references
high-dimensional asymptotic framework
0 references
0 references
0 references
0 references
0 references
0 references
0 references