Convergence and prediction of principal component scores in high-dimensional settings
From MaRDI portal
(Redirected from Publication:620562)
Abstract: A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be substantially biased toward 0 in the analysis of large matrices. This phenomenon is largely related to known inconsistency results for sample eigenvalues and eigenvectors as both dimensions of the matrix increase. For the spiked eigenvalue model for random matrices, we expand the generality of these results, and propose bias-adjusted PC score prediction. In addition, we compute the asymptotic correlation coefficient between PC scores from sample and population eigenvectors. Simulation and real data examples from the genetics literature show the improved bias and numerical properties of our estimators.
Recommendations
- Convergence of sample eigenvalues, eigenvectors, and principal component scores for ultra-high dimensional data
- A note on the prediction error of principal component regression in high dimensions
- Principal component analysis in very high-dimensional spaces
- On consistency and sparsity for principal components analysis in high dimensions
- Estimating common principal components in high dimensions
- On the number of principal components in high dimensions
- Convergence of algorithms used for principal component analysis
- Geometric consistency of principal component scores for high-dimensional mixture models and its application
- Principal regression for high dimensional covariance matrices
- Concordance-based estimation approaches for the optimal sufficient dimension reduction score
Cites work
- scientific article; zbMATH DE number 47926 (Why is no real title available?)
- scientific article; zbMATH DE number 49702 (Why is no real title available?)
- scientific article; zbMATH DE number 1347881 (Why is no real title available?)
- scientific article; zbMATH DE number 1964693 (Why is no real title available?)
- Additive Risk Models for Survival Data with High‐Dimensional Covariates
- Asymptotic Theory for Principal Component Analysis
- Asymptotics of sample eigenstructure for a large dimensional spiked covariance model
- Central limit theorems for eigenvalues in a spiked population model
- DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES
- Eigenvalues of large sample covariance matrices of spiked population models
- Finite sample approximation results for principal component analysis: A matrix perturbation approach
- Geometric Representation of High Dimension, Low Sample Size Data
- On consistency and sparsity for principal components analysis in high dimensions
- On the Sampling Theory of Roots of Determinantal Equations
- On the distribution of the largest eigenvalue in principal components analysis
- PCA consistency in high dimension, low sample size context
- Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices
- Prediction by Supervised Principal Components
- Principal Components
- Principal component analysis.
- Spectrum estimation for large dimensional covariance matrices using random matrix theory
- Statistical eigen-inference from large Wishart matrices
- The high-dimension, low-sample-size geometric representation holds under mild conditions
Cited in
(19)- Reconstruction of a low-rank matrix in the presence of Gaussian noise
- Adjusting systematic bias in high dimensional principal component scores
- Convergence of sample eigenvectors of spiked population model
- Computation of ancestry scores with mixed families and unrelated individuals
- Boundary behavior in high dimension, low sample size asymptotics of PCA
- Principal components in linear mixed models with general bulk
- \(e\)PCA: high dimensional exponential family PCA
- Detection, reconstruction, and characterization algorithms from noisy data in multistatic wave imaging
- Quantile regression estimation of partially linear additive models
- Asymptotic properties of principal component analysis and shrinkage-bias adjustment under the generalized spiked population model
- Modelling functional additive quantile regression using support vector machines approach
- Targeted random projection for prediction from high-dimensional features
- A note on cyclic shift permutation testing for large eigenvalues
- Isolation-by-distance-and-time in a stepping-stone model
- Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations
- On the number of principal components in high dimensions
- Convergence of algorithms used for principal component analysis
- Statistical inference for high-dimension, low-sample-size data
- Multidimensional scaling of noisy high dimensional data
This page was built for publication: Convergence and prediction of principal component scores in high-dimensional settings
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q620562)