Cross-Validated Loss-based Covariance Matrix Estimator Selection in High Dimensions
From MaRDI portal
Publication:6094089
Abstract: The covariance matrix plays a fundamental role in many modern exploratory and inferential statistical procedures, including dimensionality reduction, hypothesis testing, and regression. In low-dimensional regimes, where the number of observations far exceeds the number of variables, the optimality of the sample covariance matrix as an estimator of this parameter is well-established. High-dimensional regimes do not admit such a convenience, however. As such, a variety of estimators have been derived to overcome the shortcomings of the sample covariance matrix in these settings. Yet, the question of selecting an optimal estimator from among the plethora available remains largely unaddressed. Using the framework of cross-validated loss-based estimation, we develop the theoretical underpinnings of just such an estimator selection procedure. In particular, we propose a general class of loss functions for covariance matrix estimation and establish finite-sample risk bounds and conditions for the asymptotic optimality of the cross-validated estimator selector with respect to these loss functions. We evaluate our proposed approach via a comprehensive set of simulation experiments and demonstrate its practical benefits by application in the exploratory analysis of two single-cell transcriptome sequencing datasets. A free and open-source software implementation of the proposed methodology, the cvCovEst R package, is briefly introduced.
Cites work
- scientific article; zbMATH DE number 1964693 (Why is no real title available?)
- scientific article; zbMATH DE number 6122810 (Why is no real title available?)
- scientific article; zbMATH DE number 961607 (Why is no real title available?)
- A distribution-free theory of nonparametric regression
- A well-conditioned estimator for large-dimensional covariance matrices
- Adaptive thresholding for sparse covariance matrix estimation
- Asymptotics of cross-validated risk estimation in estimator selection and performance assess\-ment
- Asymptotics of the principal components estimator of large factor models with weakly influential factors
- Covariance regularization by thresholding
- Covariance, subspace, and intrinsic Crame/spl acute/r-Rao bounds
- DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES
- Forecasting Using Principal Components From a Large Number of Predictors
- Generalized thresholding of large covariance matrices
- High dimensional covariance matrix estimation using a factor model
- Inferential Theory for Factor Models of Large Dimensions
- Large covariance estimation by thresholding principal orthogonal complements. With discussion and authors' reply
- Nonlinear shrinkage estimation of large-dimensional covariance matrices
- On consistency and sparsity for principal components analysis in high dimensions
- On the distribution of the largest eigenvalue in principal components analysis
- Optimal rates of convergence for covariance matrix estimation
- Oracle inequalities for multi-fold cross validation
- Probability Inequalities for the Sum of Independent Random Variables
- Projected principal component analysis in factor models
- Regularized estimation of large covariance matrices
- Robust covariance estimation for approximate factor models
- Sparsistency and rates of convergence in large covariance matrix estimation
- Spectrum estimation: a unified framework for covariance matrix estimation and PCA in large dimensions
- The Empirical Bayes Approach to Statistical Decision Problems
- The elements of statistical learning. Data mining, inference, and prediction
- Tuning-parameter selection in regularized estimations of large covariance matrices
Cited in
(3)
This page was built for publication: Cross-Validated Loss-based Covariance Matrix Estimator Selection in High Dimensions
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6094089)