How close is the sample covariance matrix to the actual covariance matrix?
From MaRDI portal
Publication:715740
DOI10.1007/S10959-010-0338-ZzbMATH Open1365.62208arXiv1004.3484OpenAlexW2094644779MaRDI QIDQ715740FDOQ715740
Authors: Roman Vershynin
Publication date: 1 November 2012
Published in: Journal of Theoretical Probability (Search for Journal in Brave)
Abstract: Given a probability distribution in R^n with general (non-white) covariance, a classical estimator of the covariance matrix is the sample covariance matrix obtained from a sample of N independent points. What is the optimal sample size N = N(n) that guarantees estimation with a fixed accuracy in the operator norm? Suppose the distribution is supported in a centered Euclidean ball of radius sqrt{n}. We conjecture that the optimal sample size is N = O(n) for all distributions with finite fourth moment, and we prove this up to an iterated logarithmic factor. This problem is motivated by the optimal theorem of Rudelson which states that N = O(n log n) for distributions with finite second moment, and a recent result of Adamczak, Litvak, Pajor and Tomczak-Jaegermann which guarantees that N = O(n) for sub-exponential distributions.
Full work available at URL: https://arxiv.org/abs/1004.3484
Recommendations
- Sample size determination in estimating a covariance matrix
- Sample covariance matrix for random vectors with heavy tails
- Covariance Matrix Estimation From Linearly-Correlated Gaussian Samples
- On the covariance between the sample mean and variance
- Comparison between two types of large sample covariance matrices
- Sample Covariance Matrices of Heavy-Tailed Distributions
- Estimating covariance matrices
- Estimating the covariance of random matrices
Cites Work
- Weak convergence and empirical processes. With applications to statistics
- Title not available (Why is that?)
- Some estimates of norms of random matrices
- Generalized thresholding of large covariance matrices
- Limit of the smallest eigenvalue of a large dimensional sample covariance matrix
- Quantitative estimates of the convergence of the empirical covariance matrix in log-concave ensembles
- Concentration of mass on convex bodies
- Asymptotic theory of finite dimensional normed spaces. With an appendix by M. Gromov: Isoperimetric inequalities in Riemannian manifolds
- Random walks and anO*(n5) volume algorithm for convex bodies
- Title not available (Why is that?)
- Spectral norm of products of random and deterministic matrices
- Random vectors in the isotropic position
- Title not available (Why is that?)
- Non-asymptotic theory of random matrices: extreme singular values
- Sampling convex bodies: a random matrix approach
- The Expected Norm of Random Matrices
- Sharp bounds on the rate of convergence of the empirical covariance matrix
- Partial estimation of covariance matrices
- RANDOM POINTS IN ISOTROPIC UNCONDITIONAL CONVEX BODIES
- Euclidean structure in finite dimensional normed spaces
- Optimization of a convex program with a polynomial perturbation
- Frame expansions with erasures: an approach through the non-commutative operator theory
- Approximating the moments of marginals of high-dimensional distributions
Cited In (57)
- Optimal modeling of nonlinear systems: method of variable injections
- Fast random vector transforms in terms of pseudo-inverse within the Wiener filtering paradigm
- Ridge estimation of covariance matrix from data in two classes.
- Covariance estimation under missing observations and \(L_4 - L_2\) moment equivalence
- Asymptotic geometric analysis: achievements and perspective
- Principal component analysis of hybrid functional and vector data
- The famous American economist H. Markowitz and mathematical overview of his portfolio selection theory
- Streaming principal component analysis from incomplete data
- Multiscale geometric methods for data sets. I: Multiscale SVD, noise and curvature.
- Covariance estimation under one-bit quantization
- Bootstrap consistency for quadratic forms of sample averages with increasing dimension
- Sub-Gaussian estimators of the mean of a random vector
- UNIFORM-IN-SUBMODEL BOUNDS FOR LINEAR REGRESSION IN A MODEL-FREE FRAMEWORK
- The power of adaptivity in source identification with time queries on the path
- On the interval of fluctuation of the singular values of random matrices
- Preconditioning filter bank decomposition using structured normalized tight frames
- Robust long-term aircraft heavy maintenance check scheduling optimization under uncertainty
- Estimating covariance and precision matrices along subspaces
- Covariance estimation for distributions with \({2+\varepsilon}\) moments
- Convergence and finite sample approximations of entropic regularized Wasserstein distances in Gaussian and RKHS settings
- Folded concave penalized sparse linear regression: sparsity, statistical performance, and algorithmic theory for local solutions
- The method of perpendiculars of finding estimates from below for minimal singular eigenvalues of random matrices
- Distributed estimation in heterogeneous reduced rank regression: with application to order determination in sufficient dimension reduction
- Row products of random matrices
- Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data
- Estimation of a multiplicative correlation structure in the large dimensional case
- Portfolio construction by mitigating error amplification: the bounded-noise portfolio
- Bernstein-von Mises theorems for functionals of the covariance matrix
- Identification of alterations in the Jacobian of biochemical reaction networks from steady state covariance data at two conditions
- Fast convergence on blind and semi-blind channel estimation for MIMO-OFDM systems
- Affine invariant integrated rank-weighted statistical depth: properties and finite sample analysis
- Robust high-dimensional factor models with applications to statistical machine learning
- Modeling High-Dimensional Time Series: A Factor Model With Dynamically Dependent Factors and Diverging Eigenvalues
- Factorisable multitask quantile regression
- Linear system identifiability from single-cell data
- On the finite-sample analysis of \(\Theta\)-estimators
- Multivariate factorizable expectile regression with application to fMRI data
- Marcinkiewicz-type discretization of \(L^p\)-norms under the Nikolskii-type inequality assumption
- Exploring the toolkit of Jean Bourgain
- On generic chaining and the smallest singular value of random matrices with heavy tails
- Likelihood ratio tests for a large directed acyclic graph
- Mahalanobis metric based clustering for fixed effects model
- On the finite-sample analysis of \(\Theta\)-estimators
- Quantitative estimates of the convergence of the empirical covariance matrix in log-concave ensembles
- Convergence-enhanced subspace channel estimation for MIMO-OFDM systems with virtual carriers
- On the predictive risk in misspecified quantile regression
- Generalized canonical correlation analysis for classification
- Sampling discretization and related problems
- Bayesian beta regression for bounded responses with unknown supports
- From low- to high-dimensional moments without magic
- Multilevel maximum likelihood estimation with application to covariance matrices
- Partial estimation of covariance matrices
- What Should Be Done When an Estimated Between-Group Covariance Matrix Is Not Nonnegative Definite?
- A simple tool for bounding the deviation of random matrices on geometric sets
- A time-distance trade-off for GDD with preprocessing: instantiating the DLW heuristic
- Restricted isometry property for random matrices with heavy-tailed columns
- Optimal variable selection in multi-group sparse discriminant analysis
This page was built for publication: How close is the sample covariance matrix to the actual covariance matrix?
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q715740)