On principal components regression, random projections, and column subsampling
From MaRDI portal
(Redirected from Publication:1616329)
Abstract: Principal Components Regression (PCR) is a traditional tool for dimension reduction in linear regression that has been both criticized and defended. One concern about PCR is that obtaining the leading principal components tends to be computationally demanding for large data sets. While random projections do not possess the optimality properties of the leading principal subspace, they are computationally appealing and hence have become increasingly popular in recent years. In this paper, we present an analysis showing that for random projections satisfying a Johnson-Lindenstrauss embedding property, the prediction error in subsequent regression is close to that of PCR, at the expense of requiring a slightly large number of random projections than principal components. Column sub-sampling constitutes an even cheaper way of randomized dimension reduction outside the class of Johnson-Lindenstrauss transforms. We provide numerical results based on synthetic and real data as well as basic theory revealing differences and commonalities in terms of statistical performance.
Recommendations
- Principal component regression revisited
- Projection-pursuit based principal component analysis: a large sample theory
- Sparse Principal Component Analysis via Axis-Aligned Random Projections
- scientific article; zbMATH DE number 775112
- Random Projections for Large-Scale Regression
- Principal Components Regression by Using Generalized Principal Components Analysis
- On principal subspace analysis
- The principal problem with principal components regression
- Regularized principal component analysis
Cites work
- scientific article; zbMATH DE number 1857652 (Why is no real title available?)
- scientific article; zbMATH DE number 1391247 (Why is no real title available?)
- scientific article; zbMATH DE number 6438182 (Why is no real title available?)
- scientific article; zbMATH DE number 961607 (Why is no real title available?)
- A Random Matrix-Theoretic Approach to Handling Singular Covariance Estimates
- A risk comparison of ordinary least squares vs ridge regression
- A simple proof of the restricted isometry property for random matrices
- A statistical perspective on randomized sketching for ordinary least-squares
- A tail inequality for quadratic forms of subgaussian random vectors
- Adaptive estimation of a quadratic functional by model selection.
- An almost optimal unrestricted fast Johnson-Lindenstrauss transform
- Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform
- Bagging predictors
- Compressed and Privacy-Sensitive Sparse Regression
- Database-friendly random projections: Johnson-Lindenstrauss with binary coins.
- Extensions of Lipschitz mappings into a Hilbert space
- Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions
- Improved analysis of the subsampled randomized Hadamard transform
- Kernel ridge vs. principal component regression: minimax bounds and the qualification of regularization operators
- Nearest-neighbor-preserving embeddings
- New and Improved Johnson–Lindenstrauss Embeddings via the Restricted Isometry Property
- Normal Multivariate Analysis and the Orthogonal Group
- On \(b\)-bit min-wise hashing for large-scale regression and classification with sparse data
- On principal components and regression: a statistical explanation of a natural phenomenon
- On regularization algorithms in learning theory
- On variants of the Johnson–Lindenstrauss lemma
- Optimal selection of reduced rank estimators of high-dimensional matrices
- Optimization methods for large-scale machine learning
- Random Projections for Large-Scale Regression
- Random-projection ensemble classification. (With discussion).
- Randomized Sketches of Convex Programs With Sharp Guarantees
- Sketched ridge regression: optimization perspective, statistical perspective, and model averaging
- Sketching as a tool for numerical linear algebra
- Statistics for high-dimensional data. Methods, theory and applications.
Cited in
(9)- Principal component projection with low-degree polynomials
- scientific article; zbMATH DE number 7307477 (Why is no real title available?)
- Reduced rank regression with matrix projections for high-dimensional multivariate linear regression model
- Sketching for principal component regression
- Partial projective resampling method for dimension reduction: with applications to partially linear models
- Projective resampling estimation of informative predictor subspace for multivariate regression
- Dimensionality Reduction, Regularization, and Generalization in Overparameterized Regressions
- Thin-shell theory for rotationally invariant random simplices
- High-dimensional clustering via random projections
This page was built for publication: On principal components regression, random projections, and column subsampling
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1616329)