Sparse equisigned PCA: algorithms and performance bounds in the noisy rank-1 setting
From MaRDI portal
Publication:2286372
Parametric hypothesis testing (62F03) Factor analysis and principal components; correspondence analysis (62H25) Hypothesis testing in multivariate analysis (62H15) Computational methods for sparse matrices (65F50) Eigenvalues, singular values, and eigenvectors (15A18) Signal detection and filtering (aspects of stochastic processes) (60G35)
Abstract: Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider a setting where the left singular vector of the underlying rank one signal matrix is assumed to be sparse and the right singular vector is assumed to be equisigned, that is, having either only nonnegative or only nonpositive entries. We consider six different algorithms for estimating the sparse principal component based on different statistical criteria and prove that by exploiting sparsity, we recover consistent estimates in the low eigen-SNR regime where the SVD fails. Our analysis reveals conditions under which a coordinate selection scheme based on a extit{sum-type decision statistic} outperforms schemes that utilize the and norm-based statistics. We derive lower bounds on the size of detectable coordinates of the principal left singular vector and utilize these lower bounds to derive lower bounds on the worst-case risk. Finally, we verify our findings with numerical simulations and illustrate the performance with a video data example, where the interest is in identifying objects.
Recommendations
- Sparse PCA via covariance thresholding
- Minimax bounds for sparse PCA with noisy high-dimensional data
- Sparse principal component analysis and iterative thresholding
- High-dimensional analysis of semidefinite relaxations for sparse principal components
- Sparse principal component analysis via regularized low rank matrix approximation
Cites work
- scientific article; zbMATH DE number 1975220 (Why is no real title available?)
- A Direct Formulation for Sparse PCA Using Semidefinite Programming
- Concentration inequalities for order statistics
- Concentration inequalities. A nonasymptotic theory of independence
- Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism
- High-dimensional Ising model selection using \(\ell _{1}\)-regularized logistic regression
- Higher criticism for detecting sparse heterogeneous mixtures.
- Higher criticism for large-scale inference, especially for rare and weak effects
- Improved rectangular matrix multiplication using powers of the Coppersmith-Winograd tensor
- Innovated higher criticism for detecting sparse signals in correlated noise
- Minimax bounds for sparse PCA with noisy high-dimensional data
- Non-Negative Matrix Factorization Revisited: Uniqueness and Algorithm for Symmetric Decomposition
- Non-Negative Principal Component Analysis: Message Passing Algorithms and Sharp Asymptotics
- Non-negative matrix factorization with sparseness constraints
- On estimation of the noise variance in high dimensional probabilistic principal component analysis
- Optimal detection of sparse principal components in high dimension
- Optimality and sub-optimality of PCA. I: Spiked random matrix models
- Order statistics and concentration of \(l_r\) norms for log-concave vectors
- Rejoinder
- Robust Estimation of Noise Standard Deviation in Presence of Signals With Unknown Distributions and Occurrences
- Sparse principal component analysis and iterative thresholding
- The singular values and vectors of low rank perturbations of large rectangular random matrices
- Truncated power method for sparse eigenvalue problems
- Variable selection with Hamming loss
This page was built for publication: Sparse equisigned PCA: algorithms and performance bounds in the noisy rank-1 setting
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2286372)