Optimal discriminant analysis in high-dimensional latent factor models
From MaRDI portal
Publication:6136589
DOI10.1214/23-AOS2289arXiv2210.12862MaRDI QIDQ6136589FDOQ6136589
Publication date: 31 August 2023
Published in: The Annals of Statistics (Search for Journal in Brave)
Abstract: In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space, and base the classification on the resulting lower dimensional projections. In this paper, we formulate a latent-variable model with a hidden low-dimensional structure to justify this two-step procedure and to guide which projection to choose. We propose a computationally efficient classifier that takes certain principal components (PCs) of the observed features as projections, with the number of retained PCs selected in a data-driven way. A general theory is established for analyzing such two-step classifiers based on any projections. We derive explicit rates of convergence of the excess risk of the proposed PC-based classifier. The obtained rates are further shown to be optimal up to logarithmic factors in the minimax sense. Our theory allows the lower-dimension to grow with the sample size and is also valid even when the feature dimension (greatly) exceeds the sample size. Extensive simulations corroborate our theoretical findings. The proposed method also performs favorably relative to other existing discriminant methods on three real data examples.
Full work available at URL: https://arxiv.org/abs/2210.12862
Recommendations
- High-Dimensional Discriminant Analysis
- Optimal Feature Selection in High-Dimensional Discriminant Analysis
- High Dimensional Linear Discriminant Analysis: Optimality, Adaptive Algorithm and Missing Data
- High-dimensional linear discriminant analysis using nonparametric methods
- Factorial discriminant analysis and probabilistic models
- The Dantzig discriminant analysis with high dimensional data
- High dimensional discrimination analysis via a semiparametric model
- Bayesian discriminant analysis using a high dimensional predictor
- Dynamic linear discriminant analysis in high dimensional space
dimension reductiondiscriminant analysisoptimal rate of convergencelatent factor modelprincipal component regressionhigh-dimensional classification
Cites Work
- Penalized Classification using Fisher’s Linear Discriminant
- Multiclass Sparse Discriminant Analysis
- Forecasting Using Principal Components From a Large Number of Predictors
- Forecasting economic time series using targeted predictors
- Functional Classification in Hilbert Spaces
- Double/debiased machine learning for treatment and structural parameters
- Large Covariance Estimation by Thresholding Principal Orthogonal Complements
- Prediction by Supervised Principal Components
- Sufficient forecasting using factor models
- High-dimensional classification using features annealed independence rules
- Sparse models and methods for optimal instruments with an application to eminent domain
- A direct approach to sparse discriminant analysis in ultra-high dimensions
- A Direct Estimation Approach to Sparse Linear Discriminant Analysis
- Modern Multivariate Statistical Techniques
- Sparse linear discriminant analysis by thresholding for high dimensional data
- A convex optimization approach to high-dimensional sparse quadratic discriminant analysis
- Statistical analysis of factor models of high dimension
- Dimension reduction strategies for analyzing global gene expression data with a response
- Title not available (Why is that?)
- PLS Dimension Reduction for Classification with Microarray Data
- Optimal aggregation of classifiers in statistical learning.
- Minimax sparse principal subspace estimation in high dimensions
- Classifiers of support vector machine type with \(\ell_1\) complexity regularization
- Support vector machines with a reject option
- Dimension Reduction for Classification with Gene Expression Microarray Data
- Supervised principal component analysis: visualization, classification and regression on subspaces and submanifolds
- High Dimensional Linear Discriminant Analysis: Optimality, Adaptive Algorithm and Missing Data
- Adaptive estimation in structured factor models with applications to overlapping clustering
- Classification with many classes: challenges and pluses
- Adaptive estimation of the rank of the coefficient matrix in high-dimensional multivariate response regression models
- Title not available (Why is that?)
- Partial Factor Modeling: Predictor-Dependent Shrinkage for Linear Regression
- Optimal discriminant analysis in high-dimensional latent factor models
Cited In (3)
This page was built for publication: Optimal discriminant analysis in high-dimensional latent factor models
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6136589)