Sure independence screening for ultrahigh dimensional feature space. With discussion and authors' reply
From MaRDI portal
Publication:4632602
Abstract: Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality , estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using regularization and show that it achieves the ideal risk up to a logarithmic factor . Their innovative procedure and remarkable result are challenged when the dimensionality is ultra high as the factor can be large and their uniform uncertainty principle can fail. Motivated by these concerns, we introduce the concept of sure screening and propose a sure screening method based on a correlation learning, called the Sure Independence Screening (SIS), to reduce dimensionality from high to a moderate scale that is below sample size. In a fairly general asymptotic framework, the correlation learning is shown to have the sure screening property for even exponentially growing dimensionality. As a methodological extension, an iterative SIS (ISIS) is also proposed to enhance its finite sample performance. With dimension reduced accurately from high to below sample size, variable selection can be improved on both speed and accuracy, and can then be accomplished by a well-developed method such as the SCAD, Dantzig selector, Lasso, or adaptive Lasso. The connections of these penalized least-squares methods are also elucidated.
Recommendations
- Sure independence screening in generalized linear models with NP-dimensionality
- Ultrahigh dimensional feature selection: beyond the linear model
- Factor profiled sure independence screening
- High-dimensional variable selection
- The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). (With discussions and rejoinder).
Cites work
- scientific article; zbMATH DE number 5957408 (Why is no real title available?)
- scientific article; zbMATH DE number 3988038 (Why is no real title available?)
- scientific article; zbMATH DE number 51418 (Why is no real title available?)
- scientific article; zbMATH DE number 1347881 (Why is no real title available?)
- scientific article; zbMATH DE number 1034042 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Statistical View of Some Chemometrics Regression Tools
- A decision-theoretic generalization of on-line learning and an application to boosting
- A limit theorem for the norm of random matrices
- Approximation and learning by greedy algorithms
- Asymptotic properties of bridge estimators in sparse high-dimensional regression models
- Asymptotics for Lasso-type estimators.
- Best subset selection, persistence in high-dimensional statistical learning and optimization under l₁ constraint
- Better Subset Regression Using the Nonnegative Garrote
- Comments on: ``Wavelets in statistics: a review by A. Antoniadis
- Deviation Inequalities on Largest Eigenvalues
- Geometric Representation of High Dimension, Low Sample Size Data
- Heuristics of instability and stabilization in model selection
- High-dimensional classification using features annealed independence rules
- High-dimensional graphs and variable selection with the Lasso
- Ideal spatial adaptation by wavelet shrinkage
- Least angle regression. (With discussion)
- Limit of the smallest eigenvalue of a large dimensional sample covariance matrix
- Local Strong Homogeneity of a Regularized Estimator
- Nonconcave penalized likelihood with a diverging number of parameters.
- On the distribution of the largest eigenvalue in principal components analysis
- Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ 1 minimization
- Pathwise coordinate optimization
- Persistene in high-dimensional linear predictor-selection and the virtue of overparametrization
- Regularization of Wavelet Approximations
- Regularized estimation of large covariance matrices
- Rejoinder: One-step sparse estimates in nonconcave penalized likelihood models
- Relaxed Lasso
- Simultaneous analysis of Lasso and Dantzig selector
- Some theory for Fisher's linear discriminant function, `naive Bayes', and some alternatives when there are many more variables than observations
- Sparsistency and rates of convergence in large covariance matrix estimation
- Statistical challenges with high dimensionality: feature selection in knowledge discovery
- Statistical significance for genomewide studies
- Statistics on special manifolds
- The Adaptive Lasso and Its Oracle Properties
- The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). (With discussions and rejoinder).
- The Group Lasso for Logistic Regression
- The concentration of measure phenomenon
- The smallest eigenvalue of a large dimensional Wishart matrix
- The sparsity and bias of the LASSO selection in high-dimensional linear regression
- Uncertainty principles and ideal atomic decomposition
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
- Variable selection for Cox's proportional hazards model and frailty model
- Variable selection using MM algorithms
- Weak convergence and empirical processes. With applications to statistics
- ``Preconditioning for feature selection and regression in high-dimensional problems
Cited in
(only showing first 100 items - show all)- Feature Screening for Massive Data Analysis by Subsampling
- Efficient Sparse Estimate of Sufficient Dimension Reduction in High Dimension
- Goodness-of-fit testing-based selection for large-p-small-n problems: a two-stage ranking approach
- Regression with outlier shrinkage
- Discussion of: ``Grouping strategies and thresholding for high dimension linear models
- Tests for high-dimensional single-index models
- Variable selection for high-dimensional quadratic Cox model with application to Alzheimer's disease
- Grouped variable screening for ultra-high dimensional data for linear model
- Sure independence screening for analyzing supersaturated designs
- Bayesian bridge quantile regression
- A shrinkage principle for heavy-tailed data: high-dimensional robust low-rank matrix recovery
- Conditional distance correlation screening for sparse ultrahigh-dimensional models
- An attention algorithm for solving large scale structured \(l_0\)-norm penalty estimation problems
- Forward variable selection for sparse ultra-high-dimensional generalized varying coefficient models
- Fitting sparse linear models under the sufficient and necessary condition for model identification
- Maximum-type tests for high-dimensional regression coefficients using Wilcoxon scores
- On consistency and sparsity for sliced inverse regression in high dimensions
- Feature screening in ultrahigh-dimensional varying-coefficient Cox model
- Derandomizing Knockoffs
- Orthogonal one step greedy procedure for heteroscedastic linear models
- Non-marginal feature screening for additive hazard model with ultrahigh-dimensional covariates
- High-dimensional inference: confidence intervals, \(p\)-values and R-software \texttt{hdi}
- One-step sparse estimates in the reverse penalty for high-dimensional correlated data
- Robust sure independence screening for nonpolynomial dimensional generalized linear models
- Penalized \(M\)-estimation based on standard error adjusted adaptive elastic-net
- Optimal Treatment Regimes: A Review and Empirical Comparison
- A selective overview of feature screening for ultrahigh-dimensional data
- Are discoveries spurious? Distributions of maximum spurious correlations and their applications
- A model-averaging method for high-dimensional regression with missing responses at random
- Feature screening for ultrahigh-dimensional censored data with varying coefficient single-index model
- Variable selection for high dimensional Gaussian copula regression model: an adaptive hypothesis testing procedure
- Variable screening for ultrahigh dimensional censored quantile regression
- Model selection for high-dimensional quadratic regression via regularization
- Stock return predictability: a factor-augmented predictive regression system with shrinkage method
- Adjusted feature screening for ultra-high dimensional missing response
- On the Use of Minimum Penalties in Statistical Learning
- Conditional characteristic feature screening for massive imbalanced data
- Support vector machine in ultrahigh-dimensional feature space
- Nonparametric augmented probability weighting with sparsity
- High-dimensional model averaging for quantile regression
- Ranking-based variable selection for high-dimensional data
- A new approach for ultrahigh dimensional precision matrix estimation
- Feature screening via concordance indices for left-truncated and right-censored survival data
- An iterative approach to distance correlation-based sure independence screening
- Lassoing the determinants of retirement
- A note of feature screening via a rank-based coefficient of correlation
- A simple model-free survival conditional feature screening
- Optimal directional statistic for general regression
- Model selection using mass-nonlocal prior
- Covariance-insured screening
- Feature screening in ultrahigh-dimensional partially linear models with missing responses at random
- High-dimensional causal mediation analysis based on partial linear structural equation models
- Regression adjustment for treatment effect with multicollinearity in high dimensions
- Forward variable selection for ultra-high dimensional quantile regression models
- High-Dimensional Interaction Detection With False Sign Rate Control
- The Kendall interaction filter for variable interaction screening in high dimensional classification problems
- A method for selecting the relevant dimensions for high-dimensional classification in singular vector spaces
- Operator-induced structural variable selection for identifying materials genes
- Regularized zero-variance control variates
- Partial correlation screening for varying coefficient models
- UPS delivers optimal phase diagram in high-dimensional variable selection
- A model-free conditional screening approach via sufficient dimension reduction
- Model-free global likelihood subsampling for massive data
- A nonparametric empirical Bayes approach to large-scale multivariate regression
- Hybrid safe-strong rules for efficient optimization in Lasso-type problems
- Partition-based feature screening for categorical data via RKHS embeddings
- Model-free variable selection for conditional mean in regression
- A scalable surrogate L₀ sparse regression method for generalized linear models with applications to large scale data
- Group orthogonal greedy algorithm for change-point estimation of multivariate time series
- Model selection for high-dimensional linear regression with dependent observations
- A modified mean-variance feature-screening procedure for ultrahigh-dimensional discriminant analysis
- Change-point detection in multinomial data with a large number of categories
- Covariate assisted screening and estimation
- Asymptotics of AIC, BIC and \(C_p\) model selection rules in high-dimensional regression
- Sure independence screening for real medical Poisson data
- Two-sample spatial rank test using projection
- Testing covariates in high dimension linear regression with latent factors
- A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
- An Approximated Collapsed Variational Bayes Approach to Variable Selection in Linear Regression
- A stepwise regression algorithm for high-dimensional variable selection
- Fast robust feature screening for ultrahigh-dimensional varying coefficient models
- ARGONAUT: algorithms for global optimization of constrained grey-box computational problems
- Group feature screening via the F statistic
- Partial sufficient variable screening with categorical controls
- Are Latent Factor Regression and Sparse Regression Adequate?
- Confidence intervals for low dimensional parameters in high dimensional linear models
- Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions
- Modified SCAD penalty for constrained variable selection problems
- Nonconvex penalized ridge estimations for partially linear additive models in ultrahigh dimension
- Feature elimination in kernel machines in moderately high dimensions
- An RKHS model for variable selection in functional linear regression
- Integrating Multisource Block-Wise Missing Data in Model Selection
- Block-diagonal precision matrix regularization for ultra-high dimensional data
- Model-free conditional independence feature screening for ultrahigh dimensional data
- Model-free feature screening for ultrahigh dimensional censored regression
- Revisiting feature selection for linear models with FDR and power guarantees
- Asymptotic properties of high-dimensional random forests
- Sufficient variable screening with high-dimensional controls
- Dynamic tilted current correlation for high dimensional variable screening
- Feature Screening with Conditional Rank Utility for Big-Data Classification
This page was built for publication: Sure independence screening for ultrahigh dimensional feature space. With discussion and authors' reply
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4632602)