``Preconditioning for feature selection and regression in high-dimensional problems

DOI10.1214/009053607000000578MaRDI QIDQ939656zbMATH OpenOpenAlexWikidataFDO

Authors Debashis Paul, Eric Bair, Robert Tibshirani, Trevor Hastie

Publication date 28 August 2008

Published in The Annals of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/math/0703858

Nonparametric regression and quantile regression (62G08) Asymptotic properties of nonparametric inference (62G20) Factor analysis and principal components; correspondence analysis (62H25) Ridge regression; shrinkage estimators (Lasso) (62J07)

Abstract: We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a "pre-conditioned" response variable. The primary method used for this initial regression is supervised principal components. Then we apply a standard procedure such as forward stepwise selection or the LASSO to the pre-conditioned response variable. In a number of simulated and real data examples, this two-step procedure outperforms forward stepwise selection or the usual LASSO (applied directly to the raw outcome). We also show that under a certain Gaussian latent variable model, application of the LASSO to the pre-conditioned response variable is consistent as the number of predictors and observations increases. Moreover, when the observational noise is rather large, the suggested procedure can give a more accurate estimate than LASSO. We illustrate our method on some real problems, including survival analysis with microarray data.

Recommendations

Cites work

Cited in

(30)

This page was built for publication: ``Preconditioning for feature selection and regression in high-dimensional problems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q939656)