High-dimensional regression and variable selection using CAR scores
From MaRDI portal
Abstract: Variable selection is a difficult problem that is particularly challenging in the analysis of high-dimensional genomic data. Here, we introduce the CAR score, a novel and highly effective criterion for variable ranking in linear regression based on Mahalanobis-decorrelation of the explanatory variables. The CAR score provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables. It decomposes the proportion of variance explained and it is an intermediate between marginal correlation and the standardized regression coefficient. As a population quantity, any preferred inference scheme can be applied for its estimation. Using simulations we demonstrate that variable selection by CAR scores is very effective and yields prediction errors and true and false positive rates that compare favorably with modern regression techniques such as elastic net and boosting. We illustrate our approach by analyzing data concerned with diabetes progression and with the effect of aging on gene expression in the human brain. The R package "care" implementing CAR score regression is available from CRAN.
Recommendations
- High-dimensional variable selection
- Variable selection methods in high-dimensional regression -- a simulation study
- High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking
- Selection of variables and dimension reduction in high-dimensional non-parametric regression
- Variable selection in high dimensional data analysis with applications
- Variable selection for high dimensional multivariate outcomes
- Variable selection in high-dimensional partially linear models
- Variable selection in high-dimensional sparse multiresponse linear regression models
- Consistent variable selection in high dimensional regression via multiple testing
- Variable selection in multivariate linear models with high-dimensional covariance matrix estimation
Cited in
(10)- Extensions of the absolute standardized hazard ratio and connections with measures of explained variation and variable importance
- Variable importance in regression models
- Finding causative genes from high-dimensional data: an appraisal of statistical and machine learning approaches
- Random subspace method for high-dimensional regression with the \texttt{R} package \texttt{regRSM}
- care
- Correlation-adjusted regression survival scores for high-dimensional variable selection
- Variable selection for censored data using modified correlation adjusted correlation (MCAR) scores
- Extending DFA-based multiple linear regression inference: application to acoustic impedance models
- High-dimensional linear model selection motivated by multiple testing
- Optimal Whitening and Decorrelation
This page was built for publication: High-dimensional regression and variable selection using CAR scores
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q118588)