Probing for sparse and fast variable selection with model-based boosting
From MaRDI portal
(Redirected from Publication:1664500)
Abstract: We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of the fitting lies in the need of multiple model fits on slightly altered data (e.g. cross-validation or bootstrap) to find the optimal number of boosting iterations and prevent overfitting. In our proposed approach, we augment the data set with randomly permuted versions of the true variables, so called shadow variables, and stop the step-wise fitting as soon as such a variable would be added to the model. This allows variable selection in a single fit of the model without requiring further parameter tuning. We show that our probing approach can compete with state-of-the-art selection methods like stability selection in a high-dimensional classification benchmark and apply it on gene expression data for the estimation of riboflavin production of Bacillus subtilis.
Recommendations
Cites work
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- 10.1162/153244303322753616
- 10.1162/153244303322753643
- A note on the Lasso and related procedures in model selection
- Additive logistic regression: a statistical view of boosting. (With discussion and a rejoinder by the authors)
- Boosting algorithms: regularization, prediction and model fitting
- Controlling Variable Selection by the Addition of Pseudovariables
- Generalized Additive Modeling with Implicit Variable Selection by Likelihood‐Based Boosting
- High-dimensional graphs and variable selection with the Lasso
- Least angle regression. (With discussion)
- Regularization and Variable Selection Via the Elastic Net
- Significance analysis of microarrays applied to the ionizing radiation response
- Stability Selection
- The asymptotic theory of permutation statistics.
- The elements of statistical learning. Data mining, inference, and prediction
- Variable Selection with Error Control: Another Look at Stability Selection
Cited in
(13)- Feature genes selection using supervised locally linear embedding and correlation coefficient for microarray classification
- An update on statistical boosting in biomedicine
- General sparse boosting: improving feature selection of \(L_{2}\) boosting by correlation-based penalty family
- Boosting Distributional Copula Regression
- Prediction-based variable selection for component-wise gradient boosting
- Significance tests for boosted location and scale models with linear base-learners
- Corrigendum to: ``Probing for sparse and fast variable selection with model-based boosting
- A new variable selection approach and application based-on EBT model
- Component-wisely sparse boosting
- Boosting multivariate structured additive distributional regression models
- PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection
- Accelerated Componentwise Gradient Boosting Using Efficient Data Representation and Momentum-Based Optimization
- EEBoost: a general method for prediction and variable selection based on estimating equations
This page was built for publication: Probing for sparse and fast variable selection with model-based boosting
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1664500)