Variable selection for BART: an application to gene regulation
From MaRDI portal
Abstract: We consider the task of discovering gene regulatory networks, which are defined as sets of genes and the corresponding transcription factors which regulate their expression levels. This can be viewed as a variable selection problem, potentially with high dimensionality. Variable selection is especially challenging in high-dimensional settings, where it is difficult to detect subtle individual effects and interactions between predictors. Bayesian Additive Regression Trees [BART, Ann. Appl. Stat. 4 (2010) 266-298] provides a novel nonparametric alternative to parametric regression approaches, such as the lasso or stepwise regression, especially when the number of relevant predictors is sparse relative to the total number of available predictors and the fundamental relationships are nonlinear. We develop a principled permutation-based inferential approach for determining when the effect of a selected predictor is likely to be real. Going further, we adapt the BART procedure to incorporate informed prior information about variable importance. We present simulations demonstrating that our method compares favorably to existing parametric and nonparametric procedures in a variety of data settings. To demonstrate the potential of our approach in a biological context, we apply it to the task of inferring the gene regulatory network in yeast (Saccharomyces cerevisiae). We find that our BART-based procedure is best able to recover the subset of covariates with the largest signal compared to other variable selection methods. The methods developed in this work are readily available in the R package bartMachine.
Recommendations
- BART: Bayesian additive regression trees
- Bayesian variable selection and data integration for biological regulatory networks
- Bayesian nonlinear model selection for gene regulatory networks
- Bayesian regression trees for high-dimensional prediction and variable selection
- A Bayesian nonparametric mixture model for selecting genes and gene subnetworks
Cites work
- scientific article; zbMATH DE number 1906319 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Biometrics Invited Paper. The Analysis and Selection of Variables in Linear Regression
- An algorithm for information structuring and retrieval
- BART: Bayesian additive regression trees
- Bayesian Variable Selection in Linear Regression
- Bayesian lasso regression
- Bayesian variable selection and data integration for biological regulatory networks
- Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics
- Dynamic trees for learning and design
- EMVS: the EM approach to Bayesian variable selection
- Evolutionary stochastic search for Bayesian model exploration
- Monte Carlo sampling methods using Markov chains and their applications
- Multivariate adaptive regression splines
- Prediction with missing data via Bayesian additive regression trees
- Random forests
- Regularization and Variable Selection Via the Elastic Net
- Shotgun Stochastic Search for “Largep” Regression
- Spike and slab variable selection: frequentist and Bayesian strategies
- Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images
- Stochastic gradient boosting.
- The Bayesian Lasso
- Variable selection and sensitivity analysis using dynamic trees, with an application to computer code performance tuning
- Variable selection for BART: an application to gene regulation
Cited in
(28)- Nonparametric failure time: time-to-event machine learning with heteroskedastic Bayesian additive regression trees and low information omnibus Dirichlet process mixtures
- BART-based inference for Poisson processes
- A new method for clustered survival data: estimation of treatment effect heterogeneity and variable selection
- Bayesian CART models for insurance claims frequency
- CR-Lasso: robust cellwise regularized sparse regression
- Gibbs Priors for Bayesian Nonparametric Variable Selection with Weak Learners
- An integrated Bayesian framework for multi-omics prediction and classification
- Log-linear Bayesian additive regression trees for multinomial logistic and count regression models
- Group variable selection via group sparse neural network
- Bayesian additive regression trees using Bayesian model averaging
- Prediction with missing data via Bayesian additive regression trees
- mBART: multidimensional monotone BART
- Variable selection for BART: an application to gene regulation
- Posterior concentration for Bayesian regression trees and forests
- BART: Bayesian additive regression trees
- Variable selection using Bayesian additive regression trees
- Bayesian neural networks for selection of drug sensitive genes
- bartMachine
- Nonlinear Variable Selection via Deep Neural Networks
- Variable Selection Via Thompson Sampling
- Bayesian regression trees for high-dimensional prediction and variable selection
- Uncertainty quantification for Bayesian CART
- Nowcasting in a pandemic using non-parametric mixed frequency VARs
- Operator-induced structural variable selection for identifying materials genes
- Bayesian phase II clinical trial design with noncompliance
- Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: a simulation study
- Nonparametric machine learning for precision medicine with longitudinal clinical trials and Bayesian additive regression trees with mixed models
- Bayesian neural tree models for nonparametric regression
This page was built for publication: Variable selection for BART: an application to gene regulation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q484051)