Abstract: We develop a Bayesian "sum-of-trees" model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. BART's many features are illustrated with a bake-off against competing methods on 42 different data sets, with a simulation experiment and on a drug discovery classification problem.
Recommendations
- Bayesian additive regression trees with model trees
- Log-linear Bayesian additive regression trees for multinomial logistic and count regression models
- Bayesian regression trees for high-dimensional prediction and variable selection
- Variable selection for BART: an application to gene regulation
- Bayesian treed models
Cites work
- scientific article; zbMATH DE number 527274 (Why is no real title available?)
- A Bayesian CART algorithm
- A decision-theoretic generalization of on-line learning and an application to boosting
- A spatially-adjusted Bayesian additive regression tree model to merge two datasets
- Alternative models for stock price dynamics.
- An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias
- Bagging predictors
- Bayesian Analysis of Binary and Polychotomous Response Data
- Bayesian Inference on Network Traffic Using Link Count Data
- Bayesian backfitting. (With comments and a rejoinder).
- Greedy function approximation: A gradient boosting machine.
- Least angle regression. (With discussion)
- Multivariate adaptive regression splines
- Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Statistical Analysis of Financial Data in S-Plus
- The Bayesian additive classification tree applied to credit risk modelling
Cited in
(only showing first 100 items - show all)- Nonparametric machine learning for precision medicine with longitudinal clinical trials and Bayesian additive regression trees with mixed models
- Using BART to Perform Pareto Optimization and Quantify its Uncertainties
- Local Linear Forests
- Learning algorithms to evaluate forensic glass evidence
- A Generalized Estimating Equation Approach to Multivariate Adaptive Regression Splines
- Model guided adaptive design and analysis in computer experiment
- Bayesian additive machine: classification with a semiparametric discriminant function
- Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods
- A High-Fidelity Model to Predict Length of Stay in the Neonatal Intensive Care Unit
- Influential Observations in Bayesian Regression Tree Models
- BART with targeted smoothing: an analysis of patient-specific stillbirth risk
- Spike-and-slab priors for function selection in structured additive regression models
- Cluster-specific variable selection for product partition models
- Bayesian nonparametric adjustment of confounding
- Operator-induced structural variable selection for identifying materials genes
- Robust discrete choice models with \(t\)-distributed kernel errors
- Nowcasting in a pandemic using non-parametric mixed frequency VARs
- Discussion of PENCOMP
- A semiparametric modeling approach using Bayesian additive regression trees with an application to evaluate heterogeneous treatment effects
- Subgroup causal effect identification and estimation via matching tree
- Variance prior forms for high-dimensional Bayesian variable selection
- Comment: Contributions of model features to BART causal inference performance using ACIC 2016 competition data
- Estimating population average causal effects in the presence of non-overlap: the effect of natural gas compressor station exposure on cancer mortality
- Minimax optimal rates for Mondrian trees and forests
- Fair and Efficient Allocation of Scarce Resources Based on Predicted Outcomes: Implications for Homeless Service Delivery
- A penalized complexity prior for deep Bayesian transfer learning with application to materials informatics
- Adaptive Bayesian Sum of Trees Model for Covariate-Dependent Spectral Analysis
- Heteroscedastic BART via Multiplicative Regression Trees
- A simple approach for local and global variable importance in nonlinear regression models
- Bayesian synthesis: combining subjective analyses, with an application to ozone data
- Bayesian regression trees for high-dimensional prediction and variable selection
- Comment
- Targeted smooth Bayesian causal forests: an analysis of heterogeneous treatment effects for simultaneous vs. interval medical abortion regimens over gestation
- Matching on-the-fly: sequential allocation with higher power and efficiency
- Entity resolution with empirically motivated priors
- Variable prioritization in nonlinear black box methods: a genetic association case study
- Smoothing and adaptation of shifted Pólya tree ensembles
- Comments on: ``A random forest guided tour
- Balancing vs modeling approaches to weighting in practice
- Analyzing stochastic computer models: a review with opportunities
- The dependent Dirichlet process and related models
- Learning certifiably optimal rule lists for categorical data
- Modeling threshold interaction effects through the logistic classification trunk
- Preserving data utility via BART
- Type I Tobit Bayesian additive regression trees for censored outcome regression
- Heterogeneous treatment effect-based random forest: HTERF
- Variable selection using Bayesian additive regression trees
- Estimating a causal exposure response function with a continuous error-prone exposure: a study of fine particulate matter and all-cause mortality
- An evolutionary estimation procedure for generalized semilinear regression trees
- Optimization on Manifolds via Graph Gaussian Processes
- Bayesian CART models for insurance claims frequency
- A Mass-Shifting Phenomenon of Truncated Multivariate Normal Priors
- Conformal Sensitivity Analysis for Individual Treatment Effects
- Heterogeneous Distributed Lag Models to Estimate Personalized Effects of Maternal Exposures to Air Pollution
- Deep learning for ranking response surfaces with applications to optimal stopping problems
- Monotonic effects of characteristics on returns
- Bayesian nonparametric regression with varying residual density
- An adaptive sampling scheme guided by BART -- with an application to predict processor performance
- Adaptive-modal Bayesian nonparametric regression
- Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects (with discussion)
- Bayesian Deep Net GLM and GLMM
- Energy bagging tree
- Biomarker-driven adaptive design
- Sequential design for ranking response surfaces
- Sequential design for computer experiments with a flexible Bayesian additive model
- Simultaneous dimension reduction and variable selection in modeling high dimensional data
- Dynamic treatment regimes with interference
- Do German economic research institutes publish efficient growth and inflation forecasts? A Bayesian analysis
- Tree models of Bayesian regression: a case study
- Improved inference for doubly robust estimators of heterogeneous treatment effects
- Bayesian neural networks for selection of drug sensitive genes
- Landmark-Warped Emulators for Models with Misaligned Functional Response
- Comparing the performance of statistical methods that generalize effect estimates from randomized controlled trials to much larger target populations
- Minimax-optimal nonparametric regression in high dimensions
- Estimating heterogeneous treatment effects versus building individualized treatment rules: connection and disconnection
- Variable selection via a multi-stage strategy
- Nonparametric Bayesian analysis of the compound Poisson prior for support boundary recovery
- Bayesian multiple response kernel regression model for high dimensional data and its practical applications in near infrared spectroscopy
- Model-guided adaptive sampling for Bayesian model selection
- Nonparametric failure time: time-to-event machine learning with heteroskedastic Bayesian additive regression trees and low information omnibus Dirichlet process mixtures
- Prior and posterior checking of implicit causal assumptions
- Hierarchical Bayesian bootstrap for heterogeneous treatment effect estimation
- Evaluation of the health impacts of the 1990 Clean Air Act Amendments using causal inference and machine learning
- Comparing emulation methods for a high-resolution storm surge model
- Quantifying uncertainty in online regression forests
- Posterior concentration for Bayesian regression trees and forests
- BART-based inference for Poisson processes
- Semiparametric analysis of clustered interval‐censored survival data using soft Bayesian additive regression trees (SBART)
- Inflection points in community-level homeless rates
- Functional horseshoe smoothing for functional trend estimation
- Climate, agriculture, and hunger: statistical prediction of undernourishment using nonlinear regression and data-mining techniques
- A Bayesian nonparametric model for zero‐inflated outcomes: Prediction, clustering, and causal estimation
- Bayesian hierarchical modeling of the HIV evolutionary response to therapy
- A shared latent process model to correct for preferential sampling in disease surveillance systems
- The Temporal Overfitting Problem with Applications in Wind Power Curve Modeling
- Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks
- Clustering and Prediction With Variable Dimension Covariates
- Bayesian computation: a summary of the current state, and samples backwards and forwards
- Understanding the effect of contextual factors and decision making on team performance in Twenty20 cricket: an interpretable machine learning approach
- The how and why of Bayesian nonparametric causal inference
This page was built for publication: BART: Bayesian additive regression trees
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q65651)