Optimization of tree ensembles
From MaRDI portal
Abstract: Tree ensemble models such as random forests and boosted trees are among the most widely used and practically successful predictive models in applied machine learning and business analytics. Although such models have been used to make predictions based on exogenous, uncontrollable independent variables, they are increasingly being used to make predictions where the independent variables are controllable and are also decision variables. In this paper, we study the problem of tree ensemble optimization: given a tree ensemble that predicts some dependent variable using controllable independent variables, how should we set these variables so as to maximize the predicted value? We formulate the problem as a mixed-integer optimization problem. We theoretically examine the strength of our formulation, provide a hierarchy of approximate formulations with bounds on approximation quality and exploit the structure of the problem to develop two large-scale solution methods, one based on Benders decomposition and one based on iteratively generating tree split constraints. We test our methodology on real data sets, including two case studies in drug design and customized pricing, and show that our methodology can efficiently solve large-scale instances to near or full optimality, and outperforms solutions obtained by heuristic approaches. In our drug design case, we show how our approach can identify compounds that efficiently trade-off predicted performance and novelty with respect to existing, known compounds. In our customized pricing case, we show how our approach can efficiently determine optimal store-level prices under a random forest model that delivers excellent predictive accuracy.
Recommendations
- Optimal classification trees
- Mathematical optimization in classification and regression trees
- Optimal decision trees for categorical data via integer programming
- Ensemble learning from model based trees with application to differential price sensitivity assessment
- Embedding decision trees and random forests in constraint programming
Cites work
- A random forest guided tour
- An integer optimization approach to large-scale air traffic flow management
- Bagging predictors
- Boosting. Foundations and algorithms.
- Computing in operations research using Julia
- Concave extensions for nonlinear 0-1 maximization problems
- Consistency of random forests
- Consistency of random forests and other averaging classifiers
- Do we need hundreds of classifiers to solve real world classification problems?
- Exact first-choice product line optimization
- Mixed integer linear programming formulation techniques
- OR forum: An algorithmic approach to linear regression
- Optimal classification trees
- Random forests
- The impact of linear optimization on promotion planning
Cited in
(22)- Piecewise linear trees as surrogate models for system design and planning under high-frequency temporal variability
- Gradient boosting for convex cone predict and optimize problems
- Mathematical optimization in classification and regression trees
- Computer Aided Verification
- Tight mixed-integer optimization formulations for prescriptive trees
- On enhancing the explainability and fairness of tree ensembles
- Constraint learning approaches to improve the approximation of the capacity consumption function in lot-sizing models
- A two-stage exact algorithm for optimization of neural network ensemble
- Injecting domain knowledge in neural networks: a controlled experiment on a constrained problem
- Embedding decision trees and random forests in constraint programming
- Ensemble learning from model based trees with application to differential price sensitivity assessment
- The role of optimization in some recent advances in data-driven decision-making
- Piecewise polyhedral relaxations of multilinear optimization
- On clustering and interpreting with rules by means of mathematical optimization
- Big data driven order-up-to level model: application of machine learning
- Optimizing over an Ensemble of Trained Neural Networks
- On optimizing ensemble models using column generation
- Global optimization: a machine learning approach
- Assortment optimization: a systematic literature review
- Toward Efficient Ensemble Learning with Structure Constraints: Convergent Algorithms and Applications
- Integration of machine constraint learning methods within optimization models
- Optimization over decision trees: a case study for the design of stable direct-current electricity networks
Describes a project that uses
Uses Software
This page was built for publication: Optimization of tree ensembles
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5144785)