Sparsity in optimal randomized classification trees
From MaRDI portal
Abstract: Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better learning performance, especially when oblique cuts are implemented. In this paper, we propose a continuous optimization approach to build sparse optimal classification trees, based on oblique cuts, with the aim of using fewer predictor variables in the cuts as well as along the whole tree. Both types of sparsity, namely local and global, are modeled by means of regularizations with polyhedral norms. The computational experience reported supports the usefulness of our methodology. In all our data sets, local and global sparsity can be improved without harming classification accuracy. Unlike greedy approaches, our ability to easily trade in some of our classification accuracy for a gain in global sparsity is shown.
Recommendations
Cites work
- scientific article; zbMATH DE number 3860199 (Why is no real title available?)
- scientific article; zbMATH DE number 6438182 (Why is no real title available?)
- A random forest guided tour
- Auction optimization using regression trees and linear models as integer programs
- Comprehensible credit scoring models using rule extraction from support vector machines
- Constructing optimal binary decision trees is NP-complete
- Do we need hundreds of classifiers to solve real world classification problems?
- Learning decision trees with flexible constraints and objectives using integer optimization
- On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming
- Operations research and data mining
- Optimal classification trees
- Optimal decision trees for categorical data via integer programming
- Optimal randomized classification trees
- Optimization approaches to supervised classification
- Pyomo -- optimization modeling in Python
- Random forests
- Supersparse linear integer models for optimized medical scoring systems
- Supervised classification and mathematical optimization
- The \(F_{\infty}\)-norm support vector machine
- Using neural network rule extraction and decision tables for credit-risk evaluation
Cited in
(23)- Oblique decision tree induction by cross-entropy optimization based on the von Mises-Fisher distribution
- Loss-optimal classification trees: a generalized framework and the logistic case
- On optimal regression trees to detect critical intervals for multivariate functional data
- On multivariate randomized classification trees: \(l_0\)-based sparsity, VC dimension and decomposition methods
- Treelets -- an adaptive multi-scale basis for sparse unordered data
- The backbone method for ultra-high dimensional sparse machine learning
- Mathematical optimization in classification and regression trees
- Margin optimal classification trees
- Optimal randomized classification trees
- Sparse projection oblique randomer forests
- Discussion of: Treelets -- an adaptive multi-scale basis for sparse unordered data
- Convergence Rates of Oblique Regression Trees for Flexible Function Libraries
- On sparse optimal regression trees
- Optimal multivariate decision trees
- On mathematical optimization for clustering categories in contingency tables
- Optimization problems for machine learning: a survey
- Proximal variable metric method with spectral diagonal update for large scale sparse optimization
- Interpretable clustering via soft clustering trees
- A pivot-based simulated annealing algorithm to determine oblique splits for decision tree induction
- On constrained smoothing and out-of-range prediction using \(P\)-splines: a conic optimization approach
- Sparse regression over clusters: SparClur
- On sparse ensemble methods: an application to short-term predictions of the evolution of COVID-19
- scientific article; zbMATH DE number 7625179 (Why is no real title available?)
This page was built for publication: Sparsity in optimal randomized classification trees
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2301963)