Node harvest
From MaRDI portal
Publication:542973
DOI10.1214/10-AOAS367zbMATH Open1220.62084arXiv0910.2145OpenAlexW3037729706MaRDI QIDQ542973FDOQ542973
Authors: Nicolai Meinshausen
Publication date: 20 June 2011
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Abstract: When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as `black box' predictors. Node harvest is trying to reconcile the two aims of interpretability and predictive accuracy by combining positive aspects of trees and tree ensembles. Results are very sparse and interpretable and predictive accuracy is extremely competitive, especially for low signal-to-noise data. The procedure is simple: an initial set of a few thousand nodes is generated randomly. If a new observation falls into just a single node, its prediction is the mean response of all training observation within this node, identical to a tree-like prediction. A new observation falls typically into several nodes and its prediction is then the weighted average of the mean responses across all these nodes. The only role of node harvest is to `pick' the right nodes from the initial large ensemble of nodes by choosing node weights, which amounts in the proposed algorithm to a quadratic programming problem with linear inequality constraints. The solution is sparse in the sense that only very few nodes are selected with a nonzero weight. This sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to select a tuning parameter for optimal predictive accuracy. Node harvest can handle mixed data and missing values and is shown to be simple to interpret and competitive in predictive accuracy on a variety of data sets.
Full work available at URL: https://arxiv.org/abs/0910.2145
Recommendations
Classification and discrimination; cluster analysis (statistical aspects) (62H30) Quadratic programming (90C20)
Cites Work
- The elements of statistical learning. Data mining, inference, and prediction
- Predictive learning via rule ensembles
- Random survival forests
- Least angle regression. (With discussion)
- Title not available (Why is that?)
- Title not available (Why is that?)
- A numerically stable dual method for solving strictly convex quadratic programs
- Random forests
- Bagging predictors
- Random Forests and Adaptive Nearest Neighbors
- Additive logistic regression: a statistical view of boosting. (With discussion and a rejoinder by the authors)
- Boosting With theL2Loss
- Probabilistic Sensitivity Analysis of Complex Models: A Bayesian Approach
- Convexity, Classification, and Risk Bounds
- Atomic decomposition by basis pursuit
- Breast Cancer Diagnosis and Prognosis Via Linear Programming
- Forest Garrote
- Stacked regressions
Cited In (17)
- Mathematical optimization in classification and regression trees
- The Delaunay triangulation learner and its ensembles
- Supervised classification and mathematical optimization
- SIRUS: stable and interpretable RUle set for classification
- Selective harvesting over networks
- SUBiNN: a stacked uni- and bivariate \(k\)NN sparse ensemble
- On optimal regression trees to detect critical intervals for multivariate functional data
- Consistent regression using data-dependent coverings
- Ensemble of optimal trees, random forest and random projection ensemble classification
- Conclusive local interpretation rules for random forests
- Model transparency and interpretability: survey and application to the insurance industry
- Neural-symbolic temporal decision trees for multivariate time series classification
- Disjunctive Rule Lists
- Imputation Scores
- A decision-theoretic approach for model interpretability in Bayesian framework
- Learning customized and optimized lists of rules with mathematical programming
- Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model
Uses Software
This page was built for publication: Node harvest
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q542973)