Measuring the algorithmic convergence of randomized ensembles: the regression setting
From MaRDI portal
Publication:5037548
Abstract: When randomized ensemble methods such as bagging and random forests are implemented, a basic question arises: Is the ensemble large enough? In particular, the practitioner desires a rigorous guarantee that a given ensemble will perform nearly as well as an ideal infinite ensemble (trained on the same data). The purpose of the current paper is to develop a bootstrap method for solving this problem in the context of regression --- which complements our companion paper in the context of classification (Lopes 2019). In contrast to the classification setting, the current paper shows that theoretical guarantees for the proposed bootstrap can be established under much weaker assumptions. In addition, we illustrate the flexibility of the method by showing how it can be adapted to measure algorithmic convergence for variable selection. Lastly, we provide numerical results demonstrating that the method works well in a range of situations.
Recommendations
- Estimating the algorithmic variance of randomized ensembles via the bootstrap
- Estimating a sharp convergence bound for randomized ensembles
- Standard errors for bagged and random forest estimators
- Quantifying uncertainty in random forests via confidence intervals and hypothesis tests
- How large should ensembles of classifiers be?
Cites work
- scientific article; zbMATH DE number 1726664 (Why is no real title available?)
- scientific article; zbMATH DE number 6378123 (Why is no real title available?)
- scientific article; zbMATH DE number 3860199 (Why is no real title available?)
- scientific article; zbMATH DE number 718142 (Why is no real title available?)
- A bootstrap method for error estimation in randomized matrix multiplication
- Analysis of a random forests model
- Analyzing bagging
- BART: Bayesian additive regression trees
- Bagging predictors
- Bootstrapping max statistics in high dimensions: near-parametric rates under weak variance decay and application to functional and multinomial data
- Comments on: ``A random forest guided tour
- Consistency of random forests
- Consistency of random forests and other averaging classifiers
- Correlation and variable importance in random forests
- Estimating the algorithmic variance of randomized ensembles via the bootstrap
- Extrapolation and the bootstrap
- Extrapolation methods theory and practice
- Extrapolation of subsampling distribution estimators: The i.i.d. and strong mixing cases
- How large should ensembles of classifiers be?
- Isoperimetry and integrability of the sum of independent Banach-space valued random variables
- On the asymptotics of random forests
- Online bootstrap confidence intervals for the stochastic gradient descent estimator
- Optimal weighted nearest neighbour classifiers
- Practical Extrapolation Methods
- Properties of Bagged Nearest Neighbour Classifiers
- Quantifying uncertainty in random forests via confidence intervals and hypothesis tests
- Random Forests and Adaptive Nearest Neighbors
- Random Forests and Kernel Methods
- Random forests
- Random rotation ensembles
- Random-projection ensemble classification. (With discussion).
- Richardson Extrapolation and the Bootstrap
- Scalable statistical inference for averaged implicit stochastic gradient descent
- Second-order properties of an extrapolated bootstrap without replacement under weak assumptions
- Standard errors for bagged and random forest estimators
- The elements of statistical learning. Data mining, inference, and prediction
- To tune or not to tune the number of trees in random forest
- Variable importance in binary regression trees and forests
- ggplot2. Elegant graphics for data analysis. With contributions by Carson Sievert
Cited in
(5)- Toward Efficient Ensemble Learning with Structure Constraints: Convergent Algorithms and Applications
- Estimating a sharp convergence bound for randomized ensembles
- Estimating the algorithmic variance of randomized ensembles via the bootstrap
- Learning with mitigating random consistency from the accuracy measure
- On a method for constructing ensembles of regression models
Describes a project that uses
Uses Software
This page was built for publication: Measuring the algorithmic convergence of randomized ensembles: the regression setting
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5037548)