Measuring the algorithmic convergence of randomized ensembles: the regression setting

DOI10.1137/20M1343300zbMATH Open1490.62161arXiv1908.01251OpenAlexW3092422740MaRDI QIDQ5037548FDOQ5037548

Authors: Miles E. Lopes, Suofei Wu, Thomas C. M. Lee

Publication date: 1 March 2022

Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)

Abstract: When randomized ensemble methods such as bagging and random forests are implemented, a basic question arises: Is the ensemble large enough? In particular, the practitioner desires a rigorous guarantee that a given ensemble will perform nearly as well as an ideal infinite ensemble (trained on the same data). The purpose of the current paper is to develop a bootstrap method for solving this problem in the context of regression --- which complements our companion paper in the context of classification (Lopes 2019). In contrast to the classification setting, the current paper shows that theoretical guarantees for the proposed bootstrap can be established under much weaker assumptions. In addition, we illustrate the flexibility of the method by showing how it can be adapted to measure algorithmic convergence for variable selection. Lastly, we provide numerical results demonstrating that the method works well in a range of situations.

Full work available at URL: https://arxiv.org/abs/1908.01251

Recommendations

zbMATH Keywords

bootstrap random forests randomized algorithms bagging

Mathematics Subject Classification ID

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Nonparametric statistical resampling methods (62G09)

Cites Work

Cited In (5)

Uses Software

This page was built for publication: Measuring the algorithmic convergence of randomized ensembles: the regression setting

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5037548)