On the asymptotics of random forests
From MaRDI portal
Publication:268730
DOI10.1016/J.JMVA.2015.06.009zbMATH Open1337.62063arXiv1409.2090OpenAlexW2220051020MaRDI QIDQ268730FDOQ268730
Authors: Erwan Scornet
Publication date: 15 April 2016
Published in: Journal of Multivariate Analysis (Search for Journal in Brave)
Abstract: The last decade has witnessed a growing interest in random forest models which are recognized to exhibit good practical performance, especially in high-dimensional settings. On the theoretical side, however, their predictive power remains largely unexplained, thereby creating a gap between theory and practice. The aim of this paper is twofold. Firstly, we provide theoretical guarantees to link finite forests used in practice (with a finite number M of trees) to their asymptotic counterparts. Using empirical process theory, we prove a uniform central limit theorem for a large class of random forest estimates, which holds in particular for Breiman's original forests. Secondly, we show that infinite forest consistency implies finite forest consistency and thus, we state the consistency of several infinite forests. In particular, we prove that q quantile forests---close in spirit to Breiman's forests but easier to study---are able to combine inconsistent trees to obtain a final consistent prediction, thus highlighting the benefits of random forests compared to single trees.
Full work available at URL: https://arxiv.org/abs/1409.2090
Recommendations
random forestsconsistencyrandomizationcentral limit theoremempirical process\(q\)-quantilenumber of trees
Cites Work
- Random survival forests
- Functional data analysis
- Weak convergence and empirical processes. With applications to statistics
- Nonparametric functional data analysis. Theory and practice.
- Consistency of random forests
- Consistency of random forests and other averaging classifiers
- Title not available (Why is that?)
- Inference for functional data with applications
- Random forests
- Bagging predictors
- Consistent nonparametric regression. Discussion
- Concentration inequalities. A nonasymptotic theory of independence
- Title not available (Why is that?)
- Quantile regression forests
- Title not available (Why is that?)
- Ranking forests
- A distribution-free theory of nonparametric regression
- Extremely randomized trees
- Contributions in infinite-dimensional statistics and related topics. Selected papers from the 3rd international workshop on functional and operatorial statistics (IWFOS'2014), Stresa, Italy, June 19--21, 2014
- Analysis of a random forests model
- An empirical comparison of ensemble methods based on classification trees
- Consistency of random survival forests
Cited In (41)
- Modeling of time series using random forests: theoretical developments
- Random forest estimation of conditional distribution functions and conditional quantiles
- Rates of convergence for random forests via generalized U-statistics
- Tuning parameters in random forests
- Mathematical optimization in classification and regression trees
- Minimax optimal rates for Mondrian trees and forests
- Smoothing and adaptation of shifted Pólya tree ensembles
- Bootstrap bias corrections for ensemble methods
- Quantile regression forests
- Consistency of random forests
- Model-Assisted Estimation Through Random Forests in Finite Population Sampling
- Consistency of random survival forests
- Quantifying uncertainty in random forests via confidence intervals and hypothesis tests
- A random forest guided tour
- Predictive Distribution Modeling Using Transformation Forests
- On PAC-Bayesian bounds for random forests
- Improved convergence rates for some kernel random forest algorithms
- Comments on: ``A random forest guided tour
- Impact of subsampling and tree depth on random forests
- Nonunitarizable Representations and Random Forests
- Variance reduction in purely random forests
- Analysis of a random forests model
- Limit Distributions of the Height of a Random Forest
- Title not available (Why is that?)
- A Random Forest Approach for Bounded Outcome Variables
- Medoid splits for efficient random forests in metric spaces
- On the limiting distribution of the metric dimension for random forests
- Consistent estimation of residual variance with random forest out-of-bag errors
- Random forests for time-dependent processes
- An introduction to recent advances in high/infinite dimensional statistics
- Generalized random forests
- A Study of Strength and Correlation in Random Forests
- Coalescent random forests
- A version of the random directed forest and its convergence to the Brownian web
- Critical random forests
- Title not available (Why is that?)
- Measuring the algorithmic convergence of randomized ensembles: the regression setting
- Estimating the algorithmic variance of randomized ensembles via the bootstrap
- Towards convergence rate analysis of random forests for classification
- Models under which random forests perform badly; consequences for applications
- Consistency of random forests and other averaging classifiers
Uses Software
This page was built for publication: On the asymptotics of random forests
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q268730)