Conditional predictive inference for stable algorithms

From MaRDI portal
Publication:6042347

DOI10.1214/22-AOS2250arXiv1809.01412OpenAlexW4360879309MaRDI QIDQ6042347FDOQ6042347


Authors: Lukas Steinberger, Hannes Leeb Edit this on Wikidata


Publication date: 10 May 2023

Published in: The Annals of Statistics (Search for Journal in Brave)

Abstract: We investigate generically applicable and intuitively appealing prediction intervals based on k-fold cross validation. We focus on the conditional coverage probability of the proposed intervals, given the observations in the training sample (hence, training conditional validity), and show that it is close to the nominal level, in an appropriate sense, provided that the underlying algorithm used for computing point predictions is sufficiently stable when feature-response pairs are omitted. Our results are based on a finite sample analysis of the empirical distribution function of k-fold cross validation residuals and hold in non-parametric settings with only minimal assumptions on the error distribution. To illustrate our results, we also apply them to high-dimensional linear predictors, where we obtain uniform asymptotic training conditional validity as both sample size and dimension tend to infinity at the same rate and consistent parameter estimation typically fails. These results show that despite the serious problems of resampling procedures for inference on the unknown parameters (cf. Bickel and Freedman, 1983; El Karoui and Purdom, 2018; Mammen, 1996), cross validation methods can be successfully applied to obtain reliable predictive inference even in high dimensions and conditionally on the training data.


Full work available at URL: https://arxiv.org/abs/1809.01412




Recommendations




Cites Work


Cited In (6)





This page was built for publication: Conditional predictive inference for stable algorithms

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6042347)