Conditional predictive inference for stable algorithms
From MaRDI portal
Publication:6042347
Abstract: We investigate generically applicable and intuitively appealing prediction intervals based on -fold cross validation. We focus on the conditional coverage probability of the proposed intervals, given the observations in the training sample (hence, training conditional validity), and show that it is close to the nominal level, in an appropriate sense, provided that the underlying algorithm used for computing point predictions is sufficiently stable when feature-response pairs are omitted. Our results are based on a finite sample analysis of the empirical distribution function of -fold cross validation residuals and hold in non-parametric settings with only minimal assumptions on the error distribution. To illustrate our results, we also apply them to high-dimensional linear predictors, where we obtain uniform asymptotic training conditional validity as both sample size and dimension tend to infinity at the same rate and consistent parameter estimation typically fails. These results show that despite the serious problems of resampling procedures for inference on the unknown parameters (cf. Bickel and Freedman, 1983; El Karoui and Purdom, 2018; Mammen, 1996), cross validation methods can be successfully applied to obtain reliable predictive inference even in high dimensions and conditionally on the training data.
Recommendations
Cites work
- scientific article; zbMATH DE number 3841086 (Why is no real title available?)
- scientific article; zbMATH DE number 3522963 (Why is no real title available?)
- scientific article; zbMATH DE number 2168212 (Why is no real title available?)
- scientific article; zbMATH DE number 3336465 (Why is no real title available?)
- 10.1162/153244302760200704
- A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity
- A distribution-free theory of nonparametric regression
- An Extension of Wilks' Method for Setting Tolerance Limits
- Asymptotically Minimal Multivariate Tolerance Sets
- Asymptotically Valid Prediction Intervals for Linear Models
- Bootstrap Prediction Intervals for Regression
- Can we trust the bootstrap in high-dimensions? The case of linear models
- Conditional validity of inductive conformal predictors
- Consistency of random forests
- Distribution-Free Prediction Sets
- Distribution-free Prediction Bands for Non-parametric Regression
- Distribution-free inequalities for the deleted and holdout error estimates
- Distribution-free predictive inference for regression
- Empirical process of residuals for high-dimensional linear models
- Fast exact conformalization of the Lasso using piecewise linear homotopy
- Maximum likelihood estimation in misspecified generalized linear models
- Model-free model-fitting and predictive distributions
- Multivariate spacings based on data depth. I: Construction of nonparametric multivariate tolerance regions
- Non-Parametric Estimation II. Statistically Equivalent Blocks and Tolerance Regions--The Continuous Case
- Nonparametric regression using deep neural networks with ReLU activation function
- On robust regression with high-dimensional predictors
- On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators
- On-line predictive linear regression
- Optimal equivariant prediction for high-dimensional linear models with arbitrary predictor covariance
- Prediction intervals for regression models
- Predictive Intervals Based on Reuse of the Sample
- Predictive inference with the jackknife+
- Shrinkage estimators for prediction out-of-sample: conditional performance
- Smallest nonparametric tolerance regions.
- Statistical Prediction with Special Reference to the Problem of Tolerance Limits
- Statistical Tolerance Regions: Theory, Applications, and Computation
- Statistics for high-dimensional data. Methods, theory and applications.
- The limits of distribution-free conditional predictive inference
- The spectrum of kernel random matrices
- Using Least Squares to Approximate Unknown Regression Functions
Cited in
(6)- scientific article; zbMATH DE number 6670747 (Why is no real title available?)
- Conformal prediction: a unified review of theory and new challenges
- Post-selection inference via algorithmic stability
- The limits of distribution-free conditional predictive inference
- Training-conditional coverage for distribution-free predictive inference
- Conditional validity of inductive conformal predictors
This page was built for publication: Conditional predictive inference for stable algorithms
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6042347)