Conditional predictive inference post model selection

DOI10.1214/08-AOS660MaRDI QIDQ834366zbMATH OpenOpenAlexFDO

Publication date 19 August 2009

Published in The Annals of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/0908.3615

approximately honest and short prediction intervals conditional coverage probability finite sample analysis predictive inference post model selection regression with random design

Mathematics Subject Classification ID

Nonparametric regression and quantile regression (62G08) Linear regression; mixed models (62J05) Nonparametric tolerance and confidence regions (62G15) Estimation in multivariate analysis (62H12) Ridge regression; shrinkage estimators (Lasso) (62J07) Inequalities; stochastic orderings (60E15)

Abstract: We give a finite-sample analysis of predictive inference procedures after model selection in regression with random design. The analysis is focused on a statistically challenging scenario where the number of potentially important explanatory variables can be infinite, where no regularity conditions are imposed on unknown parameters, where the number of explanatory variables in a "good" model can be of the same order as sample size and where the number of candidate models can be of larger order than sample size. The performance of inference procedures is evaluated conditional on the training sample. Under weak conditions on only the number of candidate models and on their complexity, and uniformly over all data-generating processes under consideration, we show that a certain prediction interval is approximately valid and short with high probability in finite samples, in the sense that its actual coverage probability is close to the nominal one and in the sense that its length is close to the length of an infeasible interval that is constructed by actually knowing the "best" candidate model. Similar results are shown to hold for predictive inference procedures other than prediction intervals like, for example, tests of whether a future response will lie above or below a given threshold.

Recommendations

Cites work

Cited in

(17)

This page was built for publication: Conditional predictive inference post model selection

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q834366)