Prediction error after model search
From MaRDI portal
Publication:2196193
Abstract: Estimation of the prediction error of a linear estimation rule is difficult if the data analyst also use data to select a set of variables and construct the estimation rule using only the selected variables. In this work, we propose an asymptotically unbiased estimator for the prediction error after model search. Under some additional mild assumptions, we show that our estimator converges to the true prediction error in at the rate of , with being the number of data points. Our estimator applies to general selection procedures, not requiring analytical forms for the selection. The number of variables to select from can grow as an exponential factor of , allowing applications in high-dimensional data. It also allows model misspecifications, not requiring linear underlying models. One application of our method is that it provides an estimator for the degrees of freedom for many discontinuous estimation rules like best subset selection or relaxed Lasso. Connection to Stein's Unbiased Risk Estimator is discussed. We consider in-sample prediction errors in this work, with some extension to out-of-sample errors in low dimensional, linear models. Examples such as best subset selection and relaxed Lasso are considered in simulations, where our estimator outperforms both and cross validation in various settings.
Recommendations
- Model imperfection and predicting predictability
- Estimation and accuracy after model selection
- Model selection and error estimation
- Publication:5750147
- Model error propagation from experimental to prediction configuration
- Decomposition of Prediction Error
- scientific article; zbMATH DE number 522884
- Relative-error prediction
Cites work
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A note on an inequality involving the normal distribution
- A study of error variance estimation in Lasso regression
- A unified framework for high-dimensional analysis of \(M\)-estimators with decomposable regularizers
- Adaptive Model Selection
- Adaptive estimation of a quadratic functional by model selection.
- An inequality for the multivariate normal distribution
- Asymptotics of selective inference
- Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation
- Degrees of freedom and model search
- Degrees of freedom for piecewise Lipschitz estimators
- Degrees of freedom in lasso problems
- Differential Privacy: A Survey of Results
- Estimation of the mean of a multivariate normal distribution
- Honest confidence regions for nonparametric regression
- Least angle regression. (With discussion)
- On Measuring and Correcting the Effects of Data Mining and Model Selection
- On the ``degrees of freedom of the lasso
- Relaxed Lasso
- Scaled sparse linear regression
- Selective inference with a randomized response
- Selective inference with unknown variance via the square-root Lasso
- Simultaneous analysis of Lasso and Dantzig selector
- Some Comments on C P
- The Estimation of Prediction Error
Cited in
(4)
This page was built for publication: Prediction error after model search
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2196193)