How Biased is the Apparent Error Rate of a Prediction Rule?

bootstrap logistic model example underestimation AIC generalized linear model cross validation prediction model optimism Numerical results prediction errors downward bias Mallow's Cp Akaike criterium error rate of prediction rule general exponential family linear models measures of prediction errors

Mathematics Subject Classification ID

Linear regression; mixed models (62J05) Point estimation (62F10) Linear inference, regression (62J99) Parametric inference (62F99)

Related Items

Estimating the Kullback–Liebler risk based on multifold cross‐validation, Three distributions in the extended occupancy problem, Optimal Simulator Selection, Criterion constrained Bayesian hierarchical models, Robust estimation in regression and classification methods for large dimensional data, Model selection with Pearson's correlation, concentration and Lorenz curves under autocalibration, The asymptotic distribution of the proportion of correct classifications for a holdout sample in logistic regression, A regression model selection criterion based on bootstrap bumping for use with resistant fitting., Statistical significance of the Netflix challenge, Recent developments in bootstrap methodology, Discussion: ``A significance test for the lasso, Discussion: ``A significance test for the lasso, Discussion: ``A significance test for the lasso, Discussion: ``A significance test for the lasso, Discussion: ``A significance test for the lasso, High-Dimensional Spatial Quantile Function-on-Scalar Regression, P-splines with an \(\ell_1\) penalty for repeated measures, SURE-tuned tapering estimation of large covariance matrices, Cross-Validation for Correlated Data, Tuning parameter selection in sparse regression modeling, Least angle regression. (With discussion), Combining neural networks for function approximation under conditions of sparse data: the biased regression approach, Degrees of freedom for off-the-grid sparse estimation, Asymptotic properties of a double penalized maximum likelihood estimator in logistic regres\-sion, Data-based interval estimation of classification error rates, Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks, Un critère de choix de variables en analyse en composantes principales fondé sur des modèles graphiques gaussiens particuliers, Evaluating the impact of exploratory procedures in regression prediction: A pseudosample approach, Model selection criteria based on cross-validatory concordance statistics, Extending AIC to best subset regression, Using specially designed exponential families for density estimation, A Pliable Lasso, Density Deconvolution With Additive Measurement Errors Using Quadratic Programming, Modelling of insurers' rating determinants. An application of machine learning techniques and statistical models, Prediction Error Estimation Under Bregman Divergence for Non‐Parametric Regression and Classification, A lasso for hierarchical interactions, Bootstrap estimation and model selection for multivariate normal mixtures using parallel computing with graphics processing units, The degrees of freedom of partly smooth regularizers, A note on the generalized degrees of freedom under the \(L_{1}\) loss function, On the association between a random parameter and an observable, On the estimation of prediction errors in logistic regression models, Statistical properties of convex clustering, Variable selection for generalized linear mixed models by \(L_1\)-penalized estimation, Degrees of freedom and model selection for \(k\)-means clustering, Quantifying the Predictive Performance of Prognostic Models for Censored Survival Data with Time-Dependent Covariates, Efficient regularized isotonic regression with application to gene-gene interaction search, Assessing the performance of data assimilation algorithms which employ linear error feedback, Bootstrap-based model selection criteria for beta regressions, Are ordinal models useful for classification? a revised analysis, Model selection by resampling penalization, The negative correlations between data-determined bandwidths and the optimal bandwidth, Degrees of freedom in lasso problems, Is \(C_{p}\) an empirical Bayes method for smoothing parameter choice?, Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods, Modeling strategies in longitudinal data analysis: covariate, variance function and correlation structure selection, Asymptotic optimality of full cross-validation for selecting linear regression models, A large-sample model selection criterion based on Kullback's symmetric divergence, Multiple group linear discriminant analysis: robustness and error rate, Evaluation of generalized degrees of freedom for sparse estimation by replica method, Determination of the best significance level in forward stepwise logistic regression, From Fixed-X to Random-X Regression: Bias-Variance Decompositions, Covariance Penalties, and Prediction Error Estimation, Discussion of “From Fixed-X to Random-X Regression: Bias-Variance Decompositions, Covariance Penalties, and Prediction Error Estimation”, An assumption for the development of bootstrap variants of the Akaike information criterion in mixed models, Variable Selection in Canonical Discriminant Analysis for Family Studies, Subspace Information Criterion for Model Selection, A significance test for the lasso, Discussion: ``A significance test for the lasso, Rejoinder: ``A significance test for the lasso, Degrees of freedom in low rank matrix estimation, Local behavior of sparse analysis regularization: applications to risk estimation, Prediction Using Partly Conditional Time‐Varying Coefficients Regression Models, A survey of cross-validation procedures for model selection, Maximizing proportions of correct classifications in binary logistic regression, Efficient Computation and Model Selection for the Support Vector Regression, Cross validation model selection criteria for linear regression based on the Kullback-Leibler discrepancy, Adapting to unknown sparsity by controlling the false discovery rate, Estimating the accuracy of (local) cross-validation via randomised GCV choices in kernel or smoothing spline regression, Nearly unbiased variable selection under minimax concave penalty, Ideal point discriminant analysis, Bayesian nonparametric model selection and model testing, Distance-based linear discriminant analysis for interval-valued data, Additive models with trend filtering, Determination of the Selection Statistics and Best Significance Level in Backward Stepwise Logistic Regression, Assessing prediction error at interpolation and extrapolation points, Low Complexity Regularization of Linear Inverse Problems, Reluctant generalized additive modeling, A multistage algorithm for best-subset model selection based on the Kullback-Leibler discrepancy, Model evaluation, discrepancy function estimation, and social choice theory, Flexible and Interpretable Models for Survival Data, A study on tuning parameter selection for the high-dimensional lasso, Compressed covariance estimation with automated dimension learning, Automated data-driven selection of the hyperparameters for total-variation-based texture segmentation, Regular, median and Huber cross‐validation: A computational comparison, New aspects of Bregman divergence in regression and classification with parametric and nonparametric estimation, Compressed and Penalized Linear Regression, Bootstrap variants of the Akaike information criterion for mixed model selection, Estimation of the conditional risk in classification: the swapping method, Excess Optimism: How Biased is the Apparent Error of an Estimator Tuned by SURE?, Prediction-Based Structured Variable Selection through the Receiver Operating Characteristic Curves, Optimality of training/test size and resampling effectiveness in cross-validation, Inference after variable selection using restricted permutation methods, A model search procedure for hierarchical models, Unnamed Item, On the predictive risk in misspecified quantile regression, Bayesian comparison of latent variable models: conditional versus marginal likelihoods, Asymptotic bootstrap corrections of AIC for linear regression models, On model selection via stochastic complexity in robust linear regression, Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV., Selection criteria for scatterplot smoothers, Determination of different types of fixed effects in three-dimensional panels*, Reconceptualizing the p -value from a likelihood ratio test: a probabilistic pairwise comparison of models based on Kullback-Leibler discrepancy measures, A non-convex regularization approach for stable estimation of loss development factors, Appropriate penalties in the final prediction error criterion: A decision theoretic approach, Comparing and selecting spatial predictors using local criteria, Sparse estimation via nonconcave penalized likelihood in factor analysis model