Consistency of cross validation for comparing regression procedures
From MaRDI portal
Publication:2473071
DOI10.1214/009053607000000514zbMATH Open1129.62039arXiv0803.2963OpenAlexW2018471882MaRDI QIDQ2473071FDOQ2473071
Authors: Yuhong Yang
Publication date: 26 February 2008
Published in: The Annals of Statistics (Search for Journal in Brave)
Abstract: Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property.
Full work available at URL: https://arxiv.org/abs/0803.2963
Recommendations
- Cross-validation for comparing multiple density estimation procedures
- scientific article; zbMATH DE number 5056254
- Cross-validation for selecting a model selection procedure
- A survey of cross-validation procedures for model selection
- On the consistency of cross-validation in kernel nonparametric regression
Nonparametric estimation (62G05) Nonparametric regression and quantile regression (62G08) Asymptotic properties of nonparametric inference (62G20)
Cites Work
- Applied Linear Regression
- Title not available (Why is that?)
- A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods
- Title not available (Why is that?)
- Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation
- Smoothing methods in statistics
- Title not available (Why is that?)
- Title not available (Why is that?)
- Optimal rates of convergence for nonparametric estimators
- Nonparametric smoothing and lack-of-fit tests
- Model selection via multifold cross validation
- Title not available (Why is that?)
- Title not available (Why is that?)
- Linear Model Selection by Cross-Validation
- Convergence of stochastic processes
- Title not available (Why is that?)
- The Predictive Sample Reuse Method with Applications
- Optimal global rates of convergence for nonparametric regression
- Asymptotic optimality for \(C_ p\), \(C_ L\), cross-validation and generalized cross-validation: Discrete index set
- Model selection in nonparametric regression
- Nonparametric regression with correlated errors.
- A distribution-free theory of nonparametric regression
- Adaptive Regression by Mixing
- The Relationship between Variable Selection and Data Agumentation and a Method for Prediction
- On the consistency of cross-validation in kernel nonparametric regression
- Minimax estimation via wavelet shrinkage
- How Far Are Automatically Chosen Regression Smoothing Parameters From Their Optimum?
- Oracle inequalities for multi-fold cross validation
- The cross-validated adaptive epsilon-net estimator
- Spline smoothing and optimal rates of convergence in nonparametric regression models
- Title not available (Why is that?)
- Consistency for cross-validated nearest neighbor estimates in nonparametric regression
- Title not available (Why is that?)
- Title not available (Why is that?)
Cited In (41)
- Sparsity oriented importance learning for high-dimensional linear regression
- A cross-validation based estimation of the proportion of true null hypotheses
- A survey of Bayesian predictive methods for model assessment, selection and comparison
- Double-slicing assisted sufficient dimension reduction for high-dimensional censored data
- Parametric or nonparametric? A parametricness index for model selection
- Cross-Validation: What Does It Estimate and How Well Does It Do It?
- Targeted cross-validation
- Estimation of prediction error by using \(K\)-fold cross-validation
- Model selection via standard error adjusted adaptive Lasso
- Catching up Faster by Switching Sooner: A Predictive Approach to Adaptive Estimation with an Application to the AIC–BIC Dilemma
- Model selection by resampling penalization
- Estimating the Kullback–Liebler risk based on multifold cross‐validation
- Cross-validation with confidence
- Cross-validation for change-point regression: pitfalls and solutions
- Penalized cluster analysis with applications to family data
- Cross-validation for comparing multiple density estimation procedures
- Multiple predicting \(K\)-fold cross-validation for model selection
- Consistency of empirical Bayes and kernel flow for hierarchical parameter estimation
- Determining the number of factors in approximate factor models by twice K-fold cross validation
- Estimating and forecasting dynamic correlation matrices: a nonlinear common factor approach
- Efficient, adaptive cross-validation for tuning and comparing models, with application to drug discovery
- A Note on Cross-Validation for Lasso Under Measurement Errors
- On consistent statistical procedures in regression
- Mixing partially linear regression models
- Consistent selection of the number of change-points via sample-splitting
- Degrees of freedom in submodular regularization: a computational perspective of Stein's unbiased risk estimate
- Robustness by reweighting for kernel estimators: an overview
- Regression in Tensor Product Spaces by the Method of Sieves
- Theoretical analysis of cross-validation for estimating the risk of the \(k\)-nearest neighbor classifier
- Cross-Validation, Risk Estimation, and Model Selection: Comment on a Paper by Rosset and Tibshirani
- Segmentation of the mean of heteroscedastic data via cross-validation
- A survey of cross-validation procedures for model selection
- Performance Assessment of High-dimensional Variable Identification
- Bayes shrinkage estimation for high-dimensional VAR models with scale mixture of normal distributions for noise
- Asymptotics of K-fold cross validation
- Risk consistency of cross-validation with Lasso-type procedures
- Variable selection in convex quantile regression: \(\mathcal{L}_1\)-norm or \(\mathcal{L}_0\)-norm regularization?
- Consistent estimation of the number of communities in stochastic block models using cross-validation
- The art of transfer learning: an adaptive and robust pipeline
- Equivalence of regression calibration methods in main study/external validation study designs
- Cross-validation for selecting a model selection procedure
Uses Software
This page was built for publication: Consistency of cross validation for comparing regression procedures
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2473071)