On Measuring and Correcting the Effects of Data Mining and Model Selection

From MaRDI portal
Publication:3839585


DOI10.2307/2669609zbMath0920.62056MaRDI QIDQ3839585

Jianming Ye

Publication date: 9 August 1998

Full work available at URL: https://doi.org/10.2307/2669609


62G07: Density estimation

62H12: Estimation in multivariate analysis

62J05: Linear regression; mixed models


Related Items

Model Selection for Generalized Estimating Equations Accommodating Dropout Missingness, A likelihood‐based comparison of temporal models for physical processes, Bayesian P-spline estimation in hierarchical models specified by systems of affine differential equations, Small area mean estimation after effect clustering, False Discovery Rates to Detect Signals from Incomplete Spatially Aggregated Data, Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach, Computing AIC for black-box models using generalized degrees of freedom: A comparison with cross-validation, Model selection in regression based on pre-smoothing, Autoregressive model selection based on a prediction perspective, A method for choosing the smoothing parameter in a semi-parametric model for detecting change-points in blood flow, Adaptive order selection for autoregressive models, Excess Optimism: How Biased is the Apparent Error of an Estimator Tuned by SURE?, CLEAR: Covariant LEAst-Square Refitting with Applications to Image Restoration, PARSIMONIOUS PARAMETERIZATION OF AGE-PERIOD-COHORT MODELS BY BAYESIAN SHRINKAGE, Statistical significance of the Netflix challenge, Prediction errors for penalized regressions based on generalized approximate message passing, A discussion of prior-based Bayesian information criterion (PBIC), The truth about the effective dimension, Optimal Simulator Selection, Criterion constrained Bayesian hierarchical models, A Generalization Gap Estimation for Overparameterized Models via the Langevin Functional Variance, Fence methods for mixed model selection, Model selection in linear mixed models, Sparse estimation via nonconcave penalized likelihood in factor analysis model, Greedy algorithms for prediction, Estimation of Lyapunov spectrum and model selection for a chaotic time series, Estimation of nonlinear differential equation model for glucose-insulin dynamics in type I diabetic patients using generalized smoothing, Combining models in longitudinal data analysis, Generalized degrees of freedom and adaptive model selection in linear mixed-effects models, Conditional Akaike information criterion for generalized linear mixed models, Effective degrees of freedom and its application to conditional AIC for linear mixed-effects models with correlated error structures, Degrees of freedom in low rank matrix estimation, A note on the generalized degrees of freedom under the \(L_{1}\) loss function, Model selection for two-sample problems with right-censored data: an application of Cox model, Conditional and unconditional methods for selecting variables in linear mixed models, Using simulated annealing to optimize the feature selection problem in marketing applications, Testing conditional mean through regression model sequence using Yanai's generalized coefficient of determination, Component selection and smoothing in multivariate nonparametric regression, On the association between a random parameter and an observable, Reducing over-dispersion by generalized degree of freedom and propensity score, Detecting and handling outlying trajectories in irregularly sampled functional datasets, A new adaptive local linear prediction method and its application in hydrological time series, An algebraic characterization of the optimum of regularized kernel methods, Estimation of the conditional risk in classification: the swapping method, An improved model averaging scheme for logistic regression, Tuning parameter selection in sparse regression modeling, Estimation of an oblique structure via penalized likelihood factor analysis, Extending AIC to best subset regression, Degrees of freedom for piecewise Lipschitz estimators, The dual and degrees of freedom of linearly constrained generalized Lasso, Feasible generalized least squares using support vector regression, On the choice of difference sequence in a unified framework for variance estimation in nonparametric regression, A flexible shrinkage operator for fussy grouped variable selection, Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV., Least angle regression. (With discussion), Resampling-based information criteria for best-subset regression, Selection of model selection criteria for multivariate ridge regression, An introduction to model selection, On the degrees of freedom of mixed matrix regression, Generalized \(\ell_1\)-penalized quantile regression with linear constraints, Degrees of freedom for regularized regression with Huber loss and linear constraints, Model averaging for linear mixed models via augmented Lagrangian, Information criteria bias correction for group selection, Computing the degrees of freedom of rank-regularized estimators and cousins, Automatic identification of curve shapes with applications to ultrasonic vocalization, Prediction error after model search, Local behavior of sparse analysis regularization: applications to risk estimation, Markov chain estimation for test theory without an answer key, The generalized degrees of freedom of multilinear principal component analysis, A fast algorithm for optimizing ridge parameters in a generalized ridge regression by minimizing a model selection criterion, On the predictive risk in misspecified quantile regression, Generalized cross validation in variable selection with and without shrinkage, Data enriched linear regression, Efficient regularized isotonic regression with application to gene-gene interaction search, Optimal variance estimation without estimating the mean function, Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods, A new approach for selecting the number of factors, On the ``degrees of freedom of the lasso, On generalized degrees of freedom with application in linear mixed models selection, On improved loss estimation for shrinkage estimators, Selection model for domains across time: application to labour force survey by economic activities, Geometrically designed variable knot splines in generalized (non-)linear models, Model selection uncertainty and stability in beta regression models: a study of bootstrap-based model averaging with an empirical application to clickstream data, Selection Strategy for Covariance Structure of Random Effects in Linear Mixed-effects Models, Low Complexity Regularization of Linear Inverse Problems, Discussion of “From Fixed-X to Random-X Regression: Bias-Variance Decompositions, Covariance Penalties, and Prediction Error Estimation”, Combining Multiple Biomarker Models in Logistic Regression


Uses Software