Abstract: It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid ``post-selection inference by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing ``simultaneity insurance for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffe protection. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results.
Recommendations
- Exact post-selection inference, with application to the Lasso
- Valid post-selection inference in model-free linear regression
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Selective inference after likelihood- or test-based model selection in linear models
- Conditional predictive inference post model selection
Cites work
- scientific article; zbMATH DE number 3141625 (Why is no real title available?)
- scientific article; zbMATH DE number 4100415 (Why is no real title available?)
- scientific article; zbMATH DE number 46309 (Why is no real title available?)
- A Note on Quantiles in Large Samples
- Asymptotic properties of maximum likelihood estimators based on conditional specification
- CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS?
- Can one estimate the conditional distribution of post-model-selection estimators?
- Confidence sets based on penalized maximum likelihood estimators in Gaussian regression
- Distributional results for thresholding estimators in high-dimensional Gaussian regression models
- Frequentist Model Average Estimators
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Mostly harmless econometrics. An empiricist's companion.
- Note on a Conditional Property of Student's $t^1$
- On model uncertainty and its statistical implications. Proceedings of a workshop, held in Groningen, Netherlands, September 25-26, 1986
- On preliminary test and shrinkage M-estimation in linear models
- On the Large-Sample Minimal Coverage Probability of Confidence Intervals After Model Selection
- On the distribution of penalized maximum likelihood estimators: the LASSO, SCAD, and thresholding
- On the distribution of the adaptive LASSO estimator
- PERFORMANCE LIMITS FOR ESTIMATORS OF THE RISK OR DISTRIBUTION OF SHRINKAGE-TYPE ESTIMATORS, AND SOME GENERAL LOWER RISK-BOUND RESULTS
- Random Packings and Coverings of the Unit n-Sphere
- Sparse estimators and the oracle property, or the return of Hodges' estimator
- THE FINITE-SAMPLE DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS AND UNIFORM VERSUS NONUNIFORM APPROXIMATIONS
- The Conditional Level of Student's $t$ Test
- The Conditional Level of the F-Test
- The Focused Information Criterion
- The distribution of a linear predictor after model selection: unconditional finite-sample distributions and asymptotic approximations
- The distribution of model averaging estimators and an impossibility result regarding its estima\-tion
- Valid post-selection inference
Cited in
(only showing first 100 items - show all)- A bootstrap Lasso+partial ridge method to construct confidence intervals for parameters in high-dimensional sparse linear models
- Kernel Ordinary Differential Equations
- SLOPE-adaptive variable selection via convex optimization
- On various confidence intervals post-model-selection
- High-dimensional inference: confidence intervals, \(p\)-values and R-software \texttt{hdi}
- Heterogeneous heterogeneity by default: Testing categorical moderators in mixed‐effects meta‐analysis
- Inference for low‐ and high‐dimensional inhomogeneous Gibbs point processes
- Inference After Model Selection
- False Discovery Rate Control via Data Splitting
- Post-selection inference of generalized linear models based on the lasso and the elastic net
- Forward-selected panel data approach for program evaluation
- Selection of mixed copula for association modeling with tied observations
- Exploration of the variability of variable selection based on distances between bootstrap sample results
- Confidently Comparing Estimates with the c-value
- Selective inference for latent block models
- On Hodges' superefficiency and merits of oracle property in model selection
- Simultaneous high-probability bounds on the false discovery proportion in structured, regression and online settings
- Distribution-free predictive inference for regression
- Distributionally robust and generalizable inference
- FANOK: knockoffs in linear time
- Uniformly valid confidence intervals post-model-selection
- Markov Neighborhood Regression for High-Dimensional Inference
- The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters
- Asymptotics of selective inference
- Post hoc confidence bounds on false positives using reference families
- An evolutionary estimation procedure for generalized semilinear regression trees
- Penalized estimation of a class of single‐index varying‐coefficient models for integrative genomic analysis
- Bootstrapping and sample splitting for high-dimensional, assumption-lean inference
- Penalized likelihood and multiple testing
- Valid post-selection inference in high-dimensional approximately sparse quantile regression models
- A knockoff filter for high-dimensional selective inference
- Projection-based Inference for High-dimensional Linear Models
- scientific article; zbMATH DE number 7750673 (Why is no real title available?)
- Exact post-selection inference for the generalized Lasso path
- Inferactive data analysis
- Inference for High-Dimensional Censored Quantile Regression
- Exact post-selection inference for adjusted R squared selection
- Optimal configurations of lines and a statistical application
- scientific article; zbMATH DE number 7750675 (Why is no real title available?)
- On the post selection inference constant under restricted isometry properties
- Optimal model averaging for divergent-dimensional Poisson regressions
- The Perils of Balance Testing in Experimental Design: Messy Analyses of Clean Data
- Selective inference for additive and linear mixed models
- Ensuring valid inference for Cox hazard ratios after variable selection
- Sparse estimation in semiparametric finite mixture of varying coefficient regression models
- Logistic regression: from art to science
- Targeted Inference Involving High-Dimensional Data Using Nuisance Penalized Regression
- On the length of post-model-selection confidence intervals conditional on polyhedral constraints
- Variable Selection for Global Fréchet Regression
- Post-selection inference for \(\ell_1\)-penalized likelihood models
- Focused model selection for linear mixed models with an application to whale ecology
- High-dimensional statistical inference via DATE
- High-dimensional CLT: improvements, non-uniform extensions and large deviations
- Post-selection inference following aggregate level hypothesis testing in large-scale genomic data
- Post-model-selection inference in linear regression models: an integrated review
- Selective inference with unknown variance via the square-root Lasso
- Trade-off between predictive performance and FDR control for high-dimensional Gaussian model selection
- On the impact of model selection on predictor identification and parameter inference
- Selective inference after likelihood- or test-based model selection in linear models
- Uniformly valid confidence sets based on the Lasso
- Projection-based techniques for high-dimensional optimal transport problems
- Integrative analysis of `-omics' data using penalty functions
- Locally simultaneous inference
- On the least-squares model averaging interval estimator
- Integrative Bayesian Models Using Post-Selective Inference: A Case Study in Radiogenomics
- Empirical likelihood based tests for detecting the presence of significant predictors in marginal quantile regression
- Post-selection inference via algorithmic stability
- Cellwise outlier detection with false discovery rate control
- A structured brain‐wide and genome‐wide association study using ADNI PET images
- Statistical learning and selective inference
- A multi-resolution theory for approximating infinite-\(p\)-zero-\(n\): transitional inference, individualized predictions, and a world without bias-variance tradeoff
- Post-selection estimation and testing following aggregate association tests
- Carving model-free inference
- Approximate Selective Inference via Maximum Likelihood
- Unlucky Number 13? Manipulating Evidence Subject to Snooping
- Markov neighborhood regression for statistical inference of high-dimensional generalized linear models
- Inference after variable selection in linear regression models
- Scalable methods for Bayesian selective inference
- Estimation of selected parameters
- Only closed testing procedures are admissible for controlling false discovery proportions
- Multicarving for high-dimensional post-selection inference
- Conformal Prediction Credibility Intervals
- A comparison of full model specification and backward elimination of potential confounders when estimating marginal and conditional causal effects on binary outcomes from observational data
- Asymptotically Uniform Tests After Consistent Model Selection in the Linear Regression Model
- Efficient interaction selection for clustered data via stagewise generalized estimating equations
- Confounder selection strategies targeting stable treatment effect estimators
- Post-selection point and interval estimation of signal sizes in Gaussian samples
- Conditional predictive inference post model selection
- Models as approximations. II. A model-free theory of parametric regression
- Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Uniform asymptotic inference and the bootstrap after model selection
- Uniformly valid inference based on the Lasso in linear mixed models
- Valid post-selection inference in model-free linear regression
- Sparse estimation of Cox proportional hazards models via approximated information criteria
- Comments on “Unobservable Selection and Coefficient Stability: Theory and Evidence” and “Poorly Measured Confounders are More Useful on the Left Than on the Right”
- Controlling False Discovery Rate Using Gaussian Mirrors
- Excess optimism: how biased is the apparent error of an estimator tuned by SURE?
- Score Tests With Incomplete Covariates and High-Dimensional Auxiliary Variables
- Testing for Neglected Nonlinearity Using Regularized Artificial Neural Networks
- Confidence intervals for high-dimensional inverse covariance estimation
This page was built for publication: Valid post-selection inference
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q355109)