Valid post-selection inference
From MaRDI portal
Abstract: It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid ``post-selection inference by reducing the problem to one of simultaneous inference and hence suitably widening conventional confidence and retention intervals. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing ``simultaneity insurance for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffe protection. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results.
Recommendations
- Exact post-selection inference, with application to the Lasso
- Valid post-selection inference in model-free linear regression
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Selective inference after likelihood- or test-based model selection in linear models
- Conditional predictive inference post model selection
Cites work
- scientific article; zbMATH DE number 3141625 (Why is no real title available?)
- scientific article; zbMATH DE number 4100415 (Why is no real title available?)
- scientific article; zbMATH DE number 46309 (Why is no real title available?)
- A Note on Quantiles in Large Samples
- Asymptotic properties of maximum likelihood estimators based on conditional specification
- CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS?
- Can one estimate the conditional distribution of post-model-selection estimators?
- Confidence sets based on penalized maximum likelihood estimators in Gaussian regression
- Distributional results for thresholding estimators in high-dimensional Gaussian regression models
- Frequentist Model Average Estimators
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Mostly harmless econometrics. An empiricist's companion.
- Note on a Conditional Property of Student's $t^1$
- On model uncertainty and its statistical implications. Proceedings of a workshop, held in Groningen, Netherlands, September 25-26, 1986
- On preliminary test and shrinkage M-estimation in linear models
- On the Large-Sample Minimal Coverage Probability of Confidence Intervals After Model Selection
- On the distribution of penalized maximum likelihood estimators: the LASSO, SCAD, and thresholding
- On the distribution of the adaptive LASSO estimator
- PERFORMANCE LIMITS FOR ESTIMATORS OF THE RISK OR DISTRIBUTION OF SHRINKAGE-TYPE ESTIMATORS, AND SOME GENERAL LOWER RISK-BOUND RESULTS
- Random Packings and Coverings of the Unit n-Sphere
- Sparse estimators and the oracle property, or the return of Hodges' estimator
- THE FINITE-SAMPLE DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS AND UNIFORM VERSUS NONUNIFORM APPROXIMATIONS
- The Conditional Level of Student's $t$ Test
- The Conditional Level of the F-Test
- The Focused Information Criterion
- The distribution of a linear predictor after model selection: unconditional finite-sample distributions and asymptotic approximations
- The distribution of model averaging estimators and an impossibility result regarding its estima\-tion
- Valid post-selection inference
Cited in
(only showing first 100 items - show all)- Rejoinder on: ``Hierarchical inference for genome-wide association studies: a view on methodology with software
- Only closed testing procedures are admissible for controlling false discovery proportions
- In defense of the indefensible: a very naïve approach to high-dimensional inference
- Multicarving for high-dimensional post-selection inference
- Log-linear Bayesian additive regression trees for multinomial logistic and count regression models
- Optimal configurations of lines and a statistical application
- Bayesian Inference Is Unaffected by Selection: Fact or Fiction?
- Targeted Inference Involving High-Dimensional Data Using Nuisance Penalized Regression
- Assumption Lean Regression
- Excess optimism: how biased is the apparent error of an estimator tuned by SURE?
- On the least-squares model averaging interval estimator
- Score Tests With Incomplete Covariates and High-Dimensional Auxiliary Variables
- Conditional selective inference for robust regression and outlier detection using piecewise-linear homotopy continuation
- Regularized projection score estimation of treatment effects in high-dimensional quantile regression
- Larry Brown's contributions to parametric inference, decision theory and foundations: a survey
- Statistical theory powering data science
- Inferactive data analysis
- Spatially relaxed inference on high-dimensional linear models
- Post-model-selection inference in linear regression models: an integrated review
- Frequentist model averaging in structural equation modelling
- Post-selection inference via algorithmic stability
- Selection of mixed copula for association modeling with tied observations
- A multi-resolution theory for approximating infinite-\(p\)-zero-\(n\): transitional inference, individualized predictions, and a world without bias-variance tradeoff
- The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters
- Exploration of the variability of variable selection based on distances between bootstrap sample results
- Penalized likelihood and multiple testing
- Selective inference after feature selection via multiscale bootstrap
- scientific article; zbMATH DE number 7750675 (Why is no real title available?)
- Informative goodness-of-fit for multivariate distributions
- Forward-selected panel data approach for program evaluation
- Inference for \(L_2\)-boosting
- Confidence intervals for parameters in high-dimensional sparse vector autoregression
- Some perspectives on inference in high dimensions
- On the impact of model selection on predictor identification and parameter inference
- Selection-corrected statistical inference for region detection with high-throughput assays
- Post hoc confidence bounds on false positives using reference families
- Constraints versus priors
- Models as approximations. I. Consequences illustrated with linear regression
- Robust Q-learning
- FANOK: knockoffs in linear time
- On the length of post-model-selection confidence intervals conditional on polyhedral constraints
- False Discovery Rate Control via Data Splitting
- On Hodges' superefficiency and merits of oracle property in model selection
- Optimal finite sample post-selection confidence distributions in generalized linear models
- A nonparametric sequential learning procedure for estimating the pure premium
- Neighborhood-based cross fitting approach to treatment effects with high-dimensional data
- Principled statistical inference in data science
- scientific article; zbMATH DE number 7626707 (Why is no real title available?)
- Bayesian semiparametric functional mixed models for serially correlated functional data, with application to glaucoma data
- Statistical proof? The problem of irreproducibility
- Post-selection inference following aggregate level hypothesis testing in large-scale genomic data
- SLOPE-adaptive variable selection via convex optimization
- Exploratory inference: localizing relevant effects with confidence
- On asymptotically optimal confidence regions and tests for high-dimensional models
- Asymptotics of selective inference
- Uniform asymptotic inference and the bootstrap after model selection
- Valid post-selection inference in model-free linear regression
- A bootstrap Lasso+partial ridge method to construct confidence intervals for parameters in high-dimensional sparse linear models
- Uniformly valid confidence sets based on the Lasso
- Confidence sets based on thresholding estimators in high-dimensional Gaussian regression models
- Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Robust inference on average treatment effects with possibly more covariates than observations
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Scalable methods for Bayesian selective inference
- Estimation of selected parameters
- Statistical learning and selective inference
- Selective inference for additive and linear mixed models
- Rejoinder on: ``High-dimensional simultaneous inference with the bootstrap
- Inference for High-Dimensional Censored Quantile Regression
- On the post selection inference constant under restricted isometry properties
- Mixed-effect models with trees
- Uniformly valid inference based on the Lasso in linear mixed models
- A knockoff filter for high-dimensional selective inference
- An automated approach towards sparse single-equation cointegration modelling
- Inference for low‐ and high‐dimensional inhomogeneous Gibbs point processes
- Models as approximations. II. A model-free theory of parametric regression
- Post-selection inference for \(\ell_1\)-penalized likelihood models
- Conditional predictive inference post model selection
- Distribution-free predictive inference for regression
- Valid post-selection inference
- Post-selection point and interval estimation of signal sizes in Gaussian samples
- High-dimensional inference: confidence intervals, \(p\)-values and R-software \texttt{hdi}
- Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap
- Lasso Inference for High-Dimensional Time Series
- Inference after variable selection in linear regression models
- Sparse estimation of Cox proportional hazards models via approximated information criteria
- Testing for Neglected Nonlinearity Using Regularized Artificial Neural Networks
- On various confidence intervals post-model-selection
- Controlling the false discovery rate via knockoffs
- A simulation based method for assessing the statistical significance of logistic regression models after common variable selection procedures
- Weighted-average least squares estimation of generalized linear models
- The robust desparsified lasso and the focused information criterion for high-dimensional generalized linear models
- Asymptotically honest confidence regions for high dimensional parameters by the desparsified conservative Lasso
- Likelihood ratio test in multivariate linear regression: from low to high dimension
- Exact post-selection inference for the generalized Lasso path
- Valid post-selection inference in high-dimensional approximately sparse quantile regression models
- scientific article; zbMATH DE number 7750673 (Why is no real title available?)
- Spatial variable selection and an application to Virginia Lyme disease emergence
- Selective inference for latent block models
- Least-Square Approximation for a Distributed System
This page was built for publication: Valid post-selection inference
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q355109)