Hypothesis Testing in High-Dimensional Regression Under the Gaussian Random Design Model: Asymptotic Theory
From MaRDI portal
Publication:2986116
Abstract: We consider linear regression in the high-dimensional regime where the number of observations is smaller than the number of parameters . A very successful approach in this setting uses -penalized least squares (a.k.a. the Lasso) to search for a subset of parameters that best explain the data, while setting the other parameters to zero. Considerable amount of work has been devoted to characterizing the estimation and model selection problems within this approach. In this paper we consider instead the fundamental, but far less understood, question of emph{statistical significance}. More precisely, we address the problem of computing p-values for single regression coefficients. On one hand, we develop a general upper bound on the minimax power of tests with a given significance level. On the other, we prove that this upper bound is (nearly) achievable through a practical procedure in the case of random design matrices with independent entries. Our approach is based on a debiasing of the Lasso estimator. The analysis builds on a rigorous characterization of the asymptotic distribution of the Lasso estimator and its debiased version. Our result holds for optimal sample size, i.e., when is at least on the order of . We generalize our approach to random design matrices with i.i.d. Gaussian rows . In this case we prove that a similar distributional characterization (termed `standard distributional limit') holds for much larger than . Finally, we show that for optimal sample size, being at least of order , the standard distributional limit for general Gaussian designs can be derived from the replica heuristics in statistical physics.
Cited in
(56)- SLOPE-adaptive variable selection via convex optimization
- Asymptotic normality of robust M-estimators with convex penalty
- LASSO risk and phase transition under dependence
- Generalized matrix decomposition regression: estimation and inference for two-way structured data
- Asymptotic risk and phase transition of \(l_1\)-penalized robust estimator
- Discussion: ``A significance test for the lasso
- Discussion: ``A significance test for the lasso
- Discussion: ``A significance test for the lasso
- Discussion: ``A significance test for the lasso
- Large-Scale Two-Sample Comparison of Support Sets
- Online Debiasing for Adaptively Collected High-Dimensional Data With Applications to Time Series Analysis
- Ill-posed estimation in high-dimensional models with instrumental variables
- A significance test for the lasso
- On the asymptotic variance of the debiased Lasso
- Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework
- Debiasing the Lasso: optimal sample size for Gaussian designs
- Linear hypothesis testing in dense high-dimensional linear models
- Statistical Inference, Learning and Models in Big Data
- Online rules for control of false discovery rate and false discovery exceedance
- Discussion: ``A significance test for the lasso
- Constructing confidence intervals for the signals in sparse phase retrieval
- Post-model-selection inference in linear regression models: an integrated review
- Efficient estimation of smooth functionals in Gaussian shift models
- Universality of regularized regression estimators in high dimensions
- Estimating structured high-dimensional covariance and precision matrices: optimal rates and adaptive estimation
- Asymptotically efficient estimation of smooth functionals of covariance operators
- Statistical Inference for High-Dimensional Generalized Linear Models With Binary Outcomes
- StarTrek: combinatorial variable selection with false discovery rate control
- One-step regularized estimator for high-dimensional regression models
- A unifying framework of high-dimensional sparse estimation with difference-of-convex (DC) regularizations
- Gene set priorization guided by regulatory networks with p-values through kernel mixed model
- The distribution of the Lasso: uniform control over sparse balls and adaptive parameter tuning
- Asymptotic normality and optimalities in estimation of large Gaussian graphical models
- Gaussian graphical model estimation with false discovery rate control
- Rejoinder: ``A significance test for the lasso
- In defense of the indefensible: a very naïve approach to high-dimensional inference
- On asymptotically optimal confidence regions and tests for high-dimensional models
- The benefit of group sparsity in group inference with de-biased scaled group Lasso
- Asymptotically honest confidence regions for high dimensional parameters by the desparsified conservative Lasso
- Scalable inference for high-dimensional precision matrix
- Enmsp: an elastic-net multi-step screening procedure for high-dimensional regression
- Inference for high-dimensional varying-coefficient quantile regression
- Worst possible sub-directions in high-dimensional models
- Discussion: ``A significance test for the lasso
- Debiasing convex regularized estimators and interval estimation in linear models
- Detangling robustness in high dimensions: composite versus model-averaged estimation
- Flexible and Interpretable Models for Survival Data
- Lasso-driven inference in time and space
- De-biasing the Lasso with degrees-of-freedom adjustment
- Significance testing in non-sparse high-dimensional linear models
- Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models
- Generalized M-estimators for high-dimensional Tobit I models
- Additive model selection
- Semiparametric efficiency bounds for high-dimensional models
- The Lasso with general Gaussian designs with applications to hypothesis testing
- Semi-analytic resampling in Lasso
This page was built for publication: Hypothesis Testing in High-Dimensional Regression Under the Gaussian Random Design Model: Asymptotic Theory
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2986116)