A power analysis for Model-X knockoffs with _p-regularized statistics
From MaRDI portal
Publication:6136579
Abstract: Variable selection properties of procedures utilizing penalized-likelihood estimates is a central topic in the study of high dimensional linear regression problems. Existing literature emphasizes the quality of ranking of the variables by such procedures as reflected in the receiver operating characteristic curve or in prediction performance. Specifically, recent works have harnessed modern theory of approximate message-passing (AMP) to obtain, in a particular setting, exact asymptotic predictions of the type I-type II error tradeoff for selection procedures that rely on -regularized estimators. In practice, effective ranking by itself is often not sufficient because some calibration for Type I error is required. In this work we study theoretically the power of selection procedures that similarly rank the features by the size of an -regularized estimator, but further use Model-X knockoffs to control the false discovery rate in the realistic situation where no prior information about the signal is available. In analyzing the power of the resulting procedure, we extend existing results in AMP theory to handle the pairing between original variables and their knockoffs. This is used to derive exact asymptotic predictions for power. We apply the general results to compare the power of the knockoffs versions of Lasso and thresholded-Lasso selection, and demonstrate that in the i.i.d. covariate setting under consideration, tuning by cross-validation on the augmented design matrix is nearly optimal. We further demonstrate how the techniques allow to analyze also the Type S error, and a corresponding notion of power, when selections are supplemented with a decision on the sign of the coefficient.
Recommendations
- The \(k\)th power expectile regression
- Likelihood-based inference for the power regression model
- A high-dimensional power analysis of the conditional randomization test and knockoffs
- A unified approach to power calculation and sample size determination for random regression models
- Improved kth power expectile regression with nonignorable dropouts
- The exact power function of an exact test of a regression model against multiple separate alternatives
- Explicit estimators of an unknown parameter in a power regression problem
- On improvement of statistical estimators in a power regression problem
Cites work
- scientific article; zbMATH DE number 5957408 (Why is no real title available?)
- A high-dimensional power analysis of the conditional randomization test and knockoffs
- A knockoff filter for high-dimensional selective inference
- A necessary and sufficient condition for exact sparse recovery by \(\ell_1\) minimization
- A power analysis for Model-X knockoffs with \(\ell_p\)-regularized statistics
- Adaptive false discovery rate control under independence and dependence
- Adaptive linear step-up procedures that control the false discovery rate
- Conclusions vs Decisions
- Controlling the false discovery rate via knockoffs
- False discoveries occur early on the Lasso path
- High-dimensional graphs and variable selection with the Lasso
- Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing
- On the sign recovery by least absolute shrinkage and selection operator, thresholded least absolute shrinkage and selection operator, and thresholded basis pursuit denoising
- Overcoming the limitations of phase transition by higher order analysis of regularization techniques
- Panning for Gold: ‘Model-X’ Knockoffs for High Dimensional Controlled Variable Selection
- Rate minimaxity of the Lasso and Dantzig selector for the \(l_{q}\) loss in \(l_{r}\) balls
- SLOPE-adaptive variable selection via convex optimization
- Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso)
- Statistics for high-dimensional data. Methods, theory and applications.
- The Adaptive Lasso and Its Oracle Properties
- The Dynamics of Message Passing on Dense Graphs, with Applications to Compressed Sensing
- The LASSO Risk for Gaussian Matrices
- The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
- Type S error for classical and Bayesian single and multiple comparison procedures
- Which bridge estimator is the best for variable selection?
Cited in
(3)
This page was built for publication: A power analysis for Model-X knockoffs with \(\ell_p\)-regularized statistics
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6136579)