Derandomizing Knockoffs
From MaRDI portal
Abstract: Model-X knockoffs is a general procedure that can leverage any feature importance measure to produce a variable selection algorithm, which discovers true effects while rigorously controlling the number or fraction of false positives. Model-X knockoffs is a randomized procedure which relies on the one-time construction of synthetic (random) variables. This paper introduces a derandomization method by aggregating the selection results across multiple runs of the knockoffs algorithm. The derandomization step is designed to be flexible and can be adapted to any variable selection base procedure to yield stable decisions without compromising statistical power. When applied to the base procedure of Janson et al. (2016), we prove that derandomized knockoffs controls both the per family error rate (PFER) and the k family-wise error rate (k-FWER). Further, we carry out extensive numerical studies demonstrating tight type-I error control and markedly enhanced power when compared with alternative variable selection algorithms. Finally, we apply our approach to multi-stage genome-wide association studies of prostate cancer and report locations on the genome that are significantly associated with the disease. When cross-referenced with other studies, we find that the reported associations have been replicated.
Recommendations
- Pseudorandomness when the odds are against you
- Pseudorandom Generators and Typically-Correct Derandomization
- Randomizing without randomness
- scientific article; zbMATH DE number 2156272
- scientific article; zbMATH DE number 1820025
- Some results on derandomization
- scientific article; zbMATH DE number 1962815
- Pseudorandomness
- Pseudorandomness
- scientific article; zbMATH DE number 1670863
Cites work
- scientific article; zbMATH DE number 3624650 (Why is no real title available?)
- 10.1162/153244303321897735
- A knockoff filter for high-dimensional selective inference
- A sharper Bonferroni procedure for multiple tests of significance
- Analyzing bagging
- Bagging predictors
- Balanced control of generalized error rates
- Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions
- Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data
- Controlling the false discovery rate via knockoffs
- Discussion of ``Multiple testing for exploratory research by J. J. Goeman and A. Solari
- Ensemble machine learning. Methods and applications
- Exact post-selection inference, with application to the Lasso
- Familywise error rate control via knockoffs
- Further results on controlling the false discovery proportion
- Gene hunting with hidden Markov model knockoffs
- Graph estimation with joint additive models
- High-dimensional variable selection
- Inference on treatment effects after selection among high-dimensional controls
- Knockoffs with side information
- Multiple Comparisons Among Means
- Panning for Gold: ‘Model-X’ Knockoffs for High Dimensional Controlled Variable Selection
- Powerful knockoffs via minimizing reconstructability
- Random forests
- Selective inference with unknown variance via the square-root Lasso
- Stability selection. With discussion and authors' reply
- Sure independence screening for ultrahigh dimensional feature space. With discussion and authors' reply
- The control of the false discovery rate in multiple testing under dependency.
- Using iterated bagging to debias regressions
- Variable Selection with Error Control: Another Look at Stability Selection
Cited in
(10)- False Discovery Rate Control via Data Splitting
- A Scale-Free Approach for False Discovery Rate Control in Generalized Linear Models
- Revisiting feature selection for linear models with FDR and power guarantees
- ARK: robust knockoffs inference with coupling
- Robust inference with knockoffs
- Variable selection in latent variable models via knockoffs: an application to international large-scale assessment in education
- Adaptive Selection for False Discovery Rate Control Leveraging Symmetry
- Overview of research advance for knockoff methods
- Inference for the Dimension of a Regression Relationship Using Pseudo-Covariates
- Kernel Knockoffs Selection for Nonparametric Additive Models
This page was built for publication: Derandomizing Knockoffs
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6165283)