Feature selection by higher criticism thresholding achieves the optimal phase diagram
From MaRDI portal
Publication:3559955
DOI10.1098/rsta.2009.0129zbMath1185.62113arXiv0812.2263OpenAlexW3100205528WikidataQ51787285 ScholiaQ51787285MaRDI QIDQ3559955
Publication date: 8 May 2010
Published in: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/0812.2263
phase diagramfalse discovery ratelinear classificationasymptotic rare/weak modelfeature selection by thresholdingFisher's separation measure
Classification and discrimination; cluster analysis (statistical aspects) (62H30) Order statistics; empirical distribution functions (62G30)
Related Items
Goodness of fit tests in terms of local levels with special emphasis on higher criticism tests, Adaptive threshold-based classification of sparse high-dimensional data, Higher criticism to compare two large frequency tables, with sensitivity to possible rare and weak differences, High dimensional classifiers in the imbalanced case, Sparse microwave imaging: principles and applications, The intermediates take it all: Asymptotics of higher criticism statistics and a powerful alternative based on equal local levels, Feature selection when there are many influential features, Signal localization: a new approach in signal discovery, Identifying the support of rectangular signals in Gaussian noise, Estimating the amount of sparsity in two-point mixture models, Detection boundary in sparse regression, Optimal classification in sparse Gaussian graphic model, Asymptotics of goodness-of-fit tests based on minimum p-value statistics, Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing, Classification of sparse high-dimensional vectors, Fast rate of convergence in high-dimensional linear discriminant analysis, Classification with many classes: challenges and pluses, Innovated higher criticism for detecting sparse signals in correlated noise, Two-group classification with high-dimensional correlated data: a factor model approach, Higher Criticism for Discriminating Word-Frequency Tables and Testing Authorship, Goodness-of-Fit Tests Based on Sup-Functionals of Weighted Empirical Processes, Tight conditions for consistency of variable selection in the context of high dimensionality, Optimal Detection of Heterogeneous and Heteroscedastic Mixtures, Signal detection via Phi-divergences for general mixtures, Higher criticism for large-scale inference, especially for rare and weak effects, Using visual statistical inference to better understand random class separations in high dimension, low sample size data
Cites Work
- High-dimensional classification using features annealed independence rules
- Some theory for Fisher's linear discriminant function, `naive Bayes', and some alternatives when there are many more variables than observations
- Higher criticism for detecting sparse heterogeneous mixtures.
- Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences
- Asymptotic minimaxity of false discovery rate thresholding for sparse exponential data
- Goodness-of-fit tests via phi-divergences
- Estimation and confidence sets for sparse normal mixtures
- Properties of higher criticism under strong dependence
- Adapting to unknown sparsity by controlling the false discovery rate
- Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
- Impossibility of successful classification when useful features are rare and weak
- Classification of sparse high-dimensional vectors