Feature selection when there are many influential features
From MaRDI portal
Abstract: Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly supposed at present) approximately in the range from five to fifty. This change, in orders of magnitude, in the number of influential features, necessitates alterations to the way in which we choose features and to the manner in which the success of feature selection is assessed. In this paper, we suggest a general approach that is suited to cases where the number of relevant features is very large, and we consider particular versions of the approach in detail. We propose ways of measuring performance, and we study both theoretical and numerical properties of the proposed methodology.
Recommendations
- 10.1162/153244303322753616
- On selecting interacting features from high-dimensional data
- High-dimensional classification when useful information comes from many, perhaps all features
- Ultrahigh dimensional feature selection: beyond the linear model
- Model selection for classification with a large number of classes
Cites work
- scientific article; zbMATH DE number 720689 (Why is no real title available?)
- scientific article; zbMATH DE number 1048663 (Why is no real title available?)
- scientific article; zbMATH DE number 1487502 (Why is no real title available?)
- scientific article; zbMATH DE number 1485432 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A comparison of the Lasso and marginal regression
- Adapting to unknown sparsity by controlling the false discovery rate
- Atomic Decomposition by Basis Pursuit
- Better Subset Regression Using the Nonnegative Garrote
- Feature selection by higher criticism thresholding achieves the optimal phase diagram
- For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution
- For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution
- Heuristics of instability and stabilization in model selection
- High-dimensional classification using features annealed independence rules
- Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
- Impossibility of successful classification when useful features are rare and weak
- Inference for change point and post change means after a CUSUM test.
- Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ 1 minimization
- Pattern classification.
- Recovery of Short, Complex Linear Combinations Via<tex>$ell _1$</tex>Minimization
- Some theory for Fisher's linear discriminant function, `naive Bayes', and some alternatives when there are many more variables than observations
- Statistical challenges with high dimensionality: feature selection in knowledge discovery
- Strong approximations of level exceedences related to multiple hypothesis testing
- Sure independence screening in generalized linear models with NP-dimensionality
- The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). (With discussions and rejoinder).
- The elements of statistical learning. Data mining, inference, and prediction
- Uncertainty principles and ideal atomic decomposition
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Cited in
(11)- Impossibility of successful classification when useful features are rare and weak
- Effect of heavy tails on ultra high dimensional variable ranking methods
- Extending greedy feature selection algorithms to multiple solutions
- Efficient selection of feature sets possessing high coefficients of determination based on incremental determinations
- Linear hypothesis testing in dense high-dimensional linear models
- scientific article; zbMATH DE number 2090243 (Why is no real title available?)
- Streamwise feature selection
- Risk of selection of irrelevant features from high-dimensional data with small sample size
- Rare feature selection in high dimensions
- Detection boundary and higher criticism approach for rare and weak genetic effects
- Model selection for classification with a large number of classes
This page was built for publication: Feature selection when there are many influential features
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q396025)