Feature selection when there are many influential features

DOI10.3150/13-BEJ536MaRDI QIDQ396025zbMATH OpenFDO

Authors Hugh Miller, Jiashun Jin, Peter Hall

Publication date 8 August 2014

Published in Bernoulli (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/0911.4076, https://projecteuclid.org/euclid.bj/1402488953

feature selection classification dimension reduction maximum likelihood ranking thresholding change-point analysis logit model

Mathematics Subject Classification ID

Nonparametric hypothesis testing (62G10) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Statistical ranking and selection procedures (62F07)

Abstract: Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly supposed at present) approximately in the range from five to fifty. This change, in orders of magnitude, in the number of influential features, necessitates alterations to the way in which we choose features and to the manner in which the success of feature selection is assessed. In this paper, we suggest a general approach that is suited to cases where the number of relevant features is very large, and we consider particular versions of the approach in detail. We propose ways of measuring performance, and we study both theoretical and numerical properties of the proposed methodology.

Recommendations

Cites work

Cited in

(11)

Describes a project that uses

Uses Software

This page was built for publication: Feature selection when there are many influential features

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q396025)