Robust Variable and Interaction Selection for Logistic Regression and General Index Models

DOI10.1080/01621459.2017.1401541MaRDI QIDQ5229910zbMATH OpenOpenAlexWikidataFDO

Publication date 19 August 2019

Published in Journal of the American Statistical Association (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1611.08649, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7451675

classification semiparametric high-dimensional quadratic discriminant analysis stepwise selection forward screening

Nonparametric regression and quantile regression (62G08) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Generalized linear models (logistic models) (62J12) Robustness and adaptive procedures (parametric inference) (62F35)

Abstract: We propose Stepwise cOnditional likelihood variable selection for Discriminant Analysis (SODA) to detect both main and quadratic interaction effects in logistic regression and quadratic discriminant analysis (QDA) models. In the forward stage, SODA adds in important predictors evaluated based on their overall contributions, whereas in the backward stage SODA removes unimportant terms so as to optimize the extended Bayesian Information Criterion (EBIC). Compared with existing methods on QDA variable selections, SODA can deal with high-dimensional data with the number of predictors much larger than the sample size and does not require the joint normality assumption on predictors, leading to much enhanced robustness. We further extend SODA to conduct variable selection and model fitting for multiple index models. Compared with existing variable selection methods based on the Sliced Inverse Regression (SIR) (Li 1991), SODA requires neither the linearity nor the constant variance condition and is much more robust. Our theoretical analyses establish the variable-selection consistency of SODA under high-dimensional settings, and our simulation studies as well as real-data applications demonstrate superior performances of SODA in dealing with non-Gaussian design matrices in both classification problems and multiple index models.

Recommendations

Cites work

Cited in

(12)

Describes a project that uses

Uses Software

This page was built for publication: Robust Variable and Interaction Selection for Logistic Regression and General Index Models

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5229910)