Projective inference in high-dimensional problems: prediction and feature selection

DOI10.1214/20-EJS1711zbMATH Open1476.62058arXiv1810.02406OpenAlexW2894867725MaRDI QIDQ2188473FDOQ2188473

Authors: Juho Piironen, Markus Paasiniemi, Aki Vehtari

Publication date: 11 June 2020

Published in: Electronic Journal of Statistics (Search for Journal in Brave)

Abstract: This paper discusses predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We argue that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the emph{reference model} and the operation during the latter step as predictive emph{projection}. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The benefits are illustrated via several simulated and real world examples.

Full work available at URL: https://arxiv.org/abs/1810.02406

Recommendations

zbMATH Keywords

feature selection prediction sparsity projection post-selection inference

Mathematics Subject Classification ID

Bayesian inference (62F15) Statistical ranking and selection procedures (62F07) Generalized linear models (logistic models) (62J12)

Cites Work

Cited In (10)

Uses Software

This page was built for publication: Projective inference in high-dimensional problems: prediction and feature selection

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2188473)