Interaction pursuit in high-dimensional multi-response regression via distance correlation

From MaRDI portal
Publication:2012211

DOI10.1214/16-AOS1474zbMATH Open1368.62140arXiv1605.03315OpenAlexW2964150627MaRDI QIDQ2012211FDOQ2012211

Jinchi Lv, Daoji Li, Yinfei Kong, Yingying Fan

Publication date: 28 July 2017

Published in: The Annals of Statistics (Search for Journal in Brave)

Abstract: Feature interactions can contribute to a large proportion of variation in many prediction models. In the era of big data, the coexistence of high dimensionality in both responses and covariates poses unprecedented challenges in identifying important interactions. In this paper, we suggest a two-stage interaction identification method, called the interaction pursuit via distance correlation (IPDC), in the setting of high-dimensional multi-response interaction models that exploits feature screening applied to transformed variables with distance correlation followed by feature selection. Such a procedure is computationally efficient, generally applicable beyond the heredity assumption, and effective even when the number of responses diverges with the sample size. Under mild regularity conditions, we show that this method enjoys nice theoretical properties including the sure screening property, support union recovery, and oracle inequalities in prediction and estimation for both interactions and main effects. The advantages of our method are supported by several simulation studies and real data analysis.


Full work available at URL: https://arxiv.org/abs/1605.03315




Recommendations





Cited In (36)





This page was built for publication: Interaction pursuit in high-dimensional multi-response regression via distance correlation

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2012211)