Grouped variable importance with random forests and application to multiple functional data analysis

From MaRDI portal
Publication:1663198

DOI10.1016/J.CSDA.2015.04.002zbMATH Open1468.62069arXiv1411.4170OpenAlexW2097296149MaRDI QIDQ1663198FDOQ1663198

Baptiste Gregorutti, Philippe Saint-Pierre, Bertrand Michel

Publication date: 21 August 2018

Published in: Computational Statistics and Data Analysis (Search for Journal in Brave)

Abstract: The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.


Full work available at URL: https://arxiv.org/abs/1411.4170




Recommendations




Cites Work


Cited In (19)

Uses Software





This page was built for publication: Grouped variable importance with random forests and application to multiple functional data analysis

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1663198)