Split Regression Modeling

DOI10.48550/ARXIV.1812.05678MaRDI QIDQ119097arXivFDO

Authors Anthony Christidis, Stefan Van Aelst, Ruben Zamar

Publication date 13 December 2018

Abstract: Sparse methods are the standard approach to obtain interpretable models with high prediction accuracy. Alternatively, algorithmic ensemble methods can achieve higher prediction accuracy at the cost of loss of interpretability. However, the use of blackbox methods has been heavily criticized for high-stakes decisions and it has been argued that there does not have to be a trade-off between accuracy and interpretability. To combine high accuracy with interpretability, we generalize best subset selection to best split selection. Best split selection constructs a small number of sparse models learned jointly from the data which are then combined in an ensemble. Best split selection determines the models by splitting the available predictor variables among the different models when fitting the data. The proposed methodology results in an ensemble of sparse and diverse models that each provide a possible explanation for the relationship between the predictors and the response. The high computational cost of best split selection motivates the need for computational tractable approximations. We evaluate a method developed by Christidis et al. (2020) which can be seen as a multi-convex relaxation of best split selection.

Cited in

(2)

This page was built for publication: Split Regression Modeling

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q119097)