Distribution-free robust linear regression

DOI10.4171/MSL/27MaRDI QIDQ2113267zbMATH OpenOpenAlexFDO

Authors Jaouad Mourtada, Tomas Vaškevičius, Nikita Zhivotovskiy

Publication date 11 March 2022

Published in Mathematical Statistics and Learning (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/2102.12919

zbMATH Keywords

least squares robust estimation improper learning median-of-means tournaments random design linear regression

Mathematics Subject Classification ID

Nonparametric robustness (62G35) Linear regression; mixed models (62J05)

Abstract: We study random design linear regression with no assumptions on the distribution of the covariates and with a heavy-tailed response variable. In this distribution-free regression setting, we show that boundedness of the conditional second moment of the response given the covariates is a necessary and sufficient condition for achieving nontrivial guarantees. As a starting point, we prove an optimal version of the classical in-expectation bound for the truncated least squares estimator due to Gy"{o}rfi, Kohler, Krzy.{z}ak, and Walk. However, we show that this procedure fails with constant probability for some distributions despite its optimal in-expectation performance. Then, combining the ideas of truncated least squares, median-of-means procedures, and aggregation theory, we construct a non-linear estimator achieving excess risk of order

d / n

with an optimal sub-exponential tail. While existing approaches to linear regression for heavy-tailed distributions focus on proper estimators that return linear functions, we highlight that the improperness of our procedure is necessary for attaining nontrivial guarantees in the distribution-free setting.

Recommendations

Cites work

Cited in

(6)

This page was built for publication: Distribution-free robust linear regression

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2113267)