High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity

DOI10.1214/12-AOS1018MaRDI QIDQ693741zbMATH OpenOpenAlexFDO

Authors Po-Ling Loh, Martin J. Wainwright

Publication date 10 December 2012

Published in The Annals of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1109.3714, https://projecteuclid.org/euclid.aos/1346850068

M-estimation regularization sparse linear regression

Asymptotic properties of parametric estimators (62F12) Linear regression; mixed models (62J05) Numerical optimization and variational techniques (65K10) Estimation in multivariate analysis (62H12) Approximation algorithms (68W25)

Abstract: Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.

Recommendations

Cites work

Cited in

(only showing first 100 items - show all)

Describes a project that uses

Uses Software

PDCO

This page was built for publication: High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q693741)