Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression

From MaRDI portal
Publication:4637017

zbMATH Open1441.62215arXiv1602.05419MaRDI QIDQ4637017FDOQ4637017


Authors: Aymeric Dieuleveut, Nicolas Flammarion, Francis Bach Edit this on Wikidata


Publication date: 17 April 2018

Abstract: We consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting of initial conditions in O(1/n 2), and in terms of dependence on the noise and dimension d of the problem, as O(d/n). Our new algorithm is based on averaged accelerated regularized gradient descent, and may also be analyzed through finer assumptions on initial conditions and the Hessian matrix, leading to dimension-free quantities that may still be small while the " optimal " terms above are large. In order to characterize the tightness of these new bounds, we consider an application to non-parametric regression and use the known lower bounds on the statistical performance (without computational limits), which happen to match our bounds obtained from a single pass on the data and thus show optimality of our algorithm in a wide variety of particular trade-offs between bias and variance.


Full work available at URL: https://arxiv.org/abs/1602.05419




Recommendations




Cites Work


Cited In (18)

Uses Software





This page was built for publication: Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4637017)