A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

DOI10.4230/LIPICS.FSTTCS.2017.2zbMATH Open1496.62140arXiv1710.09430MaRDI QIDQ5136291FDOQ5136291

Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford

Publication date: 25 November 2020

Abstract: This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares. This result is obtained by analyzing SGD as a stochastic process and by sharply characterizing the stationary covariance matrix of this process. The finite rate optimality characterization captures the constant factors and addresses model mis-specification.

Full work available at URL: https://arxiv.org/abs/1710.09430

Recommendations

zbMATH Keywords

minimax optimality least squares regression stochastic gradient descent

Mathematics Subject Classification ID

Asymptotic properties of parametric estimators (62F12) Quadratic programming (90C20) Stochastic approximation (62L20) Stochastic programming (90C15)

Cites Work

Cited In (10)

This page was built for publication: A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5136291)