A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

From MaRDI portal
Publication:5136291

DOI10.4230/LIPICS.FSTTCS.2017.2zbMATH Open1496.62140arXiv1710.09430MaRDI QIDQ5136291FDOQ5136291


Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford Edit this on Wikidata


Publication date: 25 November 2020

Abstract: This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares. This result is obtained by analyzing SGD as a stochastic process and by sharply characterizing the stationary covariance matrix of this process. The finite rate optimality characterization captures the constant factors and addresses model mis-specification.


Full work available at URL: https://arxiv.org/abs/1710.09430




Recommendations




Cites Work


Cited In (9)





This page was built for publication: A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5136291)