A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)
DOI10.4230/LIPICS.FSTTCS.2017.2zbMATH Open1496.62140arXiv1710.09430MaRDI QIDQ5136291FDOQ5136291
Authors: Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford
Publication date: 25 November 2020
Full work available at URL: https://arxiv.org/abs/1710.09430
Recommendations
- Bridging the gap between constant step size stochastic gradient descent and Markov chains
- Finite-Time Analysis of Markov Gradient Descent
- Publication:3491316
- On optimal probabilities in stochastic coordinate descent methods
- The minimax stochastic optimization model of the Gauss-Markov nonlinear model's coefficients under the quadratic loss function
- The method of generalized stochastic gradient for solving minimax problems with constrained variables
- Optimal survey schemes for stochastic gradient descent with applications to \(M\)-estimation
- Minimizing finite sums with the stochastic average gradient
- scientific article; zbMATH DE number 125279
- Stochastic Minimization with Constant Step-Size: Asymptotic Laws
Asymptotic properties of parametric estimators (62F12) Quadratic programming (90C20) Stochastic approximation (62L20) Stochastic programming (90C15)
Cites Work
- Title not available (Why is that?)
- Stochastic approximation methods for constrained and unconstrained systems
- Nonparametric stochastic approximation with large step-sizes
- Acceleration of Stochastic Approximation by Averaging
- Title not available (Why is that?)
- Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression
Cited In (9)
- Lower error bounds for the stochastic gradient descent optimization algorithm: sharp convergence rates for slowly and fast decaying learning rates
- A Markovian Incremental Stochastic Subgradient Algorithm
- On the regularizing property of stochastic gradient descent
- Optimal rates for multi-pass stochastic gradient methods
- Stability and optimization error of stochastic gradient descent for pairwise learning
- On the regularization effect of stochastic gradient descent applied to least-squares
- Making the last iterate of SGD information theoretically optimal
- Bridging the gap between constant step size stochastic gradient descent and Markov chains
- Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
This page was built for publication: A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5136291)