Bridging the gap between constant step size stochastic gradient descent and Markov chains (Q2196224): Difference between revisions

The paper deals with the minimization algorithm of a strongly convex objective function given access to unbiased estimates of its gradient through \textit{Stochastic Gradient Descent} (SGD), aka Robbins-Monro algorithm [\textit{H. Robbins} and \textit{S. Monro}, Ann. Math. Stat. 22, 400--407 (1951; Zbl 0054.05901)], with constant step size. As the detailed analysis was only performed for quadratic functions, the authors provide an explicit asymptotic expansion of the moments of the averaged SGD iterates that outlines the dependence on initial conditions, the effect of noise and the step size, as well as the lack of convergence in the general (nonquadratic) case. This analysis is based on tools from Markov chain theory into the analysis of stochastic gradient. It is observed that Richardson-Romberg extrapolation [\textit{G. Pagès}, Monte Carlo Methods Appl. 13, No. 1, 37--70 (2007; Zbl 1119.65004); \textit{N. Frikha} and \textit{L. Huang}, Stochastic Processes Appl. 125, No. 11, 4066--4101 (2015; Zbl 1336.60137)] may be used to get closer to the global optimum. Empirical improvements of the new extrapolation scheme is shown. This methodological problem is of interest in different practical tasks appearing in large-scale machine learning, optimization and stochastic approximation.

0 references

zbMATH Keywords

stochastic gradient descent

0 references

Markov chains

0 references

reviewed by

Krzysztof J. Szajowski

0 references

describes a project that uses

0 references

0 references

0 references

MaRDI publication profile

0 references

cites work

High Order Numerical Approximation of the Invariant Measure of Ergodic SDEs

0 references

On a Perturbation Approach for the Analysis of Stochastic Tracking Algorithms

0 references

Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression

0 references

A Dynamical System Approach to Stochastic Approximations

0 references

Q3997575

0 references

Q3151174

0 references

About the multidimensional competitive learning vector quantization algorithm with constant gain

0 references

Statistical inference for model parameters in stochastic gradient descent

0 references

Theoretical Guarantees for Approximate Sampling from Smooth and Log-Concave Densities

0 references

Nonparametric stochastic approximation with large step-sizes

0 references

Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression

0 references

Nonasymptotic convergence analysis for the unadjusted Langevin algorithm

0 references

Convergence of stochastic algorithms: from the Kushner–Clark theorem to the Lyapounov functional method

0 references

Asymptotic Behavior of a Markovian Stochastic Algorithm with Constant Step

0 references

Q4225410

0 references

A Liapounov bound for solutions of the Poisson equation

0 references

Ordinary Differential Equations

0 references

Q4558562

0 references

Stochastic approximation methods for constrained and unconstrained systems

0 references

An optimal method for stochastic composite optimization

0 references

Analysis of recursive stochastic algorithms

0 references

Q4003497

0 references

Q4637063

0 references

Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise.

0 references

Applications of a Kushner and Clark lemma to general classes of stochastic algorithms

0 references

Théorèmes de convergence presque sure pour une classe d'algorithmes stochastiques à pas decroissant

0 references

Markov Chains and Stochastic Stability

0 references

On recursive estimation for time varying autoregressive processes

0 references

Q2752037

0 references

Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm

0 references

Robust Stochastic Approximation Approach to Stochastic Programming

0 references

Q3967358

0 references

Confidence level solutions for stochastic programming

0 references

Stochastic Minimization with Constant Step-Size: Asymptotic Laws

0 references

Acceleration of Stochastic Approximation by Averaging

0 references

A remark on the stability of the l.m.s. tracking algorithm

0 references

A Stochastic Approximation Method

0 references

Pegasos: primal estimated sub-gradient solver for SVM

0 references

Q4779819

0 references

Asymptotic bias of stochastic gradient search

0 references

Expansion of the global error for numerical schemes solving stochastic differential equations

0 references

Optimal Transport

0 references

Co-Coercivity and Its Role in the Convergence of Iterative Schemes for Solving Variational Inequalities

0 references

Identifiers

zbMATH Open document ID

1454.62242

0 references

DOI

10.1214/19-AOS1850

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2196224

@@ Property / arXiv ID @@
+.06386
@@ Property / arXiv ID: 1707.06386 / rank @@
+Normal rank
@@ Property / cites work @@
+High Order Numerical Approximation of the Invariant Measure of Ergodic SDEs
+Normal rank
@@ Property / cites work @@
+On a Perturbation Approach for the Analysis of Stochastic Tracking Algorithms
+Normal rank
@@ Property / cites work @@
+Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression
+Normal rank
@@ Property / cites work @@
+A Dynamical System Approach to Stochastic Approximations
+Normal rank
@@ Property / cites work @@
+Q3997575
@@ Property / cites work: Q3997575 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3151174
@@ Property / cites work: Q3151174 / rank @@
+Normal rank
@@ Property / cites work @@
+About the multidimensional competitive learning vector quantization algorithm with constant gain
+Normal rank
@@ Property / cites work @@
+Statistical inference for model parameters in stochastic gradient descent
+Normal rank
@@ Property / cites work @@
+Theoretical Guarantees for Approximate Sampling from Smooth and Log-Concave Densities
+Normal rank
@@ Property / cites work @@
+Nonparametric stochastic approximation with large step-sizes
+Normal rank
@@ Property / cites work @@
+Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
+Normal rank
@@ Property / cites work @@
+Nonasymptotic convergence analysis for the unadjusted Langevin algorithm
+Normal rank
@@ Property / cites work @@
+Convergence of stochastic algorithms: from the Kushner–Clark theorem to the Lyapounov functional method
+Normal rank
@@ Property / cites work @@
+Asymptotic Behavior of a Markovian Stochastic Algorithm with Constant Step
+Normal rank
@@ Property / cites work @@
+Q4225410
@@ Property / cites work: Q4225410 / rank @@
+Normal rank
@@ Property / cites work @@
+A Liapounov bound for solutions of the Poisson equation
+Normal rank
@@ Property / cites work @@
+Ordinary Differential Equations
@@ Property / cites work: Ordinary Differential Equations / rank @@
+Normal rank
@@ Property / cites work @@
+Q4558562
@@ Property / cites work: Q4558562 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation methods for constrained and unconstrained systems
+Normal rank
@@ Property / cites work @@
+An optimal method for stochastic composite optimization
+Normal rank
@@ Property / cites work @@
+Analysis of recursive stochastic algorithms
@@ Property / cites work: Analysis of recursive stochastic algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Q4003497
@@ Property / cites work: Q4003497 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4637063
@@ Property / cites work: Q4637063 / rank @@
+Normal rank
@@ Property / cites work @@
+Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise.
+Normal rank
@@ Property / cites work @@
+Applications of a Kushner and Clark lemma to general classes of stochastic algorithms
+Normal rank
@@ Property / cites work @@
+Théorèmes de convergence presque sure pour une classe d'algorithmes stochastiques à pas decroissant
+Normal rank
@@ Property / cites work @@
+Markov Chains and Stochastic Stability
@@ Property / cites work: Markov Chains and Stochastic Stability / rank @@
+Normal rank
@@ Property / cites work @@
+On recursive estimation for time varying autoregressive processes
+Normal rank
@@ Property / cites work @@
+Q2752037
@@ Property / cites work: Q2752037 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
+Normal rank
@@ Property / cites work @@
+Robust Stochastic Approximation Approach to Stochastic Programming
+Normal rank
@@ Property / cites work @@
+Q3967358
@@ Property / cites work: Q3967358 / rank @@
+Normal rank
@@ Property / cites work @@
+Confidence level solutions for stochastic programming
+Normal rank
@@ Property / cites work @@
+Stochastic Minimization with Constant Step-Size: Asymptotic Laws
+Normal rank
@@ Property / cites work @@
+Acceleration of Stochastic Approximation by Averaging
+Normal rank
@@ Property / cites work @@
+A remark on the stability of the l.m.s. tracking algorithm
+Normal rank
@@ Property / cites work @@
+A Stochastic Approximation Method
@@ Property / cites work: A Stochastic Approximation Method / rank @@
+Normal rank
@@ Property / cites work @@
+Pegasos: primal estimated sub-gradient solver for SVM
+Normal rank
@@ Property / cites work @@
+Q4779819
@@ Property / cites work: Q4779819 / rank @@
+Normal rank
@@ Property / cites work @@
+Asymptotic bias of stochastic gradient search
@@ Property / cites work: Asymptotic bias of stochastic gradient search / rank @@
+Normal rank
@@ Property / cites work @@
+Expansion of the global error for numerical schemes solving stochastic differential equations
+Normal rank
@@ Property / cites work @@
+Optimal Transport
@@ Property / cites work: Optimal Transport / rank @@
+Normal rank
@@ Property / cites work @@
+Co-Coercivity and Its Role in the Convergence of Iterative Schemes for Solving Variational Inequalities
+Normal rank