Asymptotic bias of stochastic gradient search (Q1704136): Difference between revisions

There is an investigation on the asymptotic behavior of biased stochastic gradient search. The following algorithm is analyzed: \({\theta}_{n+1}\) = \({\theta}_{n}\)- \({\alpha}_{n}({\nabla}f({\theta}_{n})+{\xi}_{n})\) , \(n{\geq}0\). Under a set of assumptions regarding the step-size sequence, the noise and the objective function f , the convergence of the algorithm iterates to a neighborhood of the set of minima, is proved. Upper bounds on the radius of the vicinity are obtained. The results are local, they hold only in case the stated algorithm is stable. The proofs are relaying on the chain-recurrence, Yomdin theorem and Lojasiewicz inequalities. Further , the obtained results are applied to stochastic gradient algorithms with Markovian dynamics and to the asymptotic analysis of a policy-gradient search algorithm for average-cost Markov decision problems. Global versions of the results are presented in the Appendix A and Appendix B of the paper. There is stipulated that an extended version of this article is available at \url{arXiv:1709.00291}.

0 references

reviewed by

Claudia Simionescu-Badea

0 references

zbMATH Keywords

stochastic gradient search

0 references

biased gradient estimation

0 references

chain-recurrence

0 references

Yomdin theorem

0 references

Lojasiewicz inequalities

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q3324260

0 references

Q4533362

0 references

A Dynamical System Approach to Stochastic Approximations

0 references

Q4938927

0 references

Stochastic Approximations and Differential Inclusions

0 references

Perturbations of set-valued dynamical systems, with applications to game theory

0 references

Q3997575

0 references

Q4257216

0 references

Gradient Convergence in Gradient methods with Errors

0 references

Semianalytic and subanalytic sets

0 references

Stochastic approximation. A dynamical systems viewpoint.

0 references

The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning

0 references

Inference in hidden Markov models.

0 references

Stochastic approximation and its applications

0 references

Robustness analysis for stochastic approximation algorithms

0 references

Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds

0 references

Q2871232

0 references

Chain recurrence, semiflows, and gradients

0 references

Q2771497

0 references

OnActor-Critic Algorithms

0 references

On gradients of functions definable in o-minimal structures

0 references

Q4421713

0 references

Sur le problème de la division

0 references

On semi- and subanalytic geometry

0 references

Applications of a Kushner and Clark lemma to general classes of stochastic algorithms

0 references

Markov Chains and Stochastic Stability

0 references

Q4335417

0 references

Approximate Dynamic Programming

0 references

Particle approximations of the score and observed information matrix in state space models with application to parameter estimation

0 references

Introduction to Stochastic Search and Optimization

0 references

Analyticity, Convergence, and Convergence Rate of Recursive Maximum-Likelihood Estimation in Hidden Markov Models

0 references

The geometry of critical and near-critical values of differentiable mappings

0 references

Identifiers

zbMATH Open document ID

1387.49044

0 references

DOI

10.1214/16-AAP1272

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1704136

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / arXiv ID @@
+.00291
@@ Property / arXiv ID: 1709.00291 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3324260
@@ Property / cites work: Q3324260 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4533362
@@ Property / cites work: Q4533362 / rank @@
+Normal rank
@@ Property / cites work @@
+A Dynamical System Approach to Stochastic Approximations
+Normal rank
@@ Property / cites work @@
+Q4938927
@@ Property / cites work: Q4938927 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic Approximations and Differential Inclusions
+Normal rank
@@ Property / cites work @@
+Perturbations of set-valued dynamical systems, with applications to game theory
+Normal rank
@@ Property / cites work @@
+Q3997575
@@ Property / cites work: Q3997575 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Gradient Convergence in Gradient methods with Errors
+Normal rank
@@ Property / cites work @@
+Semianalytic and subanalytic sets
@@ Property / cites work: Semianalytic and subanalytic sets / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation. A dynamical systems viewpoint.
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Inference in hidden Markov models.
@@ Property / cites work: Inference in hidden Markov models. / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation and its applications
@@ Property / cites work: Stochastic approximation and its applications / rank @@
+Normal rank
@@ Property / cites work @@
+Robustness analysis for stochastic approximation algorithms
+Normal rank
@@ Property / cites work @@
+Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds
+Normal rank
@@ Property / cites work @@
+Q2871232
@@ Property / cites work: Q2871232 / rank @@
+Normal rank
@@ Property / cites work @@
+Chain recurrence, semiflows, and gradients
@@ Property / cites work: Chain recurrence, semiflows, and gradients / rank @@
+Normal rank
@@ Property / cites work @@
+Q2771497
@@ Property / cites work: Q2771497 / rank @@
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+On gradients of functions definable in o-minimal structures
+Normal rank
@@ Property / cites work @@
+Q4421713
@@ Property / cites work: Q4421713 / rank @@
+Normal rank
@@ Property / cites work @@
+Sur le problème de la division
@@ Property / cites work: Sur le problème de la division / rank @@
+Normal rank
@@ Property / cites work @@
+On semi- and subanalytic geometry
@@ Property / cites work: On semi- and subanalytic geometry / rank @@
+Normal rank
@@ Property / cites work @@
+Applications of a Kushner and Clark lemma to general classes of stochastic algorithms
+Normal rank
@@ Property / cites work @@
+Markov Chains and Stochastic Stability
@@ Property / cites work: Markov Chains and Stochastic Stability / rank @@
+Normal rank
@@ Property / cites work @@
+Q4335417
@@ Property / cites work: Q4335417 / rank @@
+Normal rank
@@ Property / cites work @@
+Approximate Dynamic Programming
@@ Property / cites work: Approximate Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Particle approximations of the score and observed information matrix in state space models with application to parameter estimation
+Normal rank
@@ Property / cites work @@
+Introduction to Stochastic Search and Optimization
+Normal rank
@@ Property / cites work @@
+Analyticity, Convergence, and Convergence Rate of Recursive Maximum-Likelihood Estimation in Hidden Markov Models
+Normal rank
@@ Property / cites work @@
+The geometry of critical and near-critical values of differentiable mappings
+Normal rank
@@ Property / OpenAlex ID @@
+W2591423585
@@ Property / OpenAlex ID: W2591423585 / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:1704136