Convergence rate of linear two-time-scale stochastic approximation.
From MaRDI portal
Publication:1879892
DOI10.1214/105051604000000116zbMATH Open1094.62103arXivmath/0405287OpenAlexW1985291828MaRDI QIDQ1879892FDOQ1879892
John N. Tsitsiklis, Vijay R. Konda
Publication date: 15 September 2004
Published in: The Annals of Applied Probability (Search for Journal in Brave)
Abstract: We study the rate of convergence of linear two-time-scale stochastic approximation methods. We consider two-time-scale linear iterations driven by i.i.d. noise, prove some results on their asymptotic covariance and establish asymptotic normality. The well-known result [Polyak, B. T. (1990). Automat. Remote Contr. 51 937-946; Ruppert, D. (1988). Technical Report 781, Cornell Univ.] on the optimality of Polyak-Ruppert averaging techniques specialized to linear stochastic approximation is established as a consequence of the general results in this paper.
Full work available at URL: https://arxiv.org/abs/math/0405287
Recommendations
- Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms
- A generalization of the averaging procedure: the use of two-time-scale algorithms
- Stochastic Approximation with Averaging of the Iterates: Optimal Asymptotic Rate of Convergence for General Processes
- Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning
- Convergence rates and decoupling in linear stochastic approximation algorithms
Cites Work
- Stochastic approximation methods for constrained and unconstrained systems
- Title not available (Why is that?)
- Acceleration of Stochastic Approximation by Averaging
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Convergence and convergence rate of iterative stochastic algorithms. I: General case
- OnActor-Critic Algorithms
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- Stochastic approximation with two time scales
- Applications of Singular Perturbation Techniques to Control Problems
- New method of stochastic approximation type
- Stochastic Approximation with Averaging of the Iterates: Optimal Asymptotic Rate of Convergence for General Processes
Cited In (30)
- Generative adversarial networks are special cases of artificial curiosity (1990) and also closely related to predictability minimization (1991)
- Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
- GADE: a generative adversarial approach to density estimation and its applications
- A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic
- Asymptotic behavior of multiscale stochastic partial differential equations with Hölder coefficients
- Sequential online subsampling for thinning experimental designs
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search
- Two-time-scale nonparametric recursive regression estimator for independent functional data
- Online calibrated forecasts: memory efficiency versus universality for learning in games
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs
- Non asymptotic controls on a recursive superquantile approximation
- DIMIX: Diminishing Mixing for Sloppy Agents
- Recursive regression estimation based on the two-time-scale stochastic approximation method and Bernstein polynomials
- Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space
- Networks of reinforced stochastic processes: asymptotics for the empirical means
- Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation
- A stability criterion for two timescale stochastic approximation schemes
- Change-point monitoring for online stochastic approximations
- Weak convergence of dynamical systems in two timescales
- Stochastic approximation algorithms for superquantiles estimation
- Actor-Critic Algorithms with Online Feature Adaptation
- A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
- Towards multi‐agent reinforcement learning‐driven over‐the‐counter market simulations
- Empirical dynamic programming
- Geometrical Insights for Implicit Generative Modeling
- Two-timescale stochastic gradient descent in continuous time with applications to joint online parameter estimation and optimal sensor placement
- Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions
- Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms
- Averaging principle and normal deviations for multiscale stochastic systems
- Fundamental design principles for reinforcement learning algorithms
This page was built for publication: Convergence rate of linear two-time-scale stochastic approximation.
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1879892)