Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint
From MaRDI portal
Publication:2242923
DOI10.1016/j.sysconle.2021.104988OpenAlexW3185667776WikidataQ115036591 ScholiaQ115036591MaRDI QIDQ2242923
L. A. Prashanth, Nithia Vijayan
Publication date: 10 November 2021
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2101.02137
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- Global versus local asymptotic theories of finite-dimensional normed spaces
- Introductory lectures on convex optimization. A basic course.
- On the information-adaptive variants of the ADMM: an iteration complexity perspective
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Random gradient-free minimization of convex functions
- Convergence of a class of random search algorithms
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Simulation-based optimization of Markov reward processes
- Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
- A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes
- An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
This page was built for publication: Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint