New algorithms of the Q-learning type
From MaRDI portal
Publication:2440701
DOI10.1016/j.automatica.2007.09.009zbMath1283.93328OpenAlexW2118458590MaRDI QIDQ2440701
Shalabh Bhatnagar, K. Mohan Babu
Publication date: 19 March 2014
Published in: Automatica (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.automatica.2007.09.009
Learning and adaptive systems in artificial intelligence (68T05) Stochastic learning and adaptive control (93E35)
Related Items (3)
A constrained optimization perspective on actor-critic algorithms and application to network routing ⋮ Multiscale Q-learning with linear function approximation ⋮ Approximate stochastic annealing for online control of infinite horizon Markov decision processes
Cites Work
- Unnamed Item
- A one-measurement form of simultaneous perturbation stochastic approximation
- Asynchronous stochastic approximation and Q-learning
- Stochastic approximation with two time scales
- \({\mathcal Q}\)-learning
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
This page was built for publication: New algorithms of the Q-learning type