Actor-critic algorithms for hierarchical Markov decision processes
From MaRDI portal
Publication:856510
DOI10.1016/j.automatica.2005.12.010zbMath1102.93043OpenAlexW2027220971MaRDI QIDQ856510
J. Ranjan Panigrahi, Shalabh Bhatnagar
Publication date: 7 December 2006
Published in: Automatica (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.automatica.2005.12.010
optimal controllearning algorithmsMarkov decision processesstochastic approximationhierarchical decision making
Optimal stochastic control (93E20) Stochastic learning and adaptive control (93E35) Stochastic systems in control theory (general) (93E03)
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- A one-measurement form of simultaneous perturbation stochastic approximation
- Asynchronous stochastic approximation and Q-learning
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Multilayer control of large Markov chains
- OnActor-Critic Algorithms
- Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- Multitime scale markov decision processes
- A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes