An actor-critic algorithm for constrained Markov decision processes
From MaRDI portal
Publication:2504518
DOI10.1016/j.sysconle.2004.08.007zbMath1129.90322MaRDI QIDQ2504518
Publication date: 25 September 2006
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.sysconle.2004.08.007
stochastic approximation; reinforcement learning; envelope theorem; actor-critic algorithms; constrained Markov decision processes
Related Items
Optimal Distributed Uplink Channel Allocation: A Constrained MDP Formulation, An online actor-critic algorithm with function approximation for constrained Markov decision processes, Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization, An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes, A new learning algorithm for optimal stopping, A note on linear function approximation using random projections, Opportunistic Transmission over Randomly Varying Channels
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Stochastic approximation with two time scales
- An analysis of temporal-difference learning with function approximation
- OnActor-Critic Algorithms
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- Envelope Theorems for Arbitrary Choice Sets
- Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations