An actor-critic algorithm for constrained Markov decision processes
From MaRDI portal
Publication:2504518
DOI10.1016/J.SYSCONLE.2004.08.007zbMath1129.90322OpenAlexW2070570138MaRDI QIDQ2504518
Publication date: 25 September 2006
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.sysconle.2004.08.007
stochastic approximationreinforcement learningenvelope theoremactor-critic algorithmsconstrained Markov decision processes
Related Items (16)
A new learning algorithm for optimal stopping ⋮ An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes ⋮ Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems ⋮ Risk-Sensitive Reinforcement Learning via Policy Gradient Search ⋮ Variance-constrained actor-critic algorithms for discounted and average reward MDPs ⋮ Safety-constrained reinforcement learning with a distributional safety critic ⋮ An online actor-critic algorithm with function approximation for constrained Markov decision processes ⋮ Approachability in Stackelberg stochastic games with vector costs ⋮ Delay-aware online service scheduling in high-speed railway communication systems ⋮ Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization ⋮ Risk-Constrained Reinforcement Learning with Percentile Risk Criteria ⋮ Optimal Distributed Uplink Channel Allocation: A Constrained MDP Formulation ⋮ Opportunistic Transmission over Randomly Varying Channels ⋮ A note on linear function approximation using random projections ⋮ Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation ⋮ Whittle index based Q-learning for restless bandits with average reward
Cites Work
- Stochastic approximation with two time scales
- An analysis of temporal-difference learning with function approximation
- OnActor-Critic Algorithms
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- Envelope Theorems for Arbitrary Choice Sets
- Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: An actor-critic algorithm for constrained Markov decision processes