An actor-critic algorithm for constrained Markov decision processes
From MaRDI portal
Publication:2504518
DOI10.1016/j.sysconle.2004.08.007zbMath1129.90322OpenAlexW2070570138MaRDI QIDQ2504518
Publication date: 25 September 2006
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.sysconle.2004.08.007
stochastic approximationreinforcement learningenvelope theoremactor-critic algorithmsconstrained Markov decision processes
Related Items
A new learning algorithm for optimal stopping, An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes, Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems, Risk-Sensitive Reinforcement Learning via Policy Gradient Search, Variance-constrained actor-critic algorithms for discounted and average reward MDPs, Safety-constrained reinforcement learning with a distributional safety critic, An online actor-critic algorithm with function approximation for constrained Markov decision processes, Approachability in Stackelberg stochastic games with vector costs, Delay-aware online service scheduling in high-speed railway communication systems, Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization, Risk-Constrained Reinforcement Learning with Percentile Risk Criteria, Optimal Distributed Uplink Channel Allocation: A Constrained MDP Formulation, Opportunistic Transmission over Randomly Varying Channels, A note on linear function approximation using random projections, Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation, Whittle index based Q-learning for restless bandits with average reward
Cites Work
- Stochastic approximation with two time scales
- An analysis of temporal-difference learning with function approximation
- OnActor-Critic Algorithms
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- Envelope Theorems for Arbitrary Choice Sets
- Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item