An online actor-critic algorithm with function approximation for constrained Markov decision processes
DOI10.1007/S10957-012-9989-5zbMATH Open1262.90189OpenAlexW2073314543MaRDI QIDQ438776FDOQ438776
Authors: Shalabh Bhatnagar, K. Lakshmanan
Publication date: 31 July 2012
Published in: Journal of Optimization Theory and Applications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10957-012-9989-5
Recommendations
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- An actor-critic algorithm for constrained Markov decision processes
- A constrained optimization perspective on actor-critic algorithms and application to network routing
- Actor-critic algorithms with online feature adaptation
- Learning algorithms for finite horizon constrained Markov decision processes
function approximationactor critic algorithmconstrained Markov decision processlong-run average cost criterion
Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Markov and semi-Markov decision processes (90C40)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Perturbation theory and finite Markov chains
- Natural actor-critic algorithms
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Title not available (Why is that?)
- OnActor-Critic Algorithms
- Simulation-based optimization of Markov reward processes
- Title not available (Why is that?)
- Average cost temporal-difference learning
- Asynchronous Stochastic Approximations
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- An actor-critic algorithm for constrained Markov decision processes
- Optimal flow control of a class of queueing networks in equilibrium
- The Borkar-Meyn theorem for asynchronous stochastic approximations
Cited In (12)
- Queueing Network Controls via Deep Reinforcement Learning
- Event-based optimization approach for solving stochastic decision problems with probabilistic constraint
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs
- On the sample complexity of actor-critic method for reinforcement learning with function approximation
- Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
- Optimal deterministic controller synthesis from steady-state distributions
- Multiscale Q-learning with linear function approximation
- An actor-critic algorithm for constrained Markov decision processes
- Learning algorithms for finite horizon constrained Markov decision processes
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- Suboptimal control for nonlinear systems with disturbance via integral sliding mode control and policy iteration
- An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
This page was built for publication: An online actor-critic algorithm with function approximation for constrained Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q438776)