Approximate stochastic annealing for online control of infinite horizon Markov decision processes
From MaRDI portal
Publication:1937498
DOI10.1016/j.automatica.2012.06.010zbMath1257.93113MaRDI QIDQ1937498
Publication date: 1 March 2013
Published in: Automatica (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.automatica.2012.06.010
60J10: Markov chains (discrete-time Markov processes on discrete state spaces)
93E20: Optimal stochastic control
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A survey of some simulation-based algorithms for Markov decision processes
- Stochastic approximation methods for constrained and unconstrained systems
- Asynchronous stochastic approximation and Q-learning
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Conditions for the uniqueness of optimal policies of discounted Markov decision processes
- New algorithms of the Q-learning type
- Reinforcement Learning: A Tutorial Survey and Recent Advances
- On the almost sure convergence of a general stochastic approximation procedure
- Cooling Schedules for Optimal Annealing
- Introduction to Stochastic Search and Optimization
- OnActor-Critic Algorithms
- A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes
- An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
- Recursive Learning Automata Approach to Markov Decision Processes
- An Adaptive Sampling Algorithm for Solving Markov Decision Processes
- Probability Inequalities for Sums of Bounded Random Variables
- A Stochastic Approximation Method