Q-learning and policy iteration algorithms for stochastic shortest path problems
From MaRDI portal
Recommendations
- On boundedness of Q-learning iterates for stochastic shortest path problems
- Q-learning and enhanced policy iteration in discounted dynamic programming
- Learning algorithms for Markov decision processes with average cost
- On the Speed of Convergence of Value Iteration on Stochastic Shortest-Path Problems
- New algorithms of the Q-learning type
Cites work
- scientific article; zbMATH DE number 3924501 (Why is no real title available?)
- scientific article; zbMATH DE number 51132 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- (Approximate) iterated successive approximations algorithm for sequential decision processes
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
- An Analysis of Stochastic Shortest Path Problems
- Asynchronous Iterative Methods for Multiprocessors
- Asynchronous stochastic approximation and Q-learning
- Discrete Dynamic Programming with Sensitive Discount Optimality Criteria
- Distributed asynchronous computation of fixed points
- Distributed dynamic programming
- Feature-based methods for large scale dynamic programming
- Finite state Markovian decision processes
- Neuro-Dynamic Programming: An Overview and Recent Results
- On Stationary Strategies in Borel Dynamic Programming
- On boundedness of Q-learning iterates for stochastic shortest path problems
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives
- Projected equation methods for approximate solution of large linear systems
- Q-learning and enhanced policy iteration in discounted dynamic programming
Cited in
(13)- Learning algorithms for Markov decision processes with average cost
- A mixed value and policy iteration method for stochastic control with universally measurable policies
- Q-learning and enhanced policy iteration in discounted dynamic programming
- Error bounds for constant step-size \(Q\)-learning
- Robust shortest path planning and semicontractive dynamic programming
- On boundedness of Q-learning iterates for stochastic shortest path problems
- scientific article; zbMATH DE number 5141522 (Why is no real title available?)
- Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms
- UTILIZING DISTRIBUTED LEARNING AUTOMATA TO SOLVE STOCHASTIC SHORTEST PATH PROBLEMS
- On the Speed of Convergence of Value Iteration on Stochastic Shortest-Path Problems
- New algorithms of the Q-learning type
- Proximal algorithms and temporal difference methods for solving fixed point problems
- Fundamental design principles for reinforcement learning algorithms
This page was built for publication: Q-learning and policy iteration algorithms for stochastic shortest path problems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q378731)