Uniform convergence of value iteration policies for discounted Markov decision processes
From MaRDI portal
Publication:2467010
zbMATH Open1136.90042MaRDI QIDQ2467010FDOQ2467010
Authors: Daniel Cruz-Suárez, Raúl Montes-de-Oca
Publication date: 18 January 2008
Published in: Boletín de la Sociedad Matemática Mexicana. Third Series (Search for Journal in Brave)
Recommendations
- The convergence of value iteration in discounted Markov decision processes
- On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case
- On convergence of value iteration for a class of total cost Markov decision processes
- Pointwise approximations of discounted Markov decision processes to optimal policies
- The value iteration method for countable state Markov decision processes
Cited In (14)
- Simulation‐based Uniform Value Function Estimates of Markov Decision Processes
- A stopping rule for discounted Markov decision processes with finite action sets
- Nonuniqueness versus uniqueness of optimal policies in convex discounted Markov decision processes
- Pointwise approximations of discounted Markov decision processes to optimal policies
- Identification of optimal policies in Markov decision processes
- An empirical study of policy convergence in Markov decision process value iteration
- On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case
- Suboptimality of the value iteration policies in discounted linear-quadratic models
- The convergence of value iteration in discounted Markov decision processes
- A Lyapunov-based version of the value iteration algorithm formulated as a discrete-time switched affine system
- Convergence Properties of Policy Iteration
- Title not available (Why is that?)
- Convergence in unconstrained discrete-time differential dynamic programming
- A Note on the Convergence of Policy Iteration in Markov Decision Processes with Compact Action Spaces
This page was built for publication: Uniform convergence of value iteration policies for discounted Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2467010)