An improved algorithm for solving communicating average reward Markov decision processes
From MaRDI portal
Publication:2638957
DOI10.1007/BF02055583zbMath0717.90084OpenAlexW2002035237MaRDI QIDQ2638957
Moshe Haviv, Martin L. Puterman
Publication date: 1991
Published in: Annals of Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/bf02055583
average reward criterionpolicy iteration algorithmcommunicating Markov decision processesmultichain policiesunichain policies
Markov and semi-Markov decision processes (90C40) Computational methods for problems pertaining to operations research and mathematical programming (90-08)
Related Items (4)
An effective numerical method for controlled routing in large trunk line networks ⋮ On some algorithms for limiting average Markov decision processes ⋮ Exact decomposition approaches for Markov decision processes: a survey ⋮ A decomposition algorithm for limiting average Markov decision problems.
Cites Work
- Unnamed Item
- Communicating MDPs: Equivalence and LP properties
- On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case
- Optimal decision procedures for finite Markov chains. Part II: Communicating systems
- Computing Optimal Policies for Controlled Tandem Queueing Systems
- Discrete Dynamic Programming
- Denumerable State Markovian Decision Processes-Average Cost Criterion
- Scientific Applications: An algorithm for identifying the ergodic subchains and transient states of a stochastic matrix
This page was built for publication: An improved algorithm for solving communicating average reward Markov decision processes