The convergence of value iteration in average cost Markov decision chains
From MaRDI portal
Publication:2564235
DOI10.1016/0167-6377(96)00018-1zbMath0865.90134MaRDI QIDQ2564235
Publication date: 7 January 1997
Published in: Operations Research Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/0167-6377(96)00018-1
stochastic dynamic programming; value iteration; countable state space; Markov decision chain; minimum long-run expected average cost
90C40: Markov and semi-Markov decision processes
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Another set of conditions for average optimality in Markov control processes
- On strong average optimality of Markov decision processes with unbounded costs
- Comparing recent assumptions for the existence of average optimal stationary policies
- Optimal control of diffusion processes with reflection
- On Minimum Cost Per Unit Time Control of Markov Chains
- Control of Markov Chains with Long-Run Average Cost Criterion: The Dynamic Programming Equations
- Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs
- Linear Programming and Average Optimality of Markov Control Processes on Borel Spaces—Unbounded Costs
- Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey