A New Value Iteration method for the Average Cost Dynamic Programming Problem
From MaRDI portal
Publication:4388932
DOI10.1137/S0363012995291609zbMath0909.90269OpenAlexW2103406407MaRDI QIDQ4388932
Publication date: 10 May 1998
Published in: SIAM Journal on Control and Optimization (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1137/s0363012995291609
Programming involving graphs or networks (90C35) Dynamic programming in optimal control and differential games (49L20) Markov and semi-Markov decision processes (90C40)
Related Items
Policy iteration type algorithms for recurrent state Markov decision processes ⋮ An empirical study of policy convergence in Markov decision process value iteration ⋮ Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm ⋮ Convex Relaxations for Permutation Problems ⋮ Model-based average reward reinforcement learning ⋮ Inertial Newton algorithms avoiding strict saddle points ⋮ Analyzing anonymity attacks through noisy channels ⋮ A unified approach to time-aggregated Markov decision processes
This page was built for publication: A New Value Iteration method for the Average Cost Dynamic Programming Problem