Potential-Based Online Policy Iteration Algorithms for Markov Decision Processes
From MaRDI portal
Publication:5273720
DOI10.1109/TAC.2004.825647zbMath1365.90259WikidataQ114985635 ScholiaQ114985635MaRDI QIDQ5273720
Publication date: 12 July 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Markov renewal processes, semi-Markov processes (60K15) Markov and semi-Markov decision processes (90C40) Applications of Markov renewal processes (reliability, queueing networks, etc.) (60K20)
Related Items
Finding optimal memoryless policies of POMDPs under the expected average reward criterion ⋮ Temporal difference-based policy iteration for optimal control of stochastic systems ⋮ Basic ideas for event-based optimization of Markov systems ⋮ Early developments of control theory in China ⋮ A performance gradient perspective on gradient‐based policy iteration and a modified value iteration ⋮ Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration
This page was built for publication: Potential-Based Online Policy Iteration Algorithms for Markov Decision Processes