Value iteration for long-run average reward in Markov decision processes
From MaRDI portal
Publication:2151247
DOI10.1007/978-3-319-63387-9_10zbMath1494.68126arXiv1705.02326OpenAlexW2612788245MaRDI QIDQ2151247
Pranav Ashok, Tobias Meggendorfer, Jan Křetínský, Krishnendu Chatterjee, Przemysław Daca
Publication date: 1 July 2022
Full work available at URL: https://arxiv.org/abs/1705.02326
Formal languages and automata (68Q45) Markov and semi-Markov decision processes (90C40) Probability in computer science (algorithm analysis, random structures, phase transitions, etc.) (68Q87)
Related Items (6)
Markov automata with multiple objectives ⋮ Value iteration for simple stochastic games: stopping criterion and learning algorithm ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Economic design of memory-type control charts: the fallacy of the formula proposed by Lorenzen and Vance (1986) ⋮ Multi-objective optimization of long-run average and total rewards
Uses Software
This page was built for publication: Value iteration for long-run average reward in Markov decision processes