Value iteration for long-run average reward in Markov decision processes

From MaRDI portal

Publication:2151247

Jump to:navigation, search

DOI10.1007/978-3-319-63387-9_10zbMath1494.68126arXiv1705.02326OpenAlexW2612788245MaRDI QIDQ2151247

Pranav Ashok, Tobias Meggendorfer, Jan Křetínský, Krishnendu Chatterjee, Przemysław Daca

Publication date: 1 July 2022

Full work available at URL: https://arxiv.org/abs/1705.02326

Mathematics Subject Classification ID

Formal languages and automata (68Q45) Markov and semi-Markov decision processes (90C40) Probability in computer science (algorithm analysis, random structures, phase transitions, etc.) (68Q87)

Related Items (6)

Markov automata with multiple objectives ⋮ Value iteration for simple stochastic games: stopping criterion and learning algorithm ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Economic design of memory-type control charts: the fallacy of the formula proposed by Lorenzen and Vance (1986) ⋮ Multi-objective optimization of long-run average and total rewards

Uses Software

This page was built for publication: Value iteration for long-run average reward in Markov decision processes

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2151247&oldid=14656245"