Action-dependent stopping times and Markov decision process with unbounded rewards

From MaRDI portal

Publication:1158111

Jump to:navigation, search

DOI10.1007/BF01783952zbMath0471.90094OpenAlexW2058997544MaRDI QIDQ1158111

J. A. E. E. Van Nunen, Shaler jun. Stidham

Publication date: 1981

Published in: OR Spektrum (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/bf01783952

zbMATH Keywords

algorithm upper bounds lower bounds unbounded rewards actions-dependent stopping time elimination of non-optimal actions equal-row- sum property semi Markov decision processes successive-approximation method

Mathematics Subject Classification ID

Stopping times; optimal stopping problems; gambling theory (60G40) Markov renewal processes, semi-Markov processes (60K15) Markov and semi-Markov decision processes (90C40)

Related Items

On theory and algorithms for Markov decision problems with the total reward criterion, Solving linear systems by methods based on a probabilistic interpretation

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1158111&oldid=13217005"