The method of value oriented successive approximations for the average reward Markov decision process

From MaRDI portal

Publication:1144501

Jump to:navigation, search

DOI10.1007/BF01719500zbMath0443.90109OpenAlexW2123114919MaRDI QIDQ1144501

S. H. Smith

Publication date: 1980

Published in: OR Spektrum (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/bf01719500

zbMATH Keywords

convergence average reward finite action space finite state space almost optimal solutions value oriented successive approximations

Mathematics Subject Classification ID

Numerical mathematical programming methods (65K05) Markov and semi-Markov decision processes (90C40)

Related Items

The numerical exploitation of periodicity in Markov decision processes, A value iteration method for undiscounted multichain Markov decision processes, MARKOV DECISION PROCESSES

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1144501&oldid=13198723"