Discounted and average Markov decision processes with unbounded rewards: New conditions (Q1206951)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Discounted and average Markov decision processes with unbounded rewards: New conditions |
scientific article |
Statements
Discounted and average Markov decision processes with unbounded rewards: New conditions (English)
0 references
1 April 1993
0 references
For Markov decision processes (MDP) with discrete time and countable state space, conditions are given under which the known optimality equation for the discounted and for the average case, resp., has a unique resp. a bounded solution and which also regard unbounded rewards. The conditions are weaker than those given by \textit{J. M. Harrison} [Ann. Math. Statist. 43, No. 2, 636-644 (1972; Zbl 0262.90064)], resp. by \textit{S. M. Ross} [Ann. Math. Statist. 39, No. 2, 412-423 (1968; Zbl 0157.505)]. Because these new conditions are difficult to verify, further sufficient, recursively operating conditions are given. In the last part of the paper, continuous-time MDP are investigated by these methods and reduced to discrete-time ones. The paper contains a rich material so that only essentially new lemmas and theorems are proved. Also not all used symbols are explicitely defined. However, many remarks permit the reader to make connections to previously known facts.
0 references
discounted case
0 references
discrete time
0 references
countable state space
0 references
average case
0 references