Strong 1-optimal stationary policies in denumerable Markov decision processes (Q1108940)

scientific article

Language	Label	Description	Also known as
English	Strong 1-optimal stationary policies in denumerable Markov decision processes	scientific article

Statements

instance of

scholarly article

0 references

title

Strong 1-optimal stationary policies in denumerable Markov decision processes (English)

0 references

published in

Systems \& Control Letters

0 references

publication date

1988

0 references

review text

Consider a Markov decision process with countable state space S, compact action sets and bounded rewards. Let \(V_{\alpha}(\pi,i)\) denote the expected \(\alpha\)-discounted reward under policy \(\pi\), starting in state i. \(\pi\) * is called a strong 1-optimal policy (SOP) if, for each \(i\in S\), \(\lim_{\alpha \to 1}[V_{\alpha}(\pi\) *,i)- \(\sup_{\pi}V_{\alpha}(\pi,i)]=O\). Under a standard set of assumptions (including the simultaneous Doeblin condition) for the existence of a stationary average optimal policy, the author proves that (i) a stationary SOP exists, (ii) any limit point, as \(\alpha\) \(\to 1\), of stationary \(\alpha\)-discounted optimal policies is a (stationary) SOP.

0 references

zbMATH Keywords

Markov decision process

0 references

countable state space

0 references

compact action sets

0 references

bounded rewards

0 references

\(\alpha\)-discounted reward

0 references