Continuous time Markov decision programming with average reward criterion and unbounded reward rate (Q1179405)

Markov decision problems with continuous time and unbounded reward rates are studied for countable state sets and compact metric action sets. The transitive law is described by a controlled conservative transition rate matrix. For these problems the average expected reward is to be maximized under some (time dependent) deterministic Markov strategies where the resulting transition probabilities are continuous in time. Additional assumptions are given to obtain the existence of stationary optimal policies. The essential arguments are based on an imbedded finite state Markov decision chain with bounded rewards.

0 references

zbMATH Keywords

continuous time

0 references

unbounded reward

0 references

countable state sets

0 references

compact metric action sets

0 references

average expected reward

0 references