Continuous time Markov decision programming with average reward criterion and unbounded reward rate (Q1179405): Difference between revisions
From MaRDI portal
Set profile property. |
Set OpenAlex properties. |
||
(One intermediate revision by one other user not shown) | |||
Property / cites work | |||
Property / cites work: Continuous time control of Markov processes on an arbitrary state space: average return criterion / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4712090 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3908791 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q5524074 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion / rank | |||
Normal rank | |||
Property / full work available at URL | |||
Property / full work available at URL: https://doi.org/10.1007/bf02080199 / rank | |||
Normal rank | |||
Property / OpenAlex ID | |||
Property / OpenAlex ID: W1990651907 / rank | |||
Normal rank |
Latest revision as of 09:06, 30 July 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Continuous time Markov decision programming with average reward criterion and unbounded reward rate |
scientific article |
Statements
Continuous time Markov decision programming with average reward criterion and unbounded reward rate (English)
0 references
26 June 1992
0 references
Markov decision problems with continuous time and unbounded reward rates are studied for countable state sets and compact metric action sets. The transitive law is described by a controlled conservative transition rate matrix. For these problems the average expected reward is to be maximized under some (time dependent) deterministic Markov strategies where the resulting transition probabilities are continuous in time. Additional assumptions are given to obtain the existence of stationary optimal policies. The essential arguments are based on an imbedded finite state Markov decision chain with bounded rewards.
0 references
continuous time
0 references
unbounded reward
0 references
countable state sets
0 references
compact metric action sets
0 references
average expected reward
0 references
0 references