Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains (Q1103532): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
ReferenceBot (talk | contribs)
Changed an Item
 
Property / cites work
 
Property / cites work: Q4606219 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Two competing queues with linear costs and geometric service requirements: the <i>μc</i>-rule is often optimal / rank
 
Normal rank
Property / cites work
 
Property / cites work: Necessary conditions for the optimality equation in average-reward Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Existence of optimal stationary policies in average reward Markov decision processes with a recurrent state / rank
 
Normal rank
Property / cites work
 
Property / cites work: A note on simultaneous recurrence conditions on a set of denumerable stochastic matrices / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5599448 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4771778 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4131338 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5615108 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A new condition for the existence of optimal stationary policies in average cost Markov decision processes / rank
 
Normal rank

Latest revision as of 16:30, 18 June 2024

scientific article
Language Label Description Also known as
English
Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains
scientific article

    Statements

    Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains (English)
    0 references
    1988
    0 references
    Consider a discrete time Markov decision process with countable state space S. In addition to the standard assumptions of compact action sets and continuous transition probabilities, suppose that the Markov chain determined by each stationary policy f has a single positive recurrent class R(f), which is entered with probability one and which contains at least one member of a fixed, finite subset G of S. The main theorem gives, under these assumptions, five necessary and sufficient conditions (including a simultaneous Doeblin condition with set G) for the average reward optimality equation to have a bounded measurable solution for an arbitrary bounded measurable reward function. The establishment of necessity is an uncommon feature; sufficient conditions are discussed in \textit{L. C. Thomas} [``Connectedness conditions for denumerable state Markov decision processes'', in: Recent developments in Markov decision processes, R. Hartley, L. C. Thomas, D. J. White (eds.), Academic Press (1980; Zbl 0547.90064)].
    0 references
    optimal stationary policies
    0 references
    discrete time Markov decision process
    0 references
    countable state space
    0 references
    simultaneous Doeblin condition
    0 references
    average reward optimality equation
    0 references
    bounded measurable reward function
    0 references

    Identifiers