Average cost Markov decision processes: Optimality conditions (Q1176301): Difference between revisions

From MaRDI portal
RedirectionBot (talk | contribs)
Removed claims
RedirectionBot (talk | contribs)
Changed an Item
Property / author
 
Property / author: Onésimo Hernández-Lerma / rank
 
Normal rank
Property / author
 
Property / author: Jean Claude Hennet / rank
 
Normal rank
Property / author
 
Property / author: Jean-Bernard Lasserre / rank
 
Normal rank
Property / reviewed by
 
Property / reviewed by: Yoshio Ohtsubo / rank
 
Normal rank

Revision as of 09:05, 10 February 2024

scientific article
Language Label Description Also known as
English
Average cost Markov decision processes: Optimality conditions
scientific article

    Statements

    Average cost Markov decision processes: Optimality conditions (English)
    0 references
    25 June 1992
    0 references
    The authors consider the following discrete-time Markov decision processes with long run expected average cost criterion: both state space \(X\) and action set \(A\) are Borel sets (i.e. Borel subsets of complete separable metric spaces) and, for each state \(x\in X\), a nonempty measurable subset \(A(x)\) of \(A\), which is the set of the admissible actions when the process is in state \(x\), is compact. The transition law \(q(\cdot\mid\cdot,\cdot)\) is a stochastic kernel on \(X\) given \(X\times A\) such that \(\int_ X v(y)q(dy\mid x,a)\) is a lower semi-continuous function in \(a\in A(x)\) for each \(x\in X\) and any bounded measurable function \(v\) on \(X\). The one-stage cost function \(c\) is bounded measurable on \(X\times A\) and lower semi-continuous in \(a\in A(x)\) for each \(x\in X\). The authors give ergodicity conditions with respect to the transition law \(q\) under which a duality theorem holds, that is, the existence of an optimal solution to the primal problem, which is equivalently a solution to the optimality equation for the Markov decision model with the average cost criterion, yields an optimal solution to the dual problem or the deterministic version and conversely, and furthermore the corresponding optimal values of the problems are equal. This result extends those of \textit{K. Yamada} [J. Math. Anal. Appl. 50, 579-595 (1975; Zbl 0323.90053)] and \textit{J. A. Filar} and \textit{T. A. Schultz} [Oper. Res. Lett. 7, 303-307 (1988; Zbl 0659.90095)] to the model with general Borel spaces. Also, using the concept of opportunity cost introduced by \textit{J. Flynn} [J. Math. Anal. Appl. 76, 202-208 (1980; Zbl 0438.90100); ibid. 144, 586-594 (1989; Zbl 0679.90084)], they show that a stationary policy determined from the optimality equation is strong average optimal.
    0 references
    duality theorem
    0 references
    strong average optimality
    0 references
    long run expected average cost criterion
    0 references
    ergodicity conditions
    0 references
    opportunity cost
    0 references

    Identifiers