Average cost Markov decision processes: Optimality conditions (Q1176301)

From MaRDI portal
Revision as of 09:04, 10 February 2024 by RedirectionBot (talk | contribs) (‎Removed claims)
scientific article
Language Label Description Also known as
English
Average cost Markov decision processes: Optimality conditions
scientific article

    Statements

    Average cost Markov decision processes: Optimality conditions (English)
    0 references
    25 June 1992
    0 references
    The authors consider the following discrete-time Markov decision processes with long run expected average cost criterion: both state space \(X\) and action set \(A\) are Borel sets (i.e. Borel subsets of complete separable metric spaces) and, for each state \(x\in X\), a nonempty measurable subset \(A(x)\) of \(A\), which is the set of the admissible actions when the process is in state \(x\), is compact. The transition law \(q(\cdot\mid\cdot,\cdot)\) is a stochastic kernel on \(X\) given \(X\times A\) such that \(\int_ X v(y)q(dy\mid x,a)\) is a lower semi-continuous function in \(a\in A(x)\) for each \(x\in X\) and any bounded measurable function \(v\) on \(X\). The one-stage cost function \(c\) is bounded measurable on \(X\times A\) and lower semi-continuous in \(a\in A(x)\) for each \(x\in X\). The authors give ergodicity conditions with respect to the transition law \(q\) under which a duality theorem holds, that is, the existence of an optimal solution to the primal problem, which is equivalently a solution to the optimality equation for the Markov decision model with the average cost criterion, yields an optimal solution to the dual problem or the deterministic version and conversely, and furthermore the corresponding optimal values of the problems are equal. This result extends those of \textit{K. Yamada} [J. Math. Anal. Appl. 50, 579-595 (1975; Zbl 0323.90053)] and \textit{J. A. Filar} and \textit{T. A. Schultz} [Oper. Res. Lett. 7, 303-307 (1988; Zbl 0659.90095)] to the model with general Borel spaces. Also, using the concept of opportunity cost introduced by \textit{J. Flynn} [J. Math. Anal. Appl. 76, 202-208 (1980; Zbl 0438.90100); ibid. 144, 586-594 (1989; Zbl 0679.90084)], they show that a stationary policy determined from the optimality equation is strong average optimal.
    0 references
    duality theorem
    0 references
    strong average optimality
    0 references
    long run expected average cost criterion
    0 references
    ergodicity conditions
    0 references
    opportunity cost
    0 references

    Identifiers