Average cost Markov decision processes: Optimality conditions (Q1176301): Difference between revisions

From MaRDI portal
RedirectionBot (talk | contribs)
Changed an Item
ReferenceBot (talk | contribs)
Changed an Item
 
(One intermediate revision by one other user not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / cites work
 
Property / cites work: Communicating MDPs: Equivalence and LP properties / rank
 
Normal rank
Property / cites work
 
Property / cites work: On optimality criteria for dynamic programs with long finite horizons / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal steady states, excessive functions, and deterministic dynamic programs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4172731 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive Markov control processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: A forecast horizon and a stopping rule for general Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Recurrence conditions for Markov decision processes with Borel state space: A survey / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5599448 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Existence of a Minimum Pair of State and Policy for Markov Decision Processes under the Hypothesis of Doeblin / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3966881 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Duality theorem in Markovian decision problems / rank
 
Normal rank

Latest revision as of 11:13, 15 May 2024

scientific article
Language Label Description Also known as
English
Average cost Markov decision processes: Optimality conditions
scientific article

    Statements

    Average cost Markov decision processes: Optimality conditions (English)
    0 references
    25 June 1992
    0 references
    The authors consider the following discrete-time Markov decision processes with long run expected average cost criterion: both state space \(X\) and action set \(A\) are Borel sets (i.e. Borel subsets of complete separable metric spaces) and, for each state \(x\in X\), a nonempty measurable subset \(A(x)\) of \(A\), which is the set of the admissible actions when the process is in state \(x\), is compact. The transition law \(q(\cdot\mid\cdot,\cdot)\) is a stochastic kernel on \(X\) given \(X\times A\) such that \(\int_ X v(y)q(dy\mid x,a)\) is a lower semi-continuous function in \(a\in A(x)\) for each \(x\in X\) and any bounded measurable function \(v\) on \(X\). The one-stage cost function \(c\) is bounded measurable on \(X\times A\) and lower semi-continuous in \(a\in A(x)\) for each \(x\in X\). The authors give ergodicity conditions with respect to the transition law \(q\) under which a duality theorem holds, that is, the existence of an optimal solution to the primal problem, which is equivalently a solution to the optimality equation for the Markov decision model with the average cost criterion, yields an optimal solution to the dual problem or the deterministic version and conversely, and furthermore the corresponding optimal values of the problems are equal. This result extends those of \textit{K. Yamada} [J. Math. Anal. Appl. 50, 579-595 (1975; Zbl 0323.90053)] and \textit{J. A. Filar} and \textit{T. A. Schultz} [Oper. Res. Lett. 7, 303-307 (1988; Zbl 0659.90095)] to the model with general Borel spaces. Also, using the concept of opportunity cost introduced by \textit{J. Flynn} [J. Math. Anal. Appl. 76, 202-208 (1980; Zbl 0438.90100); ibid. 144, 586-594 (1989; Zbl 0679.90084)], they show that a stationary policy determined from the optimality equation is strong average optimal.
    0 references
    duality theorem
    0 references
    strong average optimality
    0 references
    long run expected average cost criterion
    0 references
    ergodicity conditions
    0 references
    opportunity cost
    0 references

    Identifiers