Average cost Markov decision processes: Optimality conditions (Q1176301): Difference between revisions

Latest revision as of 11:13, 15 May 2024

scientific article

Language	Label	Description	Also known as
English	Average cost Markov decision processes: Optimality conditions	scientific article

Statements

instance of

scholarly article

0 references

title

Average cost Markov decision processes: Optimality conditions (English)

0 references

published in

Journal of Mathematical Analysis and Applications

0 references

publication date

25 June 1992

0 references

review text

The authors consider the following discrete-time Markov decision processes with long run expected average cost criterion: both state space \(X\) and action set \(A\) are Borel sets (i.e. Borel subsets of complete separable metric spaces) and, for each state \(x\in X\), a nonempty measurable subset \(A(x)\) of \(A\), which is the set of the admissible actions when the process is in state \(x\), is compact. The transition law \(q(\cdot\mid\cdot,\cdot)\) is a stochastic kernel on \(X\) given \(X\times A\) such that \(\int_ X v(y)q(dy\mid x,a)\) is a lower semi-continuous function in \(a\in A(x)\) for each \(x\in X\) and any bounded measurable function \(v\) on \(X\). The one-stage cost function \(c\) is bounded measurable on \(X\times A\) and lower semi-continuous in \(a\in A(x)\) for each \(x\in X\). The authors give ergodicity conditions with respect to the transition law \(q\) under which a duality theorem holds, that is, the existence of an optimal solution to the primal problem, which is equivalently a solution to the optimality equation for the Markov decision model with the average cost criterion, yields an optimal solution to the dual problem or the deterministic version and conversely, and furthermore the corresponding optimal values of the problems are equal. This result extends those of \textit{K. Yamada} [J. Math. Anal. Appl. 50, 579-595 (1975; Zbl 0323.90053)] and \textit{J. A. Filar} and \textit{T. A. Schultz} [Oper. Res. Lett. 7, 303-307 (1988; Zbl 0659.90095)] to the model with general Borel spaces. Also, using the concept of opportunity cost introduced by \textit{J. Flynn} [J. Math. Anal. Appl. 76, 202-208 (1980; Zbl 0438.90100); ibid. 144, 586-594 (1989; Zbl 0679.90084)], they show that a stationary policy determined from the optimality equation is strong average optimal.

0 references

zbMATH Keywords

duality theorem

0 references

strong average optimality

0 references

long run expected average cost criterion

0 references

ergodicity conditions

0 references

opportunity cost

0 references

author

Onésimo Hernández-Lerma

0 references

Jean Claude Hennet

0 references

Jean-Bernard Lasserre

0 references

reviewed by

Yoshio Ohtsubo

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Communicating MDPs: Equivalence and LP properties

0 references

On optimality criteria for dynamic programs with long finite horizons

0 references

Optimal steady states, excessive functions, and deterministic dynamic programs

0 references

Q4172731

0 references

Adaptive Markov control processes

0 references

A forecast horizon and a stopping rule for general Markov decision processes

0 references

Recurrence conditions for Markov decision processes with Borel state space: A survey

0 references

Q5599448

0 references

The Existence of a Minimum Pair of State and Policy for Markov Decision Processes under the Hypothesis of Doeblin

0 references

Q3966881

0 references

Duality theorem in Markovian decision problems

0 references

Identifiers

zbMATH Open document ID

0739.90072

0 references

DOI

10.1016/0022-247X(91)90244-T

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1176301

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / cites work @@
+Communicating MDPs: Equivalence and LP properties
@@ Property / cites work: Communicating MDPs: Equivalence and LP properties / rank @@
+Normal rank
@@ Property / cites work @@
+On optimality criteria for dynamic programs with long finite horizons
+Normal rank
@@ Property / cites work @@
+Optimal steady states, excessive functions, and deterministic dynamic programs
+Normal rank
@@ Property / cites work @@
+Q4172731
@@ Property / cites work: Q4172731 / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive Markov control processes
@@ Property / cites work: Adaptive Markov control processes / rank @@
+Normal rank
@@ Property / cites work @@
+A forecast horizon and a stopping rule for general Markov decision processes
+Normal rank
@@ Property / cites work @@
+Recurrence conditions for Markov decision processes with Borel state space: A survey
+Normal rank
@@ Property / cites work @@
+Q5599448
@@ Property / cites work: Q5599448 / rank @@
+Normal rank
@@ Property / cites work @@
+The Existence of a Minimum Pair of State and Policy for Markov Decision Processes under the Hypothesis of Doeblin
+Normal rank
@@ Property / cites work @@
+Q3966881
@@ Property / cites work: Q3966881 / rank @@
+Normal rank
@@ Property / cites work @@
+Duality theorem in Markovian decision problems
@@ Property / cites work: Duality theorem in Markovian decision problems / rank @@
+Normal rank