Separable value functions for infinite horizon average reward Markov decision processes (Q908861)

scientific article

Language	Label	Description	Also known as
English	Separable value functions for infinite horizon average reward Markov decision processes	scientific article

Statements

instance of

scholarly article

0 references

title

Separable value functions for infinite horizon average reward Markov decision processes (English)

0 references

published in

Journal of Mathematical Analysis and Applications

0 references

publication date

1989

0 references

review text

Consider the following decision problem: (a) The state and control spaces S, C are of the form \(S=\times^ p_{j=1}S_ j\), \(C=\times^ p_{j=1}C_ j\), \(S_ j\subset R^{nj}\), \(C_ j\subset R^{m_ j}\) for some integers \(n_ j\), \(m_ j\), \(1\leq j\leq p\) and finite \(C_ j\), \(S_ j;\) (b) for any given state x, the action set is \(A(x)=x^ p_{j=0}w_ j(x_ j)\), \(x_ j\in S_ j\), \(w_ j(x_ j)\subset C_ j;\) (c) for each \(i\in \{1,2,...,p\}\) there is a set \(I_ i\subset \{1,2,...,p\}\) such that \(I_ i\cap I_ j=\emptyset\) if \(i\neq j\); \(\cup^{p}_{i=1}I_ i\subset \{1,2,...,p\};\) (d) The reward in the current period is of the form \(r(x,y)=\sum^{p}_{j=1}r_ j(x_ j,y_ j)\), \(x_ j\in S_ j\), \(y_ j\in w_ j(x_ j);\) (e) The states of the next period (given the state \(x\in S\) and the action \(y\in A(x))\) are described by a random variable D (with values in a finite set D) and a function g of the form \(g(x,y,d)=\times^ p_{j=1}g_ j(x_{i(j)},y_{i(j)},d)\) where \(i(j)=i\) for \(j\in I_ i\), \(1\leq i\leq p\), \(d\in D.\) The problem is to find a policy to maximize the expected reward per period in the long run. A method to find a solution is given. The optimality equation is investigated and the relation to the separated optimality equations is given. An elementary inventory problem within this framework is treated. The paper extends the results of \textit{W. S. Lovejoy} [Oper. Res. 34, 630-637 (1986; Zbl 0632.90088)].

0 references

zbMATH Keywords

Markov decision problem

0 references

infinite horizon

0 references

average reward

0 references

expected reward

0 references

optimality equation

0 references