Functional equations in the theory of dynamic programming. XI: Limit theorems (Q773497)

Let \(p\in P\) be a state vector of a discrete process and \(q\in Q\) a decision variable, which transforms \(p\) into \(T(p,q)\in P\). The transformation results in a ``return'' \(b(p, q)\ge 0\). Given \(p_1\), it is required to choose \(q = q_1, \ldots, q = q_n\) such that \(R_n = \sum_{i=1}^n b(p_i, q_i)\) is maximized, when \(p_{i+1} = T(p_i, q_i)\). The author proves that if it is possible for any \(p_a, p_b\) to find a \(q\in Q\) such that \(T(p_a, q) = p_b\), then for all \(p_1\in P\) we have \(\max_q R_N \sim Na\) as \(N\to\infty\), where \(a\) is independent of \(p_1\). It is mentioned that the existence of an asymptotic policy (i. e. choices of \(q)\) has not been proved. (For part X, written together with \textit{S. Lehman} see [Duke Math. J. 27, 55--69 (1960; Zbl 0096.14502)].)

0 references

Mathematics Subject Classification ID

90C39

0 references

0 references

0 references

0 references