Finite-stage reward functions having the Markov adequacy property (Q1802324)

There is a countable state space \(X\), and a bounded reward function \(g\), defined on \(X^ n\), where \(n\) is a specified positive integer. A strategy \(\sigma\) assigns a probability measure to \(X\), and generates \(X_ 1=x_ 1\). Then, inductively, given \(X_ k=x_ k\), \(1\leq k\leq m- 1\), it assigns a probability measure to \(X\), generating \(X_ m=x_ m\), with \(1\leq m\leq n\). A randomized Markov strategy \(\widehat\sigma\), given \(\sigma\), assigns the relevant probability measures as specified mixtures of probability measures based upon \(\widehat\sigma\) which depend, at stage \(h\), \(1\leq k\leq n-1\), only on the value of \(x_ k\). The reward function \(g\) is said to have the Markov adequacy property if every strategy \(\sigma\) has a corresponding randomized Markov strategy \(\widehat\sigma\) such that the expected value of \(g\) is the same under \(\widehat\sigma\) and \(\sigma\). Two key properties of \(g\) are defined, viz. linear sections and permutation invariance. The main theorems 2.6 and 2.7 show that the linear sections property implies Markov adequacy, and, if \(| X|\geq 3\), that permutation invariance and Markov adequacy imply the linear sections property. Illustrations of the properties are given. The proofs involve the use of a linear Markov mixability property, which is not easy to validate. Theorems 5.4 and 4.1 give the necessity and sufficiency for linear Markov mixability for Markov adequacy. Theorems 3.2 and 3.3 \((| X|\leq 3)\) give the equivalence of linear Markov mixability and the linear sections property, the latter being easier to verify.

0 references

zbMATH Keywords

countable state space

0 references

bounded reward

0 references

randomized Markov strategy

0 references

reviewed by

Douglas J. White

0 references

MaRDI profile type