Policy iteration for average cost Markov control processes on Borel spaces (Q1357514)

Howard's algorithm for the average cost problem of discrete time Markov control processes (MPC) with Borel state and action spaces, and possibly unbounded cost is studied. Two classes of MPC's on Borel spaces are presented for which the policy iteration algorithm (PIA) converges. (i) restricted growth unbounded cost, compact control constraint sets and strong ergodicity, (ii) strictly unbounded cost, non-compact control constraint sets. Conditions are given under which the PIA converges to a solution of the average cost optimality equation, thus giving the optimal cost and an optimal stationary control policy. An example illustrates the result.

0 references

reviewed by

Michael Kohlmann

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references