An axiomatic approach to Markov decision processes

DOI10.1007/S00186-022-00806-9arXiv1701.02879MaRDI QIDQ6281786FDOQ6281786

Publication date: 11 January 2017

Abstract: This paper presents an axiomatic approach to finite Markov decision processes where the discount rate is zero. One of the principal difficulties in the no discounting case is that, even if attention is restricted to stationary policies, a strong overtaking optimal policy need not exists. We provide preference foundations for two criteria that do admit optimal policies:

0

-discount optimality and average overtaking optimality. As a corollary of our results, we obtain conditions on a decision maker's preferences which ensure that an optimal policy exists. These results have implications for disciplines where stochastic dynamic programming problems arise, including automatic control, dynamic games, and economic development.

Mathematics Subject Classification ID

Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Statistical decision theory (62C99) Dynamic programming (90C39)

This page was built for publication: An axiomatic approach to Markov decision processes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6281786)