Reductive MDPs: A Perspective Beyond Temporal Horizons

From MaRDI portal
Publication:6399211

arXiv2205.07338MaRDI QIDQ6399211FDOQ6399211


Authors: Rui Silva, Joshua Lockhart, Jason Long Edit this on Wikidata


Publication date: 15 May 2022

Abstract: Solving general Markov decision processes (MDPs) is a computationally hard problem. Solving finite-horizon MDPs, on the other hand, is highly tractable with well known polynomial-time algorithms. What drives this extreme disparity, and do problems exist that lie between these diametrically opposed complexities? In this paper we identify and analyse a sub-class of stochastic shortest path problems (SSPs) for general state-action spaces whose dynamics satisfy a particular drift condition. This construction generalises the traditional, temporal notion of a horizon via decreasing reachability: a property called reductivity. It is shown that optimal policies can be recovered in polynomial-time for reductive SSPs -- via an extension of backwards induction -- with an efficient analogue in reductive MDPs. The practical considerations of the proposed approach are discussed, and numerical verification provided on a canonical optimal liquidation problem.













This page was built for publication: Reductive MDPs: A Perspective Beyond Temporal Horizons

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6399211)