On the asymptotic optimality of finite approximations to Markov decision processes with Borel spaces

DOI10.1287/MOOR.2016.0832zbMATH Open1417.93337arXiv1503.02244OpenAlexW2964034900MaRDI QIDQ4595952FDOQ4595952

Authors: Naci Saldi, Serdar Yüksel, Tamás Linder

Publication date: 7 December 2017

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Abstract: Calculating optimal policies is known to be computationally difficult for Markov decision processes (MDPs) with Borel state and action spaces. This paper studies finite-state approximations of discrete time Markov decision processes with Borel state and action spaces, for both discounted and average costs criteria. The stationary policies thus obtained are shown to approximate the optimal stationary policy with arbitrary precision under quite general conditions for discounted cost and more restrictive conditions for average cost. For compact-state MDPs, we obtain explicit rate of convergence bounds quantifying how the approximation improves as the size of the approximating finite state space increases. Using information theoretic arguments, the order optimality of the obtained convergence rates is established for a large class of problems. We also show that, as a pre-processing step the action space can also be finitely approximated with sufficiently large number points; thereby, well known algorithms, such as value or policy iteration, Q-learning, etc., can be used to calculate near optimal policies.

Full work available at URL: https://arxiv.org/abs/1503.02244

Recommendations

zbMATH Keywords

quantization stochastic control Markov decision processes finite state approximation

Mathematics Subject Classification ID

Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40) Optimal stochastic control (93E20)

Cited In (22)

This page was built for publication: On the asymptotic optimality of finite approximations to Markov decision processes with Borel spaces

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4595952)