Verification of Markov decision processes using learning algorithms

DOI10.1007/978-3-319-11936-6_8MaRDI QIDQ3457782zbMATH OpenOpenAlexFDO

Authors

Publication date 17 December 2015

Published in Automated Technology for Verification and Analysis (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1402.2967

Learning and adaptive systems in artificial intelligence (68T05) Probability in computer science (algorithm analysis, random structures, phase transitions, etc.) (68Q87) Specification and verification (program logics, model checking, etc.) (68Q60)

Abstract: We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model-checking for unbounded properties in MDPs. In contrast with other related approaches, we do not restrict our attention to time-bounded (finite-horizon) or discounted properties, nor assume any particular properties of the MDP. We also show how our techniques extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.

Recommendations

Cited in

(45)

This page was built for publication: Verification of Markov decision processes using learning algorithms

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3457782)