Certified reinforcement learning with logic guidance

DOI10.1016/J.ARTINT.2023.103949MaRDI QIDQ6136089zbMATH OpenOpenAlexFDO

Authors Hosein Hasanbeig, Daniel Kroening, Alessandro Abate

Publication date 28 August 2023

Published in Artificial Intelligence (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1902.00778

automata control synthesis formal methods Markov decision processes reinforcement learning temporal logics policy synthesis

Mathematics Subject Classification ID

Artificial intelligence (68Txx)

Abstract: Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems. However, applications in safety-critical domains require a systematic and formal approach to specifying requirements as tasks or goals. We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs). The given LTL property is translated into a Limit-Deterministic Generalised Buchi Automaton (LDGBA), which is then used to shape a synchronous reward function on-the-fly. Under certain assumptions, the algorithm is guaranteed to synthesise a control policy whose traces satisfy the LTL specification with maximal probability.

Cites work

Cited in

(4)

This page was built for publication: Certified reinforcement learning with logic guidance

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6136089)