Certified reinforcement learning with logic guidance
From MaRDI portal
Publication:6136089
DOI10.1016/J.ARTINT.2023.103949arXiv1902.00778OpenAlexW2914702425MaRDI QIDQ6136089FDOQ6136089
Alessandro Abate, Daniel Kroening, Hosein Hasanbeig
Publication date: 28 August 2023
Published in: Artificial Intelligence (Search for Journal in Brave)
Abstract: Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems. However, applications in safety-critical domains require a systematic and formal approach to specifying requirements as tasks or goals. We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs). The given LTL property is translated into a Limit-Deterministic Generalised Buchi Automaton (LDGBA), which is then used to shape a synchronous reward function on-the-fly. Under certain assumptions, the algorithm is guaranteed to synthesise a control policy whose traces satisfy the LTL specification with maximal probability.
Full work available at URL: https://arxiv.org/abs/1902.00778
automatacontrol synthesisformal methodsMarkov decision processesreinforcement learningtemporal logicspolicy synthesis
Cites Work
- \textsf{AMYTISS}: parallelized automated controller synthesis for large-scale stochastic systems
- StocHy - automated verification and synthesis of stochastic processes
- \({\mathcal Q}\)-learning
- Differential dynamic logic for hybrid systems
- Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems
- Title not available (Why is that?)
- Title not available (Why is that?)
- Multilayer feedforward networks are universal approximators
- A stochastic approximation method for reachability computations
- Title not available (Why is that?)
- Convergence of discretization procedures in dynamic programming
- Title not available (Why is that?)
- Approximation of Markov decision processes with general state space
- Discounting the distant future: How much do uncertain rates increase valuations?
- Variable resolution discretization in optimal control
- Continuous state dynamic programming via nonexpansive approximation
- Model checking of safety properties
- Deterministic generators and games for Ltl fragments
- Markov decision processes with state-dependent discount factors and unbounded rewards/costs
- Near-optimal reinforcement learning in polynomial time
- Analysis of a Numerical Dynamic Programming Algorithm Applied to Economic Models
- Title not available (Why is that?)
- Quantitative automata-based controller synthesis for non-autonomous stochastic hybrid systems
- Quantitative model-checking of controlled discrete-time Markov processes
- Automated verification and synthesis of stochastic hybrid systems: a survey
- Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes
- Safe Exploration of State and Action Spaces in Reinforcement Learning
- Deep reinforcement learning with temporal logics
- Verification of Markov Decision Processes Using Learning Algorithms
- Title not available (Why is that?)
- Limit-Deterministic Büchi Automata for Linear Temporal Logic
- Explorations in Monte Carlo Methods
- Optimal Translation of LTL to Limit Deterministic Automata
- Title not available (Why is that?)
- Certified reinforcement learning with logic guidance
- Complementing semi-deterministic Büchi automata
- Verification of General Markov Decision Processes by Approximate Similarity Relations and Policy Refinement
- Statistical Verification of Probabilistic Properties with Unbounded Until
- Learning-Based Probabilistic LTL Motion Planning With Environment and Motion Uncertainties
- Limit deterministic and probabilistic automata for \(\mathrm{LTL}\backslash GU\)
- Cost‐efficient numerical algorithm for solving the linear inverse problem of finding a variable magnetization
Cited In (4)
This page was built for publication: Certified reinforcement learning with logic guidance
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6136089)