Joint learning of reward machines and policies in environments with partially known semantics
From MaRDI portal
Publication:6579299
DOI10.1016/J.ARTINT.2024.104146zbMATH Open1543.6832MaRDI QIDQ6579299FDOQ6579299
Authors: Christos K. Verginis, Cevahir Koprulu, Sandeep Chinchali, Ufuk Topcu
Publication date: 25 July 2024
Published in: Artificial Intelligence (Search for Journal in Brave)
Learning and adaptive systems in artificial intelligence (68T05) Formal languages and automata (68Q45) Computational learning theory (68Q32)
Cites Work
- PySAT: a Python toolkit for prototyping with SAT oracles
- \({\mathcal Q}\)-learning
- Title not available (Why is that?)
- Complexity of automaton identification from given data
- Exact DFA Identification Using SAT Solvers
- Omega-Regular Objectives in Model-Free Reinforcement Learning
- Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning
This page was built for publication: Joint learning of reward machines and policies in environments with partially known semantics
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6579299)