Joint learning of reward machines and policies in environments with partially known semantics
From MaRDI portal
Publication:6579299
Recommendations
- Learning reward machines: a study in partially observable reinforcement learning
- Reward machines: exploiting reward function structure in reinforcement learning
- Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs
- Inverse reinforcement learning in partially observable environments
- Learning partially observable non-deterministic action models
Cites work
- scientific article; zbMATH DE number 3128787 (Why is no real title available?)
- Complexity of automaton identification from given data
- Exact DFA Identification Using SAT Solvers
- Omega-Regular Objectives in Model-Free Reinforcement Learning
- PySAT: a Python toolkit for prototyping with SAT oracles
- Reward machines: exploiting reward function structure in reinforcement learning
- \({\mathcal Q}\)-learning
This page was built for publication: Joint learning of reward machines and policies in environments with partially known semantics
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6579299)