Reliable off-policy evaluation for reinforcement learning
From MaRDI portal
Publication:6579655
DOI10.1287/OPRE.2022.2382MaRDI QIDQ6579655FDOQ6579655
Authors: Jie Wang, Rui Gao, Hongyuan Zha
Publication date: 25 July 2024
Published in: Operations Research (Search for Journal in Brave)
Recommendations
- Proximal reinforcement learning: efficient off-policy evaluation in partially observed Markov decision processes
- Double reinforcement learning for efficient off-policy evaluation in Markov decision processes
- Projected state-action balancing weights for offline reinforcement learning
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
- Off-policy evaluation in partially observed Markov decision processes under sequential ignorability
Cited In (4)
- Pessimistic value iteration for multi-task data sharing in offline reinforcement learning
- Proximal reinforcement learning: efficient off-policy evaluation in partially observed Markov decision processes
- Projected state-action balancing weights for offline reinforcement learning
- Off-policy evaluation for tabular reinforcement learning with synthetic trajectories
This page was built for publication: Reliable off-policy evaluation for reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6579655)