Reliable off-policy evaluation for reinforcement learning
From MaRDI portal
Publication:6579655
Recommendations
- Proximal reinforcement learning: efficient off-policy evaluation in partially observed Markov decision processes
- Double reinforcement learning for efficient off-policy evaluation in Markov decision processes
- Projected state-action balancing weights for offline reinforcement learning
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
- Off-policy evaluation in partially observed Markov decision processes under sequential ignorability
Cited in
(4)- Pessimistic value iteration for multi-task data sharing in offline reinforcement learning
- Proximal reinforcement learning: efficient off-policy evaluation in partially observed Markov decision processes
- Projected state-action balancing weights for offline reinforcement learning
- Off-policy evaluation for tabular reinforcement learning with synthetic trajectories
This page was built for publication: Reliable off-policy evaluation for reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6579655)