Double reinforcement learning for efficient off-policy evaluation in Markov decision processes

From MaRDI portal
Publication:5148951






Cites work







This page was built for publication: Double reinforcement learning for efficient off-policy evaluation in Markov decision processes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5148951)