Sample-efficient reinforcement learning from human feedback via information-directed sampling
From MaRDI portal
This page was built for publication: Sample-efficient reinforcement learning from human feedback via information-directed sampling
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6921518)