Mechanizing soundness of off-policy evaluation

From MaRDI portal

Publication:6572575

Jump to:navigation, search

DOI10.4230/LIPICS.ITP.2022.32MaRDI QIDQ6572575FDOQ6572575

Authors: Jared Yeager, J. Eliot B. Moss, Michael Norrish, Philip S. Thomas

Publication date: 15 July 2024

zbMATH Keywords

concentration inequality formal methods reinforcement learning Hoeffding HOL4 off-policy evaluation

Mathematics Subject Classification ID

Theorem proving (automated and interactive theorem provers, deduction, resolution, etc.) (68V15)

This page was built for publication: Mechanizing soundness of off-policy evaluation

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6572575)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:6572575&oldid=40111450"