Optimistic Posterior Sampling for Reinforcement Learning: Worst-Case Regret Bounds
From MaRDI portal
Publication:6199245
DOI10.1287/MOOR.2022.1266arXiv1705.07041OpenAlexW2769648743MaRDI QIDQ6199245
Publication date: 23 February 2024
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1705.07041
This page was built for publication: Optimistic Posterior Sampling for Reinforcement Learning: Worst-Case Regret Bounds