Bayesian policy reuse
From MaRDI portal
Publication:1689554
DOI10.1007/S10994-016-5547-YzbMath1454.68129arXiv1505.00284OpenAlexW778742492MaRDI QIDQ1689554
Majd Hawasly, Benjamin Rosman, Subramanian Ramamoorthy
Publication date: 12 January 2018
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1505.00284
online learningreinforcement learningBayesian decision theorytransfer learningBayesian optimisationonline banditspolicy reuse
Bayesian problems; characterization of Bayes procedures (62C10) Learning and adaptive systems in artificial intelligence (68T05) Online algorithms; streaming algorithms (68W27)
Related Items (1)
Cites Work
- Asymptotically efficient adaptive allocation rules
- Computing a Classic Index for Finite-Horizon Bandits
- A Structured Multiarmed Bandit Problem and the Greedy Policy
- Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting
- Finite-time analysis of the multiarmed bandit problem
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: Bayesian policy reuse