A Structured Multiarmed Bandit Problem and the Greedy Policy
From MaRDI portal
Publication:4974829
DOI10.1109/TAC.2009.2031725zbMath1367.90115OpenAlexW2115293355MaRDI QIDQ4974829
Paat Rusmevichientong, Adam J. Mersereau, John N. Tsitsiklis
Publication date: 8 August 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1109/tac.2009.2031725
Related Items (5)
A linear response bandit problem ⋮ Response-adaptive designs for clinical trials: simultaneous learning from multiple patients ⋮ Bayesian policy reuse ⋮ Learning in Combinatorial Optimization: What and How to Explore ⋮ Multi-objective multi-armed bandit with lexicographically ordered and satisficing objectives
This page was built for publication: A Structured Multiarmed Bandit Problem and the Greedy Policy