An Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits
From MaRDI portal
Publication:6288528
arXiv1707.00205MaRDI QIDQ6288528FDOQ6288528
Authors: W. Hu, Peter I. Frazier
Publication date: 1 July 2017
Abstract: We consider restless multi-armed bandit (RMAB) with a finite horizon and multiple pulls per period. Leveraging the Lagrangian relaxation, we approximate the problem with a collection of single arm problems. We then propose an index-based policy that uses optimal solutions of the single arm problems to index individual arms, and offer a proof that it is asymptotically optimal as the number of arms tends to infinity. We also use simulation to show that this index-based policy performs better than the state-of-art heuristics in various problem settings.
This page was built for publication: An Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6288528)