Finite-time lower bounds for the two-armed bandit problem
From MaRDI portal
DOI10.1109/9.847107zbMATH Open0991.62059OpenAlexW2021240652MaRDI QIDQ4507101FDOQ4507101
Authors: Sanjeev R. Kulkarni, Gábor Lugosi
Publication date: 17 October 2000
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1109/9.847107
Recommendations
- scientific article; zbMATH DE number 3846699
- Lower bounds on the sample complexity of exploration in the multi-armed bandit problem.
- Nonparametric bandit methods
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- The sample complexity of exploration in the multi-armed bandit problem
Cited In (12)
- Finite-time analysis of the multiarmed bandit problem
- Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards
- On ergodic two-armed bandits
- Arbitrary side observations in bandit problems
- Explore first, exploit next: the true shape of regret in bandit problems
- On optimal prior learning time in the two-armed bandit problem
- Title not available (Why is that?)
- Title not available (Why is that?)
- A Note on Performance Limitations in Bandit Problems With Side Information
- Title not available (Why is that?)
- Adaptive policies for sequential sampling under incomplete information and a cost constraint
- Bounded Regret for Finitely Parameterized Multi-Armed Bandits
This page was built for publication: Finite-time lower bounds for the two-armed bandit problem
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4507101)