Finite-time lower bounds for the two-armed bandit problem
From MaRDI portal
Recommendations
- scientific article; zbMATH DE number 3846699
- Lower bounds on the sample complexity of exploration in the multi-armed bandit problem.
- Nonparametric bandit methods
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- The sample complexity of exploration in the multi-armed bandit problem
Cited in
(12)- Bounded Regret for Finitely Parameterized Multi-Armed Bandits
- On ergodic two-armed bandits
- Finite-time analysis of the multiarmed bandit problem
- Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards
- Arbitrary side observations in bandit problems
- Explore first, exploit next: the true shape of regret in bandit problems
- On optimal prior learning time in the two-armed bandit problem
- scientific article; zbMATH DE number 3967731 (Why is no real title available?)
- scientific article; zbMATH DE number 3846699 (Why is no real title available?)
- A Note on Performance Limitations in Bandit Problems With Side Information
- scientific article; zbMATH DE number 6982311 (Why is no real title available?)
- Adaptive policies for sequential sampling under incomplete information and a cost constraint
This page was built for publication: Finite-time lower bounds for the two-armed bandit problem
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4507101)