Batched bandit problems
From MaRDI portal
Abstract: Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.
Recommendations
- Bandit algorithms
- scientific article; zbMATH DE number 3854141
- Multistage bandit problems
- scientific article; zbMATH DE number 4078557
- Generalized Bandit Problems
- Combinatorial bandits
- scientific article; zbMATH DE number 4059270
- The Nonstochastic Multiarmed Bandit Problem
- Bandit problems with Lévy processes
Cites work
- scientific article; zbMATH DE number 3168190 (Why is no real title available?)
- scientific article; zbMATH DE number 4078557 (Why is no real title available?)
- scientific article; zbMATH DE number 3746280 (Why is no real title available?)
- scientific article; zbMATH DE number 46732 (Why is no real title available?)
- scientific article; zbMATH DE number 1348603 (Why is no real title available?)
- scientific article; zbMATH DE number 5251051 (Why is no real title available?)
- A Learning Approach for Interactive Marketing to a Customer Segment
- A Sequential Design for the Two Armed Bandit
- A Two-Sample Test for a Linear Hypothesis Whose Power is Independent of the Variance
- An Asymptotic Minimax Theorem for the Two Armed Bandit Problem
- Asymptotically efficient adaptive allocation rules
- Asymptotically optimal multistage tests of simple hypotheses
- Batched bandit problems
- Finite-time analysis of the multiarmed bandit problem
- Introduction to nonparametric estimation
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Multistage bandit problems
- On the Non-Existence of Tests of "Student's" Hypothesis Having Power Functions Independent of $\sigma$
- Optimal few-stage designs
- Regret Minimization for Reserve Prices in Second-Price Auctions
- Regret bounds and minimax policies under partial monitoring
- SOME PROBLEMS OF OPTIMUM SAMPLING
- Sequential experimentation in clinical trials. Design and analysis
- Some Remarks on the Two-Armed Bandit
- Some aspects of the sequential design of experiments
- TWO-STAGE PROCEDURES FOR ESTIMATING THE DIFFERENCE BETWEEN MEANS
- The multi-armed bandit problem with covariates
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
Cited in
(18)- Functional Sequential Treatment Allocation
- Learning the distribution with largest mean: two bandit frameworks
- scientific article; zbMATH DE number 7596797 (Why is no real title available?)
- Rejoinder
- scientific article; zbMATH DE number 6253908 (Why is no real title available?)
- Batched bandit problems
- Online learning of energy consumption for navigation of electric vehicles
- Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit
- Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability
- Gaussian two-armed bandit: limiting description
- scientific article; zbMATH DE number 7380836 (Why is no real title available?)
- UCB strategies and optimization of batch processing in a one-armed bandit problem
- Optimization of two-alternative batch processing with parameter estimation based on data inside batches
- Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints
- scientific article; zbMATH DE number 6542809 (Why is no real title available?)
- Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits
- Learning unknown service rates in queues: a multiarmed bandit approach
- Invariant description of control in a Gaussian one-armed bandit problem
This page was built for publication: Batched bandit problems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q282463)