Batched bandit problems

DOI10.1214/15-AOS1381MaRDI QIDQ282463zbMATH OpenOpenAlexFDO

Authors Vianney Perchet, Philippe Rigollet, Sylvain Chassang, Erik Snowberg

Publication date 12 May 2016

Published in The Annals of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1505.00369, https://projecteuclid.org/euclid.aos/1458245731

sample size determination batches grouped clinical trials multi-armed bandit problems multi-phase allocation regret bounds switching cost

Mathematics Subject Classification ID

Applications of statistics to biology and medical sciences; meta analysis (62P10) Sequential statistical design (62L05) Minimax procedures in statistical decision theory (62C20)

Abstract: Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Recommendations

Cites work

Cited in

(18)

This page was built for publication: Batched bandit problems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q282463)