Batched bandit problems

DOI10.1214/15-AOS1381zbMATH Open1338.62180arXiv1505.00369OpenAlexW1958090791MaRDI QIDQ282463FDOQ282463

Authors: Vianney Perchet, Philippe Rigollet, Sylvain Chassang, Erik Snowberg

Publication date: 12 May 2016

Published in: The Annals of Statistics (Search for Journal in Brave)

Abstract: Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Full work available at URL: https://arxiv.org/abs/1505.00369

Recommendations

zbMATH Keywords

sample size determination batches grouped clinical trials multi-armed bandit problems multi-phase allocation regret bounds switching cost

Mathematics Subject Classification ID

Applications of statistics to biology and medical sciences; meta analysis (62P10) Sequential statistical design (62L05) Minimax procedures in statistical decision theory (62C20)

Cites Work

Cited In (18)

This page was built for publication: Batched bandit problems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q282463)