Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models

From MaRDI portal
Publication:4987192

DOI10.1080/07474946.2021.1847965zbMATH Open1464.62360arXiv1905.03495OpenAlexW3138493539MaRDI QIDQ4987192FDOQ4987192


Authors: Emilie Kaufmann, Aurélien Garivier Edit this on Wikidata


Publication date: 29 April 2021

Published in: Sequential Analysis (Search for Journal in Brave)

Abstract: In this paper, we study sequential testing problems with emph{overlapping} hypotheses. We first focus on the simple problem of assessing if the mean mu of a Gaussian distribution is smaller or larger than a fixed epsilon>0; if muin(epsilon,epsilon), both answers are considered to be correct. Then, we consider PAC-best arm identification in a bandit model: given K probability distributions on mathbbR with means mu1,dots,muK, we derive the asymptotic complexity of identifying, with risk at most delta, an index Iin1,dots,K such that muIgeqmaximuiepsilon. We provide non-asymptotic bounds on the error of a parallel General Likelihood Ratio Test, which can also be used for more general testing problems. We further propose lower bound on the number of observation needed to identify a correct hypothesis. Those lower bounds rely on information-theoretic arguments, and specifically on two versions of a change of measure lemma (a high-level form, and a low-level form) whose relative merits are discussed.


Full work available at URL: https://arxiv.org/abs/1905.03495




Recommendations




Cites Work


Cited In (3)





This page was built for publication: Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4987192)