Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models

From MaRDI portal
Publication:4987192




Abstract: In this paper, we study sequential testing problems with emph{overlapping} hypotheses. We first focus on the simple problem of assessing if the mean mu of a Gaussian distribution is smaller or larger than a fixed epsilon>0; if muin(epsilon,epsilon), both answers are considered to be correct. Then, we consider PAC-best arm identification in a bandit model: given K probability distributions on mathbbR with means mu1,dots,muK, we derive the asymptotic complexity of identifying, with risk at most delta, an index Iin1,dots,K such that muIgeqmaximuiepsilon. We provide non-asymptotic bounds on the error of a parallel General Likelihood Ratio Test, which can also be used for more general testing problems. We further propose lower bound on the number of observation needed to identify a correct hypothesis. Those lower bounds rely on information-theoretic arguments, and specifically on two versions of a change of measure lemma (a high-level form, and a low-level form) whose relative merits are discussed.









This page was built for publication: Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4987192)