Boundary crossing probabilities for general exponential families
From MaRDI portal
(Redirected from Publication:722599)
Abstract: We consider parametric exponential families of dimension on the real line. We study a variant of extit{boundary crossing probabilities} coming from the multi-armed bandit literature, in the case when the real-valued distributions form an exponential family of dimension . Formally, our result is a concentration inequality that bounds the probability that , where is the parameter of an unknown target distribution, is the empirical parameter estimate built from observations, is the log-partition function of the exponential family and is the corresponding Bregman divergence. From the perspective of stochastic multi-armed bandits, we pay special attention to the case when the boundary function is logarithmic, as it is enables to analyze the regret of the state-of-the-art KLUCB and KLUCBp strategies, whose analysis was left open in such generality. Indeed, previous results only hold for the case when , while we provide results for arbitrary finite dimension , thus considerably extending the existing results. Perhaps surprisingly, we highlight that the proof techniques to achieve these strong results already existed three decades ago in the work of T.L. Lai, and were apparently forgotten in the bandit community. We provide a modern rewriting of these beautiful techniques that we believe are useful beyond the application to stochastic multi-armed bandits.
Recommendations
- Boundary crossing for general exponential families
- Boundary crossing probabilities and statistical applications
- scientific article; zbMATH DE number 1218890
- Boundary crossing probability for Brownian motion and general boundaries
- On boundary crossing probabilities for diffusion processes
- Approximations for the boundary crossing probabilities of moving sums of random variables
- Boundary crossing probabilities for locally Poisson processes
Cites work
- scientific article; zbMATH DE number 193660 (Why is no real title available?)
- scientific article; zbMATH DE number 3638998 (Why is no real title available?)
- scientific article; zbMATH DE number 3296905 (Why is no real title available?)
- Adaptive treatment allocation and the multi-armed bandit problem
- Asymptotically efficient adaptive allocation rules
- Boundary crossing problems for sample means
- Dominant measures and Sanov's theorem
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Finite-time analysis of the multiarmed bandit problem
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- On a Criterion for the Rejection of Observations and the Distribution of the Ratio of Deviation to Sample Standard Deviation
- Optimal Adaptive Policies for Markov Decision Processes
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- Sequential Tests of Statistical Hypotheses
- Some aspects of the sequential design of experiments
Cited in
(3)
This page was built for publication: Boundary crossing probabilities for general exponential families
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q722599)