UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem

From MaRDI portal

Publication:653803

Jump to:navigation, search

DOI10.1007/s10998-010-3055-6zbMath1240.68164OpenAlexW1975779216MaRDI QIDQ653803

Peter Auer, Ronald Ortner

Publication date: 19 December 2011

Published in: Periodica Mathematica Hungarica (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s10998-010-3055-6

zbMATH Keywords

regret bounds stochastic multi-armed bandit problem UCB algorithm

Mathematics Subject Classification ID

Markov processes: estimation; hidden Markov models (62M05) Learning and adaptive systems in artificial intelligence (68T05) Probabilistic games; gambling (91A60)

Related Items (15)

Batched bandit problems ⋮ Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search ⋮ The multi-armed bandit problem with covariates ⋮ Unnamed Item ⋮ ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT ⋮ Transfer learning for contextual multi-armed bandits ⋮ Ballooning multi-armed bandits ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Explore First, Exploit Next: The True Shape of Regret in Bandit Problems ⋮ Approximations of the Restless Bandit Problem ⋮ Unnamed Item ⋮ Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning ⋮ A Bandit-Learning Approach to Multifidelity Approximation

Cites Work

This page was built for publication: UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:653803&oldid=12553041"