Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
Publication:5396763
DOI10.1561/2200000024zbMath1281.91051DBLPjournals/ftml/BubeckC12arXiv1204.5721OpenAlexW2950929549WikidataQ59538563 ScholiaQ59538563MaRDI QIDQ5396763
Sébastien Bubeck, Nicolò Cesa-Bianchi
Publication date: 3 February 2014
Published in: Foundations and Trends® in Machine Learning (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1204.5721
optimizationonline learningreinforcement learninggame-theoretic learninglearning and statistical methods
Minimax procedures in statistical decision theory (62C20) Learning and adaptive systems in artificial intelligence (68T05) Research exposition (monographs, survey articles) pertaining to game theory, economics, and finance (91-02) Markov and semi-Markov decision processes (90C40) Sequential statistical design (62L05) Rationality and learning in game theory (91A26) Probabilistic games; gambling (91A60) Decision theory for games (91A35)
Related Items (only showing first 100 items - show all)
This page was built for publication: Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems