Robustness of stochastic bandit policies

From MaRDI portal

Publication:391739

Jump to:navigation, search

DOI10.1016/j.tcs.2013.09.019zbMath1371.68239arXiv1107.4506OpenAlexW1985558253MaRDI QIDQ391739

Antoine Salomon, Jean-Yves Audibert

Publication date: 13 January 2014

Published in: Theoretical Computer Science (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1107.4506

zbMATH Keywords

exploration-exploitation tradeoff multi-armed stochastic bandit regret deviations/risk

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Sequential statistical analysis (62L10) Probabilistic games; gambling (91A60) General considerations in statistical decision theory (62C05)

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:391739&oldid=12265370"