Good arm identification via bandit feedback
DOI10.1007/S10994-019-05784-4zbMATH Open1491.68160arXiv1710.06360OpenAlexW2962902250WikidataQ128264264 ScholiaQ128264264MaRDI QIDQ2425222FDOQ2425222
Atsuyoshi Nakamura, Masashi Sugiyama, Kentaro Matsuura, Kentaro Sakamaki, Hideaki Kano, Junya Honda
Publication date: 26 June 2019
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1710.06360
Learning and adaptive systems in artificial intelligence (68T05) Stopping times; optimal stopping problems; gambling theory (60G40)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- On the complexity of best-arm identification in multi-armed bandit models
- Asymptotically efficient adaptive allocation rules
- Finite-time analysis of the multiarmed bandit problem
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- A procedure for selecting a subset of size m containing the l best of k independent normal populations, with applications to simulation
Cited In (3)
This page was built for publication: Good arm identification via bandit feedback
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2425222)