Sample mean based index policies by O(log n) regret for the multi-armed bandit problem

From MaRDI portal
Publication:4862097

DOI10.2307/1427934zbMATH Open0840.90129OpenAlexW2000080679MaRDI QIDQ4862097FDOQ4862097


Authors: Rajeev Agrawal Edit this on Wikidata


Publication date: 9 July 1996

Published in: Advances in Applied Probability (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.2307/1427934




Recommendations





Cited In (39)





This page was built for publication: Sample mean based index policies by O(log n) regret for the multi-armed bandit problem

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4862097)