EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET

From MaRDI portal

Publication:5070864

Jump to:navigation, search

DOI10.1017/S0269964818000529zbMath1484.62039arXiv1505.02865OpenAlexW2914435863MaRDI QIDQ5070864

Michael N. Katehakis, Wesley Cowan

Publication date: 14 April 2022

Published in: Probability in the Engineering and Informational Sciences (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1505.02865

zbMATH Keywords

online learning upper confidence bounds sequential allocation bandits inflated sample means forcing actions multi-armed

Mathematics Subject Classification ID

Asymptotic properties of nonparametric inference (62G20) Nonparametric estimation (62G05) Sequential statistical analysis (62L10)

Cites Work

This page was built for publication: EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5070864&oldid=19563361"