Exploration-exploitation policies with almost sure, arbitrarily slow growing asymptotic regret
DOI10.1017/S0269964818000529zbMATH Open1484.62039arXiv1505.02865OpenAlexW2914435863WikidataQ128505023 ScholiaQ128505023MaRDI QIDQ5070864FDOQ5070864
Authors: Wesley Cowan, Michael N. Katehakis
Publication date: 14 April 2022
Published in: Probability in the Engineering and Informational Sciences (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1505.02865
Recommendations
- Finite-time analysis of the multiarmed bandit problem
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- Explore first, exploit next: the true shape of regret in bandit problems
- On Bayesian index policies for sequential resource allocation
- Pure exploration in multi-armed bandits problems
online learningupper confidence boundssequential allocationbanditsinflated sample meansforcing actionsmulti-armed
Nonparametric estimation (62G05) Asymptotic properties of nonparametric inference (62G20) Sequential statistical analysis (62L10)
Cites Work
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Asymptotically efficient adaptive allocation rules
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem
- Optimal adaptive policies for sequential allocation problems
- Regret bounds for reinforcement learning via Markov chain concentration
- Explore first, exploit next: the true shape of regret in bandit problems
- Multi-armed bandits under general depreciation and commitment
- Normal bandits of unknown means and variances
- Title not available (Why is that?)
Cited In (1)
This page was built for publication: Exploration-exploitation policies with almost sure, arbitrarily slow growing asymptotic regret
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5070864)