Regret bounds for Narendra-Shapiro bandit algorithms
DOI10.1080/17442508.2018.1457675zbMATH Open1498.60303arXiv1502.04874OpenAlexW1693596837MaRDI QIDQ5086451FDOQ5086451
Authors: Sébastien Gadat, Fabien Panloup, Sofiane Saadane
Publication date: 5 July 2022
Published in: Stochastics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1502.04874
Recommendations
- Bounded Regret for Finitely Parameterized Multi-Armed Bandits
- Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems
- Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit
- Regret bounds for restless Markov bandits
- Regret Bounds for Restless Markov Bandits
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- Near-optimal regret bounds for reinforcement learning
- Near-optimal regret bounds for Thompson sampling
- Deviations of stochastic bandit regret
- Constrained regret minimization for multi-criterion multi-armed bandits
Sequential statistical analysis (62L10) Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Continuous-time Markov processes on general state spaces (60J25)
Cites Work
- Stochastic approximation methods for constrained and unconstrained systems
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- The Nonstochastic Multiarmed Bandit Problem
- Some aspects of the sequential design of experiments
- 10.1162/153244303321897663
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Total variation estimates for the TCP process
- When can the two-armed bandit algorithm be trusted?
- On the linear model with two absorbing barriers
- Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria
- A penalized bandit algorithm
- How Fast Is the Bandit?
- Long time behavior of Markov processes and beyond
Cited In (3)
This page was built for publication: Regret bounds for Narendra-Shapiro bandit algorithms
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5086451)