Regret bounds for restless Markov bandits
From MaRDI portal
Publication:465253
DOI10.1016/j.tcs.2014.09.026zbMath1360.60090arXiv1209.2693OpenAlexW2178643644MaRDI QIDQ465253
Daniil Ryabko, Rémi Munos, Peter Auer, Ronald Ortner
Publication date: 31 October 2014
Published in: Theoretical Computer Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1209.2693
Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40) Probabilistic games; gambling (91A60)
Related Items (5)
Multi-armed bandit problem with online clustering as side information ⋮ Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards ⋮ An online algorithm for the risk-aware restless bandit ⋮ Approximations of the Restless Bandit Problem ⋮ Learning Unknown Service Rates in Queues: A Multiarmed Bandit Approach
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Equivalence notions and model minimization in Markov decision processes
- On the possibility of learning in reactive environments with arbitrary dependence
- Asymptotically efficient adaptive allocation rules
- Threshold limits for cover times
- Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
- Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards
- On Chebyshev-Type Inequalities for Primes
- The Nonstochastic Multiarmed Bandit Problem
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: Regret bounds for restless Markov bandits