Regret bounds for restless Markov bandits

From MaRDI portal

Publication:465253

Jump to:navigation, search

DOI10.1016/j.tcs.2014.09.026zbMath1360.60090arXiv1209.2693OpenAlexW2178643644MaRDI QIDQ465253

Daniil Ryabko, Rémi Munos, Peter Auer, Ronald Ortner

Publication date: 31 October 2014

Published in: Theoretical Computer Science (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1209.2693

zbMATH Keywords

Markov decision processes regret restless bandits

Mathematics Subject Classification ID

Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40) Probabilistic games; gambling (91A60)

Related Items (5)

Multi-armed bandit problem with online clustering as side information ⋮ Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards ⋮ An online algorithm for the risk-aware restless bandit ⋮ Approximations of the Restless Bandit Problem ⋮ Learning Unknown Service Rates in Queues: A Multiarmed Bandit Approach

Cites Work

This page was built for publication: Regret bounds for restless Markov bandits

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:465253&oldid=12341243"