Partial Monitoring—Classification, Regret Bounds, and Algorithms

From MaRDI portal

Publication:5247607

Jump to:navigation, search

DOI10.1287/moor.2014.0663zbMath1310.91028OpenAlexW1987292194MaRDI QIDQ5247607

Gábor Bartók, Alexander Rakhlin, Csaba Szepesvári, Dean P. Foster, Dávid Pál

Publication date: 24 April 2015

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1287/moor.2014.0663

zbMATH Keywords

repeated games imperfect information partial monitoring regret analysis

Mathematics Subject Classification ID

Decision theory (91B06) Learning and adaptive systems in artificial intelligence (68T05) Applications of game theory (91A80) Multistage and repeated games (91A20)

Related Items (10)

Best Arm Identification for Contaminated Bandits ⋮ A general internal regret-free strategy ⋮ Improving multi-armed bandit algorithms in online pricing settings ⋮ Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management ⋮ Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback ⋮ Unnamed Item ⋮ Learning to Optimize via Information-Directed Sampling ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ Unnamed Item ⋮ Robust pricing for airlines with partial information

Cites Work

This page was built for publication: Partial Monitoring—Classification, Regret Bounds, and Algorithms

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5247607&oldid=19874765"