Partial Monitoring—Classification, Regret Bounds, and Algorithms
From MaRDI portal
Publication:5247607
DOI10.1287/moor.2014.0663zbMath1310.91028OpenAlexW1987292194MaRDI QIDQ5247607
Gábor Bartók, Alexander Rakhlin, Csaba Szepesvári, Dean P. Foster, Dávid Pál
Publication date: 24 April 2015
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1287/moor.2014.0663
Decision theory (91B06) Learning and adaptive systems in artificial intelligence (68T05) Applications of game theory (91A80) Multistage and repeated games (91A20)
Related Items (10)
Best Arm Identification for Contaminated Bandits ⋮ A general internal regret-free strategy ⋮ Improving multi-armed bandit algorithms in online pricing settings ⋮ Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management ⋮ Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback ⋮ Unnamed Item ⋮ Learning to Optimize via Information-Directed Sampling ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ Unnamed Item ⋮ Robust pricing for airlines with partial information
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The weighted majority algorithm
- Calibrated learning and correlated equilibrium
- Minimizing regret: The general case
- Toward a classification of finite partial-monitoring games
- Strategies for Prediction Under Imperfect Monitoring
- Minimizing Regret With Label Efficient Prediction
- The Nonstochastic Multiarmed Bandit Problem
- Regret Minimization Under Partial Monitoring
- Internal Regret with Partial Monitoring. Calibration-Based Optimal Algorithms
- Prediction, Learning, and Games
This page was built for publication: Partial Monitoring—Classification, Regret Bounds, and Algorithms