Publication:4998863
From MaRDI portal
Weiyu Yan, Cynthia Rudin, S. Tracà
Publication date: 9 July 2021
Full work available at URL: https://arxiv.org/abs/1505.05629
multi-armed bandit; regret bounds; retail management; exploration-exploitation trade-off; online applications; incorporating time-series into bandits
68T05: Learning and adaptive systems in artificial intelligence
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The multi-armed bandit problem with covariates
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
- Asymptotically efficient adaptive allocation rules
- New approaches to statistical learning theory
- Learning in a Changing World: Restless Multiarmed Bandit With Unknown Dynamics
- Multi‐Armed Bandit Allocation Indices
- On Upper-Confidence Bound Policies for Switching Bandit Problems
- Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis
- Bandits with Knapsacks
- Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
- Prediction, Learning, and Games
- Finite-time analysis of the multiarmed bandit problem