Bounded Regret for Finitely Parameterized Multi-Armed Bandits
From MaRDI portal
Publication:5050096
DOI10.1007/978-3-030-98519-6_17OpenAlexW4226365821MaRDI QIDQ5050096
Kishan Panaganti, Dileep Kalathil, Pravin P. Varaiya
Publication date: 15 November 2022
Published in: Stochastic Analysis, Filtering, and Stochastic Optimization (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2003.01328
Applications of statistics (62P99) Neural nets and related approaches to inference from stochastic processes (62M45)
Cites Work
- Combinatorial bandits
- Asymptotically efficient adaptive allocation rules
- Decentralized Learning for Multiplayer Multiarmed Bandits
- Asymptotically efficient adaptive allocation schemes for controlled i.i.d. processes: finite parameter space
- High-Dimensional Probability
- Bandit problems with side observations
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: Bounded Regret for Finitely Parameterized Multi-Armed Bandits