PAC-Bayesian lifelong learning for multi-armed bandits
From MaRDI portal
Publication:2134066
DOI10.1007/s10618-022-00825-4zbMath1494.68214arXiv2203.03303OpenAlexW4220868778MaRDI QIDQ2134066
Melih Kandemir, Hamish Flynn, David Reeb, Jan Peters
Publication date: 5 May 2022
Published in: Data Mining and Knowledge Discovery (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2203.03303
Bayesian inference (62F15) Learning and adaptive systems in artificial intelligence (68T05) Sequential statistical analysis (62L10) Optimal stopping in statistics (62L15)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Doubly robust policy evaluation and optimization
- PAC-Bayesian inequalities of some random variables sequences
- Statistical learning theory and stochastic optimization. Ecole d'Eté de Probabilitiés de Saint-Flour XXXI -- 2001.
- A theory of learning from different domains
- Some PAC-Bayesian theorems
- Weighted sums of certain dependent random variables
- PAC-Bayesian Inequalities for Martingales
- Asymptotic evaluation of certain markov process expectations for large time, I
- Prediction, Learning, and Games
This page was built for publication: PAC-Bayesian lifelong learning for multi-armed bandits