A dynamic programming strategy to balance exploration and exploitation in the bandit problem

From MaRDI portal

Revision as of 09:49, 30 January 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:647433

Jump to:navigation, search

DOI10.1007/s10472-010-9190-1zbMath1226.68079OpenAlexW2052471706MaRDI QIDQ647433

Olivier Caelen, Gianluca Bontempi

Publication date: 23 November 2011

Published in: Annals of Mathematics and Artificial Intelligence (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s10472-010-9190-1

zbMATH Keywords

estimation greedy multi-armed bandit problem

Mathematics Subject Classification ID

Estimation in multivariate analysis (62H12) Learning and adaptive systems in artificial intelligence (68T05) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)

Related Items

A dynamic programming strategy to balance exploration and exploitation in the bandit problem

Uses Software

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:647433&oldid=12550066"