Publication:5396640

From MaRDI portal

Jump to:navigation, search

zbMath1280.91039MaRDI QIDQ5396640

Satyen Kale, Elad Hazan

Publication date: 3 February 2014

Full work available at URL: http://www.jmlr.org/papers/v12/hazan11a.html

zbMATH Keywords

online learning; multi-armed bandit; regret minimization

Mathematics Subject Classification ID

91B06: Decision theory

68T05: Learning and adaptive systems in artificial intelligence

90C40: Markov and semi-Markov decision processes

91A60: Probabilistic games; gambling

Related Items

Unnamed Item, Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards, AN ONLINE PORTFOLIO SELECTION ALGORITHM WITH REGRET LOGARITHMIC IN PRICE VARIATION, Doubly robust policy evaluation and optimization, Extracting certainty from uncertainty: regret bounded by variation in costs, Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm, Truthful Mechanisms with Implicit Payment Computation

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5396640&oldid=20131183"