From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning
DOI10.1561/2200000038zbMath1296.91086OpenAlexW2073107347MaRDI QIDQ5168384
Publication date: 4 July 2014
Published in: Foundations and Trends® in Machine Learning (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1561/2200000038
optimizationstochastic optimizationMarkov decision processesonline learningoperations researchalgorithmic game theorygame theoretic learning
Decision theory (91B06) Monte Carlo methods (65C05) Large-scale problems in mathematical programming (90C06) Learning and adaptive systems in artificial intelligence (68T05) Stochastic programming (90C15) Management decision making, including multiple objectives (90B50) Combinatorial optimization (90C27) Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40)
Related Items (25)
This page was built for publication: From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning