Pure exploration in finitely-armed and continuous-armed bandits

From MaRDI portal

Publication:2431430

Jump to:navigation, search

DOI10.1016/j.tcs.2010.12.059zbMath1214.62082OpenAlexW2108794978MaRDI QIDQ2431430

Rémi Munos, Sébastien Bubeck, Gilles Stoltz

Publication date: 14 April 2011

Published in: Theoretical Computer Science (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1016/j.tcs.2010.12.059

zbMATH Keywords

simple regret multi-armed bandits efficient exploration continuous-armed bandits

Mathematics Subject Classification ID

Inference from stochastic processes and prediction (62M20) Sequential statistical design (62L05)

Related Items (16)

Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials ⋮ Robust Learning of Consumer Preferences ⋮ Intrinsically motivated model learning for developing curious robots ⋮ Information theory for ranking and selection ⋮ Unnamed Item ⋮ Simple and cumulative regret for continuous noisy optimization ⋮ Learning the distribution with largest mean: two bandit frameworks ⋮ Gaussian process bandits with adaptive discretization ⋮ Deep learning for ranking response surfaces with applications to optimal stopping problems ⋮ Unnamed Item ⋮ A bad arm existence checking problem: how to utilize asymmetric problem structure? ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ A PAC algorithm in relative precision for bandit problem with costly sampling ⋮ An asymptotically optimal strategy for constrained multi-armed bandit problems ⋮ Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning ⋮ Sequential Design for Ranking Response Surfaces

Cites Work

This page was built for publication: Pure exploration in finitely-armed and continuous-armed bandits

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2431430&oldid=15088956"