Pure Exploration in Multi-armed Bandits Problems
From MaRDI portal
Publication:3648740
DOI10.1007/978-3-642-04414-4_7zbMath1262.68061OpenAlexW1881419322MaRDI QIDQ3648740
Sébastien Bubeck, Gilles Stoltz, Rémi Munos
Publication date: 1 December 2009
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/978-3-642-04414-4_7
Computational learning theory (68Q32) Learning and adaptive systems in artificial intelligence (68T05) Probabilistic games; gambling (91A60)
Related Items (19)
Algorithm portfolios for noisy optimization ⋮ Sequential estimation of quantiles with applications to A/B testing and best-arm identification ⋮ Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search ⋮ Smoothness-Adaptive Contextual Bandits ⋮ Always Valid Inference: Continuous Monitoring of A/B Tests ⋮ Adaptive-treed bandits ⋮ Variable Selection Via Thompson Sampling ⋮ Convergence rate analysis for optimal computing budget allocation algorithms ⋮ Constrained regret minimization for multi-criterion multi-armed bandits ⋮ Treatment recommendation with distributional targets ⋮ Tractable Sampling Strategies for Ordinal Optimization ⋮ Simple Bayesian Algorithms for Best-Arm Identification ⋮ Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization ⋮ Efficient crowdsourcing of unknown experts using bounded multi-armed bandits ⋮ Unnamed Item ⋮ Pure Exploration in Multi-armed Bandits Problems ⋮ Multi-armed bandits with episode context ⋮ A Bandit-Learning Approach to Multifidelity Approximation ⋮ Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models
Cites Work
This page was built for publication: Pure Exploration in Multi-armed Bandits Problems