Adaptive-treed bandits
From MaRDI portal
Publication:888482
Abstract: We describe a novel algorithm for noisy global optimisation and continuum-armed bandits, with good convergence properties over any continuous reward function having finitely many polynomial maxima. Over such functions, our algorithm achieves square-root regret in bandits, and inverse-square-root error in optimisation, without prior information. Our algorithm works by reducing these problems to tree-armed bandits, and we also provide new results in this setting. We show it is possible to adaptively combine multiple trees so as to minimise the regret, and also give near-matching lower bounds on the regret in terms of the zooming dimension.
Recommendations
Cites work
- scientific article; zbMATH DE number 3911487 (Why is no real title available?)
- scientific article; zbMATH DE number 5485582 (Why is no real title available?)
- Finite-time analysis of the multiarmed bandit problem
- Improved Rates for the Stochastic Continuum-Armed Bandit Problem
- Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting
- Lipschitz bandits without the Lipschitz constant
- Pure exploration in multi-armed bandits problems
- Recent approaches to global optimization problems through particle Swarm optimization
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems
- Stochastic Estimation of the Maximum of a Regression Function
- The Continuum-Armed Bandit Problem
- The knowledge-gradient policy for correlated normal beliefs
- \(X\)-armed bandits
Cited in
(9)- Lipschitz bandits without the Lipschitz constant
- From bandits to Monte-Carlo tree search: the optimistic principle applied to optimization and planning
- Contextual bandits with continuous actions: smoothing, zooming, and adapting
- The Continuum-Armed Bandit Problem
- \(X\)-armed bandits
- Gaussian process bandits with adaptive discretization
- Filtered Poisson process bandit on a continuum
- Convergence rate of a simulated annealing algorithm with noisy observations
- Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm
This page was built for publication: Adaptive-treed bandits
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q888482)