Stochastic convex optimization with bandit feedback
From MaRDI portal
Publication:5300524
Abstract: This paper addresses the problem of minimizing a convex, Lipschitz function over a convex, compact set under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value at any query point . The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs regret. Since any algorithm has regret at least on this problem, our algorithm is optimal in terms of the scaling with .
Recommendations
- An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
- Online convex optimization in the bandit setting: gradient descent without a gradient
- Kernel-based Methods for Bandit Convex Optimization
- Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case
- Kernel-based methods for bandit convex optimization
Cited in
(24)- Zeroth-order feedback optimization for cooperative multi-agent systems
- A new one-point residual-feedback oracle for black-box learning and control
- Online bandit convex optimisation with stochastic constraints via two-point feedback
- Non-stationary stochastic optimization
- Improved regret for zeroth-order adversarial bandit convex optimisation
- An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
- Derivative-free optimization methods
- Kernel-based methods for bandit convex optimization
- Technical note: Nonstationary stochastic optimization under \(L_{p,q} \)-variation measures
- Interior-Point Methods for Full-Information and Bandit Online Learning
- Operations research applications of dichotomous search
- Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback
- Algorithmic connections between active learning and stochastic convex optimization
- Technical note: <scp>Finite‐time</scp> regret analysis of <scp>Kiefer‐Wolfowitz</scp> stochastic approximation algorithm and nonparametric <scp>multi‐product</scp> dynamic pricing with unknown demand
- Optimal policy for dynamic assortment planning under multinomial logit models
- Kernel-based Methods for Bandit Convex Optimization
- Exploratory distributions for convex functions
- An accelerated method for derivative-free smooth stochastic convex optimization
- Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case
- scientific article; zbMATH DE number 7626799 (Why is no real title available?)
- Random gradient-free minimization of convex functions
- Robust randomized optimization with \(k\) nearest neighbors
- An Efficient Algorithm for Learning with Semi-bandit Feedback
- Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm
This page was built for publication: Stochastic convex optimization with bandit feedback
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5300524)