scientific article
From MaRDI portal
Publication:2896165
zbMath1242.91034MaRDI QIDQ2896165
Jean-Yves Audibert, Sébastien Bubeck
Publication date: 13 July 2012
Full work available at URL: http://www.jmlr.org/papers/v11/audibert10a.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
online learningregret boundminimax ratebandits (adversarial and stochastic)label efficientprediction with limited feedbackupper confidence bound (UCB) policy
Computational learning theory (68Q32) Markov processes: estimation; hidden Markov models (62M05) Probabilistic games; gambling (91A60)
Related Items (20)
Batched bandit problems ⋮ Optimistic Gittins Indices ⋮ Setting Reserve Prices in Second-Price Auctions with Unobserved Bids ⋮ The multi-armed bandit problem with covariates ⋮ Kullback-Leibler upper confidence bounds for optimal sequential allocation ⋮ Unifying mirror descent and dual averaging ⋮ Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback ⋮ Unnamed Item ⋮ Relaxing the i.i.d. assumption: adaptively minimax optimal regret via root-entropic regularization ⋮ A confirmation of a conjecture on Feldman’s two-armed bandit problem ⋮ The \(K\)-armed dueling bandits problem ⋮ Data-Driven Decisions for Problems with an Unspecified Objective Function ⋮ Ballooning multi-armed bandits ⋮ On Bayesian index policies for sequential resource allocation ⋮ Truthful Mechanisms with Implicit Payment Computation ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ Online Learning over a Finite Action Set with Limited Switching ⋮ Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm ⋮ Small-Loss Bounds for Online Learning with Partial Information ⋮ On two continuum armed bandit problems in high dimensions
This page was built for publication: