scientific article

From MaRDI portal

Publication:2896165

Jump to:navigation, search

zbMath1242.91034MaRDI QIDQ2896165

Jean-Yves Audibert, Sébastien Bubeck

Publication date: 13 July 2012

Full work available at URL: http://www.jmlr.org/papers/v11/audibert10a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

online learning regret bound minimax rate bandits (adversarial and stochastic)label efficient prediction with limited feedback upper confidence bound (UCB) policy

Mathematics Subject Classification ID

Computational learning theory (68Q32) Markov processes: estimation; hidden Markov models (62M05) Probabilistic games; gambling (91A60)

Related Items (20)

Batched bandit problems ⋮ Optimistic Gittins Indices ⋮ Setting Reserve Prices in Second-Price Auctions with Unobserved Bids ⋮ The multi-armed bandit problem with covariates ⋮ Kullback-Leibler upper confidence bounds for optimal sequential allocation ⋮ Unifying mirror descent and dual averaging ⋮ Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback ⋮ Unnamed Item ⋮ Relaxing the i.i.d. assumption: adaptively minimax optimal regret via root-entropic regularization ⋮ A confirmation of a conjecture on Feldman’s two-armed bandit problem ⋮ The \(K\)-armed dueling bandits problem ⋮ Data-Driven Decisions for Problems with an Unspecified Objective Function ⋮ Ballooning multi-armed bandits ⋮ On Bayesian index policies for sequential resource allocation ⋮ Truthful Mechanisms with Implicit Payment Computation ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ Online Learning over a Finite Action Set with Limited Switching ⋮ Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm ⋮ Small-Loss Bounds for Online Learning with Partial Information ⋮ On two continuum armed bandit problems in high dimensions

This page was built for publication:

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2896165&oldid=15854480"