scientific article; zbMATH DE number 6982311
From MaRDI portal
Publication:4558161
zbMATH Open1445.62015MaRDI QIDQ4558161FDOQ4558161
Authors: Tor Lattimore
Publication date: 21 November 2018
Full work available at URL: http://jmlr.csail.mit.edu/papers/v19/17-513.html
Title of this publication is not available (Why is that?)
Recommendations
- Infinite Arms Bandit: Optimality via Confidence Bounds
- 10.1162/153244303321897663
- scientific article; zbMATH DE number 6276176
- Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits
- On upper-confidence bound policies for switching bandit problems
- Pure exploration in infinitely-armed bandit models with fixed-confidence
- Confidence sets for discrete stochastic optimization
- Optimal learning and experimentation in bandit problems.
Nonparametric tolerance and confidence regions (62G15) Minimax procedures in statistical decision theory (62C20) Sequential estimation (62L12)
Cites Work
- Pure exploration in multi-armed bandits problems
- Introduction to nonparametric estimation
- Concentration inequalities. A nonasymptotic theory of independence
- Asymptotically efficient adaptive allocation rules
- Multi-armed bandit allocation indices. With a foreword by Peter Whittle.
- Finite-time analysis of the multiarmed bandit problem
- An Asymptotic Minimax Theorem for the Two Armed Bandit Problem
- Title not available (Why is that?)
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- The multi-armed bandit problem with covariates
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems
- Lemma 1
- Adaptive treatment allocation and the multi-armed bandit problem
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem
- Optimal adaptive policies for sequential allocation problems
- Boundary crossing of Brownian motion. Its relation to the law of the iterated logarithm and to sequential analysis
- Tuning Bandit Algorithms in Stochastic Environments
- Title not available (Why is that?)
- Finite-time lower bounds for the two-armed bandit problem
- Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards
- Asymptotically efficient adaptive allocation schemes for controlled i.i.d. processes: finite parameter space
- Near-optimal regret bounds for Thompson sampling
- On Bayesian index policies for sequential resource allocation
- Normal bandits of unknown means and variances
- Title not available (Why is that?)
- Title not available (Why is that?)
- Lower bounds and selectivity of weak-consistent policies in stochastic multi-armed bandit problem
Cited In (4)
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4558161)