Woodroofe's one-armed bandit problem revisited
From MaRDI portal
Publication:835072
DOI10.1214/08-AAP589zbMath1168.62071arXiv0909.0119OpenAlexW3106279838MaRDI QIDQ835072
Alexander Goldenshluger, Assaf J. Zeevi
Publication date: 27 August 2009
Published in: The Annals of Applied Probability (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/0909.0119
minimaxestimationonline learningregretbandit problemssequential allocationinferior sampling raterate-optimal policy
Minimax procedures in statistical decision theory (62C20) Stopping times; optimal stopping problems; gambling theory (60G40) Sequential statistical design (62L05)
Related Items
A non-parametric solution to the multi-armed bandit problem with covariates, Bandit and covariate processes, with finite or non-denumerable set of arms, A linear response bandit problem, Smooth Contextual Bandits: Bridging the Parametric and Nondifferentiable Regret Regimes, MULTI-ARMED BANDITS WITH COVARIATES:THEORY AND APPLICATIONS, The multi-armed bandit problem with covariates, One-armed bandit process with a covariate, Unnamed Item, Transfer learning for contextual multi-armed bandits, Randomized allocation with arm elimination in a bandit problem with covariates, Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit, Nonparametric Pricing Analytics with Customer Covariates
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Information inequalities for the Bayes risk
- Pseudo-maximization and self-normalized processes
- Asymptotically efficient adaptive allocation rules
- One-armed bandit problems with covariates
- Smooth discrimination analysis
- Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
- Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws.
- Optimal aggregation of classifiers in statistical learning.
- Applications of the van Trees inequality: A Bayesian Cramér-Rao bound
- Deviation probability bound for martingales with applications to statistical estimation
- Arbitrary side observations in bandit problems
- Covariate models for bernoulli bandits
- A One-Armed Bandit Problem with a Concomitant Variable
- A Note on Performance Limitations in Bandit Problems With Side Information
- Prediction, Learning, and Games
- Some aspects of the sequential design of experiments