Adaptive treatment allocation and the multi-armed bandit problem

From MaRDI portal
Publication:1102059


DOI10.1214/aos/1176350495zbMath0643.62054MaRDI QIDQ1102059

Tze Leung Lai

Publication date: 1987

Published in: The Annals of Statistics (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1214/aos/1176350495


60G40: Stopping times; optimal stopping problems; gambling theory

62L05: Sequential statistical design

62L12: Sequential estimation


Related Items

Unnamed Item, Efficient Adaptive Randomization and Stopping Rules in Multi-arm Clinical Trials for Testing a New Treatment, Unnamed Item, Unnamed Item, Unnamed Item, Infinite Arms Bandit: Optimality via Confidence Bounds, Learning to Optimize via Information-Directed Sampling, The Valuator’s Curse: Decision Analysis of Overvaluation and Disappointment in Acquisition, Optimistic Gittins Indices, MULTI-ARMED BANDITS WITH COVARIATES:THEORY AND APPLICATIONS, Optimal Online Learning for Nonlinear Belief Models Using Discrete Priors, An Approximation Approach for Response-Adaptive Clinical Trial Design, Learning to Optimize via Posterior Sampling, A linear response bandit problem, Sequential Generalized Likelihood Ratios and Adaptive Treatment Allocation for Optimal Sequential Selection, Unnamed Item, Encounters with Martingales in Statistics and Stochastic Optimization, Reinforcement Learning, Bit by Bit, Asymptotic optimality theory for active quickest detection with unknown postchange parameters, Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems, Poissonian two-armed bandit: a new approach, A confirmation of a conjecture on Feldman’s two-armed bandit problem, Infomax strategies for an optimal balance between exploration and exploitation, Optimal Bayesian strategies for the infinite-armed Bernoulli bandit, Boundary crossing probabilities for general exponential families, Nonparametric bandit methods, A non-parametric solution to the multi-armed bandit problem with covariates, An analysis of model-based interval estimation for Markov decision processes, Small-sample performance of Bernoulli two-armed bandit Bayesian strategies, Optimal learning and experimentation in bandit problems., On Bayesian index policies for sequential resource allocation, Optimal stopping for Brownian motion with applications to sequential analysis and option pricing, An online algorithm for the risk-aware restless bandit, Stochastic approximation: from statistical origin to big-data, multidisciplinary applications, Matrices -- compensating the loss of anschauung, Bandit and covariate processes, with finite or non-denumerable set of arms, The multi-armed bandit problem: an efficient nonparametric solution, Undiscounted bandit games, On the optimal amount of experimentation in sequential decision problems, Asymptotically optimal algorithms for budgeted multiple play bandits, Optimal strategies for a class of sequential control problems with precedence relations, Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit, Unnamed Item