Adaptive treatment allocation and the multi-armed bandit problem
From MaRDI portal
Publication:1102059
DOI10.1214/aos/1176350495zbMath0643.62054MaRDI QIDQ1102059
Publication date: 1987
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1214/aos/1176350495
adaptive control; boundary crossing; simulation study; dynamic allocation; asymptotic optimality; upper confidence bounds; multi-armed bandit problem; adaptive treatment allocation
60G40: Stopping times; optimal stopping problems; gambling theory
62L05: Sequential statistical design
62L12: Sequential estimation
Related Items
Unnamed Item, Efficient Adaptive Randomization and Stopping Rules in Multi-arm Clinical Trials for Testing a New Treatment, Unnamed Item, Unnamed Item, Unnamed Item, Infinite Arms Bandit: Optimality via Confidence Bounds, Learning to Optimize via Information-Directed Sampling, The Valuator’s Curse: Decision Analysis of Overvaluation and Disappointment in Acquisition, Optimistic Gittins Indices, MULTI-ARMED BANDITS WITH COVARIATES:THEORY AND APPLICATIONS, Optimal Online Learning for Nonlinear Belief Models Using Discrete Priors, An Approximation Approach for Response-Adaptive Clinical Trial Design, Learning to Optimize via Posterior Sampling, A linear response bandit problem, Sequential Generalized Likelihood Ratios and Adaptive Treatment Allocation for Optimal Sequential Selection, Unnamed Item, Encounters with Martingales in Statistics and Stochastic Optimization, Reinforcement Learning, Bit by Bit, Asymptotic optimality theory for active quickest detection with unknown postchange parameters, Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems, Poissonian two-armed bandit: a new approach, A confirmation of a conjecture on Feldman’s two-armed bandit problem, Infomax strategies for an optimal balance between exploration and exploitation, Optimal Bayesian strategies for the infinite-armed Bernoulli bandit, Boundary crossing probabilities for general exponential families, Nonparametric bandit methods, A non-parametric solution to the multi-armed bandit problem with covariates, An analysis of model-based interval estimation for Markov decision processes, Small-sample performance of Bernoulli two-armed bandit Bayesian strategies, Optimal learning and experimentation in bandit problems., On Bayesian index policies for sequential resource allocation, Optimal stopping for Brownian motion with applications to sequential analysis and option pricing, An online algorithm for the risk-aware restless bandit, Stochastic approximation: from statistical origin to big-data, multidisciplinary applications, Matrices -- compensating the loss of anschauung, Bandit and covariate processes, with finite or non-denumerable set of arms, The multi-armed bandit problem: an efficient nonparametric solution, Undiscounted bandit games, On the optimal amount of experimentation in sequential decision problems, Asymptotically optimal algorithms for budgeted multiple play bandits, Optimal strategies for a class of sequential control problems with precedence relations, Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit, Unnamed Item