Optimal learning and experimentation in bandit problems.
From MaRDI portal
Publication:1614793
DOI10.1016/S0165-1889(01)00028-8zbMath1168.91324MaRDI QIDQ1614793
Publication date: 9 September 2002
Published in: Journal of Economic Dynamics \& Control (Search for Journal in Brave)
Related Items
Optimistic Gittins Indices, The multi-armed bandit problem: an efficient nonparametric solution, Optimal anytime regret with two experts, ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS, Online Regret Bounds for Markov Decision Processes with Deterministic Transitions, Algorithms for recursive delegation, A comparative study of ad hoc techniques and evolutionary methods for multi-armed bandit problems, Optimal stopping for Brownian motion with applications to sequential analysis and option pricing, INDEXABILITY OF BANDIT PROBLEMS WITH RESPONSE DELAYS, Corrected random walk approximations to free boundary problems in optimal stopping, Online regret bounds for Markov decision processes with deterministic transitions, Response adaptive designs that incorporate switching costs and constraints, Optimal learning with non-Gaussian rewards, On Incomplete Learning and Certainty-Equivalence Control, The Local Time Method for Targeting and Selection, Variance Regularization in Sequential Bayesian Optimization, Sequential Generalized Likelihood Ratios and Adaptive Treatment Allocation for Optimal Sequential Selection, Gittins' theorem under uncertainty, A Bayesian analysis of human decision-making on bandit problems
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Asymptotically efficient adaptive allocation rules
- Adaptive treatment allocation and the multi-armed bandit problem
- A Survey of Some Results in Stochastic Adaptive Control
- Numerical Solutions for Bayes Sequential Decision Problems
- Optimal stopping and dynamic allocation
- Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost
- Contributions to the "Two-Armed Bandit" Problem
- Denumerable-Armed Bandits
- Sequential Tests Involving Two Populations
- Switching Costs and the Gittins Index
- Incomplete Learning from Endogenous Data in Dynamic Allocation
- Some Remarks on the Two-Armed Bandit
- A Bernoulli Two-armed Bandit