Optimal learning and experimentation in bandit problems.

From MaRDI portal

Publication:1614793

Jump to:navigation, search

DOI10.1016/S0165-1889(01)00028-8zbMath1168.91324MaRDI QIDQ1614793

Monica Brezzi, Tze Leung Lai

Publication date: 9 September 2002

Published in: Journal of Economic Dynamics \& Control (Search for Journal in Brave)

zbMATH Keywords

Optimal stopping Multi-armed bandits Switching costs Corrected binomial algorithm Incomplete learning

Mathematics Subject Classification ID

Rationality and learning in game theory (91A26)

Related Items

Optimistic Gittins Indices, The multi-armed bandit problem: an efficient nonparametric solution, Optimal anytime regret with two experts, ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS, Online Regret Bounds for Markov Decision Processes with Deterministic Transitions, Algorithms for recursive delegation, A comparative study of ad hoc techniques and evolutionary methods for multi-armed bandit problems, Optimal stopping for Brownian motion with applications to sequential analysis and option pricing, INDEXABILITY OF BANDIT PROBLEMS WITH RESPONSE DELAYS, Corrected random walk approximations to free boundary problems in optimal stopping, Online regret bounds for Markov decision processes with deterministic transitions, Response adaptive designs that incorporate switching costs and constraints, Optimal learning with non-Gaussian rewards, On Incomplete Learning and Certainty-Equivalence Control, The Local Time Method for Targeting and Selection, Variance Regularization in Sequential Bayesian Optimization, Sequential Generalized Likelihood Ratios and Adaptive Treatment Allocation for Optimal Sequential Selection, Gittins' theorem under uncertainty, A Bayesian analysis of human decision-making on bandit problems

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1614793&oldid=13917554"