A class of bandit problems yielding myopic optimal strategies
From MaRDI portal
Publication:4021719
DOI10.2307/3214899zbMath0766.90081OpenAlexW2044356570MaRDI QIDQ4021719
Jeffrey S. Banks, Rangarajan K. Sundaram
Publication date: 16 January 1993
Published in: Journal of Applied Probability (Search for Journal in Brave)
Full work available at URL: https://resolver.caltech.edu/CaltechAUTHORS:20160525-080809749
random walkbandit problemsmultiarmed bandit problemsdynamic allocation indexmyopic strategiesBernoulli reward distributions
Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40)
Related Items (9)
Keeping your options open ⋮ A central limit theorem, loss aversion and multi-armed bandits ⋮ A confirmation of a conjecture on Feldman’s two-armed bandit problem ⋮ Parametric continuity in dynamic programming problems ⋮ Dynamic survival bias in optimal stopping problems ⋮ Parametric continuity in dynamic programming problems ⋮ Social learning in a common interest voting game ⋮ Generalized Bandit Problems ⋮ Endogenous growth model with Bayesian learning and technology selection
This page was built for publication: A class of bandit problems yielding myopic optimal strategies