A class of bandit problems yielding myopic optimal strategies

From MaRDI portal

Publication:4021719

Jump to:navigation, search

DOI10.2307/3214899zbMath0766.90081OpenAlexW2044356570MaRDI QIDQ4021719

Jeffrey S. Banks, Rangarajan K. Sundaram

Publication date: 16 January 1993

Published in: Journal of Applied Probability (Search for Journal in Brave)

Full work available at URL: https://resolver.caltech.edu/CaltechAUTHORS:20160525-080809749

zbMATH Keywords

random walk bandit problems multiarmed bandit problems dynamic allocation index myopic strategies Bernoulli reward distributions

Mathematics Subject Classification ID

Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40)

Related Items (9)

Keeping your options open ⋮ A central limit theorem, loss aversion and multi-armed bandits ⋮ A confirmation of a conjecture on Feldman’s two-armed bandit problem ⋮ Parametric continuity in dynamic programming problems ⋮ Dynamic survival bias in optimal stopping problems ⋮ Parametric continuity in dynamic programming problems ⋮ Social learning in a common interest voting game ⋮ Generalized Bandit Problems ⋮ Endogenous growth model with Bayesian learning and technology selection

This page was built for publication: A class of bandit problems yielding myopic optimal strategies

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4021719&oldid=17728615"