On the optimal solution of the one-armed bandit adaptive control problem

From MaRDI portal

Publication:3931046

Jump to:navigation, search

DOI10.1109/TAC.1981.1102790zbMath0475.90087OpenAlexW2164360744MaRDI QIDQ3931046

P. R. Kumar, Thomas I. Seidman

Publication date: 1981

Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1109/tac.1981.1102790

zbMATH Keywords

optimal strategy optimal solution sequential design of experiments designs for clinical trials good approximations to boundary function one-armed bandit adaptive control problem sequential adaptive control two slot machines

Mathematics Subject Classification ID

Bayesian problems; characterization of Bayes procedures (62C10) Optimal stochastic control (93E20) Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40) Sequential statistical design (62L05)

Related Items

On the computation of the optimal cost function for discrete time Markov models with partial observations ⋮ On the improvement of allocation rules for multi-armed bandit problem ⋮ Optimal cost and policy for a Markovian replacement problem

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3931046&oldid=17611293"