A learning algorithm for the finite-time two-armed bandit problem

From MaRDI portal

Publication:3342234

Jump to:navigation, search

DOI10.1109/TSMC.1984.6313253zbMath0549.90092MaRDI QIDQ3342234

Hiroshi Takeda, Mitsuo Sato, Ken-Ichi Abe

Publication date: 1984

Published in: IEEE Transactions on Systems, Man, and Cybernetics (Search for Journal in Brave)

zbMATH Keywords

learning algorithm controlling process estimating process finite-time two-armed bandit problem

Mathematics Subject Classification ID

Markov and semi-Markov decision processes (90C40)

This page was built for publication: A learning algorithm for the finite-time two-armed bandit problem

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3342234&oldid=16589091"