Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost

From MaRDI portal

Publication:3835405

Jump to:navigation, search

DOI10.1109/9.7243zbMath0678.93073OpenAlexW2128421120MaRDI QIDQ3835405

Demosthenis Teneketzis, Rajeev Agrawal, Manjunath V. Hegde

Publication date: 1988

Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1109/9.7243

zbMATH Keywords

switching costs asymptotic performance resource allocation problems multiarmed bandit problems

Mathematics Subject Classification ID

Optimal stochastic control (93E20) Stochastic games, stochastic differential games (91A15) Applications of queueing theory (congestion, allocation, storage, traffic, etc.) (60K30) Probabilistic games; gambling (91A60)

Related Items

Optimal learning and experimentation in bandit problems., A perpetual search for talents across overlapping generations: a learning process, Online Regret Bounds for Markov Decision Processes with Deterministic Transitions, Some indexable families of restless bandit problems, Arbitrary side observations in bandit problems, Online regret bounds for Markov decision processes with deterministic transitions, Certainty equivalence control with forcing: Revisited, Generalized Bandit Problems, Gittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No Discounting

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3835405&oldid=17430602"