Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost
From MaRDI portal
Publication:3835405
DOI10.1109/9.7243zbMath0678.93073OpenAlexW2128421120MaRDI QIDQ3835405
Demosthenis Teneketzis, Rajeev Agrawal, Manjunath V. Hegde
Publication date: 1988
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1109/9.7243
Optimal stochastic control (93E20) Stochastic games, stochastic differential games (91A15) Applications of queueing theory (congestion, allocation, storage, traffic, etc.) (60K30) Probabilistic games; gambling (91A60)
Related Items
Optimal learning and experimentation in bandit problems., A perpetual search for talents across overlapping generations: a learning process, Online Regret Bounds for Markov Decision Processes with Deterministic Transitions, Some indexable families of restless bandit problems, Arbitrary side observations in bandit problems, Online regret bounds for Markov decision processes with deterministic transitions, Certainty equivalence control with forcing: Revisited, Generalized Bandit Problems, Gittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No Discounting