Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards

From MaRDI portal
Publication:3780858