Policies without Memory for the Infinite-Armed Bernoulli Bandit under the Average-Reward Criterion

From MaRDI portal
Publication:5485352