Policies without Memory for the Infinite-Armed Bernoulli Bandit under the Average-Reward Criterion
From MaRDI portal
Publication:5485352
Recommendations
Cited in
(7)- Optimal Bayesian strategies for the infinite-armed Bernoulli bandit
- Finding optimal memoryless policies of POMDPs under the expected average reward criterion
- A note on infinite-armed Bernoulli bandit problems with generalized beta prior distributions
- Some memoryless bandit policies
- A note on strategies for bandit problems with infinitely many arms
- Bandit and covariate processes, with finite or non-denumerable set of arms
- Topp-Leone distribution with an application to binomial sampling
This page was built for publication: Policies without Memory for the Infinite-Armed Bernoulli Bandit under the Average-Reward Criterion
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5485352)