Adaptive policies for sequential sampling under incomplete information and a cost constraint
From MaRDI portal
Publication:5261007
Abstract: We consider the problem of sequential sampling from a finite number of independent statistical populations to maximize the expected infinite horizon average outcome per period, under a constraint that the expected average sampling cost does not exceed an upper bound. The outcome distributions are not known. We construct a class of consistent adaptive policies, under which the average outcome converges with probability 1 to the true value under complete information for all distributions with finite means. We also compare the rate of convergence for various policies in this class using simulation.
Recommendations
- Optimal adaptive policies for sequential allocation problems
- Obtaining the best value for money in adaptive sequential estimation
- Asymptotically optimal multi-armed bandit policies under a cost constraint
- Consistency of sequential Bayesian sampling policies
- Optimal sequential sampling from two populations.
Cites work
- Asymptotically efficient adaptive allocation rules
- Finite-time analysis of the multiarmed bandit problem
- Finite-time lower bounds for the two-armed bandit problem
- Gittins indices and constrained allocation in clinical trials
- Learning Theory
- Optimal Adaptive Policies for Markov Decision Processes
- Optimal adaptive policies for sequential allocation problems
- Some aspects of the sequential design of experiments
Cited in
(3)
This page was built for publication: Adaptive policies for sequential sampling under incomplete information and a cost constraint
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5261007)