Finite-Time Analysis for the Knowledge-Gradient Policy
From MaRDI portal
Publication:4610155
DOI10.1137/16M1073388zbMath1387.62029arXiv1606.04624OpenAlexW2963389017WikidataQ130050586 ScholiaQ130050586MaRDI QIDQ4610155
Yingfei Wang, Warren B. Powell
Publication date: 5 April 2018
Published in: SIAM Journal on Control and Optimization (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1606.04624
Bayesian problems; characterization of Bayes procedures (62C10) Learning and adaptive systems in artificial intelligence (68T05) Sequential statistical analysis (62L10) Statistical ranking and selection procedures (62F07)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Optimal learning for sequential sampling with non-parametric beliefs
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Asymptotically efficient adaptive allocation rules
- Efficient global optimization of expensive black-box functions
- Simulation budget allocation for further enhancing the efficiency of ordinal optimization
- Bayesian look ahead one-stage sampling allocations for selection of the best population
- Regret bounds for sleeping experts and bandits
- Global optimization of stochastic black-box systems via sequential kriging meta-models
- The Knowledge-Gradient Policy for Correlated Normal Beliefs
- The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery
- On Upper-Confidence Bound Policies for Switching Bandit Problems
- Selecting a Selection Procedure
- A Knowledge-Gradient Policy for Sequential Information Collection
- An analysis of approximations for maximizing submodular set functions—I
- Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- The Data-Correcting Algorithm for the Minimization of Supermodular Functions
- Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting
- Efficient Dynamic Simulation Allocation in Ordinal Optimization
- A Bayesian Approach to Some Best Population Problems
- Bandits With Heavy Tail
- Finite-time analysis of the multiarmed bandit problem