The Knowledge Gradient Algorithm for a General Class of Online Learning Problems

From MaRDI portal
Revision as of 19:33, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:2892224

DOI10.1287/OPRE.1110.0999zbMath1241.90201OpenAlexW2069034916WikidataQ131617233 ScholiaQ131617233MaRDI QIDQ2892224

Ilya O. Ryzhov, Peter I. Frazier, Warren B. Powell

Publication date: 18 June 2012

Published in: Operations Research (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1287/opre.1110.0999




Related Items (24)

Predictive stochastic programmingLearning Manipulation Through Information DisseminationPerspectives of approximate dynamic programmingBandit Theory: Applications to Learning Healthcare Systems and Clinical TrialsConvergence rate analysis for optimal computing budget allocation algorithmsOn the Convergence Rates of Expected Improvement MethodsReinforcement Learning, Bit by BitON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITSA Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief ModelNonstationary Bandits with Habituation and Recovery DynamicsOptimal Online Learning for Nonlinear Belief Models Using Discrete PriorsLearning in Combinatorial Optimization: What and How to ExploreSimple Bayesian Algorithms for Best-Arm IdentificationManaging mobile production-inventory systems influenced by a modulation processChoosing a good toolkit. II: Bayes-rule based heuristicsOptimal learning with non-Gaussian rewardsOptimal learning for sequential sampling with non-parametric beliefsOptimal learning with a local parametric belief modelLearning to Optimize via Information-Directed SamplingBayesian Exploration for Approximate Dynamic ProgrammingVariance Regularization in Sequential Bayesian OptimizationLearning to Optimize via Posterior SamplingSatisficing in Time-Sensitive Bandit LearningDynamic decision making for graphical models applied to oil exploration







This page was built for publication: The Knowledge Gradient Algorithm for a General Class of Online Learning Problems