Pages that link to "Item:Q2892224"
From MaRDI portal
The following pages link to The Knowledge Gradient Algorithm for a General Class of Online Learning Problems (Q2892224):
Displaying 23 items.
- Perspectives of approximate dynamic programming (Q333093) (← links)
- Optimal learning for sequential sampling with non-parametric beliefs (Q742143) (← links)
- Optimal learning with a local parametric belief model (Q746825) (← links)
- Predictive stochastic programming (Q2127363) (← links)
- Managing mobile production-inventory systems influenced by a modulation process (Q2241563) (← links)
- Choosing a good toolkit. II: Bayes-rule based heuristics (Q2291802) (← links)
- Dynamic decision making for graphical models applied to oil exploration (Q2356038) (← links)
- Optimal learning with non-Gaussian rewards (Q2806349) (← links)
- On the Convergence Rates of Expected Improvement Methods (Q2957473) (← links)
- Variance Regularization in Sequential Bayesian Optimization (Q3387910) (← links)
- Learning to Optimize via Information-Directed Sampling (Q4969321) (← links)
- Bayesian Exploration for Approximate Dynamic Programming (Q4971589) (← links)
- Learning Manipulation Through Information Dissemination (Q5060519) (← links)
- Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials (Q5072150) (← links)
- A Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief Model (Q5137960) (← links)
- Nonstationary Bandits with Habituation and Recovery Dynamics (Q5144777) (← links)
- Optimal Online Learning for Nonlinear Belief Models Using Discrete Priors (Q5144779) (← links)
- Learning in Combinatorial Optimization: What and How to Explore (Q5144784) (← links)
- Simple Bayesian Algorithms for Best-Arm Identification (Q5144786) (← links)
- Learning to Optimize via Posterior Sampling (Q5247618) (← links)
- ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS (Q5358114) (← links)
- Satisficing in Time-Sensitive Bandit Learning (Q5870357) (← links)
- Convergence rate analysis for optimal computing budget allocation algorithms (Q6110297) (← links)