The following pages link to (Q3046711):
Displaying 21 items.
- Approximation algorithms for stochastic combinatorial optimization problems (Q290321) (← links)
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624) (← links)
- An analysis of model-based interval estimation for Markov decision processes (Q959899) (← links)
- A PAC algorithm in relative precision for bandit problem with costly sampling (Q2084297) (← links)
- Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning (Q2094051) (← links)
- Sequential estimation of quantiles with applications to A/B testing and best-arm identification (Q2137037) (← links)
- Pure exploration in finitely-armed and continuous-armed bandits (Q2431430) (← links)
- Efficient PAC learning for episodic tasks with acyclic state spaces (Q2465672) (← links)
- Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm (Q2514758) (← links)
- Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback (Q3386400) (← links)
- Bayesian Incentive-Compatible Bandit Exploration (Q3387959) (← links)
- Adaptive Incentive-Compatible Sponsored Search Auction (Q3599081) (← links)
- Pure Exploration in Multi-armed Bandits Problems (Q3648740) (← links)
- (Q4998871) (← links)
- Knockout-Tournament Procedures for Large-Scale Ranking and Selection in Parallel Computing Environments (Q5031020) (← links)
- Solving Large-Scale Fixed-Budget Ranking and Selection Problems (Q5060777) (← links)
- Always Valid Inference: Continuous Monitoring of A/B Tests (Q5095177) (← links)
- Amplification and Derandomization without Slowdown (Q5129234) (← links)
- Tractable Sampling Strategies for Ordinal Optimization (Q5131546) (← links)
- Simple Bayesian Algorithms for Best-Arm Identification (Q5144786) (← links)
- Best Arm Identification for Contaminated Bandits (Q5214178) (← links)