The following pages link to (Q3093383):
Displayed 33 items.
- Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization (Q72746) (← links)
- The multi-armed bandit problem with covariates (Q355096) (← links)
- Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model (Q399890) (← links)
- The \(K\)-armed dueling bandits problem (Q440003) (← links)
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803) (← links)
- Multi-armed bandits with episode context (Q766259) (← links)
- An analysis of model-based interval estimation for Markov decision processes (Q959899) (← links)
- A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing (Q1690964) (← links)
- Adaptive racing ranking-based immune optimization approach solving multi-objective expected value programming (Q1797827) (← links)
- Rollout sampling approximate policy iteration (Q2036256) (← links)
- Choosing the best arm with guaranteed confidence (Q2096406) (← links)
- The pure exploration problem with general reward functions depending on full distributions (Q2102381) (← links)
- Dynamic pricing with finite price sets: a non-parametric approach (Q2238754) (← links)
- Active ranking from pairwise comparisons and when parametric assumptions do not help (Q2284367) (← links)
- A bad arm existence checking problem: how to utilize asymmetric problem structure? (Q2303673) (← links)
- Good arm identification via bandit feedback (Q2425222) (← links)
- Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback (Q3386400) (← links)
- Bayesian Incentive-Compatible Bandit Exploration (Q3387959) (← links)
- Learning the distribution with largest mean: two bandit frameworks (Q4606431) (← links)
- (Q4633073) (← links)
- Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models (Q4987192) (← links)
- (Q4993317) (← links)
- (Q4998871) (← links)
- (Q5053268) (← links)
- Posterior-Based Stopping Rules for Bayesian Ranking-and-Selection Procedures (Q5087734) (← links)
- Tractable Sampling Strategies for Ordinal Optimization (Q5131546) (← links)
- Best Arm Identification for Contaminated Bandits (Q5214178) (← links)
- Randomized allocation with arm elimination in a bandit problem with covariates (Q5965323) (← links)
- Learning approximately optimal contracts (Q6069846) (← links)
- Variable Selection Via Thompson Sampling (Q6107208) (← links)
- Learning approximately optimal contracts (Q6109528) (← links)
- Exploratory machine learning with unknown unknowns (Q6152674) (← links)
- Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning (Q6180253) (← links)