Pages that link to "Item:Q5219722"
From MaRDI portal
The following pages link to Explore First, Exploit Next: The True Shape of Regret in Bandit Problems (Q5219722):
Displaying 6 items.
- Multi-objective multi-armed bandit with lexicographically ordered and satisficing objectives (Q2051318) (← links)
- Fano's inequality for random variables (Q2218038) (← links)
- Asymptotically optimal algorithms for budgeted multiple play bandits (Q2331676) (← links)
- Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models (Q4987192) (← links)
- (Q4998881) (← links)
- EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET (Q5070864) (← links)