Pages that link to "Item:Q653803"
From MaRDI portal
The following pages link to UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803):
Displaying 16 items.
- Batched bandit problems (Q282463) (← links)
- Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search (Q307787) (← links)
- The multi-armed bandit problem with covariates (Q355096) (← links)
- Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning (Q2094051) (← links)
- Ballooning multi-armed bandits (Q2238588) (← links)
- (Q4558161) (← links)
- (Q4558474) (← links)
- (Q4558552) (← links)
- Approximations of the Restless Bandit Problem (Q4633023) (← links)
- (Q4998863) (← links)
- A Bandit-Learning Approach to Multifidelity Approximation (Q5022495) (← links)
- (Q5149015) (← links)
- Explore First, Exploit Next: The True Shape of Regret in Bandit Problems (Q5219722) (← links)
- ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT (Q5358116) (← links)
- Transfer learning for contextual multi-armed bandits (Q6192325) (← links)
- Logarithmic regret bounds for continuous-time average-reward Markov decision processes (Q6608781) (← links)