Pages that link to "Item:Q4862097"

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem (Q4862097):

Displaying 33 items.

Geiringer theorems: from population genetics to computational intelligence, memory evolutive systems and Hebbian learning (Q269771) (← links)
Wisdom of crowds versus groupthink: learning in groups and in isolation (Q361811) (← links)
Kullback-Leibler upper confidence bounds for optimal sequential allocation (Q366995) (← links)
Exploration and exploitation of scratch games (Q374139) (← links)
Robustness of stochastic bandit policies (Q391739) (← links)
An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624) (← links)
UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803) (← links)
Boundary crossing probabilities for general exponential families (Q722599) (← links)
A non-parametric solution to the multi-armed bandit problem with covariates (Q826996) (← links)
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665) (← links)
On Bayesian index policies for sequential resource allocation (Q1750289) (← links)
Efficient crowdsourcing of unknown experts using bounded multi-armed bandits (Q2014933) (← links)
An online algorithm for the risk-aware restless bandit (Q2029383) (← links)
A revised approach for risk-averse multi-armed bandits under CVaR criterion (Q2060576) (← links)
Gittins' theorem under uncertainty (Q2076662) (← links)
Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
The multi-armed bandit problem: an efficient nonparametric solution (Q2176624) (← links)
How fragile are information cascades? (Q2240475) (← links)
Tuning Bandit Algorithms in Stochastic Environments (Q3520056) (← links)
(Q4558161) (← links)
Learning the distribution with largest mean: two bandit frameworks (Q4606431) (← links)
Finite-Time Analysis for the Knowledge-Gradient Policy (Q4610155) (← links)
Nonasymptotic Analysis of Monte Carlo Tree Search (Q5060499) (← links)
Optimistic Gittins Indices (Q5060515) (← links)
Infinite Arms Bandit: Optimality via Confidence Bounds (Q5089465) (← links)
Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information (Q5095163) (← links)
Derivative-free optimization methods (Q5230522) (← links)
Functional Sequential Treatment Allocation (Q5881136) (← links)
Dealing with expert bias in collective decision-making (Q6103665) (← links)
Convergence rate analysis for optimal computing budget allocation algorithms (Q6110297) (← links)
Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems (Q6167036) (← links)
A confirmation of a conjecture on Feldman’s two-armed bandit problem (Q6198964) (← links)
Factorial Designs for Online Experiments (Q6631856) (← links)