Pages that link to "Item:Q4862097"

From MaRDI portal

← Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem (Q4862097)

Jump to:navigation, search

The following pages link to Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem (Q4862097):

Displayed 15 items.

Geiringer theorems: from population genetics to computational intelligence, memory evolutive systems and Hebbian learning (Q269771) ‎ (← links)
Wisdom of crowds versus groupthink: learning in groups and in isolation (Q361811) ‎ (← links)
Kullback-Leibler upper confidence bounds for optimal sequential allocation (Q366995) ‎ (← links)
Exploration and exploitation of scratch games (Q374139) ‎ (← links)
Robustness of stochastic bandit policies (Q391739) ‎ (← links)
An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624) ‎ (← links)
UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803) ‎ (← links)
Boundary crossing probabilities for general exponential families (Q722599) ‎ (← links)
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665) ‎ (← links)
On Bayesian index policies for sequential resource allocation (Q1750289) ‎ (← links)
Efficient crowdsourcing of unknown experts using bounded multi-armed bandits (Q2014933) ‎ (← links)
Tuning Bandit Algorithms in Stochastic Environments (Q3520056) ‎ (← links)
(Q4558161) ‎ (← links)
Learning the distribution with largest mean: two bandit frameworks (Q4606431) ‎ (← links)
Finite-Time Analysis for the Knowledge-Gradient Policy (Q4610155) ‎ (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere"