Pages that link to "Item:Q4862097"
From MaRDI portal
The following pages link to Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem (Q4862097):
Displaying 32 items.
- Geiringer theorems: from population genetics to computational intelligence, memory evolutive systems and Hebbian learning (Q269771) (← links)
- Wisdom of crowds versus groupthink: learning in groups and in isolation (Q361811) (← links)
- Kullback-Leibler upper confidence bounds for optimal sequential allocation (Q366995) (← links)
- Exploration and exploitation of scratch games (Q374139) (← links)
- Robustness of stochastic bandit policies (Q391739) (← links)
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624) (← links)
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803) (← links)
- Boundary crossing probabilities for general exponential families (Q722599) (← links)
- A non-parametric solution to the multi-armed bandit problem with covariates (Q826996) (← links)
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665) (← links)
- On Bayesian index policies for sequential resource allocation (Q1750289) (← links)
- Efficient crowdsourcing of unknown experts using bounded multi-armed bandits (Q2014933) (← links)
- An online algorithm for the risk-aware restless bandit (Q2029383) (← links)
- A revised approach for risk-averse multi-armed bandits under CVaR criterion (Q2060576) (← links)
- Gittins' theorem under uncertainty (Q2076662) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- The multi-armed bandit problem: an efficient nonparametric solution (Q2176624) (← links)
- How fragile are information cascades? (Q2240475) (← links)
- Tuning Bandit Algorithms in Stochastic Environments (Q3520056) (← links)
- (Q4558161) (← links)
- Learning the distribution with largest mean: two bandit frameworks (Q4606431) (← links)
- Finite-Time Analysis for the Knowledge-Gradient Policy (Q4610155) (← links)
- Nonasymptotic Analysis of Monte Carlo Tree Search (Q5060499) (← links)
- Optimistic Gittins Indices (Q5060515) (← links)
- Infinite Arms Bandit: Optimality via Confidence Bounds (Q5089465) (← links)
- Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information (Q5095163) (← links)
- Derivative-free optimization methods (Q5230522) (← links)
- Functional Sequential Treatment Allocation (Q5881136) (← links)
- Dealing with expert bias in collective decision-making (Q6103665) (← links)
- Convergence rate analysis for optimal computing budget allocation algorithms (Q6110297) (← links)
- Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems (Q6167036) (← links)
- A confirmation of a conjecture on Feldman’s two-armed bandit problem (Q6198964) (← links)