scientific article

From MaRDI portal
Revision as of 17:31, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:2810878

zbMath1360.62030arXiv1403.5341MaRDI QIDQ2810878

Daniel J. Russo, Benjamin van Roy

Publication date: 6 June 2016

Full work available at URL: https://arxiv.org/abs/1403.5341

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.



Related Items (23)

Generalizations of maximal inequalities to arbitrary selection rulesBandit Theory: Applications to Learning Healthcare Systems and Clinical TrialsFeel-Good Thompson Sampling for Contextual Bandits and Reinforcement LearningProbabilistic bisection with spatial metamodelsA Bayesian approach to (online) transfer learning: theory and algorithmsInformation theory for ranking and selectionReward Maximization Through Discrete Active InferenceReinforcement Learning, Bit by BitNearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset SelectionForaging decisions as multi-armed bandit problems: applying reinforcement learning algorithms to foraging dataNonstationary Bandits with Habituation and Recovery DynamicsExploratory distributions for convex functionsMulti-Armed Bandit for Species Discovery: A Bayesian Nonparametric ApproachImproved regret for zeroth-order adversarial bandit convex optimisationAdaptive policies for perimeter surveillance problemsLearning to Optimize via Information-Directed SamplingEfficient Simulation of High Dimensional Gaussian VectorsDerivative-free optimization methodsOn the Prior Sensitivity of Thompson SamplingMatching While LearningDismemberment and design for controlling the replication variance of regret for the multi-armed banditSatisficing in Time-Sensitive Bandit LearningEntropy Regularization for Mean Field Games with Learning






This page was built for publication: