Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665): Difference between revisions

@@ Property / full work available at URL @@
+https://doi.org/10.1016/j.tcs.2009.01.016
+Normal rank
@@ Property / OpenAlex ID @@
+W2142971854
@@ Property / OpenAlex ID: W2142971854 / rank @@
+Normal rank
@@ Property / cites work @@
+Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem
+Normal rank
@@ Property / cites work @@
+Finite-time analysis of the multiarmed bandit problem
+Normal rank
@@ Property / cites work @@
+On tail probabilities for martingales
@@ Property / cites work: On tail probabilities for martingales / rank @@
+Normal rank
@@ Property / cites work @@
+Q4692329
@@ Property / cites work: Q4692329 / rank @@
+Normal rank
@@ Property / cites work @@
+Probability Inequalities for Sums of Bounded Random Variables
+Normal rank
@@ Property / cites work @@
+Asymptotically efficient adaptive allocation rules
+Normal rank
@@ Property / cites work @@
+Machine learning and nonparametric bandit theory
@@ Property / cites work: Machine learning and nonparametric bandit theory / rank @@
+Normal rank
@@ Property / cites work @@
+Some aspects of the sequential design of experiments
+Normal rank