Learning to Optimize via Posterior Sampling (Q5247618): Difference between revisions

@@ Property / OpenAlex ID @@
+W2149721706
@@ Property / OpenAlex ID: W2149721706 / rank @@
+Normal rank
@@ Property / arXiv ID @@
+.2609
@@ Property / arXiv ID: 1301.2609 / rank @@
+Normal rank
@@ Property / cites work @@
+Near-Optimal Regret Bounds for Thompson Sampling
@@ Property / cites work: Near-Optimal Regret Bounds for Thompson Sampling / rank @@
+Normal rank
@@ Property / cites work @@
+Finite-time analysis of the multiarmed bandit problem
+Normal rank
@@ Property / cites work @@
+Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
+Normal rank
@@ Property / cites work @@
+Q5396654
@@ Property / cites work: Q5396654 / rank @@
+Normal rank
@@ Property / cites work @@
+Kullback-Leibler upper confidence bounds for optimal sequential allocation
+Normal rank
@@ Property / cites work @@
+Q5302093
@@ Property / cites work: Q5302093 / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive treatment allocation and the multi-armed bandit problem
+Normal rank
@@ Property / cites work @@
+Asymptotically efficient adaptive allocation rules
+Normal rank
@@ Property / cites work @@
+Q5405185
@@ Property / cites work: Q5405185 / rank @@
+Normal rank
@@ Property / cites work @@
+Linearly Parameterized Bandits
@@ Property / cites work: Linearly Parameterized Bandits / rank @@
+Normal rank
@@ Property / cites work @@
+The Knowledge Gradient Algorithm for a General Class of Online Learning Problems
+Normal rank
@@ Property / cites work @@
+Computationally Related Problems
@@ Property / cites work: Computationally Related Problems / rank @@
+Normal rank
@@ Property / cites work @@
+Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting
+Normal rank