Multi-armed bandits based on a variant of simulated annealing (Q2520136): Difference between revisions

@@ Property / cites work @@
+Finite-time analysis of the multiarmed bandit problem
+Normal rank
@@ Property / cites work @@
+The Nonstochastic Multiarmed Bandit Problem
@@ Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank @@
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Stochastic approximation. A dynamical systems viewpoint.
+Normal rank
@@ Property / cites work @@
+Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
+Normal rank
@@ Property / cites work @@
+An Adaptive Sampling Algorithm for Solving Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
+Normal rank
@@ Property / cites work @@
+The Irrevocable Multiarmed Bandit Problem
@@ Property / cites work: The Irrevocable Multiarmed Bandit Problem / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive game playing using multiplicative weights
+Normal rank