Multi-armed bandits based on a variant of simulated annealing (Q2520136): Difference between revisions

@@ Property / full work available at URL @@
+https://doi.org/10.1007/s13226-016-0184-5
+Normal rank
@@ Property / OpenAlex ID @@
+W2475275076
@@ Property / OpenAlex ID: W2475275076 / rank @@
+Normal rank
@@ Property / cites work @@
+Finite-time analysis of the multiarmed bandit problem
+Normal rank
@@ Property / cites work @@
+The Nonstochastic Multiarmed Bandit Problem
@@ Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank @@
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Stochastic approximation. A dynamical systems viewpoint.
+Normal rank
@@ Property / cites work @@
+Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
+Normal rank
@@ Property / cites work @@
+An Adaptive Sampling Algorithm for Solving Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
+Normal rank
@@ Property / cites work @@
+The Irrevocable Multiarmed Bandit Problem
@@ Property / cites work: The Irrevocable Multiarmed Bandit Problem / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive game playing using multiplicative weights
+Normal rank