New algorithms of the Q-learning type (Q2440701): Difference between revisions

@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with two time scales
@@ Property / cites work: Stochastic approximation with two time scales / rank @@
+Normal rank
@@ Property / cites work @@
+Actor-Critic--Type Learning Algorithms for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
+Normal rank
@@ Property / cites work @@
+A one-measurement form of simultaneous perturbation stochastic approximation
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank
@@ Property / cites work @@
+\({\mathcal Q}\)-learning
@@ Property / cites work: \({\mathcal Q}\)-learning / rank @@
+Normal rank