Multiscale Q-learning with linear function approximation (Q312650): Difference between revisions

@@ Property / cites work @@
+Reinforcement learning based algorithms for average cost Markov decision processes
+Normal rank
@@ Property / cites work @@
+Learning Algorithms for Markov Decision Processes with Average Cost
+Normal rank
@@ Property / cites work @@
+Q3324260
@@ Property / cites work: Q3324260 / rank @@
+Normal rank
@@ Property / cites work @@
+A simple dynamic routing problem
@@ Property / cites work: A simple dynamic routing problem / rank @@
+Normal rank
@@ Property / cites work @@
+Some Pathological Traps for Stochastic Approximation
+Normal rank
@@ Property / cites work @@
+Stochastic Approximations and Differential Inclusions
+Normal rank
@@ Property / cites work @@
+Stochastic Approximations and Differential Inclusions, Part II: Applications
+Normal rank
@@ Property / cites work @@
+Q3376698
@@ Property / cites work: Q3376698 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+New algorithms of the Q-learning type
@@ Property / cites work: New algorithms of the Q-learning type / rank @@
+Normal rank
@@ Property / cites work @@
+Multiscale Stochastic Approximation for Parametric Optimization of Hidden Markov Models
+Normal rank
@@ Property / cites work @@
+Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
+Normal rank
@@ Property / cites work @@
+A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization
+Normal rank
@@ Property / cites work @@
+Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization
+Normal rank
@@ Property / cites work @@
+Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
+Normal rank
@@ Property / cites work @@
+Natural actor-critic algorithms
@@ Property / cites work: Natural actor-critic algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+An online actor-critic algorithm with function approximation for constrained Markov decision processes
+Normal rank
@@ Property / cites work @@
+Q4858374
@@ Property / cites work: Q4858374 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with two time scales
@@ Property / cites work: Stochastic approximation with two time scales / rank @@
+Normal rank
@@ Property / cites work @@
+Q3527701
@@ Property / cites work: Q3527701 / rank @@
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Recursive Stochastic Algorithms for Global Optimization in $\mathbb{R}^d $
+Normal rank
@@ Property / cites work @@
+Actor-Critic--Type Learning Algorithms for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation methods for constrained and unconstrained systems
+Normal rank
@@ Property / cites work @@
+Q4346705
@@ Property / cites work: Q4346705 / rank @@
+Normal rank
@@ Property / cites work @@
+Q-Learning with Linear Function Approximation
@@ Property / cites work: Q-Learning with Linear Function Approximation / rank @@
+Normal rank
@@ Property / cites work @@
+Nonconvergence to unstable points in urn models and stochastic approximations
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Perturbation theory and finite Markov chains
@@ Property / cites work: Perturbation theory and finite Markov chains / rank @@
+Normal rank
@@ Property / cites work @@
+Average cost temporal-difference learning
@@ Property / cites work: Average cost temporal-difference learning / rank @@
+Normal rank
@@ Property / cites work @@
+Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
+Normal rank
@@ Property / cites work @@
+A one-measurement form of simultaneous perturbation stochastic approximation
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Q4714399
@@ Property / cites work: Q4714399 / rank @@
+Normal rank
@@ Property / cites work @@
+\({\mathcal Q}\)-learning
@@ Property / cites work: \({\mathcal Q}\)-learning / rank @@
+Normal rank
@@ Property / cites work @@
+On the optimal assignment of customers to parallel servers
+Normal rank