Natural actor-critic algorithms (Q1049136): Difference between revisions

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1016/j.automatica.2009.07.008
+Normal rank
@@ Property / OpenAlex ID @@
+W2094387729
@@ Property / OpenAlex ID: W2094387729 / rank @@
+Normal rank
@@ Property / cites work @@
+Reinforcement learning based algorithms for average cost Markov decision processes
+Normal rank
@@ Property / cites work @@
+Learning Algorithms for Markov Decision Processes with Average Cost
+Normal rank
@@ Property / cites work @@
+Q4533362
@@ Property / cites work: Q4533362 / rank @@
+Normal rank
@@ Property / cites work @@
+Functional Approximations and Dynamic Programming
@@ Property / cites work: Functional Approximations and Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Q3997575
@@ Property / cites work: Q3997575 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4209222
@@ Property / cites work: Q4209222 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4001523
@@ Property / cites work: Q4001523 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization
+Normal rank
@@ Property / cites work @@
+Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization
+Normal rank
@@ Property / cites work @@
+Natural actor-critic algorithms
@@ Property / cites work: Natural actor-critic algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with two time scales
@@ Property / cites work: Stochastic approximation with two time scales / rank @@
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Some Pathological Traps for Stochastic Approximation
+Normal rank
@@ Property / cites work @@
+Q5477859
@@ Property / cites work: Q5477859 / rank @@
+Normal rank
@@ Property / cites work @@
+Perturbation realization, potentials, and sensitivity analysis of Markov processes
+Normal rank
@@ Property / cites work @@
+Elevator group control using multiple reinforcement learning agents
+Normal rank
@@ Property / cites work @@
+Q2810874
@@ Property / cites work: Q2810874 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3093234
@@ Property / cites work: Q3093234 / rank @@
+Normal rank
@@ Property / cites work @@
+Actor-Critic--Type Learning Algorithms for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation methods for constrained and unconstrained systems
+Normal rank
@@ Property / cites work @@
+Q4346705
@@ Property / cites work: Q4346705 / rank @@
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Simulation-based optimization of Markov reward processes
+Normal rank
@@ Property / cites work @@
+Nonconvergence to unstable points in urn models and stochastic approximations
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+On the convergence of temporal-difference learning with linear function approximation
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Average cost temporal-difference learning
@@ Property / cites work: Average cost temporal-difference learning / rank @@
+Normal rank
@@ Property / cites work @@
+A Survey of Applications of Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Q3724211
@@ Property / cites work: Q3724211 / rank @@
+Normal rank
@@ Property / cites work @@
+Simple statistical gradient-following algorithms for connectionist reinforcement learning
+Normal rank