Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601): Difference between revisions

@@ Property / DOI @@
-.1016/j.ins.2013.08.037
@@ Property / DOI: 10.1016/j.ins.2013.08.037 / rank @@
-Normal rank
@@ Property / cites work @@
+Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H^\infty\) control
+Normal rank
@@ Property / cites work @@
+.1162/153244303768966085
@@ Property / cites work: 10.1162/153244303768966085 / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive-critic-based neural networks for aircraft optimal control
+Normal rank
@@ Property / cites work @@
+Recent advances in hierarchical reinforcement learning
+Normal rank
@@ Property / cites work @@
+Q4533362
@@ Property / cites work: Q4533362 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Natural actor-critic algorithms
@@ Property / cites work: Natural actor-critic algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with two time scales
@@ Property / cites work: Stochastic approximation with two time scales / rank @@
+Normal rank
@@ Property / cites work @@
+Q3527701
@@ Property / cites work: Q3527701 / rank @@
+Normal rank
@@ Property / cites work @@
+Technical update: Least-squares temporal difference learning
+Normal rank
@@ Property / cites work @@
+Q5477859
@@ Property / cites work: Q5477859 / rank @@
+Normal rank
@@ Property / cites work @@
+Elevator group control using multiple reinforcement learning agents
+Normal rank
@@ Property / cites work @@
+The convergence of \(TD(\lambda)\) for general \(\lambda\)
+Normal rank
@@ Property / cites work @@
+Q4527272
@@ Property / cites work: Q4527272 / rank @@
+Normal rank
@@ Property / cites work @@
+Integrating guidance into relational reinforcement learning
+Normal rank
@@ Property / cites work @@
+Q4797054
@@ Property / cites work: Q4797054 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3093261
@@ Property / cites work: Q3093261 / rank @@
+Normal rank
@@ Property / cites work @@
+Model selection in reinforcement learning
@@ Property / cites work: Model selection in reinforcement learning / rank @@
+Normal rank
@@ Property / cites work @@
+Graph kernels and Gaussian processes for relational reinforcement learning
+Normal rank
@@ Property / cites work @@
+Learning Theory and Kernel Machines
@@ Property / cites work: Learning Theory and Kernel Machines / rank @@
+Normal rank
@@ Property / cites work @@
+Reinforcement learning for long-run average cost.
@@ Property / cites work: Reinforcement learning for long-run average cost. / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
+Normal rank
@@ Property / cites work @@
+Q3174169
@@ Property / cites work: Q3174169 / rank @@
+Normal rank
@@ Property / cites work @@
+Markov decision processes with their applications
@@ Property / cites work: Markov decision processes with their applications / rank @@
+Normal rank
@@ Property / cites work @@
+On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
+Normal rank
@@ Property / cites work @@
+Hybrid least-squares algorithms for approximate policy evaluation
+Normal rank
@@ Property / cites work @@
+Policy search for motor primitives in robotics
@@ Property / cites work: Policy search for motor primitives in robotics / rank @@
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3174155
@@ Property / cites work: Q3174155 / rank @@
+Normal rank
@@ Property / cites work @@
+Least squares policy evaluation algorithms with linear function approximation
+Normal rank
@@ Property / cites work @@
+Adaptive stock trading with dynamic asset allocation using reinforcement learning
+Normal rank
@@ Property / cites work @@
+Kernel-based reinforcement learning
@@ Property / cites work: Kernel-based reinforcement learning / rank @@
+Normal rank
@@ Property / cites work @@
+Approximate Dynamic Programming
@@ Property / cites work: Approximate Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Convergence results for single-step on-policy reinforcement-learning algorithms
+Normal rank
@@ Property / cites work @@
+An upper bound on the loss from approximate optimal-value functions
+Normal rank
@@ Property / cites work @@
+Instrumental variable methods for system identification
+Normal rank
@@ Property / cites work @@
+Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
+Normal rank
@@ Property / cites work @@
+Algorithms for Reinforcement Learning
@@ Property / cites work: Algorithms for Reinforcement Learning / rank @@
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
+Normal rank
@@ Property / cites work @@
+Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations
+Normal rank
@@ Property / cites work @@
+Q4261789
@@ Property / cites work: Q4261789 / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive optimal control for continuous-time linear systems based on policy iteration
+Normal rank
@@ Property / cites work @@
+Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
+Normal rank
@@ Property / cites work @@
+\({\mathcal Q}\)-learning
@@ Property / cites work: \({\mathcal Q}\)-learning / rank @@
+Normal rank
@@ Property / cites work @@
+Punish/Reward: Learning with a Critic in Adaptive Threshold Systems
+Normal rank
@@ Property / cites work @@
+Q4533350
@@ Property / cites work: Q4533350 / rank @@
+Normal rank
@@ Property / cites work @@
+Robot learning with GA-based fuzzy reinforcement learning agents
+Normal rank
@@ Property / DOI @@
+.1016/J.INS.2013.08.037
@@ Property / DOI: 10.1016/J.INS.2013.08.037 / rank @@
+Normal rank