On the existence of fixed points for approximate value iteration and temporal-difference learning (Q1586803): Difference between revisions

@@ Property / author @@
+Benjamin van Roy
@@ Property / author: Benjamin van Roy / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / cites work @@
+Functional Approximations and Dynamic Programming
@@ Property / cites work: Functional Approximations and Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+The convergence of \(TD(\lambda)\) for general \(\lambda\)
+Normal rank
@@ Property / cites work @@
+Q4886156
@@ Property / cites work: Q4886156 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3997575
@@ Property / cites work: Q3997575 / rank @@
+Normal rank