Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762): Difference between revisions

@@ Property / author @@
-Susan A. Murphy
@@ Property / author: Susan A. Murphy / rank @@
-Normal rank
@@ Property / author @@
+Susan A. Murphy
@@ Property / author: Susan A. Murphy / rank @@
+Normal rank
@@ Property / describes a project that uses @@
+Approxrl
@@ Property / describes a project that uses: Approxrl / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / OpenAlex ID @@
+W2134689794
@@ Property / OpenAlex ID: W2134689794 / rank @@
+Normal rank
@@ Property / Wikidata QID @@
+Q42258641
@@ Property / Wikidata QID: Q42258641 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3241581
@@ Property / cites work: Q3241581 / rank @@
+Normal rank
@@ Property / cites work @@
+Technical update: Least-squares temporal difference learning
+Normal rank
@@ Property / cites work @@
+Q5477859
@@ Property / cites work: Q5477859 / rank @@
+Normal rank
@@ Property / cites work @@
+Machine Learning: ECML 2003
@@ Property / cites work: Machine Learning: ECML 2003 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3093261
@@ Property / cites work: Q3093261 / rank @@
+Normal rank
@@ Property / cites work @@
+Towards Min Max Generalization in Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Q5405216
@@ Property / cites work: Q5405216 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3096132
@@ Property / cites work: Q3096132 / rank @@
+Normal rank
@@ Property / cites work @@
+Optimal Dynamic Treatment Regimes
@@ Property / cites work: Optimal Dynamic Treatment Regimes / rank @@
+Normal rank
@@ Property / cites work @@
+Marginal Mean Models for Dynamic Regimes
@@ Property / cites work: Marginal Mean Models for Dynamic Regimes / rank @@
+Normal rank
@@ Property / cites work @@
+Least squares policy evaluation algorithms with linear function approximation
+Normal rank
@@ Property / cites work @@
+Kernel-based reinforcement learning
@@ Property / cites work: Kernel-based reinforcement learning / rank @@
+Normal rank
@@ Property / cites work @@
+A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect
+Normal rank