Convergence results for single-step on-policy reinforcement-learning algorithms (Q1568533): Difference between revisions

@@ Property / author @@
+Satinder Pal Singh
@@ Property / author: Satinder Pal Singh / rank @@
+Normal rank
@@ Property / author @@
+Tommi S. Jaakkola
@@ Property / author: Tommi S. Jaakkola / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1023/a:1007678930559
+Normal rank
@@ Property / OpenAlex ID @@
+W2150339816
@@ Property / OpenAlex ID: W2150339816 / rank @@
+Normal rank