Error bounds for constant step-size \(Q\)-learning (Q1932736): Difference between revisions

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1016/j.sysconle.2012.08.014
+Normal rank
@@ Property / OpenAlex ID @@
+W1999254175
@@ Property / OpenAlex ID: W1999254175 / rank @@
+Normal rank
@@ Property / cites work @@
+\({\mathcal Q}\)-learning
@@ Property / cites work: \({\mathcal Q}\)-learning / rank @@
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank
@@ Property / cites work @@
+On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
+Normal rank
@@ Property / cites work @@
+Q3093180
@@ Property / cites work: Q3093180 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4346705
@@ Property / cites work: Q4346705 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3527701
@@ Property / cites work: Q3527701 / rank @@
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Boundedness of iterates in \(Q\)-learning
@@ Property / cites work: Boundedness of iterates in \(Q\)-learning / rank @@
+Normal rank
@@ Property / cites work @@
+Q-learning and policy iteration algorithms for stochastic shortest path problems
+Normal rank
@@ Property / cites work @@
+Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
+Normal rank