Q-learning and policy iteration algorithms for stochastic shortest path problems (Q378731): Difference between revisions

@@ Property / DOI @@
-.1007/s10479-012-1128-z
@@ Property / DOI: 10.1007/s10479-012-1128-z / rank @@
-Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C40
@@ Property / Mathematics Subject Classification ID: 90C40 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C39
@@ Property / Mathematics Subject Classification ID: 90C39 / rank @@
+Normal rank
@@ Property / zbMATH DE Number @@
+6225970
@@ Property / zbMATH DE Number: 6225970 / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+Markov decision processes
@@ Property / zbMATH Keywords: Markov decision processes / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+Q-learning
@@ Property / zbMATH Keywords: Q-learning / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+approximate dynamic programming
@@ Property / zbMATH Keywords: approximate dynamic programming / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+value iteration
@@ Property / zbMATH Keywords: value iteration / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+policy iteration
@@ Property / zbMATH Keywords: policy iteration / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+stochastic shortest paths
@@ Property / zbMATH Keywords: stochastic shortest paths / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+stochastic approximation
@@ Property / zbMATH Keywords: stochastic approximation / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1007/s10479-012-1128-z
+Normal rank
@@ Property / OpenAlex ID @@
+W2027855416
@@ Property / OpenAlex ID: W2027855416 / rank @@
+Normal rank
@@ Property / Wikidata QID @@
+Q115147448
@@ Property / Wikidata QID: Q115147448 / rank @@
+Normal rank
@@ Property / cites work @@
+Asynchronous Iterative Methods for Multiprocessors
+Normal rank
@@ Property / cites work @@
+Distributed dynamic programming
@@ Property / cites work: Distributed dynamic programming / rank @@
+Normal rank
@@ Property / cites work @@
+Distributed asynchronous computation of fixed points
+Normal rank
@@ Property / cites work @@
+Neuro-Dynamic Programming: An Overview and Recent Results
+Normal rank
@@ Property / cites work @@
+Q4001523
@@ Property / cites work: Q4001523 / rank @@
+Normal rank
@@ Property / cites work @@
+An Analysis of Stochastic Shortest Path Problems
@@ Property / cites work: An Analysis of Stochastic Shortest Path Problems / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Projected equation methods for approximate solution of large linear systems
+Normal rank
@@ Property / cites work @@
+Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
+Normal rank
@@ Property / cites work @@
+(Approximate) iterated successive approximations algorithm for sequential decision processes
+Normal rank
@@ Property / cites work @@
+A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
+Normal rank
@@ Property / cites work @@
+Finite state Markovian decision processes
@@ Property / cites work: Finite state Markovian decision processes / rank @@
+Normal rank
@@ Property / cites work @@
+On Stationary Strategies in Borel Dynamic Programming
+Normal rank
@@ Property / cites work @@
+On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank
@@ Property / cites work @@
+Q5477860
@@ Property / cites work: Q5477860 / rank @@
+Normal rank
@@ Property / cites work @@
+Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives
+Normal rank
@@ Property / cites work @@
+Discrete Dynamic Programming with Sensitive Discount Optimality Criteria
+Normal rank
@@ Property / cites work @@
+Q3698635
@@ Property / cites work: Q3698635 / rank @@
+Normal rank
@@ Property / cites work @@
+On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems
+Normal rank
@@ Property / DOI @@
+.1007/S10479-012-1128-Z
@@ Property / DOI: 10.1007/S10479-012-1128-Z / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:378731