Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm (Q300040): Difference between revisions

@@ Property / cites work @@
+Time aggregated Markov decision processes via standard dynamic programming
+Normal rank
@@ Property / cites work @@
+Approximate dynamic programming via direct search in the space of value function approximations
+Normal rank
@@ Property / cites work @@
+Stability and optimality of a multi-product production and storage system under demand uncertainty
+Normal rank
@@ Property / cites work @@
+Accelerating the convergence of value iteration by using partial transition functions
+Normal rank
@@ Property / cites work @@
+A New Value Iteration method for the Average Cost Dynamic Programming Problem
+Normal rank
@@ Property / cites work @@
+Q2925454
@@ Property / cites work: Q2925454 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4256521
@@ Property / cites work: Q4256521 / rank @@
+Normal rank
@@ Property / cites work @@
+Approximate dynamic programming with a fuzzy parameterization
+Normal rank
@@ Property / cites work @@
+A time aggregation approach to Markov decision processes
+Normal rank
@@ Property / cites work @@
+Simulation-based algorithms for Markov decision processes.
+Normal rank
@@ Property / cites work @@
+Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models
+Normal rank
@@ Property / cites work @@
+LAO*: A heuristic search algorithm that finds solutions with loops
+Normal rank
@@ Property / cites work @@
+Probabilistic Relational Planning  with First Order Decision Diagrams
+Normal rank
@@ Property / cites work @@
+Exact finite approximations of average-cost countable Markov decision processes
+Normal rank
@@ Property / cites work @@
+Reducing reinforcement learning to KWIK online regression
+Normal rank
@@ Property / cites work @@
+Kernel-based reinforcement learning
@@ Property / cites work: Kernel-based reinforcement learning / rank @@
+Normal rank
@@ Property / cites work @@
+A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems
+Normal rank
@@ Property / cites work @@
+Approximate Dynamic Programming
@@ Property / cites work: Approximate Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Markov decision Processes with fractional costs
@@ Property / cites work: Markov decision Processes with fractional costs / rank @@
+Normal rank
@@ Property / cites work @@
+Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation
+Normal rank
@@ Property / cites work @@
+Performance gradient estimation for the very large finite Markov chains
+Normal rank