Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm (Q300040): Difference between revisions

@@ Property / DOI @@
-.1016/j.ejor.2014.08.023
@@ Property / DOI: 10.1016/j.ejor.2014.08.023 / rank @@
-Normal rank
@@ Property / author @@
-Edilson F. Arruda
@@ Property / author: Edilson F. Arruda / rank @@
-Normal rank
@@ Property / author @@
-Marcelo Dutra Fragoso
@@ Property / author: Marcelo Dutra Fragoso / rank @@
-Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C40
@@ Property / Mathematics Subject Classification ID: 90C40 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C39
@@ Property / Mathematics Subject Classification ID: 90C39 / rank @@
+Normal rank
@@ Property / zbMATH DE Number @@
+6597117
@@ Property / zbMATH DE Number: 6597117 / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+dynamic programming
@@ Property / zbMATH Keywords: dynamic programming / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+Markov decision processes
@@ Property / zbMATH Keywords: Markov decision processes / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+embedding
@@ Property / zbMATH Keywords: embedding / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+time aggregation
@@ Property / zbMATH Keywords: time aggregation / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+stochastic optimal control
@@ Property / zbMATH Keywords: stochastic optimal control / rank @@
+Normal rank
@@ Property / author @@
+Edilson F. Arruda
@@ Property / author: Edilson F. Arruda / rank @@
+Normal rank
@@ Property / author @@
+Marcelo Dutra Fragoso
@@ Property / author: Marcelo Dutra Fragoso / rank @@
+Normal rank
@@ Property / describes a project that uses @@
+FODD-Planner
@@ Property / describes a project that uses: FODD-Planner / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1016/j.ejor.2014.08.023
+Normal rank
@@ Property / OpenAlex ID @@
+W2072370473
@@ Property / OpenAlex ID: W2072370473 / rank @@
+Normal rank
@@ Property / cites work @@
+Time aggregated Markov decision processes via standard dynamic programming
+Normal rank
@@ Property / cites work @@
+Approximate dynamic programming via direct search in the space of value function approximations
+Normal rank
@@ Property / cites work @@
+Stability and optimality of a multi-product production and storage system under demand uncertainty
+Normal rank
@@ Property / cites work @@
+Accelerating the convergence of value iteration by using partial transition functions
+Normal rank
@@ Property / cites work @@
+A New Value Iteration method for the Average Cost Dynamic Programming Problem
+Normal rank
@@ Property / cites work @@
+Q2925454
@@ Property / cites work: Q2925454 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4256521
@@ Property / cites work: Q4256521 / rank @@
+Normal rank
@@ Property / cites work @@
+Approximate dynamic programming with a fuzzy parameterization
+Normal rank
@@ Property / cites work @@
+A time aggregation approach to Markov decision processes
+Normal rank
@@ Property / cites work @@
+Simulation-based algorithms for Markov decision processes.
+Normal rank
@@ Property / cites work @@
+Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models
+Normal rank
@@ Property / cites work @@
+LAO*: A heuristic search algorithm that finds solutions with loops
+Normal rank
@@ Property / cites work @@
+Probabilistic Relational Planning  with First Order Decision Diagrams
+Normal rank
@@ Property / cites work @@
+Exact finite approximations of average-cost countable Markov decision processes
+Normal rank
@@ Property / cites work @@
+Reducing reinforcement learning to KWIK online regression
+Normal rank
@@ Property / cites work @@
+Kernel-based reinforcement learning
@@ Property / cites work: Kernel-based reinforcement learning / rank @@
+Normal rank
@@ Property / cites work @@
+A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems
+Normal rank
@@ Property / cites work @@
+Approximate Dynamic Programming
@@ Property / cites work: Approximate Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Markov decision Processes with fractional costs
@@ Property / cites work: Markov decision Processes with fractional costs / rank @@
+Normal rank
@@ Property / cites work @@
+Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation
+Normal rank
@@ Property / cites work @@
+Performance gradient estimation for the very large finite Markov chains
+Normal rank
@@ Property / DOI @@
+.1016/J.EJOR.2014.08.023
@@ Property / DOI: 10.1016/J.EJOR.2014.08.023 / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:300040