Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
From MaRDI portal
Publication:300040
DOI10.1016/J.EJOR.2014.08.023zbMATH Open1338.90443OpenAlexW2072370473MaRDI QIDQ300040FDOQ300040
Marcelo Dutra Fragoso, E. F. Arruda
Publication date: 23 June 2016
Published in: European Journal of Operational Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.ejor.2014.08.023
Recommendations
- A time aggregation approach to Markov decision processes
- Reinforcement learning based algorithms for average cost Markov decision processes
- A unified approach to time-aggregated Markov decision processes
- Time aggregated Markov decision processes via standard dynamic programming
- The control of a two-level Markov decision process by time aggregation
Cites Work
- Probabilistic Relational Planning with First Order Decision Diagrams
- Title not available (Why is that?)
- Title not available (Why is that?)
- A time aggregation approach to Markov decision processes
- Performance gradient estimation for the very large finite Markov chains
- A New Value Iteration method for the Average Cost Dynamic Programming Problem
- Markov decision Processes with fractional costs
- Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
- Time aggregated Markov decision processes via standard dynamic programming
- Approximate Dynamic Programming
- Approximate dynamic programming with a fuzzy parameterization
- An analysis of temporal-difference learning with function approximation
- Kernel-based reinforcement learning
- Accelerating the convergence of value iteration by using partial transition functions
- Exact finite approximations of average-cost countable Markov decision processes
- Stability and optimality of a multi-product production and storage system under demand uncertainty
- Dynamic programming and optimal control. Vol. 2
- Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models
- Title not available (Why is that?)
- A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems
- Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation
- LAO*: A heuristic search algorithm that finds solutions with loops
- Reducing reinforcement learning to KWIK online regression
- Approximate dynamic programming via direct search in the space of value function approximations
- Simulation-based algorithms for Markov decision processes.
Cited In (2)
Uses Software
This page was built for publication: Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q300040)