Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
From MaRDI portal
Publication:300040
DOI10.1016/j.ejor.2014.08.023zbMath1338.90443OpenAlexW2072370473MaRDI QIDQ300040
Marcelo Dutra Fragoso, Edilson F. Arruda
Publication date: 23 June 2016
Published in: European Journal of Operational Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.ejor.2014.08.023
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Reducing reinforcement learning to KWIK online regression
- Time aggregated Markov decision processes via standard dynamic programming
- Approximate dynamic programming via direct search in the space of value function approximations
- Simulation-based algorithms for Markov decision processes.
- Approximate dynamic programming with a fuzzy parameterization
- Kernel-based reinforcement learning
- A time aggregation approach to Markov decision processes
- Accelerating the convergence of value iteration by using partial transition functions
- Exact finite approximations of average-cost countable Markov decision processes
- Stability and optimality of a multi-product production and storage system under demand uncertainty
- Probabilistic Relational Planning with First Order Decision Diagrams
- Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models
- Performance gradient estimation for the very large finite Markov chains
- An analysis of temporal-difference learning with function approximation
- A New Value Iteration method for the Average Cost Dynamic Programming Problem
- A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems
- Markov decision Processes with fractional costs
- Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
- Approximate Dynamic Programming
- Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation
- LAO*: A heuristic search algorithm that finds solutions with loops
This page was built for publication: Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm