Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm

Recommendations

A time aggregation approach to Markov decision processes
Reinforcement learning based algorithms for average cost Markov decision processes
A unified approach to time-aggregated Markov decision processes
Time aggregated Markov decision processes via standard dynamic programming
The control of a two-level Markov decision process by time aggregation

Cites work

scientific article; zbMATH DE number 1315585 (Why is no real title available?)
scientific article; zbMATH DE number 1321699 (Why is no real title available?)
scientific article; zbMATH DE number 700091 (Why is no real title available?)
A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems
A New Value Iteration method for the Average Cost Dynamic Programming Problem
A time aggregation approach to Markov decision processes
Accelerating the convergence of value iteration by using partial transition functions
An analysis of temporal-difference learning with function approximation
Approximate Dynamic Programming
Approximate dynamic programming via direct search in the space of value function approximations
Approximate dynamic programming with a fuzzy parameterization
Dynamic programming and optimal control. Vol. 2
Exact finite approximations of average-cost countable Markov decision processes
Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
Kernel-based reinforcement learning
LAO*: A heuristic search algorithm that finds solutions with loops
Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation
Markov decision Processes with fractional costs
Performance gradient estimation for the very large finite Markov chains
Probabilistic relational planning with first order decision diagrams
Reducing reinforcement learning to KWIK online regression
Simulation-based algorithms for Markov decision processes.
Stability and optimality of a multi-product production and storage system under demand uncertainty
Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models
Time aggregated Markov decision processes via standard dynamic programming

Cited in

(4)

A unified approach to time-aggregated Markov decision processes
Time aggregated Markov decision processes via standard dynamic programming
A time aggregation approach to Markov decision processes
The control of a two-level Markov decision process by time aggregation

Describes a project that uses

Uses Software

FODD-Planner

This page was built for publication: Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q300040)