Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
From MaRDI portal
(Redirected from Publication:300040)
Recommendations
- A time aggregation approach to Markov decision processes
- Reinforcement learning based algorithms for average cost Markov decision processes
- A unified approach to time-aggregated Markov decision processes
- Time aggregated Markov decision processes via standard dynamic programming
- The control of a two-level Markov decision process by time aggregation
Cites work
- scientific article; zbMATH DE number 1315585 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems
- A New Value Iteration method for the Average Cost Dynamic Programming Problem
- A time aggregation approach to Markov decision processes
- Accelerating the convergence of value iteration by using partial transition functions
- An analysis of temporal-difference learning with function approximation
- Approximate Dynamic Programming
- Approximate dynamic programming via direct search in the space of value function approximations
- Approximate dynamic programming with a fuzzy parameterization
- Dynamic programming and optimal control. Vol. 2
- Exact finite approximations of average-cost countable Markov decision processes
- Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
- Kernel-based reinforcement learning
- LAO*: A heuristic search algorithm that finds solutions with loops
- Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation
- Markov decision Processes with fractional costs
- Performance gradient estimation for the very large finite Markov chains
- Probabilistic relational planning with first order decision diagrams
- Reducing reinforcement learning to KWIK online regression
- Simulation-based algorithms for Markov decision processes.
- Stability and optimality of a multi-product production and storage system under demand uncertainty
- Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models
- Time aggregated Markov decision processes via standard dynamic programming
Cited in
(4)
This page was built for publication: Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q300040)