A unified approach to time-aggregated Markov decision processes
From MaRDI portal
(Redirected from Publication:259403)
Recommendations
- A time aggregation approach to Markov decision processes
- Time aggregated Markov decision processes via standard dynamic programming
- A unified approach to adaptive control of average reward Markov decision processes
- Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
- A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases
Cites work
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- scientific article; zbMATH DE number 1093829 (Why is no real title available?)
- scientific article; zbMATH DE number 2107836 (Why is no real title available?)
- scientific article; zbMATH DE number 3361678 (Why is no real title available?)
- A New Value Iteration method for the Average Cost Dynamic Programming Problem
- A basic formula for performance gradient estimation of semi-Markov decision processes
- A time aggregation approach to Markov decision processes
- Continuous-time Markov decision processes. Theory and applications
- Incremental Value Iteration for Time-Aggregated Markov-Decision Processes
- Markov decision Processes with fractional costs
- Performance gradient estimation for the very large finite Markov chains
- Perturbation realization, potentials, and sensitivity analysis of Markov processes
- Recent advances in hierarchical reinforcement learning
- Semi-markov decision problems and performance sensitivity analysis
- Stochastic learning and optimization. A sensitivity-based approach.
- Time aggregated Markov decision processes via standard dynamic programming
Cited in
(6)- A time aggregation approach to Markov decision processes
- Time aggregated Markov decision processes via standard dynamic programming
- A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases
- Coupling based estimation approaches for the average reward performance potential in Markov chains
- Sliding mode control for semi-Markovian jump systems via output feedback
- Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
This page was built for publication: A unified approach to time-aggregated Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q259403)