Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
From MaRDI portal
Publication:3520073
Recommendations
- Bisimulation metrics for continuous Markov decision processes
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes
- A time aggregation approach to Markov decision processes
- Algorithms for aggregated limiting average Markov decision problems
- Relative value iteration algorithm with soft state aggregation
Cites work
- scientific article; zbMATH DE number 19232 (Why is no real title available?)
- scientific article; zbMATH DE number 3240812 (Why is no real title available?)
- A new analysis of quasianalysis
- Approximate equivalence of Markov decision processes.
- Bisimulation metrics for continuous Markov decision processes
- Comparison of perturbation bounds for the stationary distribution of a Markov chain
- Equivalence notions and model minimization in Markov decision processes
- Linear dependence of stationary distributions in ergodic Markov decision processes
- Mixing times with applications to perturbed Markov chains
- Pattern recognition.
Cited in
(7)- Approximating a Behavioural Pseudometric Without Discount for Probabilistic Systems
- Regret bounds for restless Markov bandits
- Extreme state aggregation beyond Markov decision processes
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- On State Aggregation to Approximate Complex Value Functions in Large-Scale Markov Decision Processes
- Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes
This page was built for publication: Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3520073)