Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
From MaRDI portal
Publication:3520073
DOI10.1007/978-3-540-75225-7_30zbMATH Open1142.68403OpenAlexW1509780496MaRDI QIDQ3520073FDOQ3520073
Authors: Ronald Ortner
Publication date: 19 August 2008
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/978-3-540-75225-7_30
Recommendations
- Bisimulation metrics for continuous Markov decision processes
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes
- A time aggregation approach to Markov decision processes
- Algorithms for aggregated limiting average Markov decision problems
- Relative value iteration algorithm with soft state aggregation
Learning and adaptive systems in artificial intelligence (68T05) Computational learning theory (68Q32) Markov and semi-Markov decision processes (90C40)
Cites Work
- Title not available (Why is that?)
- Mixing times with applications to perturbed Markov chains
- Equivalence notions and model minimization in Markov decision processes
- Pattern recognition.
- Approximate equivalence of Markov decision processes.
- Title not available (Why is that?)
- Bisimulation metrics for continuous Markov decision processes
- A new analysis of quasianalysis
- Comparison of perturbation bounds for the stationary distribution of a Markov chain
- Linear dependence of stationary distributions in ergodic Markov decision processes
Cited In (7)
- Approximating a Behavioural Pseudometric Without Discount for Probabilistic Systems
- Regret bounds for restless Markov bandits
- Extreme state aggregation beyond Markov decision processes
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- On State Aggregation to Approximate Complex Value Functions in Large-Scale Markov Decision Processes
- Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes
This page was built for publication: Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3520073)