Adaptive aggregation for reinforcement learning in average reward Markov decision processes
From MaRDI portal
Publication:378753
DOI10.1007/S10479-012-1064-YzbMATH Open1274.90476OpenAlexW2071815241MaRDI QIDQ378753FDOQ378753
Authors: Ronald Ortner
Publication date: 12 November 2013
Published in: Annals of Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10479-012-1064-y
Recommendations
Cites Work
- Title not available (Why is that?)
- Asymptotically efficient adaptive allocation rules
- Finite-time analysis of the multiarmed bandit problem
- Equivalence notions and model minimization in Markov decision processes
- Simulation-based algorithms for Markov decision processes.
- Optimal Adaptive Policies for Markov Decision Processes
- Optimal adaptive policies for sequential allocation problems
- Near-optimal regret bounds for reinforcement learning
- Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
- Approximate equivalence of Markov decision processes.
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- Bisimulation metrics for continuous Markov decision processes
- Bounded-parameter Markov decision processes
- Adaptive aggregation methods for infinite horizon dynamic programming
- Knows what it knows: a framework for self-aware learning
- An Adaptive Sampling Algorithm for Solving Markov Decision Processes
- Bounded Parameter Markov Decision Processes with Average Reward Criterion
- An analysis of model-based interval estimation for Markov decision processes
Cited In (10)
- Selecting near-optimal approximate state representations in reinforcement learning
- Optimal learning with \textit{Q}-aggregation
- Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
- Unsupervised basis function adaptation for reinforcement learning
- Relative value iteration algorithm with soft state aggregation
- Clustering in block Markov chains
- Extreme state aggregation beyond Markov decision processes
- Adaptive Discretization in Online Reinforcement Learning
- Extreme state aggregation beyond MDPs
- Markov decision processes with arbitrary reward processes
This page was built for publication: Adaptive aggregation for reinforcement learning in average reward Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q378753)