Adaptive aggregation for reinforcement learning in average reward Markov decision processes
From MaRDI portal
(Redirected from Publication:378753)
Recommendations
Cites work
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- Adaptive aggregation methods for infinite horizon dynamic programming
- An Adaptive Sampling Algorithm for Solving Markov Decision Processes
- An analysis of model-based interval estimation for Markov decision processes
- Approximate equivalence of Markov decision processes.
- Asymptotically efficient adaptive allocation rules
- Bisimulation metrics for continuous Markov decision processes
- Bounded Parameter Markov Decision Processes with Average Reward Criterion
- Bounded-parameter Markov decision processes
- Equivalence notions and model minimization in Markov decision processes
- Finite-time analysis of the multiarmed bandit problem
- Knows what it knows: a framework for self-aware learning
- Near-optimal regret bounds for reinforcement learning
- Optimal Adaptive Policies for Markov Decision Processes
- Optimal adaptive policies for sequential allocation problems
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
- Simulation-based algorithms for Markov decision processes.
Cited in
(10)- Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
- Unsupervised basis function adaptation for reinforcement learning
- Selecting near-optimal approximate state representations in reinforcement learning
- Relative value iteration algorithm with soft state aggregation
- Adaptive Discretization in Online Reinforcement Learning
- Extreme state aggregation beyond MDPs
- Markov decision processes with arbitrary reward processes
- Extreme state aggregation beyond Markov decision processes
- Optimal learning with \textit{Q}-aggregation
- Clustering in block Markov chains
This page was built for publication: Adaptive aggregation for reinforcement learning in average reward Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q378753)