Recommendations
Cites work
- An analysis of model-based interval estimation for Markov decision processes
- Asymptotically efficient adaptive allocation rules
- Bayesian Reinforcement Learning with Exploration
- Concentration Inequalities and Martingale Inequalities: A Survey
- Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model
- Near-optimal regret bounds for reinforcement learning
- PAC Bounds for Discounted MDPs
- Reinforcement learning in finite MDPs: PAC analysis
- The sample complexity of exploration in the multi-armed bandit problem
- The variance of discounted Markov decision processes
Cited in
(10)- Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model
- Reinforcement learning in finite MDPs: PAC analysis
- Near-optimal regret bounds for reinforcement learning
- Complexity bounds for approximately solving discounted MDPs by value iterations
- scientific article; zbMATH DE number 2089367 (Why is no real title available?)
- Near-optimal reinforcement learning in polynomial time
- Extreme state aggregation beyond Markov decision processes
- Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
- Optimistic Posterior Sampling for Reinforcement Learning: Worst-Case Regret Bounds
- PAC Bounds for Discounted MDPs
This page was built for publication: Near-optimal PAC bounds for discounted MDPs
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q465258)