Reinforcement learning in finite MDPs: PAC analysis
From MaRDI portal
Publication:2880979
zbMATH Open1235.68193MaRDI QIDQ2880979FDOQ2880979
Lihong Li, Alexander Strehl, Michael L. Littman
Publication date: 17 April 2012
Published in: Journal of Machine Learning Research (JMLR) (Search for Journal in Brave)
Full work available at URL: http://www.jmlr.org/papers/v10/strehl09a.html
Recommendations
- PAC Reinforcement Learning Algorithm for General-Sum Markov Games
- PAC Bounds for Discounted MDPs
- Near-optimal PAC bounds for discounted MDPs
- From perturbation analysis to Markov decision processes and reinforcement learning
- Reinforcement learning with approximation spaces
- Finite-sample analysis of least-squares policy iteration
- Learning algorithms for finite horizon constrained Markov decision processes
General nonlinear regression (62J02) Learning and adaptive systems in artificial intelligence (68T05)
Cited In (26)
- Controlling estimation error in reinforcement learning via reinforced operation
- Hybrid answer set programming
- Efficient PAC learning for episodic tasks with acyclic state spaces
- An analysis of model-based interval estimation for Markov decision processes
- Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model
- Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man
- Recent advances in reinforcement learning in finance
- 10.1162/153244303765208377
- A Small Gain Analysis of Single Timescale Actor Critic
- Title not available (Why is that?)
- Reducing reinforcement learning to KWIK online regression
- Reinforcement learning with immediate rewards and linear hypotheses
- Title not available (Why is that?)
- Title not available (Why is that?)
- Extreme state aggregation beyond Markov decision processes
- Near-optimal PAC bounds for discounted MDPs
- Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods
- PAC Bounds for Discounted MDPs
- Title not available (Why is that?)
- Title not available (Why is that?)
- Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes
- Title not available (Why is that?)
- 10.1162/153244303768966148
- Knows what it knows: a framework for self-aware learning
- Title not available (Why is that?)
- Identity concealment games: how I learned to stop revealing and love the coincidences
Uses Software
This page was built for publication: Reinforcement learning in finite MDPs: PAC analysis
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2880979)