Error bounds for constant step-size Q-learning
DOI10.1016/J.SYSCONLE.2012.08.014zbMATH Open1255.93129OpenAlexW1999254175MaRDI QIDQ1932736FDOQ1932736
Authors: Yanyan Li
Publication date: 21 January 2013
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.sysconle.2012.08.014
Recommendations
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Learning and adaptive systems in artificial intelligence (68T05) Stochastic systems in control theory (general) (93E03)
Cites Work
- Title not available (Why is that?)
- \({\mathcal Q}\)-learning
- Title not available (Why is that?)
- Asynchronous stochastic approximation and Q-learning
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Q-learning and enhanced policy iteration in discounted dynamic programming
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Q-learning and policy iteration algorithms for stochastic shortest path problems
- Title not available (Why is that?)
- Boundedness of iterates in \(Q\)-learning
Cited In (13)
- Some limit properties of Markov chains induced by recursive stochastic algorithms
- Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning
- Recent advances in reinforcement learning in finance
- Data-driven approximate Q-learning stabilization with optimality error bound analysis
- A generalization error for Q-learning
- Advances in Artificial Intelligence
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
- Settling the sample complexity of model-based offline reinforcement learning
- Convergence of Recursive Stochastic Algorithms Using Wasserstein Divergence
- Boundedness of iterates in \(Q\)-learning
- A Discrete-Time Switching System Analysis of Q-Learning
- Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
- Title not available (Why is that?)
Uses Software
This page was built for publication: Error bounds for constant step-size \(Q\)-learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1932736)