Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
From MaRDI portal
Publication:5920615
DOI10.1007/s10994-006-5835-zzbMath1102.68753OpenAlexW2054567935MaRDI QIDQ5920615
Publication date: 14 August 2006
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10994-006-5835-z
Markov chainsStochastic approximationReinforcement learningTemporal-difference learningNeuro-dynamic programming
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Markov chains and stochastic stability
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Least squares policy evaluation algorithms with linear function approximation
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- An analysis of temporal-difference learning with function approximation
- OnActor-Critic Algorithms
- On stability of nonlinear AR processes with Markov switching
- On the convergence of temporal-difference learning with linear function approximation