Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
DOI10.1007/S10994-006-5835-ZzbMATH Open1470.68185OpenAlexW2054567935MaRDI QIDQ5898263FDOQ5898263
Authors: Vladislav B. Tadić
Publication date: 22 November 2006
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10994-006-5835-z
Recommendations
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
- On the convergence of temporal-difference learning with linear function approximation
- An analysis of temporal-difference learning with function approximation
- Linear least-squares algorithms for temporal difference learning
- Analytical mean squared error curves for temporal difference learning
stochastic approximationMarkov chainsreinforcement learningtemporal-difference learningneuro-dynamic programming
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Time series, auto-correlation, regression, etc. in statistics (GARCH) (62M10) Learning and adaptive systems in artificial intelligence (68T05) Analysis of algorithms (68W40) Queueing theory (aspects of probability theory) (60K25)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Markov chains and stochastic stability
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Least squares policy evaluation algorithms with linear function approximation
- OnActor-Critic Algorithms
- Title not available (Why is that?)
- An analysis of temporal-difference learning with function approximation
- On the convergence of temporal-difference learning with linear function approximation
- Title not available (Why is that?)
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- On stability of nonlinear AR processes with Markov switching
Cited In (4)
This page was built for publication: Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5898263)