Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
From MaRDI portal
Publication:5898263
stochastic approximationMarkov chainsreinforcement learningtemporal-difference learningneuro-dynamic programming
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Time series, auto-correlation, regression, etc. in statistics (GARCH) (62M10) Learning and adaptive systems in artificial intelligence (68T05) Analysis of algorithms (68W40) Queueing theory (aspects of probability theory) (60K25)
Recommendations
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
- On the convergence of temporal-difference learning with linear function approximation
- An analysis of temporal-difference learning with function approximation
- Linear least-squares algorithms for temporal difference learning
- Analytical mean squared error curves for temporal difference learning
Cites work
- scientific article; zbMATH DE number 4013703 (Why is no real title available?)
- scientific article; zbMATH DE number 4066707 (Why is no real title available?)
- scientific article; zbMATH DE number 48727 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- scientific article; zbMATH DE number 976356 (Why is no real title available?)
- scientific article; zbMATH DE number 1005357 (Why is no real title available?)
- scientific article; zbMATH DE number 1043533 (Why is no real title available?)
- An analysis of temporal-difference learning with function approximation
- Least squares policy evaluation algorithms with linear function approximation
- Markov chains and stochastic stability
- On stability of nonlinear AR processes with Markov switching
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- On the convergence of temporal-difference learning with linear function approximation
- OnActor-Critic Algorithms
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
Cited in
(4)
This page was built for publication: Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5898263)