The convergence of TD() for general
From MaRDI portal
Recommendations
- Linear least-squares algorithms for temporal difference learning
- Linear least-squares algorithms for temporal difference learning
- On the worst-case analysis of temporal-difference learning algorithms
- On the convergence of temporal-difference learning with linear function approximation
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
Cites work
- scientific article; zbMATH DE number 3174053 (Why is no real title available?)
- scientific article; zbMATH DE number 3954793 (Why is no real title available?)
- scientific article; zbMATH DE number 4078838 (Why is no real title available?)
- scientific article; zbMATH DE number 46507 (Why is no real title available?)
- scientific article; zbMATH DE number 3215568 (Why is no real title available?)
- scientific article; zbMATH DE number 3338156 (Why is no real title available?)
- A Boolean complete neural model of adaptive behavior
- A New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)
- An adaptive optimal controller for discrete-time Markov environments
Cited in
(35)- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
- An information-theoretic analysis of return maximization in reinforcement learning
- TD(λ) learning without eligibility traces: a theoretical analysis
- Neural Temporal Difference and Q Learning Provably Converge to Global Optima
- Adaptive learning via selectionism and Bayesianism. II: The sequential case
- Weak convergence properties of constrained emphatic temporal-difference learning with constant and slowly diminishing stepsize
- Aspects regarding the existence of fixed points of the iterates of Stancu operators
- Premium control with reinforcement learning
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- On the worst-case analysis of temporal-difference learning algorithms
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
- Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control
- Practical issues in temporal difference learning
- Reinforcement learning with replacing eligibility traces
- A simulation-based approach to stochastic dynamic programming
- The asymptotic equipartition property in reinforcement learning and its relation to return maximization
- On the worst-case analysis of temporal-difference learning algorithms
- Least squares temporal difference methods: An analysis under general conditions
- A \(Sarsa(\lambda)\) algorithm based on double-layer fuzzy reasoning
- Feature-based methods for large scale dynamic programming
- Linear least-squares algorithms for temporal difference learning
- Reinforcement distribution in fuzzy Q-learning
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
- Eligibility traces and forgetting factor in recursive least-squares-based temporal difference
- Iterates of Stancu operators (via fixed point principles) revisited
- scientific article; zbMATH DE number 1931849 (Why is no real title available?)
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- scientific article; zbMATH DE number 7370555 (Why is no real title available?)
- Linear least-squares algorithms for temporal difference learning
- Mathematical properties of neuronal TD-rules and differential Hebbian learning: a comparison
- Extension of \(\lambda\)-PIR for weakly contractive operators via fixed point theory
- Finite-time error bounds for distributed linear stochastic approximation
- Positivity and strict contractivity of functions of operators
- Finite-time performance of distributed temporal-difference learning with linear function approximation
- Reinforcement learning algorithms with function approximation: recent advances and applications
This page was built for publication: The convergence of \(TD(\lambda)\) for general \(\lambda\)
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1812934)