The convergence of \(TD(\lambda)\) for general \(\lambda\)

From MaRDI portal
Publication:1812934

DOI10.1007/BF00992701zbMath0773.68060MaRDI QIDQ1812934

Peter Dayan

Publication date: 11 August 1992

Published in: Machine Learning (Search for Journal in Brave)




Related Items

Adaptive learning via selectionism and Bayesianism. II: The sequential case, An information-theoretic analysis of return maximization in reinforcement learning, A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning, Linear least-squares algorithms for temporal difference learning, Feature-based methods for large scale dynamic programming, On the worst-case analysis of temporal-difference learning algorithms, Reinforcement learning with replacing eligibility traces, Positivity and strict contractivity of functions of operators, Unnamed Item, Premium control with reinforcement learning, Reinforcement learning algorithms with function approximation: recent advances and applications, A \(Sarsa(\lambda)\) algorithm based on double-layer fuzzy reasoning, Mathematical properties of neuronal TD-rules and differential Hebbian learning: a comparison, Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes, Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes, The asymptotic equipartition property in reinforcement learning and its relation to return maximization, Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control, Practical issues in temporal difference learning, Reinforcement distribution in fuzzy Q-learning, Finite-Time Performance of Distributed Temporal-Difference Learning with Linear Function Approximation, On the existence of fixed points for approximate value iteration and temporal-difference learning, A simulation-based approach to stochastic dynamic programming



Cites Work