The convergence of \(TD(\lambda)\) for general \(\lambda\)

From MaRDI portal
Revision as of 23:58, 29 January 2024 by Import240129110155 (talk | contribs) (Created automatically from import240129110155)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:1812934

DOI10.1007/BF00992701zbMath0773.68060MaRDI QIDQ1812934

Peter Dayan

Publication date: 11 August 1992

Published in: Machine Learning (Search for Journal in Brave)




Related Items (22)

Adaptive learning via selectionism and Bayesianism. II: The sequential caseAn information-theoretic analysis of return maximization in reinforcement learningA generalized Kalman filter for fixed point approximation and efficient temporal-difference learningLinear least-squares algorithms for temporal difference learningFeature-based methods for large scale dynamic programmingOn the worst-case analysis of temporal-difference learning algorithmsReinforcement learning with replacing eligibility tracesPositivity and strict contractivity of functions of operatorsUnnamed ItemPremium control with reinforcement learningReinforcement learning algorithms with function approximation: recent advances and applicationsA \(Sarsa(\lambda)\) algorithm based on double-layer fuzzy reasoningMathematical properties of neuronal TD-rules and differential Hebbian learning: a comparisonAsymptotic analysis of temporal-difference learning algorithms with constant step-sizesAsymptotic analysis of temporal-difference learning algorithms with constant step-sizesThe asymptotic equipartition property in reinforcement learning and its relation to return maximizationChaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang controlPractical issues in temporal difference learningReinforcement distribution in fuzzy Q-learningFinite-Time Performance of Distributed Temporal-Difference Learning with Linear Function ApproximationOn the existence of fixed points for approximate value iteration and temporal-difference learningA simulation-based approach to stochastic dynamic programming



Cites Work




This page was built for publication: The convergence of \(TD(\lambda)\) for general \(\lambda\)