Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

From MaRDI portal

Publication:5920615

Jump to:navigation, search

DOI10.1007/s10994-006-5835-zzbMath1102.68753OpenAlexW2054567935MaRDI QIDQ5920615

Vladislav B. Tadić

Publication date: 14 August 2006

Published in: Machine Learning (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s10994-006-5835-z

zbMATH Keywords

Markov chains Stochastic approximation Reinforcement learning Temporal-difference learning Neuro-dynamic programming

Mathematics Subject Classification ID

Analysis of algorithms (68W40)

Related Items

FLOW SHOP SCHEDULING WITH REINFORCEMENT LEARNING

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5920615&oldid=15207991"