Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

From MaRDI portal

Publication:5898263

Jump to:navigation, search

DOI10.1007/S10994-006-5835-ZMaRDI QIDQ5898263zbMATH OpenOpenAlexFDO

Authors Vladislav B. Tadić

Publication date 22 November 2006

Published in Machine Learning (Search for Journal in Brave)

Full work available at URL https://doi.org/10.1007/s10994-006-5835-z

zbMATH Keywords

stochastic approximation Markov chains reinforcement learning temporal-difference learning neuro-dynamic programming

Mathematics Subject Classification ID

Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Time series, auto-correlation, regression, etc. in statistics (GARCH) (62M10) Learning and adaptive systems in artificial intelligence (68T05) Analysis of algorithms (68W40) Queueing theory (aspects of probability theory) (60K25)

Recommendations

Cites work

Cited in

(4)

This page was built for publication: Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5898263)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5898263&oldid=12793080"