Error bounds for constant step-size Q-learning

From MaRDI portal

Publication:1932736

Jump to:navigation, search

DOI10.1016/J.SYSCONLE.2012.08.014MaRDI QIDQ1932736zbMATH OpenOpenAlexFDO

Authors Yanyan Li

Publication date 21 January 2013

Published in Systems \& Control Letters (Search for Journal in Brave)

Full work available at URL https://doi.org/10.1016/j.sysconle.2012.08.014

zbMATH Keywords

stochastic approximation \(Q\)-learning Markov decision processes

Mathematics Subject Classification ID

Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Learning and adaptive systems in artificial intelligence (68T05) Stochastic systems in control theory (general) (93E03)

Recommendations

Cites work

Cited in

(13)

Describes a project that uses

Uses Software

Approxrl

This page was built for publication: Error bounds for constant step-size \(Q\)-learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1932736)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1932736&oldid=14364449"