A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning

From MaRDI portal

Revision as of 15:02, 30 January 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:859737

Jump to:navigation, search

DOI10.1007/s10626-006-8134-8zbMath1104.93054OpenAlexW2062541405MaRDI QIDQ859737

David Choi, Benjamin van Roy

Publication date: 18 January 2007

Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s10626-006-8134-8

zbMATH Keywords

Dynamic programming Kalman filter Optimal stopping Queueing Reinforcement learning Recursive least-squares Temporal-difference learning

Mathematics Subject Classification ID

Filtering in stochastic control theory (93E11) Least squares and related methods for stochastic control systems (93E24) Stochastic learning and adaptive control (93E35)

Related Items (6)

Approximate policy iteration: a survey and some new methods ⋮ A new learning algorithm for optimal stopping ⋮ Q-learning and policy iteration algorithms for stochastic shortest path problems ⋮ On regression-based stopping times ⋮ Projected equation methods for approximate solution of large linear systems ⋮ Fundamental design principles for reinforcement learning algorithms

Cites Work

This page was built for publication: A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:859737&oldid=12804594"