Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path

From MaRDI portal

Publication:5307594

Jump to:navigation, search

DOI10.1007/11776420_42zbMath1143.68516MaRDI QIDQ5307594

Csaba Szepesvári, András Antos, Rémi Munos

Publication date: 14 September 2007

Published in: Learning Theory (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/11776420_42

Mathematics Subject Classification ID

68T05: Learning and adaptive systems in artificial intelligence

Related Items

Approximation of Markov decision processes with general state space, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, An incremental off-policy search in a model-free Markov decision process using a single sample path, Multi-agent reinforcement learning: a selective overview of theories and algorithms, A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5307594&oldid=19979419"