Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path
From MaRDI portal
Publication:5307594
DOI10.1007/11776420_42zbMath1143.68516MaRDI QIDQ5307594
Csaba Szepesvári, András Antos, Rémi Munos
Publication date: 14 September 2007
Published in: Learning Theory (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/11776420_42
68T05: Learning and adaptive systems in artificial intelligence
Related Items
Approximation of Markov decision processes with general state space, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, An incremental off-policy search in a model-free Markov decision process using a single sample path, Multi-agent reinforcement learning: a selective overview of theories and algorithms, A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning