An emphatic approach to the problem of off-policy temporal-difference learning

From MaRDI portal
Publication:2810885

zbMATH Open1360.68712arXiv1503.04269MaRDI QIDQ2810885FDOQ2810885


Authors: Richard S. Sutton, A. Rupam Mahmood, Martha White Edit this on Wikidata


Publication date: 6 June 2016

Published in: Journal of Machine Learning Research (JMLR) (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1503.04269




Recommendations





Cited In (12)





This page was built for publication: An emphatic approach to the problem of off-policy temporal-difference learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2810885)