An incremental off-policy search in a model-free Markov decision process using a single sample path

From MaRDI portal
Publication:1621868


DOI10.1007/s10994-018-5697-1zbMath1465.90116arXiv1801.10287MaRDI QIDQ1621868

Ajin George Joseph, Shalabh Bhatnagar

Publication date: 12 November 2018

Published in: Machine Learning (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1801.10287


90C40: Markov and semi-Markov decision processes



Uses Software


Cites Work