Variance-constrained actor-critic algorithms for discounted and average reward MDPs

From MaRDI portal
Publication:1689603

DOI10.1007/s10994-016-5569-5zbMath1432.90158arXiv1403.6530OpenAlexW2963856199MaRDI QIDQ1689603

L. A. Prashanth, Mohammad Ghavamzadeh

Publication date: 12 January 2018

Published in: Machine Learning (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1403.6530



Related Items



Cites Work