Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration
From MaRDI portal
Publication:5378202
DOI10.1162/NECO_a_00452zbMath1414.68090arXiv1301.3966WikidataQ47904761 ScholiaQ47904761MaRDI QIDQ5378202
Masashi Sugiyama, Tingting Zhao, Hirotaka Hachiya, Voot Tangkaratt, Jun Morimoto
Publication date: 12 June 2019
Published in: Neural Computation (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1301.3966
68T05: Learning and adaptive systems in artificial intelligence
68T40: Artificial intelligence for robotics
Related Items