Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation
From MaRDI portal
Publication:889297
DOI10.1016/j.neunet.2014.06.006zbMath1325.68200arXiv1307.5118WikidataQ39164602 ScholiaQ39164602MaRDI QIDQ889297
Syogo Mori, Voot Tangkaratt, Jun Morimoto, Masashi Sugiyama, Tingting Zhao
Publication date: 6 November 2015
Published in: Neural Networks (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1307.5118
Related Items
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Policy search for motor primitives in robotics
- Statistical analysis of kernel-based least-squares density-ratio estimation
- Analysis and improvement of policy gradient estimation
- Adaptive importance sampling for value function approximation in off-policy reinforcement learning
- Efficient exploration through active learning for value function approximation in reinforcement learning
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation
- Computational complexity of kernel-based density-ratio estimation: a condition number analysis
- Model-based contextual policy search for data-efficient generalization of robot skills
- Using Expectation-Maximization for Reinforcement Learning
- 10.1162/1532443041827907
- Sufficient Dimension Reduction via Squared-Loss Mutual Information Estimation
- Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration