The following pages link to Hirotaka Hachiya (Q448293):
Displaying 8 items.
- Analysis and improvement of policy gradient estimation (Q448295) (← links)
- Adaptive importance sampling for value function approximation in off-policy reinforcement learning (Q1784527) (← links)
- Efficient exploration through active learning for value function approximation in reinforcement learning (Q1784573) (← links)
- Reward-Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning (Q2887009) (← links)
- Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration (Q5378202) (← links)
- Relative Density-Ratio Estimation for Robust Distribution Comparison (Q5378219) (← links)
- Information-Maximization Clustering Based on Squared-Loss Mutual Information (Q5378312) (← links)
- Multistream-Based Marked Point Process With Decomposed Cumulative Hazard Functions (Q6176624) (← links)