Adaptive importance sampling for value function approximation in off-policy reinforcement learning
From MaRDI portal
Publication:1784527
DOI10.1016/j.neunet.2009.01.002zbMath1396.68091OpenAlexW2002748013WikidataQ50110342 ScholiaQ50110342MaRDI QIDQ1784527
Masashi Sugiayma, Hirotaka Hachiya, Takayuki Akiyama, Jan Peters
Publication date: 27 September 2018
Published in: Neural Networks (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.neunet.2009.01.002
policy iterationvalue function approximationadaptive importance samplingoff-policy reinforcement learningefficient sample reuseimportance-weighted cross-validation
Related Items (8)
Reward-Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning ⋮ Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation ⋮ Least-squares two-sample test ⋮ Direct density-ratio estimation with dimensionality reduction via least-squares hetero-distributional subspace search ⋮ Semi-supervised speaker identification under covariate shift ⋮ Dimensionality reduction for density ratio estimation in high-dimensional spaces ⋮ Efficient exploration through active learning for value function approximation in reinforcement learning ⋮ Multivariate error modeling and uncertainty quantification using importance (re-)weighting for Monte Carlo simulations in particle transport
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Improving predictive inference under covariate shift by weighting the log-likelihood function
- 10.1162/1532443041827907
- Trading Variance Reduction with Unbiasedness: The Regularized Subspace Information Criterion for Robust Model Selection in Kernel Regression
- Linear Statistical Inference and its Applications
- The elements of statistical learning. Data mining, inference, and prediction
This page was built for publication: Adaptive importance sampling for value function approximation in off-policy reinforcement learning