Efficient exploration through active learning for value function approximation in reinforcement learning
From MaRDI portal
Publication:1784573
Recommendations
- 10.1162/1532443041827907
- An active exploration method for data efficient reinforcement learning
- Rollout sampling approximate policy iteration
- Reward-weighted regression with sample reuse for direct policy search in reinforcement learning
- Efficient sample reuse in policy gradients with parameter-based exploration
Cites work
- scientific article; zbMATH DE number 5957325 (Why is no real title available?)
- scientific article; zbMATH DE number 3551792 (Why is no real title available?)
- scientific article; zbMATH DE number 1149423 (Why is no real title available?)
- 10.1162/153244303765208377
- 10.1162/1532443041827907
- Active learning algorithm using the maximum weighted log-likelihood estimator
- Adaptive importance sampling for value function approximation in off-policy reinforcement learning
- Near-optimal reinforcement learning in polynomial time
- Pattern recognition and machine learning.
- Pool-based active learning in approximate linear regression
- Regularization algorithms for learning that are equivalent to multilayer networks
- Ridge Regression: Biased Estimation for Nonorthogonal Problems
- Robust weights and designs for biased regression models: Least squares and generalized \(M\)-estimation
Cited in
(13)- Direct density-ratio estimation with dimensionality reduction via least-squares hetero-distributional subspace search
- Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation
- Rollout sampling approximate policy iteration
- Active policy learning for robot planning and exploration under uncertainty
- Adaptive importance sampling for value function approximation in off-policy reinforcement learning
- Learning under nonstationarity: covariate shift and class-balance change
- Deep exploration via randomized value functions
- Improving importance estimation in pool-based batch active learning for approximate linear regression
- A parallel scheduling algorithm for reinforcement learning in large state space
- Exploiting best-match equations for efficient reinforcement learning
- An active exploration method for data efficient reinforcement learning
- Active contextual policy search
- Reward-weighted regression with sample reuse for direct policy search in reinforcement learning
This page was built for publication: Efficient exploration through active learning for value function approximation in reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1784573)