Expected policy gradients for reinforcement learning
From MaRDI portal
Publication:4969098
Recommendations
Cites work
- scientific article; zbMATH DE number 3626409 (Why is no real title available?)
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- scientific article; zbMATH DE number 6982305 (Why is no real title available?)
- 10.1162/1532443041827907
- Approximate Newton methods for policy search in Markov decision processes
- Bayesian policy gradient and actor-critic algorithms
- Multi-objective reinforcement learning through continuous Pareto manifold approximation
- Natural actor-critic algorithms
- Optimal Estimation of Dynamic Systems
- Policy gradient in Lipschitz Markov decision processes
- Reinforcement learning. An introduction
- Some Relations Between Extended and Unscented Kalman Filters
Cited in
(18)- Reward-weighted regression with sample reuse for direct policy search in reinforcement learning
- The factored policy-gradient planner
- Estimation and approximation bounds for gradient-based reinforcement learning
- Hessian matrix distribution for Bayesian policy gradient reinforcement learning
- A stochastic trust-region framework for policy optimization
- scientific article; zbMATH DE number 7306857 (Why is no real title available?)
- DSMC evaluation stages: fostering robust and safe behavior in deep reinforcement learning -- extended version
- Rejoinder: New Objectives for Policy Learning
- Analysis and improvement of policy gradient estimation
- Importance sampling in reinforcement learning with an estimated behavior policy
- Accelerating reinforcement learning with a directional-Gaussian-smoothing evolution strategy
- Bayesian policy gradient and actor-critic algorithms
- scientific article; zbMATH DE number 1753153 (Why is no real title available?)
- Efficient sample reuse in policy gradients with parameter-based exploration
- Smoothing policies and safe policy gradients
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
- Compatible natural gradient policy search
- Jointly learning environments and control policies with projected stochastic gradient ascent
This page was built for publication: Expected policy gradients for reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4969098)