Expected policy gradients for reinforcement learning
From MaRDI portal
Publication:4969098
zbMATH Open1498.68229arXiv1801.03326MaRDI QIDQ4969098FDOQ4969098
Authors: Kamil Ciosek, Shimon Whiteson
Publication date: 5 October 2020
Full work available at URL: https://arxiv.org/abs/1801.03326
Recommendations
Markov processes: estimation; hidden Markov models (62M05) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Natural actor-critic algorithms
- 10.1162/1532443041827907
- Reinforcement learning. An introduction
- Multi-objective reinforcement learning through continuous Pareto manifold approximation
- Optimal Estimation of Dynamic Systems
- Some Relations Between Extended and Unscented Kalman Filters
- Bayesian policy gradient and actor-critic algorithms
- Policy gradient in Lipschitz Markov decision processes
- Title not available (Why is that?)
- Approximate Newton methods for policy search in Markov decision processes
Cited In (18)
- Reward-weighted regression with sample reuse for direct policy search in reinforcement learning
- The factored policy-gradient planner
- Estimation and approximation bounds for gradient-based reinforcement learning
- Hessian matrix distribution for Bayesian policy gradient reinforcement learning
- A stochastic trust-region framework for policy optimization
- Title not available (Why is that?)
- DSMC evaluation stages: fostering robust and safe behavior in deep reinforcement learning -- extended version
- Rejoinder: New Objectives for Policy Learning
- Analysis and improvement of policy gradient estimation
- Importance sampling in reinforcement learning with an estimated behavior policy
- Accelerating reinforcement learning with a directional-Gaussian-smoothing evolution strategy
- Bayesian policy gradient and actor-critic algorithms
- Title not available (Why is that?)
- Efficient sample reuse in policy gradients with parameter-based exploration
- Smoothing policies and safe policy gradients
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
- Compatible natural gradient policy search
- Jointly learning environments and control policies with projected stochastic gradient ascent
Uses Software
This page was built for publication: Expected policy gradients for reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4969098)