A stochastic trust-region framework for policy optimization
From MaRDI portal
Publication:5096136
Recommendations
- Expected policy gradients for reinforcement learning
- Compatible natural gradient policy search
- Approximate Newton Policy Gradient Algorithms
- Stochastic trust-region methods with trust-region radius depending on probabilistic models
- A generalized path integral control approach to reinforcement learning
Cites work
- scientific article; zbMATH DE number 5060482 (Why is no real title available?)
- Approximate Newton methods for policy search in Markov decision processes
- Convergence of trust-region methods based on probabilistic models
- Global convergence of policy gradient methods to (almost) locally optimal policies
- Optimization theory and methods. Nonlinear programming
- Stochastic optimization using a trust-region method and random models
Cited in
(2)
This page was built for publication: A stochastic trust-region framework for policy optimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5096136)