Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
From MaRDI portal
Publication:5364280
Recommendations
- Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization
- Risk-averse policy optimization via risk-neutral policy optimization
- Nonconvex policy search using variational inequalities
- Parallel Algorithms for Stochastic Dynamic Programming with Continuous State and Control Variables
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search
- Global Convergence of Policy Gradient Primal–Dual Methods for Risk-Constrained LQRs
- On parallelization of a stochastic dynamic programming algorithm for solving large-scale mixed \(0-1\) problems under uncertainty
- Approximate gradient methods in policy-space optimization of Markov reward processes
- Policy iteration accelerated with Krylov methods
Cited in
(4)- Zeroth-order stochastic compositional algorithms for risk-aware learning
- scientific article; zbMATH DE number 6433488 (Why is no real title available?)
- Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
- A unified framework for stochastic optimization
This page was built for publication: Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5364280)