A stochastic trust-region framework for policy optimization (Q5096136)

scientific article; zbMATH DE number 7571710

Language	Label	Description	Also known as
default for all languages	No label defined
English	A stochastic trust-region framework for policy optimization	scientific article; zbMATH DE number 7571710

Statements

instance of

scholarly article

0 references

title

A Stochastic Trust-Region Framework for Policy Optimization (English)

0 references

0 references

0 references

0 references

Journal of Computational Mathematics

0 references

publication date

15 August 2022

0 references

full work available at URL

https://arxiv.org/abs/1911.11640

0 references

zbMATH Keywords

deep reinforcement learning

0 references

stochastic trust region method

0 references

policy optimization

0 references

global convergence

0 references

entropy control

0 references

describes a project that uses

OpenAI Gym

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Convergence of trust-region methods based on probabilistic models

0 references

Stochastic optimization using a trust-region method and random models

0 references

Approximate Newton methods for policy search in Markov decision processes

0 references

Q5491447

0 references

Optimization theory and methods. Nonlinear programming

0 references

Global convergence of policy gradient methods to (almost) locally optimal policies

0 references

Identifiers

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

10.4208/JCM.2104-M2021-0007

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5096136