Softmax policy gradient methods can take exponential time to converge (Q6110457)
From MaRDI portal
scientific article; zbMATH DE number 7720818
Language | Label | Description | Also known as |
---|---|---|---|
English | Softmax policy gradient methods can take exponential time to converge |
scientific article; zbMATH DE number 7720818 |
Statements
Softmax policy gradient methods can take exponential time to converge (English)
0 references
1 August 2023
0 references
policy gradient methods
0 references
exponential lower bounds
0 references
softmax parameterization
0 references
discounted infinite-horizon Markov decision processes
0 references
0 references
0 references
0 references
0 references