Softmax policy gradient methods can take exponential time to converge (Q6110457)

scientific article; zbMATH DE number 7720818

Language	Label	Description	Also known as
English	Softmax policy gradient methods can take exponential time to converge	scientific article; zbMATH DE number 7720818

Statements

instance of

scholarly article

0 references

title

Softmax policy gradient methods can take exponential time to converge (English)

0 references

0 references

0 references

0 references

0 references

Mathematical Programming. Series A. Series B

0 references

publication date

1 August 2023

0 references

full work available at URL

https://arxiv.org/abs/2102.11270

0 references

zbMATH Keywords

policy gradient methods

0 references

exponential lower bounds

0 references

softmax parameterization

0 references

discounted infinite-horizon Markov decision processes

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q4999029

0 references

Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model

0 references

First-Order Methods in Optimization

0 references

Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization

0 references

Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis

0 references

Finite-Sample Analysis of Two-Time-Scale Natural Actor–Critic Algorithm

0 references

OnActor-Critic Algorithms

0 references

Policy mirror descent for reinforcement learning: linear convergence, new sampling complexity, and generalized problem classes

0 references

Instance-Dependent ℓ<sub>∞</sub>-Bounds for Policy Evaluation in Tabular Reinforcement Learning

0 references

Simple statistical gradient-following algorithms for connectionist reinforcement learning

0 references

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

0 references

Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

0 references

Identifiers

zbMATH Open document ID

1522.90263

0 references

arXiv ID

2102.11270

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

10.1007/S10107-022-01920-6

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:6110457