Neural Temporal Difference and Q Learning Provably Converge to Global Optima (Q6149409)
From MaRDI portal
scientific article; zbMATH DE number 7812937
Language | Label | Description | Also known as |
---|---|---|---|
English | Neural Temporal Difference and Q Learning Provably Converge to Global Optima |
scientific article; zbMATH DE number 7812937 |
Statements
Neural Temporal Difference and Q Learning Provably Converge to Global Optima (English)
0 references
5 March 2024
0 references
reinforcement learning
0 references
temporal difference learning
0 references
overparameterized neural network
0 references