On-policy concurrent reinforcement learning
From MaRDI portal
Publication:4670596
Recommendations
- Multiagent learning using a variable learning rate
- A new \(Q\) learning algorithm for multi-agent systems
- AWESOME: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
- scientific article; zbMATH DE number 5957383
- Individual Q-Learning in Normal Form Games
Cites work
- An analysis of temporal-difference learning with function approximation
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Fast online \(Q(\lambda)\)
- Multiagent learning using a variable learning rate
- Non-cooperative games
- Two-person nonzero-sum games and quadratic programming
- \({\mathcal Q}\)-learning
This page was built for publication: On-policy concurrent reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4670596)