Multiagent reinforcement learning with regret matching for robot soccer (Q474272)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Multiagent reinforcement learning with regret matching for robot soccer
scientific article

    Statements

    Multiagent reinforcement learning with regret matching for robot soccer (English)
    0 references
    0 references
    0 references
    0 references
    24 November 2014
    0 references
    Summary: This paper proposes a novel multiagent reinforcement learning (MARL) algorithm Nash-\(Q\) learning with regret matching, in which regret matching is used to speed up the well-known MARL algorithm Nash-\(Q\) learning. It is critical that choosing a suitable strategy for action selection to harmonize the relation between exploration and exploitation to enhance the ability of online learning for Nash-\(Q\) learning. In Markov Game the joint action of agents adopting regret matching algorithm can converge to a group of points of no-regret that can be viewed as coarse correlated equilibrium which includes Nash equilibrium in essence. It is can be inferred that regret matching can guide exploration of the state-action space so that the rate of convergence of Nash-\(Q\) learning algorithm can be increased. Simulation results on robot soccer validate that compared to original Nash-\(Q\) learning algorithm, the use of regret matching during the learning phase of Nash-\(Q\) learning has excellent ability of online learning and results in significant performance in terms of scores, average reward and policy convergence.
    0 references
    0 references
    0 references
    0 references
    0 references