Best-response dynamics in zero-sum stochastic games (Q2211487)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Best-response dynamics in zero-sum stochastic games |
scientific article; zbMATH DE number 7272929
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | Best-response dynamics in zero-sum stochastic games |
scientific article; zbMATH DE number 7272929 |
Statements
Best-response dynamics in zero-sum stochastic games (English)
0 references
11 November 2020
0 references
The authors introduce three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. As an extension, they also introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, they present a modified \(\delta\)-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the \(\delta\)-converging dynamic the discount rate adapts more slowly than everything else.
0 references
stochastic games
0 references
best-response dynamics
0 references
zero-sum games
0 references
convergence
0 references
0 references
0 references
0 references
0.9653719
0 references
0.94080937
0 references
0.93678164
0 references
0.9289653
0 references
0.9283994
0 references
0.92192966
0 references
0.9204135
0 references
0.9117857
0 references
0.91133523
0 references