Decentralized Q-Learning for Stochastic Teams and Games
From MaRDI portal
Publication:5282408
DOI10.1109/TAC.2016.2598476zbMATH Open1366.91030arXiv1506.07924OpenAlexW2962990479MaRDI QIDQ5282408FDOQ5282408
Gรผrdal Arslan, Serdar Yรผksel
Publication date: 27 July 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Abstract: There are only a few learning algorithms applicable to stochastic dynamic teams and games which generalize Markov decision processes to decentralized stochastic control problems involving possibly self-interested decision makers. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other decision makers who are also learning. In stochastic dynamic games, learning is more challenging because, while learning, the decision makers alter the state of the system and hence the future cost. In this paper, we present decentralized Q-learning algorithms for stochastic games, and study their convergence for the weakly acyclic case which includes team problems as an important special case. The algorithm is decentralized in that each decision maker has access to only its local information, the state information, and the local cost realizations; furthermore, it is completely oblivious to the presence of other decision makers. We show that these algorithms converge to equilibrium policies almost surely in large classes of stochastic games.
Full work available at URL: https://arxiv.org/abs/1506.07924
Stochastic games, stochastic differential games (91A15) Rationality and learning in game theory (91A26)
Cited In (13)
- Q-learning-based target selection for bearings-only autonomous navigation
- Multi-sensor transmission power control for remote estimation through a SINR-based communication channel
- Separation of learning and control for cyber-physical systems
- Provably efficient reinforcement learning in decentralized general-sum Markov games
- Two-phase selective decentralization to improve reinforcement learning systems with MDP
- Independent learning in stochastic games
- Value iteration for simple stochastic games: stopping criterion and learning algorithm
- Evaluation and learning in two-player symmetric games via best and better responses
- Combining learning and control in linear systems
- Formalization of methods for the development of autonomous artificial intelligence systems
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
- Fictitious Play in Zero-Sum Stochastic Games
- Decentralized inertial best-response with voluntary and limited communication in random communication networks
Recommendations
- Decentralized Learning for Optimality in Stochastic Dynamic Teams and Games With Local Control and Global State Information ๐ ๐
- Partially decentralized reinforcement learning in finite, multi-agent Markov decision processes ๐ ๐
- Provably efficient reinforcement learning in decentralized general-sum Markov games ๐ ๐
- Decentralized strategies for finite population linear-quadratic-Gaussian games and teams ๐ ๐
- Decentralized Learning for Multiplayer Multiarmed Bandits ๐ ๐
- Deep Teams: Decentralized Decision Making With Finite and Infinite Number of Agents ๐ ๐
- Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information ๐ ๐
- Multiagent Fully Decentralized Value Function Learning With Linear Convergence Rates ๐ ๐
- Decentralized MDPs with sparse interactions ๐ ๐
- Reinforcement learning for distributed control and multi-player games ๐ ๐
This page was built for publication: Decentralized Q-Learning for Stochastic Teams and Games
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5282408)