A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game
DOI10.1007/S11538-009-9424-8zbMATH Open1182.91048OpenAlexW2073008835WikidataQ39975754 ScholiaQ39975754MaRDI QIDQ1048261FDOQ1048261
Authors: Naoki Masuda, Hisashi Ohtsuki
Publication date: 11 January 2010
Published in: Bulletin of Mathematical Biology (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s11538-009-9424-8
Recommendations
- scientific article; zbMATH DE number 952980
- Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated prisoner's dilemma
- A dynamic analysis of the repeated prisoner's dilemma game
- Transient and asymptotic dynamics of reinforcement learning in games
- scientific article; zbMATH DE number 741099
- Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term
- Modeling and Using Context
- AN ANALYSIS OF EXPERIENCE REPLAY IN TEMPORAL DIFFERENCE LEARNING
- Bounded rationality in differential games: a reinforcement learning-based approach
- A learning-based model of repeated games with incomplete information
Models of societies, social and urban evolution (91D10) Cooperative games (91A12) Memory and learning in psychology (91E40) Rationality and learning in game theory (91A26)
Cites Work
- \({\mathcal Q}\)-learning
- Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term
- Experience-weighted Attraction Learning in Normal Form Games
- Individual learning in normal form games: Some laboratory results
- Title not available (Why is that?)
- Title not available (Why is that?)
- Learning dynamics in social dilemmas
- The evolution of stochastic strategies in the prisoner's dilemma
- Learning to cooperate with Pavlov and adaptive strategy for the iterated prisoner's dilemma with noise
- Game-dynamical aspects of the prisoner's dilemma
- Automata, repeated games and noise
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Learning behavior in an experimental matching pennies game
- Practical issues in temporal difference learning
- Chaos in learning a simple two-person game
- Dynamics of internal models in game players
Cited In (7)
- Immediate return preference emerged from a synaptic learning rule for return maximization
- Evolution of cooperation facilitated by reinforcement learning with adaptive aspiration levels
- The independent localisations of interaction and learning in the repeated prisoner's dilemma
- Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game
- Reinforcement learning in a prisoner's dilemma
- Global migration can lead to stronger spatial selection than local migration
- Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated prisoner's dilemma
This page was built for publication: A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1048261)