Representation and Timing in Theories of the Dopamine System
From MaRDI portal
Publication:5476688
DOI10.1162/neco.2006.18.7.1637zbMath1092.92006WikidataQ40317703 ScholiaQ40317703MaRDI QIDQ5476688
David S. Touretzky, Aaron Courville, Nathaniel D. Daw
Publication date: 17 July 2006
Published in: Neural Computation (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1162/neco.2006.18.7.1637
92C20: Neural biology
60K20: Applications of Markov renewal processes (reliability, queueing networks, etc.)
Related Items
Dopamine Ramps Are a Consequence of Reward Prediction Errors, Dopamine, Inference, and Uncertainty, Context Learning in the Rodent Hippocampus, Immediate return preference emerged from a synaptic learning rule for return maximization, Reinforcement learning in the brain, Planning and navigation as active inference, Multiple model-based reinforcement learning explains dopamine neuronal activity, Internal-Time Temporal Difference Model for Neural Value-Based Decision Making, Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System
Cites Work
- Planning and acting in partially observable stochastic domains
- On average versus discounted reward temporal-difference learning
- Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning
- Long-Term Reward Prediction in TD Models of the Dopamine System
- Kalman Filter Control Embedded into the Reinforcement Learning Framework
- A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains
- A Computational Model of the Functional Role of the Ventral-Striatal D2 Receptor in the Expression of Previously Acquired Behaviors