Multiple Model-Based Reinforcement Learning

From MaRDI portal

Publication:4542426

Jump to:navigation, search

DOI10.1162/089976602753712972zbMath0997.93037WikidataQ51956885 ScholiaQ51956885MaRDI QIDQ4542426

Kenji Doya, Mitsuo Kawato, Kazuyuki Samejima, Ken-Ichi Katagiri

Publication date: 4 November 2002

Published in: Neural Computation (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1162/089976602753712972

zbMATH Keywords

competition; cooperation; multiple linear quadratic controller; multiple model-based reinforcement learning; multiple prediction models

Mathematics Subject Classification ID

68T05: Learning and adaptive systems in artificial intelligence

93B51: Design techniques (robust design, computer-aided design, etc.)

49N10: Linear-quadratic optimal control problems

Related Items

Model-Free Robust Optimal Feedback Mechanisms of Biological Motor Control, Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation, Model-based Reinforcement Learning: A Survey, Reward prediction errors, not sensory prediction errors, play a major role in model selection in human reinforcement learning, Learning to grasp and extract affordances: the integrated learning of grasps and affordances (ILGA) model, Modeling of autonomous problem solving process by dynamic construction of task models in multiple tasks environment, Incremental acquisition of multiple nonlinear forward models based on differentiation process of schema model, From internal models toward metacognitive AI, Challenges of real-world reinforcement learning: definitions, benchmarks and analysis, Kernel dynamic policy programming: applicable reinforcement learning to robot systems with high dimensional states, Data-driven adaptive optimal control of linear uncertain systems with unknown jumping dynamics, Multiple model-based reinforcement learning explains dopamine neuronal activity, Scalable transfer learning in heterogeneous, dynamic environments, Extraction of primitive representation from captured human movements and measured ground reaction force to generate physically consistent imitated behaviors, MOSAIC for Multiple-Reward Environments, The Neuronal Replicator Hypothesis, Internal-Time Temporal Difference Model for Neural Value-Based Decision Making, An Internal Model for Acquisition and Retention of Motor Learning During Arm Reaching

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4542426&oldid=18664244"