R-MAX - MaRDI portal

Cited in

(37)

An analysis of model-based interval estimation for Markov decision processes
Reinforcement learning in finite MDPs: PAC analysis
Near-optimal regret bounds for reinforcement learning
Belief and truth in hypothesised behaviours
Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning
Model selection in reinforcement learning
Adaptive representations for reinforcement learning.
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
scientific article; zbMATH DE number 2089367 (Why is no real title available?)
A Monte-Carlo AIXI approximation
Learning Theory
Multi-agent reinforcement learning in common interest and fixed sum stochastic games: an experimental study
Reducing reinforcement learning to KWIK online regression
scientific article; zbMATH DE number 2090946 (Why is no real title available?)
DeepStack
Pluribus
ProMP
AWESOME
Perspectives on multiagent learning
Guiding exploration by pre-existing knowledge without modifying reward
On the possibility of learning in reactive environments with arbitrary dependence
A minimum relative entropy principle for learning and acting
Reinforcement learning agents
Algorithms for reinforcement learning.
Provably efficient learning with typed parametric models
scientific article; zbMATH DE number 7626804 (Why is no real title available?)
Machine Learning: ECML 2004
Multi-agent reinforcement learning: a selective overview of theories and algorithms
scientific article; zbMATH DE number 2000817 (Why is no real title available?)
scientific article; zbMATH DE number 5957207 (Why is no real title available?)
Markov decision processes with arbitrary reward processes
Cooperative learning with joint state value approximation for multi-agent systems
Knows what it knows: a framework for self-aware learning
If multi-agent learning is the answer, what is the question?
Bounded Parameter Markov Decision Processes with Average Reward Criterion
Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence
Efficient learning equilibrium

This page was built for software: R-MAX