Bayesian learning of noisy Markov decision processes
From MaRDI portal
Publication:4635207
Abstract: We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Recommendations
- Inverse reinforcement learning from summary data
- A Bayesian approach for learning and planning in partially observable Markov decision processes
- Inverse reinforcement learning in partially observable environments
- Learning parametric policies and transition probability models of Markov decision processes from data
- Markov decision processes with arbitrary reward processes
Cited in
(7)- Learning Variable-Length Markov Models of Behavior
- Learning parametric policies and transition probability models of Markov decision processes from data
- Modular inverse reinforcement learning for visuomotor behavior
- Playing to train your video game avatar
- scientific article; zbMATH DE number 5530066 (Why is no real title available?)
- Trajectory modeling via random utility inverse reinforcement learning
- Bayesian Representation of Stochastic Processes under Learning: de Finetti Revisited
This page was built for publication: Bayesian learning of noisy Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4635207)