Bayesian Learning of Noisy Markov Decision Processes

From MaRDI portal
Publication:4635207

DOI10.1145/2414416.2414420zbMATH Open1384.62104arXiv1211.5901OpenAlexW2001339644MaRDI QIDQ4635207FDOQ4635207

Sumeetpal S. Singh, Nicolas Chopin, Nick Whiteley

Publication date: 16 April 2018

Published in: ACM Transactions on Modeling and Computer Simulation (Search for Journal in Brave)

Abstract: We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.


Full work available at URL: https://arxiv.org/abs/1211.5901






Cited In (4)






This page was built for publication: Bayesian Learning of Noisy Markov Decision Processes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4635207)