Bayesian learning of noisy Markov decision processes

DOI10.1145/2414416.2414420MaRDI QIDQ4635207zbMATH OpenOpenAlexFDO

Authors Sumeetpal S. Singh, Nicolas Chopin, Nick Whiteley

Publication date 16 April 2018

Published in ACM Transactions on Modeling and Computer Simulation (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1211.5901

zbMATH Keywords

Bayesian inference data augmentation Markov decision process parameter expansion Markov chain Monte Carlo sampler

Mathematics Subject Classification ID

Bayesian inference (62F15) Monte Carlo methods (65C05) Learning and adaptive systems in artificial intelligence (68T05)

Abstract: We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

Recommendations

Cited in

(7)

This page was built for publication: Bayesian learning of noisy Markov decision processes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4635207)