scientific article
From MaRDI portal
Publication:2933988
zbMath1317.68158arXiv1304.3999MaRDI QIDQ2933988
Bruno Scherrer, Matthieu Geist
Publication date: 8 December 2014
Full work available at URL: https://arxiv.org/abs/1304.3999
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Markov processes: estimation; hidden Markov models (62M05) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)
Related Items
Learning‐based T‐sHDP() for optimal control of a class of nonlinear discrete‐time systems, Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning, Distributed consensus-based multi-agent temporal-difference learning, On Generalized Bellman Equations and Temporal-Difference Learning, Off-policy temporal difference learning with distribution adaptation in fast mixing chains