On generalized Bellman equations and temporal-difference learning
From MaRDI portal
Markov chainMarkov decision processreinforcement learningrandomized stopping timegeneralized Bellman equationapproximate policy evaluationtemporal-difference method
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Learning and adaptive systems in artificial intelligence (68T05) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40)
This page was built for publication: On generalized Bellman equations and temporal-difference learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6829291)