Kernel-based reinforcement learning in average-cost problems
From MaRDI portal
Publication:5267044
DOI10.1109/TAC.2002.803530zbMATH Open1364.90349OpenAlexW2148024708MaRDI QIDQ5267044FDOQ5267044
Publication date: 20 June 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1109/tac.2002.803530
Markov and semi-Markov decision processes (90C40) Least squares and related methods for stochastic control systems (93E24)
Cited In (10)
- On Hoeffding and Bernstein type inequalities for sums of random variables in non-additive measure spaces and complete convergence
- Algorithms for Optimal Control of Stochastic Switching Systems
- Approximated multi-agent fitted Q iteration
- An algorithmic approach to optimal asset liquidation problems
- Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis
- Hoeffding's inequality for uniformly ergodic Markov chains
- Hoeffding's inequality for non-irreducible Markov models
- Efficient algorithms of pathwise dynamic programming for decision optimization in mining operations
- Kernel dynamic policy programming: applicable reinforcement learning to robot systems with high dimensional states
- Average cost temporal-difference learning
This page was built for publication: Kernel-based reinforcement learning in average-cost problems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5267044)