Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm
From MaRDI portal
Publication:2901012
Recommendations
- Convergence of a Q-learning variant for continuous states and actions
- Q-Learning with Linear Function Approximation
- Asynchronous stochastic approximation and Q-learning
- \({\mathcal Q}\)-learning
- A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm
Cited in
(6)- Shape constraints in economics and operations research
- A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm
- A Machine Learning–Enabled Partially Observable Markov Decision Process Framework for Early Sepsis Prediction
- Counterexample explanation by learning small strategies in Markov decision processes
- Q-learning for Markov decision processes with a satisfiability criterion
- Q-learning algorithms with random truncation bounds and applications to effective parallel computing
This page was built for publication: Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2901012)