Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm
From MaRDI portal
Publication:2901012
DOI10.1287/IJOC.1070.0240zbMATH Open1243.90235OpenAlexW2102195169MaRDI QIDQ2901012FDOQ2901012
Authors: Sumit Kunnumkal, Huseyin Topaloglu
Publication date: 28 July 2012
Published in: INFORMS Journal on Computing (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1287/ijoc.1070.0240
Recommendations
- Convergence of a Q-learning variant for continuous states and actions
- Q-Learning with Linear Function Approximation
- Asynchronous stochastic approximation and Q-learning
- \({\mathcal Q}\)-learning
- A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm
Cited In (6)
- Shape constraints in economics and operations research
- A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm
- A Machine Learning–Enabled Partially Observable Markov Decision Process Framework for Early Sepsis Prediction
- Counterexample explanation by learning small strategies in Markov decision processes
- Q-learning for Markov decision processes with a satisfiability criterion
- Q-learning algorithms with random truncation bounds and applications to effective parallel computing
This page was built for publication: Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2901012)