Q-learning for Markov decision processes with a satisfiability criterion
From MaRDI portal
Recommendations
- Exploiting the structural properties of the underlying Markov decision problem in the Q-learning algorithm
- Q-learning for distributionally robust Markov decision processes
- Learning algorithms for finite horizon constrained Markov decision processes
- Learning algorithms for Markov decision processes
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- Q-learning and enhanced policy iteration in discounted dynamic programming
- \({\mathcal Q}\)-learning
- Reinforcement learning in robust Markov decision processes
Cites work
- scientific article; zbMATH DE number 5869530 (Why is no real title available?)
- scientific article; zbMATH DE number 3855514 (Why is no real title available?)
- scientific article; zbMATH DE number 5348356 (Why is no real title available?)
- An analog of the minimax theorem for vector payoffs
- Approachability in Stackelberg stochastic games with vector costs
- Approachable sets of vector payoffs in stochastic games
- Asynchronous Stochastic Approximations
- Asynchronous stochastic approximation with differential inclusions
- Dynamical systems and variational inequalities
- Estimation and control in discounted stochastic dynamic programming
- Evolutionary Games and Population Dynamics
- Guaranteed performance regions in Markovian systems with competing decision makers
- Learning algorithms for Markov decision processes with average cost
- Multiplicative updates outperform generic no-regret learning in congestion games (extended abstract)
- Stochastic Approximations and Differential Inclusions
- Stochastic Approximations and Differential Inclusions, Part II: Applications
- Stochastic approximation with two time scales
- Structural Properties of Optimal Transmission Policies Over a Randomly Varying Channel
- Survey of Measurable Selection Theorems
- The Borkar-Meyn theorem for asynchronous stochastic approximations
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- The multiplicative weights update method: a meta-algorithm and applications
- The projection dynamic and the geometry of population games
Cited in
(5)- \(L^\ast\)-based learning of Markov decision processes (extended version)
- scientific article; zbMATH DE number 5670432 (Why is no real title available?)
- Prospect-theoretic Q-learning
- Revisiting SIR in the age of COVID-19: explicit solutions and control problems
- Counterexample explanation by learning small strategies in Markov decision processes
This page was built for publication: Q-learning for Markov decision processes with a satisfiability criterion
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1749413)