Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality

DOI10.1137/22m1484201zbMath1521.93214arXiv2203.07499OpenAlexW4385162439MaRDI QIDQ6136230

Publication date: 29 August 2023

Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/2203.07499

zbMATH Keywords

diffusion processes reinforcement learning MDP (Markov decision process)

Mathematics Subject Classification ID

Optimal stochastic control (93E20) Stochastic learning and adaptive control (93E35) Diffusion processes (60J60) Markov and semi-Markov decision processes (90C40)

Related Items (2)

Continuity of cost in Borkar control topology and implications on discrete space and time approximations for controlled diffusions under several criteria ⋮ Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality

Cites Work

Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Unnamed Item
Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design
Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
Markov chains and stochastic stability
Approximating value functions for controlled degenerate diffusion processes by using piece-wise constant policies.
Asynchronous stochastic approximation and Q-learning
On the rate of convergence of finite-difference approximations for Bellman's equations with variable coefficients
\({\mathcal Q}\)-learning
Neural networks-based backward scheme for fully nonlinear PDEs
Improved order 1/4 convergence for piecewise constant policy approximation of stochastic control problems
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
Policy iterations for reinforcement learning problems in continuous time and space -- fundamental theory and methods
Variational estimation of the drift for stochastic differential equations from the empirical density
Algorithms for Reinforcement Learning
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
An analysis of temporal-difference learning with function approximation
On the convergence rate of approximation schemes for Hamilton-Jacobi-Bellman Equations
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Error Bounds for Monotone Approximation Schemes for Hamilton--Jacobi--Bellman Equations
Continuous‐time mean–variance portfolio selection: A reinforcement learning framework
Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality

This page was built for publication: Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality