Deep Q‐learning: A robust control approach
From MaRDI portal
Publication:6136628
Abstract: In this paper, we place deep Q-learning into a control-oriented perspective and study its learning dynamics with well-established techniques from robust control. We formulate an uncertain linear time-invariant model by means of the neural tangent kernel to describe learning. We show the instability of learning and analyze the agent's behavior in frequency-domain. Then, we ensure convergence via robust controllers acting as dynamical rewards in the loss function. We synthesize three controllers: state-feedback gain scheduling H2, dynamic Hinf, and constant gain Hinf controllers. Setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature compared to the heuristics in reinforcement learning. In addition, our approach does not use a target network and randomized replay memory. The role of the target network is overtaken by the control input, which also exploits the temporal dependency of samples (opposed to a randomized memory buffer). Numerical simulations in different OpenAI Gym environments suggest that the Hinf controlled learning performs slightly better than Double deep Q-learning.
Recommendations
- Robust reinforcement learning control with static and dynamic stability
- Model-free LQR design by Q-function learning
- Robust control under worst-case uncertainty for unknown nonlinear systems using modified reinforcement learning
- Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
- Continuous-time reinforcement learning for robust control under worst-case uncertainty
Cites work
- scientific article; zbMATH DE number 3435336 (Why is no real title available?)
- Beyond singular values and loop shapes
- Linear systems theory
- Reinforcement learning. An introduction
- Robust control under worst-case uncertainty for unknown nonlinear systems using modified reinforcement learning
- Robust reinforcement learning control with static and dynamic stability
- Wide neural networks of any depth evolve as linear models under gradient descent *
- \(H_\infty\) tracking control for linear discrete-time systems via reinforcement learning
Cited in
(6)- Proximal policy optimization‐based controller for chaotic systems
- Complementary reward function based learning enhancement for deep reinforcement learning
- Reinforcement learning in the task of spherical robot motion control
- A comparative analysis of reinforcement learning and adaptive control techniques for linear uncertain systems
- Deep ensemble reinforcement learning with multiple deep deterministic policy gradient algorithm
- Safe reinforcement learning-based control using deep deterministic policy gradient algorithm and slime mould algorithm with experimental tower crane system validation
This page was built for publication: Deep Q‐learning: A robust control approach
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6136628)