A projected primal-dual gradient optimal control method for deep reinforcement learning
neural networksoptimal controlnecessary optimality conditionsreinforcement learningMarkov Decision Process
Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Optimality conditions for problems involving ordinary differential equations (49K15) Markov and semi-Markov decision processes (90C40) Stochastic learning and adaptive control (93E35) Networks and circuits as models of computation; circuit complexity (68Q06)
- Deep learning as optimal control problems: models and numerical methods
- Dynamical Systems andOptimal Control Approach to Deep Learning
- A mean-field optimal control formulation of deep learning
- Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis
- A generalized path integral control approach to reinforcement learning
- scientific article; zbMATH DE number 1807508 (Why is no real title available?)
- scientific article; zbMATH DE number 3167340 (Why is no real title available?)
- scientific article; zbMATH DE number 4080789 (Why is no real title available?)
- scientific article; zbMATH DE number 3222422 (Why is no real title available?)
- Deep learning as optimal control problems: models and numerical methods
- Handbook of Markov decision processes. Methods and applications
- Optimal control of ODEs and DAEs.
- Ordinary differential equations. An introduction from the dynamical systems perspective
- Reinforcement Learning Applied to a Human Arm Model
- Reinforcement learning. An introduction
- Robust optimization
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- \({\mathcal Q}\)-learning
- A mean-field optimal control formulation of deep learning
- Value-Gradient Based Formulation of Optimal Control Problem and Machine Learning Algorithm
- Primal-Dual Q-Learning Framework for LQR Design
- Jointly learning environments and control policies with projected stochastic gradient ascent
- Pretty darn good control: when are approximate solutions better than approximate models
This page was built for publication: A projected primal-dual gradient optimal control method for deep reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1980960)