Bellman's principle of optimality and deep reinforcement learning for time-varying tasks
From MaRDI portal
Publication:5043501
Recommendations
- Reinforcement Learning and Stochastic Optimization
- The Bellman's principle of optimality in the discounted dynamic programming
- From reinforcement learning to optimal control: a unified framework for sequential decisions
- Time-varying policy rule under learning
- Approximate dynamic programming via iterated Bellman inequalities
- Policy iterations for reinforcement learning problems in continuous time and space -- fundamental theory and methods
Cites work
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- Adaptive dynamic programming for discrete-time linear quadratic regulation based on multirate generalised policy iteration
- Allocating resources via price management systems: a dynamic programming-based approach
- An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
- Dynamic programming and optimal control. Vol. 1.
- Modelling and solving resource allocation problems via a dynamic programming approach
- Multi-objective reinforcement learning using sets of Pareto dominating policies
- Reinforcement learning. An introduction
Cited in
(2)
This page was built for publication: Bellman's principle of optimality and deep reinforcement learning for time-varying tasks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5043501)