A convex optimization approach to dynamic programming in continuous state and action spaces
From MaRDI portal
(Redirected from Publication:831365)
Abstract: In this paper, a convex optimization-based method is proposed for numerically solving dynamic programs in continuous state and action spaces. The key idea is to approximate the output of the Bellman operator at a particular state by the optimal value of a convex program. The approximate Bellman operator has a computational advantage because it involves a convex optimization problem in the case of control-affine systems and convex costs. Using this feature, we propose a simple dynamic programming algorithm to evaluate the approximate value function at pre-specified grid points by solving convex optimization problems in each iteration. We show that the proposed method approximates the optimal value function with a uniform convergence property in the case of convex optimal value functions. We also propose an interpolation-free design method for a control policy, of which performance converges uniformly to the optimum as the grid resolution becomes finer. When a nonlinear control-affine system is considered, the convex optimization approach provides an approximate policy with a provable suboptimality bound. For general cases, the proposed convex formulation of dynamic programming operators can be modified as a nonconvex bi-level program, in which the inner problem is a linear program, without losing uniform convergence properties.
Recommendations
- Continuous state dynamic programming via nonexpansive approximation
- A discrete approximate iteration method for the continuing convex dynamic programming
- A polyhedral approximation approach to concave numerical dynamic programming
- A new algorithm for the continuous dynamic programming
- Quadratic approximate dynamic programming for input-affine systems
Cites work
- scientific article; zbMATH DE number 3722445 (Why is no real title available?)
- scientific article; zbMATH DE number 1240224 (Why is no real title available?)
- scientific article; zbMATH DE number 1734433 (Why is no real title available?)
- scientific article; zbMATH DE number 2107836 (Why is no real title available?)
- A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs
- A dynamic game approach to distributionally robust safety specifications for stochastic systems
- Algorithms for reinforcement learning.
- An efficient DP algorithm on a tree-structure for finite horizon optimal control problems
- An optimal one-way multigrid algorithm for discrete-time stochastic control
- Approximation of Markov decision processes with general state space
- Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
- Approximations of Dynamic Programs, I
- Convergence of Dynamic Programming Models
- Convergence of discretization procedures in dynamic programming
- Convex optimization algorithms
- Discretization procedures for adaptive Markov control processes
- Finite linear programming approximations of constrained discounted Markov decision processes
- Finite-time bounds for fitted value iteration
- Lectures on convex optimization
- Numerical Solution of Continuous-State Dynamic Programs Using Linear and Spline Interpolation
- On Curves of Minimal Length with a Constraint on Average Curvature, and with Prescribed Initial and Terminal Positions and Tangents
- On the asymptotic optimality of finite approximations to Markov decision processes with Borel spaces
- Probabilistic error analysis for some approximation schemes to optimal control problems
- Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems
- Reach set computation and control synthesis for discrete-time dynamical systems with disturbances
- Reinforcement learning. An introduction
- Semi-Lagrangian schemes for Hamilton-Jacobi equations, discrete representation formulae and Godunov methods
- Semidefinite Approximations of Reachable Sets for Discrete-time Polynomial Systems
- Using Randomization to Break the Curse of Dimensionality
Cited in
(8)- A new algorithm for multidimensional continuing dynamic programming
- A discrete approximate iteration method for the continuing convex dynamic programming
- Quadratic approximate dynamic programming for input-affine systems
- Using Convex Switching Techniques for Partially Observable Decision Processes
- Multitime dynamic programming for curvilinear integral actions
- scientific article; zbMATH DE number 2221684 (Why is no real title available?)
- Dynamic programming with convexity, concavity and sparsity
- Maximizing the set of recurrent states of an MDP subject to convex constraints
This page was built for publication: A convex optimization approach to dynamic programming in continuous state and action spaces
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q831365)