A convex optimization approach to dynamic programming in continuous state and action spaces
From MaRDI portal
Publication:831365
DOI10.1007/S10957-020-01747-1zbMATH Open1467.90085arXiv1810.03847OpenAlexW3104081151MaRDI QIDQ831365FDOQ831365
Authors: Insoon Yang
Publication date: 11 May 2021
Published in: Journal of Optimization Theory and Applications (Search for Journal in Brave)
Abstract: In this paper, a convex optimization-based method is proposed for numerically solving dynamic programs in continuous state and action spaces. The key idea is to approximate the output of the Bellman operator at a particular state by the optimal value of a convex program. The approximate Bellman operator has a computational advantage because it involves a convex optimization problem in the case of control-affine systems and convex costs. Using this feature, we propose a simple dynamic programming algorithm to evaluate the approximate value function at pre-specified grid points by solving convex optimization problems in each iteration. We show that the proposed method approximates the optimal value function with a uniform convergence property in the case of convex optimal value functions. We also propose an interpolation-free design method for a control policy, of which performance converges uniformly to the optimum as the grid resolution becomes finer. When a nonlinear control-affine system is considered, the convex optimization approach provides an approximate policy with a provable suboptimality bound. For general cases, the proposed convex formulation of dynamic programming operators can be modified as a nonconvex bi-level program, in which the inner problem is a linear program, without losing uniform convergence properties.
Full work available at URL: https://arxiv.org/abs/1810.03847
Recommendations
- Continuous state dynamic programming via nonexpansive approximation
- A discrete approximate iteration method for the continuing convex dynamic programming
- A polyhedral approximation approach to concave numerical dynamic programming
- A new algorithm for the continuous dynamic programming
- Quadratic approximate dynamic programming for input-affine systems
Convex programming (90C25) Dynamic programming (90C39) Dynamic programming in optimal control and differential games (49L20)
Cites Work
- Title not available (Why is that?)
- Semi-Lagrangian schemes for Hamilton-Jacobi equations, discrete representation formulae and Godunov methods
- Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems
- Title not available (Why is that?)
- On Curves of Minimal Length with a Constraint on Average Curvature, and with Prescribed Initial and Terminal Positions and Tangents
- Reinforcement learning. An introduction
- Finite linear programming approximations of constrained discounted Markov decision processes
- Algorithms for reinforcement learning.
- Title not available (Why is that?)
- Convergence of Dynamic Programming Models
- An optimal one-way multigrid algorithm for discrete-time stochastic control
- Convergence of discretization procedures in dynamic programming
- Approximations of Dynamic Programs, I
- Using Randomization to Break the Curse of Dimensionality
- Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
- Finite-time bounds for fitted value iteration
- Numerical Solution of Continuous-State Dynamic Programs Using Linear and Spline Interpolation
- Approximation of Markov decision processes with general state space
- Title not available (Why is that?)
- Reach set computation and control synthesis for discrete-time dynamical systems with disturbances
- On the asymptotic optimality of finite approximations to Markov decision processes with Borel spaces
- Convex optimization algorithms
- Lectures on convex optimization
- An efficient DP algorithm on a tree-structure for finite horizon optimal control problems
- Discretization procedures for adaptive Markov control processes
- A dynamic game approach to distributionally robust safety specifications for stochastic systems
- Probabilistic error analysis for some approximation schemes to optimal control problems
- A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs
- Semidefinite Approximations of Reachable Sets for Discrete-time Polynomial Systems
Cited In (8)
- A new algorithm for multidimensional continuing dynamic programming
- A discrete approximate iteration method for the continuing convex dynamic programming
- Multitime dynamic programming for curvilinear integral actions
- Quadratic approximate dynamic programming for input-affine systems
- Title not available (Why is that?)
- Dynamic programming with convexity, concavity and sparsity
- Maximizing the set of recurrent states of an MDP subject to convex constraints
- Using Convex Switching Techniques for Partially Observable Decision Processes
This page was built for publication: A convex optimization approach to dynamic programming in continuous state and action spaces
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q831365)