Adaptive dynamic programming for control. Algorithms and stability (Q694558)

Adaptive Dynamic Programming (ADP) is a recursive method (algorithm) which aims to circumvent the classical computational and theoretical problems in solving the Hamilton-Jacobi-Bellman equation for an optimal control problem. This book is devoted to such a subject. From Chapter 1. ``A solution to the ADP formulation is obtained through a neural-network-based adaptive critic approach.'' ``Heuristic Dynamic Programming (HDP) is the most basic and widely applied structure of ADP. In HDP, the critic network will give an estimation of the cost function \(J\)''. ``In (the presented) HDP structure, there are two critic networks. During the ADP algorithm based on HDP, there are two iteration loops, i.e., an outer iteration loop and an inner iteration loop. The weights of the first critic network \(1\) are updated at each outer loop iteration step, and the weights of the second critic network are updated at each inner loop iteration step. During the inner loop iteration, the weights of the first critic network are kept unchanged. Once the whole inner loop iteration process is finished, the wights of the second critic network are transferred to the first critic network. The output of the second critic network is \(\hat J\), which is the estimate of \(J\).'' From the Preface. ``In the recent years, ADP algorithms have gained much attention from researchers in control fields. However, with the development of ADP algorithms, more and more people want to know the answers to the following questions: {\parindent=6mm \begin{itemize} \item[(1)] Are ADP algorithms convergent? \item [(2)] Can the algorithm stabilize a nonlinear plant? \item [(3)] Can the algorithm be run on-line? \item [(4)] Can the algorithm be implemented in a finite time horizon? \item [(5)] If the answer to the first question is positive, the subsequent questions are where the algorithm converges to, and how large the error is. \end{itemize}} Before ADP algorithms can be applied to real plants, these questions need to be answered first. Throughout this book, we will study these questions and give specific answers to each question''. ``Why this book?'' ``First, the types of system involved in this monograph are rather extensive. From the point of view of models, one can find affine nonlinear systems, non-affine nonlinear systems, switched nonlinear systems, singularly perturbed systems and time-delay nonlinear systems''. ``Second, since the monograph is a summary of recent research works of the authors, the methods presented here for stabilizing, tracking, and games ... are more advanced than those appearing in introductory books''. ``Last but not least, some rather unique contributions are included in this monograph. One notable feature is the implementation of finite horizon optimal control for discrete-time nonlinear system ... another notable feature is that a pair of mixed optimal policies is developed to solve nonlinear games for the first time when the saddle point does not exist.'' Contents. 2 Optimal State Feedback Control for Discrete-Time Systems. 3 Optimal Tracking Control for Discrete-Times Systems. 4 Optimal State Feedback Control of Nonlinear Systems with Time Delays. 5 Optimal Tracking Control of Nonlinear with Time Delays. 6 Optimal Feedback Control for Continuous- Time Systems via ADP. 7 Several Special Optimal Feedback Control Designs Based on ADP. 8 Zero-Sum Games for Discrete-Time Systems Based on Model-Free ADP. 9 Nonlinear Games for a Class of Continuous-Time Systems Based on ADP. 10 Other Applications of ADP. A list of references is given at the end of each chapter.

0 references

reviewed by

Fabio Bagagiolo

0 references

zbMATH Keywords

adaptive dynamic programming

0 references

optimal control

0 references

Hamilton-Jacobi-Bellman equations

0 references

discrete-time