Finite approximations in discrete-time stochastic control. Quantized models and asymptotic optimality (Q1746073): Difference between revisions

The monograph constitutes an extremely valuable addition to the literature on approximate dynamic programming. It is aimed at expounding in a unified framework numerous original results that the three authors have achieved in the area of approximation of centralized and decentralized discrete-time stochastic control problems with uncountable state, measurement, and action spaces using quantization methods. The work reported therein, originally scattered over journal and conference papers, is extremely extensive and ranges from different optimality criteria (discounted cost and average cost problems) to cost-constrained models to partially observed processes to decentralized stochastic control problems. The authors adopt quite an abstract setting based on Borel state and action spaces, but this does not mean that the monograph can be of interest only for researchers concerned with advanced stochastic control theory. This generality is an invaluable asset, as the framework can be employed to many nonstandard settings encountered in theory and applications. The authors exploit it themselves to develop approximation methods for partially observed Markov decision processes. The book is very well written, with focus on clarity, which is a result of the authors' great mathematical erudition. On the one hand, there are no algorithms included (only some simple numerical examples complement the presented theoretical results), but these can be found in on other contributions. On the other hand, however, the book provides rigorous and flexible theoretical basis for the presented discretization methods and the attendant convergence characterizations. The authors show that quantization, which reduces a system with Borel spaces to one with finite state, measurement, and action spaces, constitutes an extremely constructive method which essentially does not depend on the system under consideration. What is more, they derive bounds on the approximation performance. Compared with the bulk of the existing literature, the presented analysis requires quite relaxed regularity conditions, such as the weak continuity of the controlled stochastic kernel. Whenever an existence result is established, the appropriate approximation result follows, and typically with no additional assumptions. The monograph is organized into two parts. The first of them (Chapters 2 to 6) regards classical stochastic control problems with a single decision maker. The second part (Chapters 7 to 9) focuses on multi-agent versions of the approximation problems emerging from decentralized stochastic control problems and their applications. Chapter 2 constitutes a prelude to all the subsequent chapters, introducing the stochastic control problems considered; specifically, Markov decision processes (MDPs), partially observed Markov decision process (POMDPs), and constrained MDPs are reviewed, along with characterizations of optimal stationary policies in Markov decision theory. Chapter 3 includes results on finite-action approximation of stationary policies for an MDP under strong and weak continuity assumptions on the transition probability. The authors demonstrate that the appropriate approximate quantized policies can approximate optimal deterministic stationary policies with arbitrary precision. Explicit bounds on the approximation error are additionally derived in terms of the number of points used to discretize the action space. In turn, in Chapter 4 finite-state approximation of MDPs is investigated. On some continuity conditions imposed on the one-stage cost function and the transition probability, it is demonstrated that a stationary policy obtained from the finite model, which is constructed by quantizing the state space of the original system on a finite grid, can approximate the optimal stationary policy with arbitrary precision. For MDPs with compact state space, explicit convergence rates are derived, which quantify how the approximation improves as the number of the grid points increases. In Chapter 5, finite-model approximations of POMDPs under the discounted cost criterion are studied. The original partially observed stochastic control problem is converted to a fully observed one on the belief space, and then finite models are obtained through uniform quantization of the state and action spaces of the belief space MDP. The policies obtained from these finite models are nearly optimal for the belief space MDP, and consequently, for the original partially observed problem. Chapter 6 deals with finite-state approximation of a discrete-time constrained MDP with compact state space, for both the discounted and average cost criteria. Using a linear programming formulation of the constrained discounted problem, convergence of the optimal value function of the finite-state model to the optimal value function of the original model is proven. Based on additional continuity conditions on the transition probability of the original model, a method to compute approximate optimal policies is proposed. Chapter 7 includes preliminaries regarding decentralized stochastic control. Chapter 8 presents results on finite-model approximation of a multi-agent stochastic control problem (team decision problem). The strategies obtained from finite models are shown to approximate the optimal cost with arbitrary precision. In particular, quantized team policies are shown to be asymptotically optimal. Finally, in Chapter 9 results of applying the proposed methodology to Witsenhausen's counter-example and the Gaussian relay channel problem are presented and discussed. Overall, although the material of this monograph is pretty advanced, the presentation style is very clear, compact and relatively easy to follow, but at the same time mathematically rigorous. The monograph is a good piece of work on a subject that attracts considerable attention. Both researchers and professionals in applied mathematics will find this book very useful. It can also be recommended as a valuable reference text in approximate dynamic programming.

0 references

reviewed by

Dariusz Uciński

0 references

zbMATH Keywords

discrete-time stochastic control

0 references

asymptotic optimality

0 references

decentralized stochastic control

0 references

Markov decision process

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1007/978-3-319-79033-6

0 references

Identifiers

zbMATH Open document ID

1471.93005

0 references

DOI

10.1007/978-3-319-79033-6

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1746073

@@ Property / full work available at URL @@
+https://doi.org/10.1007/978-3-319-79033-6
+Normal rank
@@ Property / OpenAlex ID @@
+W4253349640
@@ Property / OpenAlex ID: W4253349640 / rank @@
+Normal rank