Simulation-based algorithms for Markov decision processes (Q1946768): Difference between revisions
From MaRDI portal
Added link to MaRDI item. |
Set OpenAlex properties. |
||
(3 intermediate revisions by 2 users not shown) | |||
Property / author | |||
Property / author: Steven I. Marcus / rank | |||
Property / author | |||
Property / author: Steven I. Marcus / rank | |||
Normal rank | |||
Property / MaRDI profile type | |||
Property / MaRDI profile type: MaRDI publication profile / rank | |||
Normal rank | |||
Property / full work available at URL | |||
Property / full work available at URL: https://doi.org/10.1007/978-1-4471-5022-0 / rank | |||
Normal rank | |||
Property / OpenAlex ID | |||
Property / OpenAlex ID: W4302608015 / rank | |||
Normal rank |
Latest revision as of 18:13, 19 March 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Simulation-based algorithms for Markov decision processes |
scientific article |
Statements
Simulation-based algorithms for Markov decision processes (English)
0 references
16 April 2013
0 references
The monograph is devoted to Markov Decision Processes (MDP) models that are widely used for modeling sequential decision-making problems. Those problems arise in engineering, economics, computer science and the social sciences. The monograph is the second extended edition of the book first published over six years ago. The book presents the latests developments in the theories and the relevant algorithms developed by the authors in the MDP field. The book consists of five chapters. In Chapter 1 a formal description of the discounted reward MDP framework including both the finite- and infinite-horizon settings and summarizing the associated optimality equations is presented. Chapter 2 presents simulation-based algorithms estimating the optimal value function in finite-horizon MDPs with large (possibly uncountable) state spaces, where the usual techniques of policy iteration and value iteration are either computationally impractical or infeasible to implement. Chapter 3 is devoted to infinite-horizon problems and evolutionary approaches for finding an optimal policy. In Chapter 4 a global optimization approach called Model Reference Adaptive Search (MRAS), which provides a broad framework for updating a probability distribution over the solution space in a way that ensures convergence to an optimal solution, is presented. In Chapter 5 the authors consider an approximate rolling-horizon MDPs with large state{/action} spaces in an online manner by simulation. This well-written book is addressed to researchers in MDPs and applied modeling with an interests in numerical computations, but the book is also accessible to graduate students in operation research, computer science, and economics. The authors gives many pseudocodes of algorithms, numerical examples, algorithms convergence analysis and bibliographical notes that can be very helpful for readers to understand the ideas presented in the book and to perform experiments on their own.
0 references
Markov decision process
0 references
multi-stage adaptive sampling
0 references
population-based evolutionary method
0 references
model reference adaptive search
0 references
simulation
0 references
optimal policy
0 references