Simulation-based algorithms for Markov decision processes (Q1946768)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Simulation-based algorithms for Markov decision processes
scientific article

    Statements

    Simulation-based algorithms for Markov decision processes (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    16 April 2013
    0 references
    The monograph is devoted to Markov Decision Processes (MDP) models that are widely used for modeling sequential decision-making problems. Those problems arise in engineering, economics, computer science and the social sciences. The monograph is the second extended edition of the book first published over six years ago. The book presents the latests developments in the theories and the relevant algorithms developed by the authors in the MDP field. The book consists of five chapters. In Chapter 1 a formal description of the discounted reward MDP framework including both the finite- and infinite-horizon settings and summarizing the associated optimality equations is presented. Chapter 2 presents simulation-based algorithms estimating the optimal value function in finite-horizon MDPs with large (possibly uncountable) state spaces, where the usual techniques of policy iteration and value iteration are either computationally impractical or infeasible to implement. Chapter 3 is devoted to infinite-horizon problems and evolutionary approaches for finding an optimal policy. In Chapter 4 a global optimization approach called Model Reference Adaptive Search (MRAS), which provides a broad framework for updating a probability distribution over the solution space in a way that ensures convergence to an optimal solution, is presented. In Chapter 5 the authors consider an approximate rolling-horizon MDPs with large state{/action} spaces in an online manner by simulation. This well-written book is addressed to researchers in MDPs and applied modeling with an interests in numerical computations, but the book is also accessible to graduate students in operation research, computer science, and economics. The authors gives many pseudocodes of algorithms, numerical examples, algorithms convergence analysis and bibliographical notes that can be very helpful for readers to understand the ideas presented in the book and to perform experiments on their own.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    Markov decision process
    0 references
    multi-stage adaptive sampling
    0 references
    population-based evolutionary method
    0 references
    model reference adaptive search
    0 references
    simulation
    0 references
    optimal policy
    0 references
    0 references