Markov control process with the expected total cost criterion: Optimality, stability, and transient models (Q1973302)

From MaRDI portal





scientific article; zbMATH DE number 1436944
Language Label Description Also known as
default for all languages
No label defined
    English
    Markov control process with the expected total cost criterion: Optimality, stability, and transient models
    scientific article; zbMATH DE number 1436944

      Statements

      Markov control process with the expected total cost criterion: Optimality, stability, and transient models (English)
      0 references
      29 August 2000
      0 references
      The authors study discrete-time Markov Control Processes (MCPs) on Borel spaces under the Expected Total Cost (ETC) criterion \[ V(\pi, x)= E^\pi_x\Biggl[ \sum^\infty_{t= 0} c(x_ t,a_t)\Biggr], \] where \(c(x_t, a_t)\) is the cost-per-stage function and is possibly unbounded [for the basic concepts and notations of the MCPs, cf. \textit{O. Hernández-Lerma} and \textit{J. B. Lasserre}, Discrete-time Markov control processes: Basic optimality criteria, Springer-Verlag, New York (1995; Zbl 0840.93001)]. A lot of optimality questions are answered affirmatively here. Conditions for a control policy to be ETC-optimal and conditions for the ETC-value function to be a solution to the dynamic programming equation are well provided. It is also shown that the finiteness of the ETC function may lead to two kinds of stability: Lagrange stability and stability with probability one. In addition, transient control models [cf. \textit{S. R. Pliska}, Dynamic programming and its applications, Proc. Int. Conf., Vancouver 1977, 335-349 (1978; Zbl 0458.90082)] are fully analyzed. In fact, with the authors' new results, the paper provides a fairly complete, up-dated, survey-like presentation of the ETC criterion for MCPs.
      0 references
      0 references
      policy iteration
      0 references
      discrete-time Markov control processes
      0 references
      expected total cost
      0 references
      dynamic programming
      0 references
      Lagrange stability
      0 references
      stability with probability one
      0 references
      transient control models
      0 references

      Identifiers