Semi-infinite Markov decision processes (Q1974026)

From MaRDI portal





scientific article; zbMATH DE number 1441494
Language Label Description Also known as
default for all languages
No label defined
    English
    Semi-infinite Markov decision processes
    scientific article; zbMATH DE number 1441494

      Statements

      Semi-infinite Markov decision processes (English)
      0 references
      0 references
      0 references
      0 references
      7 April 2002
      0 references
      This paper studies Markov Decision Processes (MDPs) with finite state and countable action sets. It deals with two major criteria for MDPs: the total discounted rewards and average rewards per unit time. For the total discounted criteria, the authors study the linear programming approach. In particular, they develop a semi-infinite linear program for the MDP and show that it can be approximated by finite linear programs associated with finite state and action MDPs, which are truncated versions of the original MDPs. The authors also investigate the question when the supremum of the objective criterion over the class of deterministic (or stationary, according to another terminology) policies is equal to the supremum over the class of all policies. Referee's remark: (i) Theorem 3.1, which states that the supremum of the total discounted rewards over all policies is equal to the supremum over deterministic policies, is a particular case of Theorem 5 (b) in \textit{D. Blackwell} [Ann. Math. Stat. 36, 226-235 (1965; Zbl 0133.42805)] where the finiteness of the state space was not assumed. (ii) Theorem 6.1, which states that in a communicating MDP the supremum of the average reward per unit time over all policies, is equal the supremum over randomized stationary policies is a particular case of Corollary 4.5 (C) in \textit{E. Feinberg} and \textit{H. Park} [Stochastic Processes Appl. 49, 159-177 (1994; Zbl 0787.60085)] where it was proved for a smaller class of deterministic policies and without additional Assumption 6.1.
      0 references
      semi-infinite Markov decision processes
      0 references
      optimal strategy
      0 references
      semi-infinite linear program
      0 references

      Identifiers