Semi-infinite Markov decision processes (Q1974026): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
Set OpenAlex properties.
 
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s001860050006 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2073519119 / rank
 
Normal rank

Latest revision as of 22:02, 19 March 2024

scientific article
Language Label Description Also known as
English
Semi-infinite Markov decision processes
scientific article

    Statements

    Semi-infinite Markov decision processes (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    7 April 2002
    0 references
    This paper studies Markov Decision Processes (MDPs) with finite state and countable action sets. It deals with two major criteria for MDPs: the total discounted rewards and average rewards per unit time. For the total discounted criteria, the authors study the linear programming approach. In particular, they develop a semi-infinite linear program for the MDP and show that it can be approximated by finite linear programs associated with finite state and action MDPs, which are truncated versions of the original MDPs. The authors also investigate the question when the supremum of the objective criterion over the class of deterministic (or stationary, according to another terminology) policies is equal to the supremum over the class of all policies. Referee's remark: (i) Theorem 3.1, which states that the supremum of the total discounted rewards over all policies is equal to the supremum over deterministic policies, is a particular case of Theorem 5 (b) in \textit{D. Blackwell} [Ann. Math. Stat. 36, 226-235 (1965; Zbl 0133.42805)] where the finiteness of the state space was not assumed. (ii) Theorem 6.1, which states that in a communicating MDP the supremum of the average reward per unit time over all policies, is equal the supremum over randomized stationary policies is a particular case of Corollary 4.5 (C) in \textit{E. Feinberg} and \textit{H. Park} [Stochastic Processes Appl. 49, 159-177 (1994; Zbl 0787.60085)] where it was proved for a smaller class of deterministic policies and without additional Assumption 6.1.
    0 references
    0 references
    semi-infinite Markov decision processes
    0 references
    optimal strategy
    0 references
    semi-infinite linear program
    0 references
    0 references