Monotone value iteration for discounted finite Markov decision processes (Q1076618)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Monotone value iteration for discounted finite Markov decision processes |
scientific article |
Statements
Monotone value iteration for discounted finite Markov decision processes (English)
0 references
1985
0 references
This paper considers the properties of two modifications of the value iteration scheme for a finite state, finite actions Markovian decision process with the total expected discounted reward criterion. The first variant replaces the discount factor appearing in the value iteration scheme by a magnitude which depends on the results of the previous iterations. In the second variant some perturbation is added to the value vector on each iteration. The author shows that for some choice of the parameters of these methods they converge monotonically to the optimal solutions. He also presents some examples in which these modifications perform better than the usual value iteration scheme.
0 references
value iteration
0 references
finite state, finite actions Markovian decision process
0 references
total expected discounted reward criterion
0 references