Nonstationary value-iteration and adaptive control of discounted semi- Markov processes (Q1068732)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Nonstationary value-iteration and adaptive control of discounted semi- Markov processes |
scientific article |
Statements
Nonstationary value-iteration and adaptive control of discounted semi- Markov processes (English)
0 references
1985
0 references
We consider in this paper discounted-reward, denumerable state space, semi-Markov decision processes which depend on unknown parameters. The problems we are interested in are: Given that the true parameter value is unknown, (I) give an iterative scheme to determine the total maximal discounted reward, and (II) find an asymptotically discount optimal (adaptive) policy. Our solutions are inspired by the nonstationary value iteration (NVI) scheme of \textit{A. Federgruen} and \textit{P. J. Schweitzer} [J. Optimization Theory Appl. 34, 207-241 (1981; Zbl 0426.90091)] combined with the ideas of \textit{M. Schäl} [in: Optimization, theory and algorithms, Conf. Confolant/France 1981, Lect. Notes Pure Appl. Math. 86, 239-253 (1983; Zbl 0525.93071)] concerning the ''principle of estimation and control'' for the adaptive control of semi-Markov processes.
0 references
discounted-reward
0 references
denumerable state space
0 references
semi-Markov decision processes
0 references
unknown parameters
0 references
nonstationary value iteration
0 references
principle of estimation and control
0 references
adaptive control of semi-Markov processes
0 references