Nonstationary value-iteration and adaptive control of discounted semi- Markov processes (Q1068732): Difference between revisions

We consider in this paper discounted-reward, denumerable state space, semi-Markov decision processes which depend on unknown parameters. The problems we are interested in are: Given that the true parameter value is unknown, (I) give an iterative scheme to determine the total maximal discounted reward, and (II) find an asymptotically discount optimal (adaptive) policy. Our solutions are inspired by the nonstationary value iteration (NVI) scheme of \textit{A. Federgruen} and \textit{P. J. Schweitzer} [J. Optimization Theory Appl. 34, 207-241 (1981; Zbl 0426.90091)] combined with the ideas of \textit{M. Schäl} [in: Optimization, theory and algorithms, Conf. Confolant/France 1981, Lect. Notes Pure Appl. Math. 86, 239-253 (1983; Zbl 0525.93071)] concerning the ''principle of estimation and control'' for the adaptive control of semi-Markov processes.

0 references

zbMATH Keywords

discounted-reward

0 references

denumerable state space

0 references

semi-Markov decision processes

0 references

unknown parameters

0 references

nonstationary value iteration

0 references

principle of estimation and control

0 references

adaptive control of semi-Markov processes

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Dynamic programming and stochastic control

0 references

Nonstationary Markov decision problems with converging parameters

0 references

Q4150452

0 references

Adaptive control of service in queueing systems

0 references

Adaptive control of discounted Markov decision chains

0 references

Q5599448

0 references

The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter

0 references

Strongly consistent estimation in a controlled Markov renewal model

0 references

Q3875736

0 references

Q5649557

0 references

On Dynamic Programming with Unbounded Rewards

0 references

Estimation and control in Markov chains

0 references

Q5615108

0 references

Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal

0 references

Estimation and control in discounted stochastic dynamic programming

0 references

Q4772533

0 references

Identifiers

zbMATH Open document ID

0581.90096

0 references

DOI

10.1016/0022-247X(85)90253-7

0 references

Mathematics Subject Classification ID

90C40

0 references

zbMATH DE Number

3930759

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1068732

@@ Property / cites work @@
+Dynamic programming and stochastic control
@@ Property / cites work: Dynamic programming and stochastic control / rank @@
+Normal rank
@@ Property / cites work @@
+Nonstationary Markov decision problems with converging parameters
+Normal rank
@@ Property / cites work @@
+Q4150452
@@ Property / cites work: Q4150452 / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive control of service in queueing systems
@@ Property / cites work: Adaptive control of service in queueing systems / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive control of discounted Markov decision chains
+Normal rank
@@ Property / cites work @@
+Q5599448
@@ Property / cites work: Q5599448 / rank @@
+Normal rank
@@ Property / cites work @@
+The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter
+Normal rank
@@ Property / cites work @@
+Strongly consistent estimation in a controlled Markov renewal model
+Normal rank
@@ Property / cites work @@
+Q3875736
@@ Property / cites work: Q3875736 / rank @@
+Normal rank
@@ Property / cites work @@
+Q5649557
@@ Property / cites work: Q5649557 / rank @@
+Normal rank
@@ Property / cites work @@
+On Dynamic Programming with Unbounded Rewards
@@ Property / cites work: On Dynamic Programming with Unbounded Rewards / rank @@
+Normal rank
@@ Property / cites work @@
+Estimation and control in Markov chains
@@ Property / cites work: Estimation and control in Markov chains / rank @@
+Normal rank
@@ Property / cites work @@
+Q5615108
@@ Property / cites work: Q5615108 / rank @@
+Normal rank
@@ Property / cites work @@
+Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal
+Normal rank
@@ Property / cites work @@
+Estimation and control in discounted stochastic dynamic programming
+Normal rank
@@ Property / cites work @@
+Q4772533
@@ Property / cites work: Q4772533 / rank @@
+Normal rank