Adaptive control of discounted Markov decision chains (Q796461): Difference between revisions

We consider discounted-reward finite-state Markov decision processes which depend on unknown parameters. An adaptive policy inspired by the nonstationary value iteration scheme of \textit{A. Federgruen} and \textit{P. J. Schweitzer} [ibid. 34, 207-241 (1981; Zbl 0426.90091)] is proposed. This policy is briefly compared with the principle of estimation and control recently obtained by \textit{M. Schäl} [Lect. Notes Pure Appl. Math. 86, 239-253 (1983; Zbl 0525.93071)].

0 references

zbMATH Keywords

discounted-reward finite-state Markov decision processes

0 references

adaptive policy

0 references

nonstationary value iteration

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Nonstationary Markov decision problems with converging parameters

0 references

Dynamic programming and stochastic control

0 references

Q5615108

0 references

The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter

0 references

Q5599448

0 references

Q4150452

0 references

Q5649557

0 references

Estimation and control in Markov chains

0 references

Strongly consistent estimation in a controlled Markov renewal model

0 references

Adaptive control of service in queueing systems

0 references

Optimal adaptive control of priority assignment in queueing systems

0 references

Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal

0 references

Q3881672

0 references

Q3313754

0 references

Convergence analysis of parametric identification methods

0 references

Identifiers

zbMATH Open document ID

0543.90093

0 references

DOI

10.1007/BF00938426

0 references

Mathematics Subject Classification ID

90C40

0 references

zbMATH DE Number

3865009

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:796461

@@ Property / cites work @@
+Nonstationary Markov decision problems with converging parameters
+Normal rank
@@ Property / cites work @@
+Dynamic programming and stochastic control
@@ Property / cites work: Dynamic programming and stochastic control / rank @@
+Normal rank
@@ Property / cites work @@
+Q5615108
@@ Property / cites work: Q5615108 / rank @@
+Normal rank
@@ Property / cites work @@
+The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter
+Normal rank
@@ Property / cites work @@
+Q5599448
@@ Property / cites work: Q5599448 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4150452
@@ Property / cites work: Q4150452 / rank @@
+Normal rank
@@ Property / cites work @@
+Q5649557
@@ Property / cites work: Q5649557 / rank @@
+Normal rank
@@ Property / cites work @@
+Estimation and control in Markov chains
@@ Property / cites work: Estimation and control in Markov chains / rank @@
+Normal rank
@@ Property / cites work @@
+Strongly consistent estimation in a controlled Markov renewal model
+Normal rank
@@ Property / cites work @@
+Adaptive control of service in queueing systems
@@ Property / cites work: Adaptive control of service in queueing systems / rank @@
+Normal rank
@@ Property / cites work @@
+Optimal adaptive control of priority assignment in queueing systems
+Normal rank
@@ Property / cites work @@
+Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal
+Normal rank
@@ Property / cites work @@
+Q3881672
@@ Property / cites work: Q3881672 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3313754
@@ Property / cites work: Q3313754 / rank @@
+Normal rank
@@ Property / cites work @@
+Convergence analysis of parametric identification methods
+Normal rank