A unified approach to adaptive control of average reward Markov decision processes (Q1095048)

From MaRDI portal

Jump to:navigation, search

scientific article

Language	Label	Description	Also known as
English	A unified approach to adaptive control of average reward Markov decision processes	scientific article

Statements

scholarly article

0 references

A unified approach to adaptive control of average reward Markov decision processes (English)

0 references

zbMATH Open document ID

0 references

10.1007/BF01740510

0 references

Gerhard Hübner

0 references

0 references

publication date

1988

0 references

The paper presents a general optimization method for adaptive average reward Markov decision problems. Optimal decisions are determined by applying after each observation of the state and estimation of the unknown parameter a policy improvement step to an auxiliary value function, converging with increasing time to the true relative value. This method includes the classical procedure of estimation and control [cp. \textit{M. Kurano}, J. Oper. Res. Soc. Japan 15, 67-76 (1972; Zbl 0238.90006), and \textit{P. Mandl}, Adv. Appl. Probab. 6, 40-60 (1974; Zbl 0281.60070)], the nonstationary value iteration [cp. \textit{A. Federgruen} and \textit{P. J. Schweitzer}, J. Optimization Theory Appl. 34, 207-241 (1981; Zbl 0457.90083), \textit{R. S. Acosta-Abreu} and \textit{O. Hernandez- Lerma}, Control Cybern. 14, 313-322 (1985; Zbl 0606.90130), and \textit{M. Kurano}, J. Appl. Probab. 24, 270-276 (1987)], and a lot of new procedures, too.

0 references

Mathematics Subject Classification ID

0 references

zbMATH DE Number

0 references

zbMATH Keywords

adaptive control

0 references

adaptive average reward Markov decision

0 references

policy improvement

0 references

nonstationary value iteration

0 references

MaRDI profile type

MaRDI publication profile

0 references

0 references

0 references

The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms

0 references

Nonstationary Markov decision problems with converging parameters

0 references

Contraction mappings underlying undiscounted Markov decision problems

0 references

Adaptive control of discounted Markov decision chains

0 references

0 references

Bounds and good policies in stationary finite–stage Markovian decision problems

0 references

0 references

Adaptive Policies in Markov Decision Processes with Uncertain Transition Matrices

0 references

Learning algorithms for Markov decision processes

0 references

Estimation and control in Markov chains

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1095048

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q1095048&oldid=34777250"