Adaptive policies for time-varying stochastic systems under discounted criterion (Q1397033): Difference between revisions

The authors consider a discrete-time controlled Markov system whose evolution is described by the equation \(x_{n+1}= G_n(x_n, a_n,\xi_n)\), \(n= 0,1,\dots\), where the system states \(x_n\) and controls \(a_n\) are elements of Borel spaces and \(\{\xi_n\}\) is a sequence of observable i.i.d. random vectors with unknown distribution. Assuming the convergence of \(G_n\) and estimating the unknown distribution density of \(\xi_n\), an asymptotically optimal control policy for the limit control system is constructed.

0 references

zbMATH Keywords

non-homogeneous Markov control processes

0 references

discrete-time stochastic systems

0 references

discounted cost criterion

0 references

optimal adaptive policy

0 references

Identifiers

zbMATH Open document ID

1042.93065

0 references

DOI

10.1007/s001860100170

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1397033

Revision as of 17:00, 31 January 2024 Import240129110113 (talk \| contribs) Bots 7,163,963 edits Added link to MaRDI item. ← Older edit		Revision as of 07:32, 10 February 2024 RedirectionBot (talk \| contribs) Bots 2,880,369 edits ‎Removed claim: reviewed by (P1447): Item:Q174462 Newer edit →
Property / reviewed by
	~~H. Pragarauskas~~
Property / reviewed by: H. Pragarauskas / rank
	~~Normal rank~~