Singularly perturbed Markov decision processes with inclusion of transient states. (Q5943682): Difference between revisions

The authors consider continuous-time Markov Decision Processes (MDP) with weak and strong interactions as follows. Let \(x^\varepsilon (\cdot)=\{x^\varepsilon (t):t\geq 0\}\) be a real valued MDP with finite state space \({\mathcal M}=\{1,2, \dots,m\}\) and let \(u(\cdot)= \{u(t)= u(x^\varepsilon(t)) :t\geq 0\}\) be a feedback control such that \(u(t)\) is in a compact subset \(\Gamma\) of an Euclidean space. Let \(Q^\varepsilon (u(t))\) be a generator of \(x^\varepsilon(\cdot)\) having the form: \(Q^\varepsilon (u)=\widetilde Q(u)/\varepsilon+\widehat Q(u), u\in\Gamma\), where \(\widetilde Q(u)\) and \(\widehat Q(u)\) are generators and \(\varepsilon>0\) is a small parameter. For the initial state \(i=x^\varepsilon(0)\) of \(x^\varepsilon (\cdot)\), the cost-to-go function \(G(x,u)\) and the discount factor \(\rho> 0\), the cost functional is \[ J^\varepsilon \bigl(i,u(\cdot) \bigr)=E \int^\infty_0 e^{-\rho t}G\biggl( x^\varepsilon(t), u\bigl(x^\varepsilon (t)\bigr) \biggr)\,dt \] and the objective of the problem is to find a function \(u(\cdot)\) that minimizes \(J^\varepsilon(i,u (\cdot))\). They formulate a singularly perturbed MDP by decomposing the state space into several groups of recurrent states and a group of transient states, and derive the limit problem. They then construct the asymptotically optimal controls of the original problem by using the optimal solution of the limit problem. They obtain the convergence rate and error bound of the approximate control. Furthermore they deal with the related MDP with long-run average costs.

0 references

reviewed by

Yoshio Ohtsubo

0 references

zbMATH Keywords

asymptotically optimal control

0 references

MaRDI profile type

MaRDI publication profile

0 references

Identifiers

zbMATH Open document ID

1052.90098

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5943682

Revision as of 00:52, 30 January 2024 Import240129110155 (talk \| contribs) 399,160 edits Added link to MaRDI item. ← Older edit	Latest revision as of 23:44, 4 March 2024 Import240304020342 (talk \| contribs) 4,416,906 edits Set profile property.
	Property / MaRDI profile type
		MaRDI publication profile
	Property / MaRDI profile type: MaRDI publication profile / rank
		Normal rank