Bounds on the fixed point of a monotone contraction operator (Q579144): Difference between revisions
From MaRDI portal
Created a new Item |
ReferenceBot (talk | contribs) Changed an Item |
||
(5 intermediate revisions by 4 users not shown) | |||
Property / author | |||
Property / author: Paul J. Schweitzer / rank | |||
Property / review text | |||
This paper considers the solution to the following fixed point equation governing infinite horizon Markovian decision processes: \[ x^*_ i=\max_{k\in A(i)}[a^ k_ i+\sum^{N}_{t=1}M^ k_{it}x^*_ t]=f(x^*)_ i,\quad 1\leq i\leq N. \] The generally imposed condition is that \(\rho ([M_{ij}^{k(i)}])<1\), for all the possible probability transition matrices obtainable for any stationary pure Markov decision rule, and \(\rho\) is the spectral radius of the probability transition matrix. The main results are in Theorem 1. These results include the following: (i) The set \[ Y=\{y\in E^ N: y_ i>\sum^{N}_{t=1}M^ k_{it}y_ t,\quad \forall \quad k\in A(i),\quad 1\leq i\leq N\} \] is nonempty, and \(y>0\), \(\forall\) \(y\in Y\). (ii) f is a contraction mapping with modulus \(\lambda\) (y) for each norm \(\| u\|_ y\equiv \max_{1\leq i\leq N}[| u_ i| /y_ i]\), where \[ \lambda (y)\equiv \max_{i,k}[\Sigma_ tM^ k_{it}\quad y_ t/y_ i]<1. \] (iii) If \(x\in E^ N\), \(y\in Y\), then: \[ ^ y_ i\min_{1\leq j\leq N}\max_{k\in A(j)}\frac{P(j,k)}{Q(j,k)}\leq x^*_ i-x_ i\leq y_ i\max_{1\leq j\leq N}\max_{k\in A(j)}\frac{P(j,k)}{Q(j,k)},\quad where \] \[ P(j,k)=a^ k_ j+\sum^{N}_{t=1}M^ k_{jt}x_ t-x_ j\quad and\quad Q(j,k)=y_ j-\sum^{N}_{t=1}M_{jt}y_ t. \] The paper also introduces subgradient ideas. Thus y is a subgradient for f, if there exist numbers \(\{c_{ti}\}\), \(0\leq c_{ti}<1\), for all t, i, such that for all w, \(z\in E^ N\) \[ f(w+z)_ i\leq f(w)_ i+c_{ti}y_ i\max_{1\leq k\leq N}[z_ k/y_ k],\quad 1\leq i\leq N \] where \(t=1\) if \(\max_{k}[z_ k]\geq 0\), \(t=2\) if \(\max [z_ k]<0\). The paper then shows that if \(x\in E^ N\) \(y_ i\min_{1\leq j\leq N}\min_{t=1,2}\frac{R(j,t)}{S(j,t)}\leq x^*_ i-x_ i\leq y_ i\max_{1\leq j\leq N}\max_{t=1,2}\frac{R(j,t)}{S(j,t)},\) where \[ R(j,t)=f(x_ j)-x_ j\quad and\quad S(j,t)=y_ j(1-c_{tj}). \] Special cases of \(\{c_{tj}\}\) exist in terms of \(y\in Y\). The remainder of the paper gives some specializations, additional properties, some consideration of semi-Markov decision processes, and nonsymmetric bounds. | |||
Property / review text: This paper considers the solution to the following fixed point equation governing infinite horizon Markovian decision processes: \[ x^*_ i=\max_{k\in A(i)}[a^ k_ i+\sum^{N}_{t=1}M^ k_{it}x^*_ t]=f(x^*)_ i,\quad 1\leq i\leq N. \] The generally imposed condition is that \(\rho ([M_{ij}^{k(i)}])<1\), for all the possible probability transition matrices obtainable for any stationary pure Markov decision rule, and \(\rho\) is the spectral radius of the probability transition matrix. The main results are in Theorem 1. These results include the following: (i) The set \[ Y=\{y\in E^ N: y_ i>\sum^{N}_{t=1}M^ k_{it}y_ t,\quad \forall \quad k\in A(i),\quad 1\leq i\leq N\} \] is nonempty, and \(y>0\), \(\forall\) \(y\in Y\). (ii) f is a contraction mapping with modulus \(\lambda\) (y) for each norm \(\| u\|_ y\equiv \max_{1\leq i\leq N}[| u_ i| /y_ i]\), where \[ \lambda (y)\equiv \max_{i,k}[\Sigma_ tM^ k_{it}\quad y_ t/y_ i]<1. \] (iii) If \(x\in E^ N\), \(y\in Y\), then: \[ ^ y_ i\min_{1\leq j\leq N}\max_{k\in A(j)}\frac{P(j,k)}{Q(j,k)}\leq x^*_ i-x_ i\leq y_ i\max_{1\leq j\leq N}\max_{k\in A(j)}\frac{P(j,k)}{Q(j,k)},\quad where \] \[ P(j,k)=a^ k_ j+\sum^{N}_{t=1}M^ k_{jt}x_ t-x_ j\quad and\quad Q(j,k)=y_ j-\sum^{N}_{t=1}M_{jt}y_ t. \] The paper also introduces subgradient ideas. Thus y is a subgradient for f, if there exist numbers \(\{c_{ti}\}\), \(0\leq c_{ti}<1\), for all t, i, such that for all w, \(z\in E^ N\) \[ f(w+z)_ i\leq f(w)_ i+c_{ti}y_ i\max_{1\leq k\leq N}[z_ k/y_ k],\quad 1\leq i\leq N \] where \(t=1\) if \(\max_{k}[z_ k]\geq 0\), \(t=2\) if \(\max [z_ k]<0\). The paper then shows that if \(x\in E^ N\) \(y_ i\min_{1\leq j\leq N}\min_{t=1,2}\frac{R(j,t)}{S(j,t)}\leq x^*_ i-x_ i\leq y_ i\max_{1\leq j\leq N}\max_{t=1,2}\frac{R(j,t)}{S(j,t)},\) where \[ R(j,t)=f(x_ j)-x_ j\quad and\quad S(j,t)=y_ j(1-c_{tj}). \] Special cases of \(\{c_{tj}\}\) exist in terms of \(y\in Y\). The remainder of the paper gives some specializations, additional properties, some consideration of semi-Markov decision processes, and nonsymmetric bounds. / rank | |||
Normal rank | |||
Property / Mathematics Subject Classification ID | |||
Property / Mathematics Subject Classification ID: 90C40 / rank | |||
Normal rank | |||
Property / zbMATH DE Number | |||
Property / zbMATH DE Number: 4014481 / rank | |||
Normal rank | |||
Property / zbMATH Keywords | |||
upper and lower variational bounds | |||
Property / zbMATH Keywords: upper and lower variational bounds / rank | |||
Normal rank | |||
Property / zbMATH Keywords | |||
fixed point | |||
Property / zbMATH Keywords: fixed point / rank | |||
Normal rank | |||
Property / zbMATH Keywords | |||
monotone contraction operator | |||
Property / zbMATH Keywords: monotone contraction operator / rank | |||
Normal rank | |||
Property / zbMATH Keywords | |||
infinite horizon Markovian decision processes | |||
Property / zbMATH Keywords: infinite horizon Markovian decision processes / rank | |||
Normal rank | |||
Property / author | |||
Property / author: Paul J. Schweitzer / rank | |||
Normal rank | |||
Property / reviewed by | |||
Property / reviewed by: Douglas J. White / rank | |||
Normal rank | |||
Property / MaRDI profile type | |||
Property / MaRDI profile type: MaRDI publication profile / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Contraction Mappings in the Theory Underlying Dynamic Programming / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3206684 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Variational characterizations in Markov decision processes / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3266141 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q5335719 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Markov-Renewal Programming. I: Formulation, Finite Return Models / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: A modified dynamic programming method for Markovian decision problems / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: A UNIFIED APPROACH TO ALGORITHMS WITH A SUBOPTIMALITY TEST IN DISCOUNTED SEMI-MARKOV DECISION PROCESSES / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Some Bounds for Discounted Sequential Decision Processes / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Perturbation Theory and Undiscounted Markov Renewal Programming / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Iterative Bounds on the Equilibrium Distribution of a Finite Markov Chain / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Discrete Dynamic Programming with Sensitive Discount Optimality Criteria / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Markov programming by successive approximations with respect to weighted supremum norms / rank | |||
Normal rank | |||
links / mardi / name | links / mardi / name | ||
Latest revision as of 10:04, 18 June 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Bounds on the fixed point of a monotone contraction operator |
scientific article |
Statements
Bounds on the fixed point of a monotone contraction operator (English)
0 references
1987
0 references
This paper considers the solution to the following fixed point equation governing infinite horizon Markovian decision processes: \[ x^*_ i=\max_{k\in A(i)}[a^ k_ i+\sum^{N}_{t=1}M^ k_{it}x^*_ t]=f(x^*)_ i,\quad 1\leq i\leq N. \] The generally imposed condition is that \(\rho ([M_{ij}^{k(i)}])<1\), for all the possible probability transition matrices obtainable for any stationary pure Markov decision rule, and \(\rho\) is the spectral radius of the probability transition matrix. The main results are in Theorem 1. These results include the following: (i) The set \[ Y=\{y\in E^ N: y_ i>\sum^{N}_{t=1}M^ k_{it}y_ t,\quad \forall \quad k\in A(i),\quad 1\leq i\leq N\} \] is nonempty, and \(y>0\), \(\forall\) \(y\in Y\). (ii) f is a contraction mapping with modulus \(\lambda\) (y) for each norm \(\| u\|_ y\equiv \max_{1\leq i\leq N}[| u_ i| /y_ i]\), where \[ \lambda (y)\equiv \max_{i,k}[\Sigma_ tM^ k_{it}\quad y_ t/y_ i]<1. \] (iii) If \(x\in E^ N\), \(y\in Y\), then: \[ ^ y_ i\min_{1\leq j\leq N}\max_{k\in A(j)}\frac{P(j,k)}{Q(j,k)}\leq x^*_ i-x_ i\leq y_ i\max_{1\leq j\leq N}\max_{k\in A(j)}\frac{P(j,k)}{Q(j,k)},\quad where \] \[ P(j,k)=a^ k_ j+\sum^{N}_{t=1}M^ k_{jt}x_ t-x_ j\quad and\quad Q(j,k)=y_ j-\sum^{N}_{t=1}M_{jt}y_ t. \] The paper also introduces subgradient ideas. Thus y is a subgradient for f, if there exist numbers \(\{c_{ti}\}\), \(0\leq c_{ti}<1\), for all t, i, such that for all w, \(z\in E^ N\) \[ f(w+z)_ i\leq f(w)_ i+c_{ti}y_ i\max_{1\leq k\leq N}[z_ k/y_ k],\quad 1\leq i\leq N \] where \(t=1\) if \(\max_{k}[z_ k]\geq 0\), \(t=2\) if \(\max [z_ k]<0\). The paper then shows that if \(x\in E^ N\) \(y_ i\min_{1\leq j\leq N}\min_{t=1,2}\frac{R(j,t)}{S(j,t)}\leq x^*_ i-x_ i\leq y_ i\max_{1\leq j\leq N}\max_{t=1,2}\frac{R(j,t)}{S(j,t)},\) where \[ R(j,t)=f(x_ j)-x_ j\quad and\quad S(j,t)=y_ j(1-c_{tj}). \] Special cases of \(\{c_{tj}\}\) exist in terms of \(y\in Y\). The remainder of the paper gives some specializations, additional properties, some consideration of semi-Markov decision processes, and nonsymmetric bounds.
0 references
upper and lower variational bounds
0 references
fixed point
0 references
monotone contraction operator
0 references
infinite horizon Markovian decision processes
0 references
0 references