Sensitivity of constrained Markov decision processes (Q1176864): Difference between revisions

From MaRDI portal
RedirectionBot (talk | contribs)
Removed claim: reviewed by (P1447): Item:Q181183
Created claim: Wikidata QID (P12): Q59313626, #quickstatements; #temporary_batch_1722355380754
 
(3 intermediate revisions by 3 users not shown)
Property / reviewed by
 
Property / reviewed by: Douglas J. White / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov Decision Problems and State-Action Frequencies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive control of constrained Markov chains / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive control of constrained Markov chains: Criteria and policies / rank
 
Normal rank
Property / cites work
 
Property / cites work: A convex analytic approach to Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Controlled Markov chains with constraints. / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the continuity of the minimum set of a continuous function / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3703677 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite state Markovian decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Some Remarks on Finite Horizon Markovian Decision Models / rank
 
Normal rank
Property / cites work
 
Property / cites work: Solving stochastic dynamic programming problems by linear programming — An annotated bibliography / rank
 
Normal rank
Property / cites work
 
Property / cites work: Constrained Undiscounted Stochastic Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4739658 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Linear Programming and Sequential Decisions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal priority assignment with hard constraint / rank
 
Normal rank
Property / cites work
 
Property / cites work: Randomized and Past-Dependent Policies for Markov Decision Processes with Multiple Constraints / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal scheduling of interactive and noninteractive traffic in telecommunication systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Estimation and control in discounted stochastic dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3725837 / rank
 
Normal rank
Property / Wikidata QID
 
Property / Wikidata QID: Q59313626 / rank
 
Normal rank

Latest revision as of 18:03, 30 July 2024

scientific article
Language Label Description Also known as
English
Sensitivity of constrained Markov decision processes
scientific article

    Statements

    Sensitivity of constrained Markov decision processes (English)
    0 references
    0 references
    0 references
    0 references
    25 June 1992
    0 references
    The paper considers a stationary parameter, discrete time, finite state, finite action Markov decision process, where \(X_ t\in X\), \(A_ t\in A\) are, respectively, the random state and action at time \(t\); \(x\) is the initial state; \(\beta\), \(0<\beta\leq 1\), is a discount factor; \(S\) is the set of stationary Markov policies; \(c(y,a)\), \(\{d^ k(y,a)\}\), \(1\leq k\leq K\), are cost functions; and \(\{V_ k\}\) are preset constraint levels. If \(E^ u_ x\) is the expectation operator, given the initial state \(x\) and policy \(u\in S\), the main problem addressed is \(\text{COP}_ \beta(x)\) \[ \text{minimize }\biggl[C_ \beta(x,u):=(1-\beta)E^ u_ x\Bigl[\sum^ \infty_{s=0}\beta^ sc(X_ s,A_ s)\Bigl]\biggl] \] \[ \text{subject to }D^ k_ \beta:=(1-\beta)E^ u_ x\Bigl[\sum^ \infty_{s=0}\rho^ sd^ k(X_ s,A_ s)\Bigl]\leq V_ k, \quad 1\leq k\leq K,\;u\in S. \] The associated linear program is, with \(\{P_{yav}\}\) being the transition probabilities and \(\delta_ v(y)\) being the Kronecker function, \(\text{LP}_ \beta(x)\) \[ \text{minimize }\Bigl[C(z):=\sum_{y,a}c(y,a)z(y,a)\Bigl] \] \[ \text{subject to }\sum_{y,a}z(y,a)(\delta_ v(y)-\beta P_{yav})=(1-\beta)\delta_ x(v),\quad v\in X, \] \[ \text{and }D^ k(z):=\sum_{y,a}d^ k(y,a)z(y,a)\leq V_ k,\quad 1\leq k\leq K. \] Under a positive recurrent state assumption, it is, in effect, shown that \(\text{COP}_ \beta(x)\) and \(\text{LP}_ \beta(x)\) are equivalent problems. The main purpose of the paper is to study the continuity properties of \(\text{COP}_ \beta(x)\) in terms of the parameters \(\{\beta, P_{yav}, c(y,a), \{d^ k(y,a)\}\}\). These are replaced by sequences \(\{\beta_ n, P^ n_{yav}, c_ n(y,a),\allowbreak \{d^ k_ n(y,a)\}\}\) with specified convergence properties. \(\text{LP}_ \beta(x)\) is then generalized to \(\{\text{LP}^ n_ \beta(x)\}\) and it is shown that various limiting properties of \(\{\text{LP}^ n_ \beta\}\) hold in relationship to \(\text{LP}_ \beta\). This is used to establish continuity results. -- Some consideration to limiting finite horizon problems and to adaptive problems is given.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    Markov decision process
    0 references
    stationary Markov policies
    0 references
    continuity properties
    0 references
    finite horizon problems
    0 references
    adaptive problems
    0 references
    0 references