Time-inconsistent risk-sensitive equilibrium for countable-stated Markov decision processes (Q2232770)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Time-inconsistent risk-sensitive equilibrium for countable-stated Markov decision processes
scientific article

    Statements

    Time-inconsistent risk-sensitive equilibrium for countable-stated Markov decision processes (English)
    0 references
    0 references
    8 October 2021
    0 references
    Consider an integer-valued Markov decision process \(X_t; t=1,2,\dots,T\), which dynamics is described by transition probabilities \[\Pr(X_{t+1}=j|X_t=i,u_t(i))=q_t^{\varepsilon}(j;i,u_t(i)),\] where \(q_t^{\varepsilon}(j;i,u_t(i))\geq 0\), \(\sum_{j \in \mathbb{Z}}q_t^{\varepsilon}(j;i,u_t(i))=1\). Here \(u_t(i) \in U\) stands for currently chosen action, \(U\) is a complete metric space. Given strategy \(\pi_t=\{u_s(X_s)\}_{s=t}^T\) and initial condition \(X(t)=x\), the time-inconsistent \(\varepsilon\)-risk-sensitive cost functional is defined as \[ J^{\varepsilon}_{\tau,t}(x,\pi_t)=\varepsilon \log \mathbf{E}_{t,x}^{\varepsilon,\pi_t} \left[\varepsilon^{-1}\left(\sum_{s=t}^T f_{\tau,s}(X_s,u_s(X_s))+g_\tau(X_{T+1})\right)\right] \] for each \(t \in \{1.\dots,T\}\), \(\tau \in \{1.\dots,T\}\), where \(f(\cdot), g(\cdot)\) are cost functions, \(J_{\tau,t}(x,\pi_t)=\limsup_{\varepsilon \to +0} J^{\varepsilon}_{\tau,t}(x,\pi_t)\). Corresponding value functions are \(J^{\varepsilon}_{t,t}(x,\pi_t), J_{t,t}(x,\pi_t)\). In the article a time-inconsistent \(\varepsilon\)-risk-sensitive equilibrium, which verifies some step-optimality of control strategy \(\pi_t\) with respect to the cost functional \(J^{\varepsilon}_{t,t}(x,\pi_t)\), is established. As \(\varepsilon \to +0\), convergence of the \(\varepsilon\)-risk-sensitive equilibrium and corresponding value functions is proved. Some illustrative examples are given.
    0 references
    Markov decision processes
    0 references
    risk-sensitive control problem
    0 references
    large deviation principle
    0 references
    time-inconsistent equilibrium
    0 references
    Bellman principle of optimality
    0 references
    0 references
    0 references

    Identifiers