Exponential filter stability via Dobrushin's coefficient (Q2201531)

The authors deal the exponential filter stability -- the classical problem in the study of partially observed Markov processes (POMP), also known as hidden Markov models (HMM). A sequence of hidden random variables (POMP) \(X_n,n=0,1,\dots\) defined on a probability space \((\mathcal X,\mathcal B(\mathcal X),\mathcal P(\mathcal X))\) is initialized by a state \(x_0\in\mathcal X\) with a prior measure \(\mu\) on \((\mathcal X,\mathcal B(\mathcal X))\). However, the state is not available to the observer, instead the observer observes a sequence of random variables \(Y_n,n=0,1,\dots\) defined on a probability space \((\mathcal Y,\mathcal B(\mathcal Y),\mathcal P(\mathcal Y))\), a noisy measurement \(Y_n\sim Q(dy\vert X_n)\) of the hidden random variable \(X_n\) via the measurement channel \(Q:\mathcal X\to \mathcal P(\mathcal Y)\). Here the state space \(\mathcal X\) and the measurement space \(\mathcal Y\) are Polish (that is, complete, separable, metric) spaces equipped with the Borel sigma algebras \(\mathcal B(\mathcal X )\) and \(\mathcal B(\mathcal Y)\) and probability measures \(\mathcal P(\mathcal X )\) and \(\mathcal P(\mathcal Y)\). Then any set \(A\in\mathcal B(\mathcal X \times \mathcal Y)\), \(P \left((X_0; Y_0)\in A\right)= \int_A Q(dy\vert x)\mu(dx)\) and the POMP updates via the transition kernel \(T:\mathcal X\to \mathcal P(\mathcal X)\) \[ P \left((X_n; Y_n) \in A\vert (X_k; Y_k ) = (x_k; y_k),k=0,1\dots,n-1\right) = \int_A Q(dy\vert x_n)T (dx_n\vert x_{n-1}). \] The sequence of random variables \(\{(X_n; Y_n),n=0,1\dots\}\) itself is a Markov chain with the probability measure \(P^{\mu}\) on \(\Omega =\mathcal X^{Z_+}\times \mathcal Y^{Z_+}\), endowed with the product topology. The filter is defined as the sequence of conditional probability measures \[ \pi^{\mu}_n(A) = P^{\mu}(X_n\in A\vert Y_k,k=0,1,\dots,n). \] The filter realizations can be performed in a recursive manner. That is, given the previous filter realization \( \pi^{\mu}_n\in \mathcal P( \mathcal X)\) and a new observation \(y_{n+1}\in\mathcal Y\) we can compute the next filter realization \( \pi^{\mu}_{n+1}\) via the filter update function \(\varphi:\mathcal P(\mathcal X ) \times\mathcal Y\to\mathcal P(\mathcal X )\). Since the filter update is a recursive process, it is sensitive to the initial distribution of \(X_0\) which is the starting point of the recursion. Suppose that an observer computes the non-linear filter assuming that the initial prior is \(\nu\), when in reality the prior distribution is \(\mu\). The observer receives the measurements and computes the filter \( \pi^{\nu}_{n}\), but the measurement process is generated according to the true measure \(\mu\). The problem of filter stability may be described as follows. If we have two different initial probability measures \(\mu\) and \(\nu\), when do we have that the filter processes \( \pi^{\mu}_{n}\) and \( \pi^{\nu}_{n}\) merge in some appropriate sense as \(n\to\infty\). A POMP is said to be exponentially stable in total variation in expectation if there exists a coefficient \(0 < \alpha < 1\) such that for any \(\mu\ll \nu\) we have \[ E^{\mu}[\|\pi^{\mu}_{n+1}- \pi^{\nu}_{n+1}\|_{T V} ]\leq \alpha E^{\mu}[\|\pi^{\mu}_{n}- \pi^{\nu}_{n}\|_{T V} ],n=0,1,\dots, \] where the total variation norm for two probability measures \(P\) and \(Q\) is defined as \[ \|P-Q\|_{T V} =\sup_{\|f\|^{\infty}}\left|\int fdP-\int fdQ\right|, \] where \(f\) is assumed to be measurable and bounded with norm 1. Note, that most exponential stability results in the literature rely on the mixing condition which may be prohibitive for many applications, as noted in [\textit{O. Cappé} et al., Inference in hidden Markov models. New York, NY: Springer (2005; Zbl 1080.62065), Section 4.3.6], this is not a desirable approach to filter stability. Instead of such a strong mixing condition, the authors of this article introduce a new approach based on a joint contraction property of the Bayesian filter update and measurement update steps through the Dobrushin coefficient which leads to more relaxed characterizations as the measurement updates are taken into account. The main result of the article is the following statement. Assume that \(\mu\ll\nu\) and that the measurement channel \(Q\) is dominated. Then we have \[ E^{\mu}[\|\pi^{\mu}_{n+1}- \pi^{\nu}_{n+1}\|_{T V} ]\leq (1-\delta(T))(2-\delta(Q)) E^{\mu}[\|\pi^{\mu}_{n}- \pi^{\nu}_{n}\|_{T V} ], \] where \(\delta(T)\) and \(\delta(Q)\) are the Dobrushin coefficients for the transition and the measurement kernels, respectively. As a corollary we have the following statement. Assume that \(\mu\ll\nu\) and that the measurement channel \(Q\) is dominated. If we have \(\alpha=(1-\delta(T))(1-\delta(Q)) < 1\), then the filter is exponentially stable in total variation in expectation with coefficient \(\alpha\) and \[ E^{\mu}[\|\pi^{\mu}_{n}- \pi^{\nu}_{n}\|_{T V} ]\leq (2-\delta(Q)) (\alpha^n) E^{\mu}[\|\mu- \nu\|_{T V} ]. \] Furthermore, if \(\delta(T) > 1/ 2\) then \(\alpha < 1\) and the POMP is exponentially stable regardless of the measurement kernel \(Q\). The Dobrushin coefficient for a kernel operator \(K: S_1\to\mathcal P(S_2)\) is defined as: \[ \delta(K) = \inf\sum_{i=1}^{n}\min(K(x; A_i); K(y; A_i)) \] where the infimum is over all \(x; y\in S_1\) and all partitions \(\{A_i\}_{i=1}^n\) of \(S_2\) (for more details see [\textit{R. L. Dobrushin}, Teor. Veroyatn. Primen. 1, 72--89 (1956; Zbl 0093.15001)]). The Dobrushin coefficient is conceptually a measure on how similar or different the different conditional measures \(K(ds_2\vert s_1)\) and \(K(ds_2\vert s^0_1) \) are for different \(s_1\) and \(s^0_1\) (different conditionals). If the measures are similar, the coefficient is close to 1 and if they are different, it is close to 0. These simple explicit (sufficient) conditions on filter stability can be applied to more general system models, including controlled stochastic models

0 references

zbMATH Keywords

non-linear filtering

0 references

filter stability

0 references

Dobrushin coefficient

0 references

geometric convergence

0 references

reviewed by

M. P. Moklyachuk