Matrix versions of the Hellinger distance (Q2318010)

From MaRDI portal





scientific article
Language Label Description Also known as
default for all languages
No label defined
    English
    Matrix versions of the Hellinger distance
    scientific article

      Statements

      Matrix versions of the Hellinger distance (English)
      0 references
      0 references
      0 references
      0 references
      13 August 2019
      0 references
      Let \((p_{1},p_{2},\dots,p_{n})\) and \((q_{1},q_{2},\dots,q_{n})\) be two probability distributions. Then the Hellinger distance between them is defined to be \(\left\{ \sum_{i}(\frac{1}{2}(p_{i}+q_{i})-\sqrt{p_{i}q_{i}})\right\} ^{1/2}\). In terms of the diagonal matrices \(P:=\mathrm{diag}(p_{1},p_{2},\dots)\) and \(Q:=\mathrm{diag}(q_{1},q_{2},\dots,q_{n})\), this can be written \[ d_{H}(P,Q):=\sqrt{\operatorname{tr}\mathcal{A}(P,Q)-\operatorname{tr}\mathcal{G}(P,Q)}, \] where \(\mathcal{A}\) and \(\mathcal{G}\) represent the arithmetic and geometric means of \(P\) and \(Q\), respectively. The goal of the present paper is to examine some extensions of the above definition for general \(n\times n\) complex semipositive definite matrices. Although there is a natural unique way to define \(\mathcal{A}\), there is more than one way to define the square root of a product of semipositive definite matrices and hence more than one way to define \(\mathcal{G}\). Let \(A\) and \(B\) be arbitrary semipositive definite matrices and let \(A^{1/2}\) and \(B^{1/2}\) be their (unique) positive semidefinite square roots. Write \(\left\Vert \ \right\Vert _{2}\) to denote the Frobenius norm and \(\mathbb{P}\) to denote the \(n\times n\) positive definite matrices. Then the following functions are considered: \(d_{1}(A,B):=\left\Vert A^{1/2}-B^{1/2}\right\Vert _{2}=\left\{ \operatorname{tr}(A+B)-2\operatorname{tr}A^{1/2}B^{1/2}\right\} ^{1/2}\); \(d_{2}(A,B):=\left\{ \operatorname{tr}(A+B)-\operatorname{tr}(A^{1/2}BA^{1/2})^{1/2}\right\} ^{1/2}\); \(d_{3}(A,B):=\left\{ \operatorname{tr}(A+B)-2\operatorname{tr}A\#B\right\} ^{1/2}\) where \(A\#B:=A^{1/2}(A^{-1/2} BA^{-1/2})^{1/2}A^{1/2}\); and \(d_{4}(A,B):=\left\{ \operatorname{tr}(A+B)-2\operatorname{tr}\mathcal{L} (A,B)\right\} ^{1/2}\) where \(\mathcal{L}(A,B):=\exp\left( \frac{1}{2}(\log A+\log B)\right) \) (defined only for strictly positive definite matrices). The functions \(d_{1}\) and \(d_{2}\) define metrics (\(d_{2}\) is sometimes called the Bures distance or Wasserstein metric) but \(d_{3}\) and \(d_{4}\) fail to satisfy the triangle inequality so do not define metrics. The main results of this paper concern the functions \(\Phi_{k}(A,B):=d_{k}(A,B)^{2}\) for \(k=3\) and \(4\). In particular, it is shown that \(\Phi_{3}\) and \(\Phi_{4}\) are divergence functions (see [\textit{S.-i. Amari} [Information geometry and its applications. Tokyo: Springer (2016; Zbl 1350.94001)]) and have useful convexity properties such as (Theorem 8): for each \(A\in\mathbb{P}\) the function \(X\longmapsto\Phi_{4}(A,X)\) is strictly convex on \(\mathbb{P}\).
      0 references
      0 references
      geometric mean
      0 references
      matrix divergence
      0 references
      relative entropy
      0 references
      strict convexity
      0 references
      barycentre
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references

      Identifiers

      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references