Eigen structure of a new class of covariance and inverse covariance matrices (Q2405151)

The increasing availability of information based on large databases increases the number of statistical analysis of phenomena where there are \(n\) observations, \(V_1,\dots,V_{n}\), of a random vector \(V\) of dimension \(p\) with \(p\gg n\). The results of several such analysis depend on an estimator of \(\Sigma(V)\), the covariance matrix of \(V\). It is well known that in this situation, \(p\gg n\), it is impossible, except for trivial cases, to obtain consistent estimators of \(\Sigma(V)\) based on \(V_1,\dots,V_{n}\). It is necessary to assume certain properties of the structure of \(\Sigma(V)\). In this work, assuming that \(\Sigma(V)\) is positive definite, certain properties of \(\Sigma(V)\) and \(\Sigma(V)^{-1}\) are assumed through the logarithm of \(\Sigma(V)\). It is defined as the \(p\times p\) matrix \(L(V)\) such that \[ \Sigma(V)=\exp(L(V)):=\sum_{k=0}^{\infty}(1/(k!))(L(V))^{k}. \] The existence and uniqueness of one such matrix is proven in the paper. Consider \(M_{p}(\mathbb{R})\) as the space of \(p\times p\) matrices with elements in \(\mathbb{R}\), and \[ V(p,\mathbb{R}):=\{S\in M_{p}(\mathbb{R}) \mid S=S^{T}\} \] where \(S^{T}\) is the transposed matrix of \(S\). Let \(B=B_1\cup B_2\) be the basis of the vector space \(V(p,\mathbb{R})\) where \[ B_1 : =\{B\in M_{p}(\mathbb{R}) \mid B=e_{j}e_{j}^{T},\;j\in \{1,\dots,p\}\}, \] \[ B_2 : =\{B\in M_{p}(\mathbb{R}) \mid B=e_{j}e_{k}^{T}+e_{k}e_{j}^{T},\;j,k\in \{1,\dots,p\},\;j\neq k\}, \] where \((e_1,\dots,e_{p})\) is the canonical basis vectors for \(\mathbb{R}^{p}\). Let \(|B|=p(p+1)/2\) be the number of elements in the set \(B\). For each \(\alpha=(\alpha_1,\dots,\alpha_{|B|})\in \mathbb{R}^{|B|}\) define \[ L(\alpha) : =\sum_{m=1}^{|B|}\alpha_{m}B_{m}, \] \[ \Sigma(\alpha) : =\exp(L(\alpha)), \] \[ \mathrm{Supp}(\alpha) : =\{m\in \{1,\dots,|B|\} \mid \alpha_{m}\neq 0\}, \] \[ s^*(\alpha) : =\sharp(\mathrm{Supp}(\alpha)), \] the ``sparsity of \(\alpha\)''. Let \(\lambda_1(\alpha)\geq\dots\geq\lambda_{p}(\alpha)\) be the ordered eigenvalue of \(\Sigma(\alpha)\), and let \(\Xi(\alpha)\) be the corresponding matrix of eigenvectors. Then \[ \Sigma(\alpha)=\Xi(\alpha)\cdot\Lambda(\alpha)\cdot\Xi(\alpha)^{T}, \] where \[ \Lambda(\alpha):=\mathrm{diag}(\lambda_1(\alpha),\dots,\lambda_{p}(\alpha)), \] and \[ L(\alpha)=\Xi(\alpha)\cdot\Delta(\alpha)\cdot\Xi(\alpha)^{T}, \] where \[ \Delta(\alpha):=\mathrm{diag}(\delta_1(\alpha),\dots,\delta_{p}(\alpha)), \] and \[ \delta_{j}(\alpha):=\log(\lambda_{j}(\alpha)),\;j\in {1,\dots,p}. \] Hence \(\delta_{j}(\alpha)=0\text{ if and only if }\lambda_{j}(\alpha)=1.\) Let \(A(\alpha)\) be the set \[ A(\alpha):=\{j\in \{1,\dots,p\}\mid \lambda_{j}(\alpha)\neq 1\}. \] Then \[ \Sigma(\alpha)=\Xi_{A(\alpha)}\Lambda_{A(\alpha)}\Xi_{A(\alpha)}^{T}+\Xi_{A(\alpha)^{c}}\Xi_{A(\alpha)^{c}}^{T}, \] where \(\Xi_{A(\alpha)}\) denotes the restriction by columns of \(\Xi(\alpha)\) to \(A(\alpha)\), and analogously \(\Lambda_{A(\alpha)}\). For each \(s^*\in \{1,\dots,|B|\}\), and each \(\alpha=(\alpha_1,\dots,\alpha_{|B|})\in \mathbb{R}^{|B|}\) such that \(s^*(\alpha)=s^*\), define the sets \[ S(s^*,\alpha) :=\{m\in \{1,\dots,|B|\}\mid \alpha_{m}\neq 0\}, \] \[ B_{S(s^*,\alpha)} :=\{B_{m}\in B \mid m\in S(s^*,\alpha)\}, \] and \[ L(s^*,\alpha) :=\{b_{j}^{(m)} \mid j\in \{1,\dots,p\}, b_{j}^{(m)}\neq 0, m\in S(s^*,\alpha)\}, \] where \(b_{j}^{(m)}\) for \(m\in S(s^*,\alpha)\) is the \(j\)th column of \(B_{m}\in B_{S(s^*,\alpha)}\). Finally \(D^*(s^*,\alpha)\) is the number of distinct elements of \(L(s^*,\alpha)\). Note that if \(s^*(\alpha)=s^*=s^*(\alpha')\) this does not imply that \(D^*(s^*,\alpha)=D^*(s^*,\alpha')\). The main result of the paper is the following \textbf{Theorem.} Let \(C\) be the set \[ C:=\{\alpha\in \mathbb{R}^{|B|} \mid \Sigma(\alpha)\text{ is positive definite}\}. \] Let \(X=(X_1,\dots,X_{|B|})\) a random field with values in \(C\), and let \(\alpha=(\alpha_1,\dots,\alpha_{|B|})\) be an observed value of \(X\). \(\Sigma(\alpha)=\exp(L(\alpha))\) with \(L(\alpha):=\sum_{m=1}^{|B|}\alpha_{m}B_{m}\) and \(s^*(\alpha)=s^*\) if and only if \[ \Sigma(\alpha)=PKP^{-1}, \] with \(P\) a permutation matrix and \(K\) a block diagonal matrix with blocks \(K_1\) and \(I_{p-D^*(s^*,\alpha)}, E(D^*(s^*(X)=s^*,X))=d^*\) where \[ \frac{4p+p(p-1))}{2(p+1)}\left[\log\left({p}{p-d^*}\right)-\frac{d^*}{2p(p-d^*)}\right]=s^*. \] In a previous lemma to the theorem it is proved that \[ |A(\alpha)|=D^*(s^*,\alpha), \] for all \(\alpha\in C\). The authors propose the following interpretation of the Theorem: ``In terms of the random vector \(V\) itself, sparsity of \(\alpha\) implies that \(V\) can be decomposed into two subsets of variables, \(V_1\) and \(V_2\) such that \(V_1\) has covariance structure \(K_1\), whilst the elements of \(V_2\) are completely uncorrelated with each other and with the elements of \(V_1\).'' Finally, the authors propose an estimator as \(L(\alpha)\) of the Theorem. Unfortunately, the notation used in the definition of the estimator is not defined. The paper is rigorous in the proofs and interesting in its result but it is hard to understand because the notation is somewhat confusing.

0 references

zbMATH Keywords

covariance matrix

0 references

matrix logarithm

0 references

precision matrix

0 references

spectral theory

0 references

reviewed by

E. A. Pilotta and G. A. Torres O. H. Bustos