On the empirical multilinear copula process for count data (Q396007)

The aim of the paper under review is to study the asymptotic behavior of the empirical process associated with the multilinear copula based on \(d\)-variate count data. The main result is sufficient to deduce the limiting distribution of classical statistics for monotone trend such as Spearman's rho and Kendall's tau, and performs well for sparse contingency tables whose dimensions are sample size dependent. The first two authors studied this earlier to show some important dependence properties for the 2-dimensional case. In Section 2, the authors define the multilinear extension copula \(C^*\) of the joint cumulative distribution function \(H\) for a vector \(X=(X_1, \dots, X_d)\) of discrete random variables with density with respect to the Lebesgue measure given by \[ c^*(u_1, \dots, u_d) = \frac{Pr[X_1=A_1(k_1), \dots, X_d=A_d(k_d)]}{Pr[X_1=A_1(k_1)]\cdots Pr[X_d=A_d(k_d)]}, \] for all \(F_j[A_j(k_j-1)]< u_j \leq F_j[A_j(k_j)]\) for some \(k_j\in \mathbb{N}\), where \(F_j\) is the distribution function of \(X_j\) (\(1\leq j \leq d\)). The explicit form of \(C^*\) is given in Proposition 2.1 and is proved in Appendix A, and the multilinear extension copula satisfies Sklar's representation, and is invariant with respect to strictly increasing transformations of the margins. Consider a random sample \({\chi}=\{(X_{11}, \dots, X_{1d}), \dots, (X_{n1}, \dots, X_{nd})\}\) from \(H\) and let \(H_n\) be the corresponding empirical distribution function. One can define \[ c_n^*(u_1, \dots, u_d) = \frac{h_n[A_{n1}(k_1), \dots, A_{nd}(k_d)]}{\Delta F_{n1}[A_{n1}(k_1)]\cdots \Delta F_{nd}[A_{nd}(k_d)]}, \] \[ C_n^*(u_1, \dots, u_d) = \sum_{S\subset\{1, \dots, d\}}\lambda_{H_n, S}(u_1, \dots, u_d)H_n\{ F_{n1}^{-1}(u_{S_1}), \dots, F_{nd}^{-1}(u_{S_d})\}. \] The \(C_n^*(u_1, \dots, u_d)\) is asymptotically equivalent to other versions of the empirical copula commonly used in the literature. Section 3 gives the main result on the asymptotic behavior of \(C_n^*\) as \(n\to \infty\). Theorem 3.1 shows that \(C_n^*\) converges over any compact subset \(K\). Section 4 and Appendix B are devoted to the proof of the main result Theorem 3.1. The first step consists of considering the case where the margins of \(H\) are known. Proposition 4.1 shows that the empirical process \(C_n^*\) converges, and the process \(\hat{C}_n^*\) in which margins are unknown can be decomposed into \(\tilde{C}_n^* + \tilde{D}_n\). Therefore, the next step shows that \(\tilde{C}_n^* - C_n^*\) converges to zero and \(\tilde{D}_n - D_n\) converges to zero, and \(\hat{C}_n^*\) is a uniformly consistent estimator of \(C^*\). Section 5 lists applications of the asymptotic behavior and usefulness of this result. Kendall's tau and Spearman's rho are two classical measures of monotone trend for two-way cross-classifications of ordinal or interval data. Proposition 5.1 shows the weak convergence of tau for \(d=2\), Proposition 5.2 shows the weak convergence of rho for \(d\geq 2\), and Proposition 5.3 shows that a test based on the Cramer-von Mises statistic \(S_n\) is consistent against any alternative. However, the limiting null distribution of \(S_n\) depends upon the margins of \(H\) which are generally unknown. For \(d=2\), Algorithm 5.1 presents how to carry out the test with restoring to resampling techniques as multiplier bootstrap. The paper ends with the conclusion in Section 6. It is nice that all proofs are given in the appendixes.

0 references

Mathematics Subject Classification ID

62H20

0 references

0 references

0 references

0 references

0 references