Optimal rates for independence testing via $U$-statistic permutation tests (Q130903)

From MaRDI portal
scientific article; zbMATH DE number 7438256
  • Optimal rates for independence testing via \(U\)-statistic permutation tests
Language Label Description Also known as
English
Optimal rates for independence testing via $U$-statistic permutation tests
scientific article; zbMATH DE number 7438256
  • Optimal rates for independence testing via \(U\)-statistic permutation tests

Statements

15 January 2020
0 references
3 December 2021
0 references
0 references
math.ST
0 references
stat.ME
0 references
stat.ML
0 references
stat.TH
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
Optimal rates for independence testing via \(U\)-statistic permutation tests (English)
0 references
The authors study the problem of independence testing in a general framework, where the data consists of independent copies of a pair \((X,Y)\) taking values in a separable measure space \((\mathcal{X},\mathcal{Y})\), equipped with a \(\sigma\)-finite measure \(\mu\). Assuming that the joint distribution of \((X,Y)\) has a density \(f\) with respect to \(\mu\), one may define a measure of dependence \(D(f)\), given by the squared \(L^2(\mu)\) distance between the joint density and the product of its marginal densities. This satisfies the natural requirement that \(D(f)=0\) if and only if \(X\) and \(Y\) are independent. However, Theorem 1 reveals that it is not possible to construct a valid independence test with nontrivial power against all alternatives satisfying a lower bound on \(D(f)\). This motivates to introduce classes satisfying an additional Sobolev-type smoothness condition as well as boundedness conditions on the joint and marginal densities. The first main goal of this work is determination of the minimax separation rate of independence testing over these classes, and to this end, a new permutation test of independence based on a \(U\)-statistic estimator of \(D(f)\) is defined. Further, Theorem 2 in Section 3 provides a very general upper bound on the separation rate of independence testing. Note that the framework is broad enough to include both discrete and absolutely continuous data, as well as data that may take values in infinite-dimensional spaces. The authors show how the bound can be simplified in many special cases, and, in Section 4, how to construct adaptive versions of their tests that incur only a small loss in effective sample size. Moreover, in Section 5, matching lower bounds in several instances is provided, allowing to conclude that suggested USP test attains the minimax optimal separation rate for independence testing in such settings. In Section 6, an approximation to the power function of the test at local alternatives is elucidated, thereby providing a very detailed description of its properties. Numerical properties are studied in Section 7. Suggested methodology is implemented in the \texttt{R} package \texttt{USP}.
0 references
0 references
independence testing
0 references
minimax separation rates
0 references
permutation tests
0 references
Stein's method
0 references
U-statistics
0 references
\(U\)-statistic permutation (USP) test
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references