FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test
From MaRDI portal
Publication:5380252
Abstract: The maximum mean discrepancy (MMD) is a recently proposed test statistic for two-sample test. Its quadratic time complexity, however, greatly hampers its availability to large-scale applications. To accelerate the MMD calculation, in this study we propose an efficient method called FastMMD. The core idea of FastMMD is to equivalently transform the MMD with shift-invariant kernels into the amplitude expectation of a linear combination of sinusoid components based on Bochner's theorem and Fourier transform (Rahimi & Recht, 2007). Taking advantage of sampling of Fourier transform, FastMMD decreases the time complexity for MMD calculation from to , where and are the size and dimension of the sample set, respectively. Here is the number of basis functions for approximating kernels which determines the approximation accuracy. For kernels that are spherically invariant, the computation can be further accelerated to by using the Fastfood technique (Le et al., 2013). The uniform convergence of our method has also been theoretically proved in both unbiased and biased estimates. We have further provided a geometric explanation for our method, namely ensemble of circular discrepancy, which facilitates us to understand the insight of MMD, and is hopeful to help arouse more extensive metrics for assessing two-sample test. Experimental results substantiate that FastMMD is with similar accuracy as exact MMD, while with faster computation speed and lower variance than the existing MMD approximation methods.
Recommendations
- EuMMD: efficiently computing the MMD two-sample test statistic for univariate data
- A fast algorithm for two-dimensional Kolmogorov-Smirnov two sample tests
- Fast tests for the two-sample problem based on the empirical characteristic function
- A two-sample nonparametric test for circular data -- its exact distribution and performance
- Two-sample test for equal distributions in separate metric space: New maximum mean discrepancy based approaches
- Efficient and adaptive nonparametric test for the two-sample problem
- Exact asymptotic goodness-of-fit testing for discrete circular data with applications
- A New Test of Discordancy in Circular Data
- Simple and efficient adaptive two-sample tests for high-dimensional data
Cites work
- 10.1162/15324430260185619
- 10.1162/15324430260185646
- A Distribution Free Version of the Smirnov Two Sample Test in the $p$-Variate Case
- A Hilbert Space Embedding for Distributions
- A kernel two-sample test
- Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel
- Bipartite graph matching for points on a line or a circle
- Equivalence of distance-based and RKHS-based statistics in hypothesis testing
- Hilbert space embeddings and metrics on probability measures
- Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests
- On the Asymptotic Properties of a Nonparametric<tex>$L_1$</tex>-Test Statistic of Homogeneity
- On the bootstrap of \(U\) and \(V\) statistics
- On the mathematical foundations of learning
- Permutation tests for equality of distributions in high-dimensional settings
- Quasi-Monte Carlo feature maps for shift-invariant kernels
- Statistical Analysis of Circular Data
- Universality, Characteristic Kernels and RKHS Embedding of Measures
Cited in
(8)- A review of multivariate distributions for count data derived from the Poisson distribution
- Multivariate Rank-Based Distribution-Free Nonparametric Testing Using Measure Transportation
- A fast algorithm for two-dimensional Kolmogorov-Smirnov two sample tests
- scientific article; zbMATH DE number 7758314 (Why is no real title available?)
- Unbalanced optimal transport and maximum mean discrepancies: interconnections and rapid evaluation
- EuMMD: efficiently computing the MMD two-sample test statistic for univariate data
- FastMMD
- Large-scale kernel methods for independence testing
This page was built for publication: FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5380252)