FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test
From MaRDI portal
Publication:5380252
DOI10.1162/NECO_A_00732zbMATH Open1476.62082DBLPjournals/neco/0001M15arXiv1405.2664OpenAlexW2002870907WikidataQ47571345 ScholiaQ47571345MaRDI QIDQ5380252FDOQ5380252
Publication date: 4 June 2019
Published in: Neural Computation (Search for Journal in Brave)
Abstract: The maximum mean discrepancy (MMD) is a recently proposed test statistic for two-sample test. Its quadratic time complexity, however, greatly hampers its availability to large-scale applications. To accelerate the MMD calculation, in this study we propose an efficient method called FastMMD. The core idea of FastMMD is to equivalently transform the MMD with shift-invariant kernels into the amplitude expectation of a linear combination of sinusoid components based on Bochner's theorem and Fourier transform (Rahimi & Recht, 2007). Taking advantage of sampling of Fourier transform, FastMMD decreases the time complexity for MMD calculation from to , where and are the size and dimension of the sample set, respectively. Here is the number of basis functions for approximating kernels which determines the approximation accuracy. For kernels that are spherically invariant, the computation can be further accelerated to by using the Fastfood technique (Le et al., 2013). The uniform convergence of our method has also been theoretically proved in both unbiased and biased estimates. We have further provided a geometric explanation for our method, namely ensemble of circular discrepancy, which facilitates us to understand the insight of MMD, and is hopeful to help arouse more extensive metrics for assessing two-sample test. Experimental results substantiate that FastMMD is with similar accuracy as exact MMD, while with faster computation speed and lower variance than the existing MMD approximation methods.
Full work available at URL: https://arxiv.org/abs/1405.2664
Recommendations
- EuMMD: efficiently computing the MMD two-sample test statistic for univariate data
- A fast algorithm for two-dimensional Kolmogorov-Smirnov two sample tests
- Fast tests for the two-sample problem based on the empirical characteristic function
- A two-sample nonparametric test for circular data -- its exact distribution and performance
- Two-sample test for equal distributions in separate metric space: New maximum mean discrepancy based approaches
- Efficient and adaptive nonparametric test for the two-sample problem
- Exact asymptotic goodness-of-fit testing for discrete circular data with applications
- A New Test of Discordancy in Circular Data
- Simple and efficient adaptive two-sample tests for high-dimensional data
Cites Work
- Equivalence of distance-based and RKHS-based statistics in hypothesis testing
- 10.1162/15324430260185619
- On the bootstrap of \(U\) and \(V\) statistics
- Hilbert space embeddings and metrics on probability measures
- A Hilbert Space Embedding for Distributions
- Universality, Characteristic Kernels and RKHS Embedding of Measures
- A kernel two-sample test
- On the mathematical foundations of learning
- Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests
- Statistical Analysis of Circular Data
- Permutation tests for equality of distributions in high-dimensional settings
- A Distribution Free Version of the Smirnov Two Sample Test in the $p$-Variate Case
- Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel
- 10.1162/15324430260185646
- On the Asymptotic Properties of a Nonparametric<tex>$L_1$</tex>-Test Statistic of Homogeneity
- Bipartite graph matching for points on a line or a circle
- Quasi-Monte Carlo feature maps for shift-invariant kernels
Cited In (8)
- EuMMD: efficiently computing the MMD two-sample test statistic for univariate data
- Large-scale kernel methods for independence testing
- Title not available (Why is that?)
- Multivariate Rank-Based Distribution-Free Nonparametric Testing Using Measure Transportation
- Unbalanced optimal transport and maximum mean discrepancies: interconnections and rapid evaluation
- A fast algorithm for two-dimensional Kolmogorov-Smirnov two sample tests
- FastMMD
- A review of multivariate distributions for count data derived from the Poisson distribution
This page was built for publication: FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5380252)