Dimension-agnostic inference using cross U-statistics
From MaRDI portal
Publication:6178581
Abstract: Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension while letting the sample size increase to infinity. Recently, much effort has been dedicated towards understanding how these methods behave in high-dimensional settings, where and both increase to infinity together. This often leads to different inference procedures, depending on the assumptions about the dimensionality, leaving the practitioner in a bind: given a dataset with 100 samples in 20 dimensions, should they calibrate by assuming , or ? This paper considers the goal of dimension-agnostic inference; developing methods whose validity does not depend on any assumption on versus . We introduce an approach that uses variational representations of existing test statistics along with sample splitting and self-normalization to produce a refined test statistic with a Gaussian limiting distribution, regardless of how scales with . The resulting statistic can be viewed as a careful modification of degenerate U-statistics, dropping diagonal blocks and retaining off-diagonal blocks. We exemplify our technique for some classical problems including one-sample mean and covariance testing, and show that our tests have minimax rate-optimal power against appropriate local alternatives. In most settings, our cross U-statistic matches the high-dimensional power of the corresponding (degenerate) U-statistic up to a factor.
Cites work
- scientific article; zbMATH DE number 3551712 (Why is no real title available?)
- scientific article; zbMATH DE number 1964693 (Why is no real title available?)
- scientific article; zbMATH DE number 889593 (Why is no real title available?)
- A feasible high dimensional randomization test for the mean vector
- A kernel two-sample test
- A modern maximum-likelihood theory for high-dimensional logistic regression
- A new test for multivariate normality
- A note on data-splitting for the evaluation of significance levels
- A note on testing the covariance matrix for large dimension
- A one-sample test for normality with kernel methods
- A review of 20 years of naive tests of significance for high-dimensional mean vectors and covariance matrices
- A test for the mean vector with fewer observations than the dimension
- A two-sample test for high-dimensional data with applications to gene-set testing
- Asymptotic behavior of M estimators of p regression parameters when \(p^ 2/n\) is large. II: Normal approximation
- Asymptotic behavior of M-estimators of p regression parameters when p^ 2/n is large. I. Consistency
- Asymptotic normality of a consistent estimator of maximum mean discrepancy in Hilbert space
- Asymptotic normality of quadratic estimators
- Bootstrapping and sample splitting for high-dimensional, assumption-lean inference
- Can we trust the bootstrap in high-dimensions? The case of linear models
- Central limit theorem for integrated square error of multivariate nonparametric density estimators
- Central limit theorems and bootstrap in high dimensions
- Central limit theorems for classical likelihood ratio tests for high-dimensional normal distributions
- Classification accuracy as a proxy for two-sample testing
- Conditional Distance Correlation
- Conditional mean and quantile dependence testing in high dimension
- Distribution and quantile functions, ranks and signs in dimension \(d\): a measure transportation approach
- Distribution-Free Consistent Independence Tests via Center-Outward Ranks and Signs
- Distribution-free predictive inference for regression
- Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Estimation of integrated squared density derivatives
- Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing
- Exact bounds on the closeness between the Student and standard normal distributions
- Gaussian universal likelihood ratio testing
- Goodness-of-fit Testing in High Dimensional Generalized Linear Models
- High-dimensional probability. An introduction with applications in data science
- Interaction screening for ultrahigh-dimensional data
- Minimax Euclidean separation rates for testing convex hypotheses in \(\mathbb{R}^{d}\)
- Minimax optimality of permutation tests
- Modification of some goodness-of-fit statistics to yield asymptotically normal null distributions
- Monge-Kantorovich depth, quantiles, ranks and signs
- Multinomial goodness-of-fit based on \(U\)-statistics: high-dimensional asymptotic and minimax optimality
- Multivariate Rank-Based Distribution-Free Nonparametric Testing Using Measure Transportation
- Non-asymptotic minimax rates of testing in signal detection
- Nonparametric goodness-of-fit testing under Gaussian models
- On some test criteria for covariance matrix
- On the power of conditional independence testing under model-X
- Optimal hypothesis testing for high dimensional covariance matrices
- Quantifying uncertainty in random forests via confidence intervals and hypothesis tests
- Robust multivariate nonparametric tests via projection averaging
- Some properties of incomplete U-statistics
- Split sample methods for constructing confidence intervals for binomial and Poisson parameters
- Student's t-Test Under Symmetry Conditions
- Testing Statistical Hypotheses
- Tests for high-dimensional covariance matrices
- Tests for high-dimensional regression coefficients with factorial designs
- The Berry-Esseen bound for Student's statistic
- The Holdout Randomization Test for Feature Selection in Black Box Models
- The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses
- Two-Sample Test of High Dimensional Means Under Dependence
- Universal inference
This page was built for publication: Dimension-agnostic inference using cross U-statistics
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6178581)