The Hardness of Conditional Independence Testing and the Generalised Covariance Measure
From MaRDI portal
Abstract: It is a common saying that testing for conditional independence, i.e., testing whether whether two random vectors and are independent, given , is a hard statistical problem if is a continuous random variable (or vector). In this paper, we prove that conditional independence is indeed a particularly difficult hypothesis to test for. Valid statistical tests are required to have a size that is smaller than a predefined significance level, and different tests usually have power against a different class of alternatives. We prove that a valid test for conditional independence does not have power against any alternative. Given the non-existence of a uniformly valid conditional independence test, we argue that tests must be designed so their suitability for a particular problem may be judged easily. To address this need, we propose in the case where and are univariate to nonlinearly regress on , and on and then compute a test statistic based on the sample covariance between the residuals, which we call the generalised covariance measure (GCM). We prove that validity of this form of test relies almost entirely on the weak requirement that the regression procedures are able to estimate the conditional means given , and given , at a slow rate. We extend the methodology to handle settings where and may be multivariate or even high-dimensional. While our general procedure can be tailored to the setting at hand by combining it with any regression technique, we develop the theoretical guarantees for kernel ridge regression. A simulation study shows that the test based on GCM is competitive with state of the art conditional independence tests. Code is available as the R package GeneralisedCovarianceMeasure on CRAN.
Recommendations
- A GENERALIZATION OF TESTING INDEPENDENCE OF SETS OF VARIATES
- General tests of conditional independence based on empirical processes indexed by functions
- Expected conditional characteristic function-based measures for testing independence
- Testing conditional independence via empirical likelihood
- A flexible nonparametric test for conditional independence
- A consistent characteristic function-based test for conditional independence
- Nonparametric tests for conditional independence using conditional distributions
- On the power of conditional independence testing under model-X
- On some tests of the covariance matrix under general conditions
Cites work
- scientific article; zbMATH DE number 3112287 (Why is no real title available?)
- scientific article; zbMATH DE number 3635280 (Why is no real title available?)
- scientific article; zbMATH DE number 490141 (Why is no real title available?)
- scientific article; zbMATH DE number 1134987 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Non-Parametric Test of Independence
- A projection-based conditional dependence measure with applications to high-dimensional undirected graphical models
- Asymptotic normality and optimalities in estimation of large Gaussian graphical models
- Causal inference by using invariant prediction: identification and confidence intervals. With discussion and authors' reply
- Causality. Models, reasoning, and inference
- Causation, prediction, and search
- Convergence of estimates under dimensionality restrictions
- Double/debiased machine learning for treatment and structural parameters
- Elements of causal inference. Foundations and learning algorithms
- Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors
- Goodness-of-Fit Tests for High Dimensional Linear Models
- Higher order inference on a treatment effect under low regularity conditions
- Higher order influence functions and minimax estimation of nonlinear functionals
- Hypothesis testing for densities and high-dimensional multinomials: sharp local minimax rates
- Measure theory and functional analysis
- Minimax estimation of a functional on a structured high-dimensional model
- Nonparametric independence testing via mutual information
- On Non-parametric Testing, the Uniform Behaviour of the t-test, and Related Problems
- On the equivalence between kernel quadrature rules and random feature expansions
- On the testability of identification in some nonparametric models with endogeneity
- Partial association measures and an application to qualitative regression
- Probabilistic graphical models.
- Semiparametric minimax rates
- Sparse graphical models for exploring gene expression data
- Statistics for high-dimensional data. Methods, theory and applications.
- The Nonexistence of Certain Statistical Procedures in Nonparametric Problems
- Two new properties of mathematical likelihood
- Uniformly powerful goodness of fit tests
Cited in
(40)- General tests of conditional independence based on empirical processes indexed by functions
- Learning to increase the power of conditional randomization tests
- Minimax optimal conditional independence testing
- Conditional independence testing via weighted partial copulas
- Local permutation tests for conditional independence
- Sufficient variable screening with high-dimensional controls
- Anytime-Valid Tests of Conditional Independence Under Model-X
- The conditional permutation test for independence while controlling for confounders
- On the power of conditional independence testing under model-X
- A double-robust test for high-dimensional gene coexpression networks conditioning on clinical information
- Conditional feature importance for mixed data
- Reconciling model-X and doubly robust approaches to conditional independence testing
- Nonparametric conditional local independence testing
- scientific article; zbMATH DE number 7370579 (Why is no real title available?)
- Extending greedy feature selection algorithms to multiple solutions
- Test of conditional independence in factor models via Hilbert-Schmidt independence criterion
- GeneralisedCovarianceMeasure
- weightedGCM
- The Holdout Randomization Test for Feature Selection in Black Box Models
- scientific article; zbMATH DE number 7626800 (Why is no real title available?)
- Efficient and multiply robust risk estimation under general forms of dataset shift
- Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators
- From statistical to causal learning
- Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning
- Yet another look at the omitted variable bias
- Asymptotic distributions of high-dimensional distance correlation inference
- On Azadkia-Chatterjee's conditional dependence coefficient
- comets
- A survey of some recent developments in measures of association
- Demystifying Statistical Learning Based on Efficient Influence Functions
- scientific article; zbMATH DE number 7370587 (Why is no real title available?)
- Causal structure learning: a combinatorial perspective
- A new covariate selection strategy for high dimensional data in causal effect estimation with multivariate treatments
- Double-estimation-friendly inference for high-dimensional misspecified models
- Testing conditional independence in supervised learning algorithms
- A simple measure of conditional dependence
- Optimal rates for independence testing via U-statistic permutation tests
- Game-theoretic statistical inference: optional sampling, universal inference, and multiple testing based on e-values. Abstracts from the workshop held May 5--10, 2024
- Comment: Reflections on the Deconfounder
- On universally consistent and fully distribution-free rank tests of vector independence
This page was built for publication: The Hardness of Conditional Independence Testing and the Generalised Covariance Measure
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q118262)