Double-estimation-friendly inference for high-dimensional misspecified models

From MaRDI portal
Publication:6325876

DOI10.1214/22-STS850arXiv1909.10828MaRDI QIDQ6325876FDOQ6325876


Authors: Rajen D. Shah, Peter Bühlmann Edit this on Wikidata


Publication date: 24 September 2019

Abstract: All models may be wrong -- but that is not necessarily a problem for inference. Consider the standard t-test for the significance of a variable X for predicting response Y whilst controlling for p other covariates Z in a random design linear model. This yields correct asymptotic type~I error control for the null hypothesis that X is conditionally independent of Y given Z under an emph{arbitrary} regression model of Y on (X,Z), provided that a linear regression model for X on Z holds. An analogous robustness to misspecification, which we term the "double-estimation-friendly" (DEF) property, also holds for Wald tests in generalised linear models, with some small modifications. In this expository paper we explore this phenomenon, and propose methodology for high-dimensional regression settings that respects the DEF property. We advocate specifying (sparse) generalised linear regression models for both Y and the covariate of interest X; our framework gives valid inference for the conditional independence null if either of these hold. In the special case where both specifications are linear, our proposal amounts to a small modification of the popular debiased Lasso test. We also investigate constructing confidence intervals for the regression coefficient of X via inverting our tests; these have coverage guarantees even in partially linear models where the contribution of Z to Y can be arbitrary. Numerical experiments demonstrate the effectiveness of the methodology.













This page was built for publication: Double-estimation-friendly inference for high-dimensional misspecified models

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6325876)