Diagnosing missing always at random in multivariate data

From MaRDI portal
Publication:5222234

DOI10.1093/BIOMET/ASZ061zbMATH Open1435.62187arXiv1710.06891OpenAlexW2766807015MaRDI QIDQ5222234FDOQ5222234


Authors: Iavor Bojinov, Natesh S. Pillai, Donald B. Rubin Edit this on Wikidata


Publication date: 1 April 2020

Published in: Biometrika (Search for Journal in Brave)

Abstract: Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable - a twofold assumption dependent on the mode of inference. The first part, which is the focus here, under the Bayesian and direct-likelihood paradigms, requires that the missing data are missing at random; in contrast, the frequentist-likelihood paradigm demands that the missing data mechanism always produces missing at random data, a condition known as missing always at random. Under certain regularity conditions, assuming missing always at random leads to an assumption that can be tested using the observed data alone namely, the missing data indicators only depend on fully observed variables. Here, we propose three different diagnostic tests that not only indicate when this assumption is incorrect but also suggest which variables are the most likely culprits. Although missing always at random is not a necessary condition to ensure validity under the Bayesian and direct-likelihood paradigms, it is sufficient, and evidence for its violation should encourage the careful statistician to conduct targeted sensitivity analyses.


Full work available at URL: https://arxiv.org/abs/1710.06891




Recommendations





Cited In (6)





This page was built for publication: Diagnosing missing always at random in multivariate data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5222234)