Imputation and low-rank estimation with missing not at random data
From MaRDI portal
Abstract: Missing values challenge data analysis because many supervised and unsupervised learning methods cannot be applied directly to incomplete data. Matrix completion based on low-rank assumptions are very powerful solution for dealing with missing values. However, existing methods do not consider the case of informative missing values which are widely encountered in practice. This paper proposes matrix completion methods to recover Missing Not At Random (MNAR) data. Our first contribution is to suggest a model-based estimation strategy by modelling the missing mechanism distribution. An EM algorithm is then implemented, involving a Fast Iterative Soft-Thresholding Algorithm (FISTA). Our second contribution is to suggest a computationally efficient surrogate estimation by implicitly taking into account the joint distribution of the data and the missing mechanism: the data matrix is concatenated with the mask coding for the missing values; a low-rank structure for exponential family is assumed on this new matrix, in order to encode links between variables and missing mechanisms. The methodology that has the great advantage of handling different missing value mechanisms is robust to model specification errors.The performances of our methods are assessed on the real data collected from a trauma registry (TraumaBase ) containing clinical information about over twenty thousand severely traumatized patients in France. The aim is then to predict if the doctors should administrate tranexomic acid to patients with traumatic brain injury, that would limit excessive bleeding.
Recommendations
- Majorized proximal alternating imputation for regularized rank constrained matrix completion
- Transposable regularized covariance models with an application to missing data imputation
- Tree-based algorithms for missing data imputation
- Matrix Completion, Counterfactuals, and Factor Analysis of Missing Data
- Graphical models for processing missing data
Cites work
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 2140075 (Why is no real title available?)
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
- A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion
- A Singular Value Thresholding Algorithm for Matrix Completion
- A principal component method to impute missing values for mixed data
- Exact matrix completion via convex optimization
- Generalized low rank models
- Identification and inference with nonignorable missing covariate data
- Inference and missing data
- Literature survey on low rank approximation of matrices
- Main effects and interactions in mixed and incomplete data frames
- Matrix completion and low-rank SVD via fast alternating least squares
- Missing Covariates in Generalized Linear Models When the Missing Data Mechanism is Non-ignorable
- Multiple imputation: a review of practical and theoretical findings
- Optimal Shrinkage of Singular Values
- Partial and latent ignorability in missing-data problems
- Pattern-Mixture Models for Multivariate Incomplete Data
- Random forest missing data algorithms
- Regularised PCA to denoise and visualise data
- Selecting the number of components in principal component analysis using cross-validation approximations
- Semiparametric maximum likelihood estimation with data missing not at random
- Shadow Prices, Market Wages, and Labor Supply
- Spectral regularization algorithms for learning large incomplete matrices
- Unbiased Risk Estimates for Singular Value Thresholding and Spectral Estimators
- What is meant by ``missing at random?
- \(e\)PCA: high dimensional exponential family PCA
Cited in
(10)- Chunk-wise regularised PCA-based imputation of missing data
- Nonparametric empirical Bayes biomarker imputation and estimation
- Adjacency-based regularization for partially ranked data with non-ignorable missing
- MIDIA: exploring denoising autoencoders for missing data imputation
- An adaptation for iterative structured matrix completion
- Model-based clustering with missing not at random data
- scientific article; zbMATH DE number 7829048 (Why is no real title available?)
- Matrix Completion, Counterfactuals, and Factor Analysis of Missing Data
- An Imputation–Regularized Optimization Algorithm for High Dimensional Missing Data Problems and Beyond
- Unfolding incomplete data: guidelines for unfolding row-conditional rank order data with random missings
This page was built for publication: Imputation and low-rank estimation with missing not at random data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2209726)