Clustering Data with Nonignorable Missingness using Semi-Parametric Mixture Models
From MaRDI portal
Abstract: We are concerned in clustering continuous data sets subject to non-ignorable missingness. We perform clustering with a specific semi-parametric mixture, under the assumption of conditional independence given the component. The mixture model isused for clustering and not for estimating the density of the full variables (observed and unobserved), thus we do not need other assumptions on the component distribution neither to specify the missingness mechanism. Estimation is performed by maximizing an extension of smoothed likelihood allowing missingness. This optimization is achieved by a Majorization-Minorization algorithm. We illustrate the relevance of our approach by numerical experiments. Under mild assumptions, we show the identifiability of the model defining the distribution of the observed data and the monotony of the algorithm. We also propose an extension of this new method to the case of mixed-type data that we illustrate on a real data set.
Recommendations
- Mixture model clustering for mixed data with missing information
- A semiparametric method for clustering mixed data
- Model-based classification of clustered binary data with non-ignorable missing values
- Robust model-based clustering via mixtures of skew-t distributions with missing information
- Semiparametric mixtures of regressions with single-index for model based clustering
- Incomplete clustering analysis via multiple imputation
- Model-based clustering via mixtures of unrestricted skew normal factor analyzers with complete and incomplete data
- Robust clustering via mixtures of t factor analyzers with incomplete data
Cites work
- scientific article; zbMATH DE number 3942813 (Why is no real title available?)
- scientific article; zbMATH DE number 1294360 (Why is no real title available?)
- scientific article; zbMATH DE number 1834445 (Why is no real title available?)
- scientific article; zbMATH DE number 2124691 (Why is no real title available?)
- k-POD: A Method for k-Means Clustering of Missing Data
- A family of block-wise one-factor distributions for modeling high-dimensional binary data
- Approximating discrete probability distributions with dependence trees
- Bandwidth selection in an EM-like algorithm for nonparametric multivariate mixtures
- Binary Probability Maps Using a Hidden Conditional Autoregressive Gaussian Process with an Application to Finnish Common Toad Data
- Clustering multiply imputed multivariate high-dimensional longitudinal profiles
- Clustering via finite nonparametric ICA mixture models
- Clustering with missing data: which equivalent for Rubin's rules?
- Estimating multivariate latent-structure models
- Estimating the dimension of a model
- Estimation of the number of components of nonparametric multivariate finite mixture models
- Every Missingness not at Random Model Has a Missingness at Random Counterpart with Equal Fit
- Exact and Monte Carlo calculations of integrated likelihoods for the latent class model
- Finite mixture models
- Flexible imputation of missing data
- Handbook of missing data methodology
- Handbook of mixture analysis
- Identifiability of parameters in latent structure models with many observed variables
- Learning with mixtures of trees.
- Maximum smoothed likelihood for multivariate mixtures
- Non-Parametric Identification and Estimation of the Number of Components in Multivariate Mixtures
- Nonparametric Estimation of Multivariate Mixtures
- Nonparametric and semiparametric models.
- Nonparametric estimation of component distributions in a multivariate mixture
- Nonparametric mixture models with conditionally independent multivariate component densities
- Not so naive Bayes: Aggregating one-dependence estimators
- Pair copula constructions for multivariate discrete data
- Pattern-Mixture Models for Multivariate Incomplete Data
- Semi-parametric estimation for conditional independence multivariate finite mixture models
- Semiparametric theory and missing data.
- Theoretical grounding for estimation in conditional independence multivariate finite mixture models
- When is the naive Bayes approximation not so naive?
Cited in
(5)
This page was built for publication: Clustering Data with Nonignorable Missingness using Semi-Parametric Mixture Models
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q118021)