Model-based clustering for conditionally correlated categorical data
From MaRDI portal
Publication:269537
DOI10.1007/S00357-015-9180-4zbMATH Open1335.62103arXiv1401.5684OpenAlexW158520387MaRDI QIDQ269537FDOQ269537
Authors: Matthieu Marbac, Christophe Biernacki, Vincent Vandewalle
Publication date: 19 April 2016
Published in: Journal of Classification (Search for Journal in Brave)
Abstract: An extension of the latent class model is presented for clustering categorical data by relaxing the classical "class conditional independence assumption" of variables. This model consists in grouping the variables into inter-independent and intra-dependent blocks, in order to consider the main intra-class correlations. The dependency between variables grouped inside the same block of a class is taken into account by mixing two extreme distributions, which are respectively the independence and the maximum dependency. When the variables are dependent given the class, this approach is expected to reduce the biases of the latent class model. Indeed, it produces a meaningful dependency model with only a few additional parameters. The parameters are estimated, by maximum likelihood, by means of an EM algorithm. Moreover, a Gibbs sampler is used for model selection in order to overcome the computational intractability of the combinatorial problems involved by the block structure search. Two applications on medical and biological data sets show the relevance of this new model. The results strengthen the view that this model is meaningful and that it reduces the biases induced by the conditional independence assumption of the latent class model.
Full work available at URL: https://arxiv.org/abs/1401.5684
Recommendations
- Latent class model with conditional dependency per modes to cluster categorical data
- Hierarchical latent class models for cluster analysis
- scientific article; zbMATH DE number 2034563
- Mixture of latent trait analyzers for model-based clustering of categorical data
- Using conditional independence for parsimonious model-based Gaussian clustering
clusteringcorrelationmodel selectionexpectation-maximization algorithmGibbs samplermixture modelcategorical data
Cites Work
- Mixture model clustering using the MULTIMIX program
- Estimating the dimension of a model
- Title not available (Why is that?)
- Finite mixture models
- Variable selection in model-based clustering: a general variable role modeling
- Block clustering with Bernoulli mixture models: comparison of different approaches
- Title not available (Why is that?)
- Model-Based Gaussian and Non-Gaussian Clustering
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Approximating discrete probability distributions with dependence trees
- Mixture of latent trait analyzers for model-based clustering of categorical data
- Identifiability of parameters in latent structure models with many observed variables
- Learning with mixtures of trees.
- Title not available (Why is that?)
- Clustering criteria for discrete data and latent class models
- Exploratory latent structure analysis using both identifiable and unidentifiable models
- Model-based clustering for conditionally correlated categorical data
- Bayesian network classifiers
- Market segmentation using brand strategy research: Bayesian inference with respect to mixtures of log-linear models
- Classification, clustering, and data analysis. Recent advances and applications. Papers presented at the eighth conference of the International Federation of Classification Societies (IFCS), Cracow, Poland, July 16--19, 2002
- An introduction to the Bayes information criterion: theoretical foundations and interpretation
- Local dependence latent structure models
- Title not available (Why is that?)
- Random Effects Models in Latent Class Analysis for Evaluating Accuracy of Diagnostic Tests
- Using Latent Class Models to Characterize and Assess Relative Error in Discrete Measurements
Cited In (17)
- Exploring dependence between categorical variables: benefits and limitations of using variable selection within Bayesian clustering in relation to log-linear modelling with interaction terms
- Model-based clustering for conditionally correlated categorical data
- Hill climbing method using Claus model for categorical data
- Title not available (Why is that?)
- Hierarchical latent class models for cluster analysis
- Simplex factor models for multivariate unordered categorical data
- Latent class model with conditional dependency per modes to cluster categorical data
- Variable selection methods for model-based clustering
- Density-based clustering with non-continuous data
- Model-based clustering for spatiotemporal data on air quality monitoring
- The clustering of categorical data: a comparison of a model-based and a distance-based approach
- A model-based approach to simultaneous clustering and dimensional reduction of ordinal data
- A latent class distance association model for cross-classified data with a categorical response variable
- UC-LTM: unidimensional clustering using latent tree models for discrete data
- A semiparametric and location-shift copula-based mixture model
- A family of block-wise one-factor distributions for modeling high-dimensional binary data
- Mixture of latent trait analyzers for model-based clustering of categorical data
Uses Software
This page was built for publication: Model-based clustering for conditionally correlated categorical data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q269537)