Model-based clustering for conditionally correlated categorical data

From MaRDI portal
Publication:269537

DOI10.1007/S00357-015-9180-4zbMATH Open1335.62103arXiv1401.5684OpenAlexW158520387MaRDI QIDQ269537FDOQ269537


Authors: Matthieu Marbac, Christophe Biernacki, Vincent Vandewalle Edit this on Wikidata


Publication date: 19 April 2016

Published in: Journal of Classification (Search for Journal in Brave)

Abstract: An extension of the latent class model is presented for clustering categorical data by relaxing the classical "class conditional independence assumption" of variables. This model consists in grouping the variables into inter-independent and intra-dependent blocks, in order to consider the main intra-class correlations. The dependency between variables grouped inside the same block of a class is taken into account by mixing two extreme distributions, which are respectively the independence and the maximum dependency. When the variables are dependent given the class, this approach is expected to reduce the biases of the latent class model. Indeed, it produces a meaningful dependency model with only a few additional parameters. The parameters are estimated, by maximum likelihood, by means of an EM algorithm. Moreover, a Gibbs sampler is used for model selection in order to overcome the computational intractability of the combinatorial problems involved by the block structure search. Two applications on medical and biological data sets show the relevance of this new model. The results strengthen the view that this model is meaningful and that it reduces the biases induced by the conditional independence assumption of the latent class model.


Full work available at URL: https://arxiv.org/abs/1401.5684




Recommendations




Cites Work


Cited In (17)

Uses Software





This page was built for publication: Model-based clustering for conditionally correlated categorical data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q269537)