Gamma-based clustering via ordered means with application to gene-expression analysis

DOI10.1214/10-AOS805MaRDI QIDQ620546zbMATH OpenWikidataFDO

Authors Michael A. Newton, Lisa M. Chung

Publication date 19 January 2011

Published in The Annals of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/0907.3837

mixture model next generation sequencing Poisson embedding gamma ranking rank probability

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Applications of statistics to biology and medical sciences; meta analysis (62P10) Applications of mathematical programming (90C90) Genetics and epigenetics (92D10) Inequalities; stochastic orderings (60E15) Dynamic programming (90C39)

Abstract: Discrete mixture models provide a well-known basis for effective clustering algorithms, although technical challenges have limited their scope. In the context of gene-expression data analysis, a model is presented that mixes over a finite catalog of structures, each one representing equality and inequality constraints among latent expected values. Computations depend on the probability that independent gamma-distributed variables attain each of their possible orderings. Each ordering event is equivalent to an event in independent negative-binomial random variables, and this finding guides a dynamic-programming calculation. The structuring of mixture-model components according to constraints among latent means leads to strict concavity of the mixture log likelihood. In addition to its beneficial numerical properties, the clustering method shows promising results in an empirical study.

Recommendations

Cites work

Cited in

(4)

Describes a project that uses

Uses Software

This page was built for publication: Gamma-based clustering via ordered means with application to gene-expression analysis

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q620546)