Classification and clustering of sequencing data using a Poisson model
From MaRDI portal
Abstract: In recent years, advances in high throughput sequencing technology have led to a need for specialized methods for the analysis of digital gene expression data. While gene expression data measured on a microarray take on continuous values and can be modeled using the normal distribution, RNA sequencing data involve nonnegative counts and are more appropriately modeled using a discrete count distribution, such as the Poisson or the negative binomial. Consequently, analytic tools that assume a Gaussian distribution (such as classification methods based on linear discriminant analysis and clustering methods that use Euclidean distance) may not perform as well for sequencing data as methods that are based upon a more appropriate distribution. Here, we propose new approaches for performing classification and clustering of observations on the basis of sequencing data. Using a Poisson log linear model, we develop an analog of diagonal linear discriminant analysis that is appropriate for sequencing data. We also propose an approach for clustering sequencing data using a new dissimilarity measure that is based upon the Poisson model. We demonstrate the performances of these approaches in a simulation study, on three publicly available RNA sequencing data sets, and on a publicly available chromatin immunoprecipitation sequencing data set.
Recommendations
- A two-stage Poisson model for testing RNA-Seq data
- A Poisson model of sequence comparison and its application to coronavirus phylogeny
- Nonparametric Bayesian bi-clustering for next generation sequencing count data
- Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model
- A sequential clustering algorithm with applications to gene expression data
- Classification of molecular sequence data using Bayesian phylogenetic mixture models
- scientific article; zbMATH DE number 1805764
- PUseqClust: a clustering analysis method for RNA-seq data
Cites work
- scientific article; zbMATH DE number 1817585 (Why is no real title available?)
- Class prediction by nearest shrunken centroids, with applications to DNA microarrays.
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data
- Convergence behaviour of Dirichlet-Neumann and Robin methods for a nonlinear transmission problem
- Penalized classification using Fisher's linear discriminant
- Some theory for Fisher's linear discriminant function, `naive Bayes', and some alternatives when there are many more variables than observations
- Sparse logistic principal components analysis for binary data
- THE TRANSFORMATION OF POISSON, BINOMIAL AND NEGATIVE-BINOMIAL DATA
Cited in
(17)- Variational discriminant analysis with variable selection
- Fast model-based clustering of partial records
- Exact Bayesian designs for count time series
- Two‐group Poisson‐Dirichlet mixtures for multiple testing
- Multiple suboptimal solutions for prediction rules in gene expression data
- A sparse negative binomial classifier with covariate adjustment for RNA-seq data
- PoiClaClu
- Classification of RNA-Seq data via Gaussian copulas
- Discovering political topics in facebook discussion threads with graph contextualization
- Bayesian sparse multivariate regression with asymmetric nonlocal priors for microbiome data analysis
- Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data
- Dealing with overdispersion in multivariate count data
- Simultaneous estimation of cluster number and feature sparsity in high‐dimensional cluster analysis
- Category encoding method to select feature genes for the classification of bulk and single-cell RNA-seq data
- Variational nonparametric discriminant analysis
- Clustering for multivariate continuous and discrete longitudinal data
- PUseqClust: a clustering analysis method for RNA-seq data
This page was built for publication: Classification and clustering of sequencing data using a Poisson model
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q765993)