A unified statistical framework for single cell and bulk RNA sequencing data
From MaRDI portal
(Redirected from Publication:1647646)
Abstract: Recent advances in technology have enabled the measurement of RNA levels for individual cells. Compared to traditional tissue-level bulk RNA-seq data, single cell sequencing yields valuable insights about gene expression profiles for different cell types, which is potentially critical for understanding many complex human diseases. However, developing quantitative tools for such data remains challenging because of high levels of technical noise, especially the "dropout" events. A "dropout" happens when the RNA for a gene fails to be amplified prior to sequencing, producing a "false" zero in the observed data. In this paper, we propose a Unified RNA-Sequencing Model (URSM) for both single cell and bulk RNA-seq data, formulated as a hierarchical model. URSM borrows the strength from both data sources and carefully models the dropouts in single cell data, leading to a more accurate estimation of cell type specific gene expression profile. In addition, URSM naturally provides inference on the dropout entries in single cell data that need to be imputed for downstream analyses, as well as the mixing proportions of different cell types in bulk samples. We adopt an empirical Bayes approach, where parameters are estimated using the EM algorithm and approximate inference is obtained by Gibbs sampling. Simulation results illustrate that URSM outperforms existing approaches both in correcting for dropouts in single cell data, as well as in deconvolving bulk samples. We also demonstrate an application to gene expression data on fetal brains, where our model successfully imputes the dropout genes and reveals cell type specific expression patterns.
Recommendations
- Statistical modeling of RNA-Seq data
- A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data
- A statistical framework for the analysis of ChIP-seq data
- Gene expression distribution deconvolution in single-cell RNA sequencing
- Detection of differentially expressed genes in discrete single‐cell RNA sequencing data using a hurdle model with correlated random effects
- A survey on normalization of single-cell RNA sequencing
- A compositional model to assess expression changes from single-cell RNA-seq data
- Clustering methods for single-cell RNA-sequencing expression data: performance evaluation with varying sample sizes and cell compositions
- Kinetic Foundation of the Zero-Inflated Negative Binomial Model for Single-Cell RNA Sequencing Data
- Data denoising and post-denoising corrections in single cell RNA sequencing
Cites work
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- 10.1162/jmlr.2003.3.4-5.993
- A unified statistical framework for single cell and bulk RNA sequencing data
- An introduction to variational methods for graphical models
- Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables
- Gene expression distribution deconvolution in single-cell RNA sequencing
- Graphical models, exponential families, and variational inference
- Online but accurate inference for latent variable models with local Gibbs sampling
- Sampling-Based Approaches to Calculating Marginal Densities
- Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images
Cited in
(16)- Supervised Adversarial Alignment of Single-Cell RNA-seq Data
- Kinetic Foundation of the Zero-Inflated Negative Binomial Model for Single-Cell RNA Sequencing Data
- Detection of differentially expressed genes in discrete single‐cell RNA sequencing data using a hurdle model with correlated random effects
- A kernel non-negative matrix factorization framework for single cell clustering
- Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data
- LLE based K-nearest neighbor smoothing for scRNA-seq data imputation
- A unified statistical framework for single cell and bulk RNA sequencing data
- MSIQ: joint modeling of multiple RNA-seq samples for accurate isoform quantification
- Benchmarking penalized regression methods in machine learning for single cell RNA sequencing data
- Data denoising and post-denoising corrections in single cell RNA sequencing
- A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data
- Unsupervised integration of single-cell multi-omics datasets with disproportionate cell-type representation
- A zero-inflated non-negative matrix factorization for the deconvolution of mixed signals of biological data
- Gene expression distribution deconvolution in single-cell RNA sequencing
- Nonparametric Bayesian multiarmed bandits for single-cell experiment design
- Model-based approach to the joint analysis of single-cell data on chromatin accessibility and gene expression
This page was built for publication: A unified statistical framework for single cell and bulk RNA sequencing data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1647646)