Statistical modeling of RNA-Seq data
From MaRDI portal
Abstract: Recently, ultra high-throughput sequencing of RNA (RNA-Seq) has been developed as an approach for analysis of gene expression. By obtaining tens or even hundreds of millions of reads of transcribed sequences, an RNA-Seq experiment can offer a comprehensive survey of the population of genes (transcripts) in any sample of interest. This paper introduces a statistical model for estimating isoform abundance from RNA-Seq data and is flexible enough to accommodate both single end and paired end RNA-Seq data and sampling bias along the length of the transcript. Based on the derivation of minimal sufficient statistics for the model, a computationally feasible implementation of the maximum likelihood estimator of the model is provided. Further, it is shown that using paired end RNA-Seq provides more accurate isoform abundance estimates than single end sequencing at fixed sequencing depth. Simulation studies are also given.
Recommendations
- A penalized likelihood approach for robust estimation of isoform expression
- A two-stage Poisson model for testing RNA-Seq data
- Deconvolution of base pair level RNA-seq read counts for quantification of transcript expression levels
- Simultaneous inference of gene isoform expression for RNA sequencing data
- MSIQ: joint modeling of multiple RNA-seq samples for accurate isoform quantification
Cites work
Cited in
(22)- Beta approximation of ratio distribution and its application to next generation sequencing read counts
- A rejection principle for sequential tests of multiple hypotheses controlling familywise error rates
- Lognormality and oscillations in the coverage of high-throughput transcriptomic data towards gene ends
- Pathway analysis for RNA-seq data using a score-based approach
- Deconvolution of base pair level RNA-seq read counts for quantification of transcript expression levels
- Exact transcript quantification over splice graphs
- A stable sequential multiple test for Koopman-Darmois family
- Bayesian analysis of RNA-Seq data using a family of negative binomial models
- Generate gene expression profile from high-throughput sequencing data
- Removing technical variability in RNA-seq data using conditional quantile normalization
- A statistical framework for eQTL mapping using RNA-seq data
- A unified statistical framework for single cell and bulk RNA sequencing data
- Statistical method for modeling sequencing data from different technologies in longitudinal studies with application to Huntington disease
- Transcript abundance estimation and the laminar packing problem
- Statistical analysis of next generation sequencing data
- Shrinkage of dispersion parameters in the binomial family, with application to differential exon skipping
- Perplexity: Evaluating Transcript Abundance Estimation in the Absence of Ground Truth.
- A penalized likelihood approach for robust estimation of isoform expression
- Simultaneous inference of gene isoform expression for RNA sequencing data
- A two-stage Poisson model for testing RNA-Seq data
- Comparing segmentation methods for genome annotation based on RNA-seq data
- Quantifying alternative splicing from paired-end RNA-sequencing data
This page was built for publication: Statistical modeling of RNA-Seq data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q635414)