Statistical modeling of RNA-Seq data

DOI10.1214/10-STS343MaRDI QIDQ635414zbMATH OpenOpenAlexWikidataFDO

Authors Julia Salzman, Hui Jiang, Wing H. Wong

Publication date 19 August 2011

Published in Statistical Science (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1106.3211

Fisher information minimal sufficiency isoform abundance estimation paired end RNA-Seq data analysis

Estimation in survival analysis and censored data (62N02) Applications of statistics to biology and medical sciences; meta analysis (62P10) Biochemistry, molecular biology (92C40) Genetics and epigenetics (92D10) Sufficient statistics and fields (62B05)

Abstract: Recently, ultra high-throughput sequencing of RNA (RNA-Seq) has been developed as an approach for analysis of gene expression. By obtaining tens or even hundreds of millions of reads of transcribed sequences, an RNA-Seq experiment can offer a comprehensive survey of the population of genes (transcripts) in any sample of interest. This paper introduces a statistical model for estimating isoform abundance from RNA-Seq data and is flexible enough to accommodate both single end and paired end RNA-Seq data and sampling bias along the length of the transcript. Based on the derivation of minimal sufficient statistics for the model, a computationally feasible implementation of the maximum likelihood estimator of the model is provided. Further, it is shown that using paired end RNA-Seq provides more accurate isoform abundance estimates than single end sequencing at fixed sequencing depth. Simulation studies are also given.

Recommendations

Cites work

Cited in

(22)

Describes a project that uses

Uses Software

This page was built for publication: Statistical modeling of RNA-Seq data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q635414)