False discovery rates in somatic mutation studies of cancer
From MaRDI portal
Publication:641083
Abstract: The purpose of cancer genome sequencing studies is to determine the nature and types of alterations present in a typical cancer and to discover genes mutated at high frequencies. In this article we discuss statistical methods for the analysis of somatic mutation frequency data generated in these studies. We place special emphasis on a two-stage study design introduced by Sj"{o}blom et al. [Science 314 (2006) 268--274]. In this context, we describe and compare statistical methods for constructing scores that can be used to prioritize candidate genes for further investigation and to assess the statistical significance of the candidates thus identified. Controversy has surrounded the reliability of the false discovery rates estimates provided by the approximations used in early cancer genome studies. To address these, we develop a semiparametric Bayesian model that provides an accurate fit to the data. We use this model to generate a large collection of realistic scenarios, and evaluate alternative approaches on this collection. Our assessment is impartial in that the model used for generating data is not used by any of the approaches compared. And is objective, in that the scenarios are generated by a model that fits data. Our results quantify the conservative control of the false discovery rate with the Benjamini and Hockberg method compared to the empirical Bayes approach and the multiple testing method proposed in Storey [J. R. Stat. Soc. Ser. B Stat. Methodol. 64 (2002) 479--498]. Simulation results also show a negligible departure from the target false discovery rate for the methodology used in Sj"{o}blom et al. [Science 314 (2006) 268--274].
Recommendations
- Hierarchical Bayesian analysis of somatic mutation data in cancer
- Bayesian local false discovery rate for sparse count data with application to the discovery of hotspots in protein domains
- A Bayesian False Discovery Rate for Multiple Testing
- Detecting mutations in mixed sample sequencing data using empirical Bayes
- Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data
Cites work
- scientific article; zbMATH DE number 720689 (Why is no real title available?)
- A Bayesian analysis of some nonparametric problems
- A Direct Approach to False Discovery Rates
- A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion
- Bayesian Density Estimation and Inference Using Mixtures
- Empirical Bayes Analysis of a Microarray Experiment
- Ferguson distributions via Polya urn schemes
- Gamma shape mixtures for heavy-tailed distributions
- Multiple hypothesis testing in microarray experiments.
- Nonparametric Bayesian data analysis
- Optimal two-stage genome-wide association designs based on false discovery rate
- Robbins, empirical Bayes and microarrays
- The control of the false discovery rate in multiple testing under dependency.
- Two-Stage Designs for Gene-Disease Association Studies with Sample Size Constraints
- Two-stage designs for gene-disease association studies
Cited in
(8)- Estimating an oncogenetic tree when false negatives and positives are present
- Using somatic mutation data to test tumors for clonal relatedness
- Multivariate association analysis with somatic mutation data
- Bayesian local false discovery rate for sparse count data with application to the discovery of hotspots in protein domains
- TRAB: testing whether mutation frequencies are above an unknown background
- Sampling designs via a multivariate hypergeometric-Dirichlet process model for a multi-species assemblage with unknown heterogeneity
- Detecting mutations in mixed sample sequencing data using empirical Bayes
- Hierarchical Bayesian analysis of somatic mutation data in cancer
This page was built for publication: False discovery rates in somatic mutation studies of cancer
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q641083)