Abstract: This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. [Proc. Natl. Acad. Sci. USA 102 (2005) 15545--15550]. We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas.
Recommendations
- Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis
- Linear combination test for hierarchical gene set analysis
- Multiple hypothesis testing in microarray experiments.
- A renewed approach to the nonparametric analysis of replicated microarray experiments
- A GS-CORE algorithm for performing a reduction test on multiple gene sets and their core genes
Cited in
(65)- Bayesian analysis of multiple hypothesis testing with applications to microarray experiments
- Calculating the Statistical Significance of Changes in Pathway Activity From Gene Expression Data
- Feature screening via distance correlation learning
- Shrinkage-based diagonal Hotelling's tests for high-dimensional small sample size data
- Identification of differentially expressed spatial clusters using humoral response microarray data
- Partition clustering of high dimensional low sample size data based on \(p\)-values
- Microarrays, empirical Bayes and the two-groups model
- Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis
- Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis
- A high-dimensional two-sample test for the mean using random subspaces
- Testing SNPs and sets of SNPs for importance in association studies
- Probabilistic methods in cancer biology
- Penalized model-based clustering
- Power-Enhanced Simultaneous Test of High-Dimensional Mean Vectors and Covariance Matrices with Application to Gene-Set Testing
- Simultaneous inference: when should hypothesis testing problems be combined?
- Incorporation of gene exchangeabilities improves the reproducibility of gene set rankings
- Permutation test for incomplete paired data with application to cDNA microarray data
- Additive varying-coefficient model for nonlinear gene-environment interactions
- Correcting length-bias in gene set analysis for DNA methylation data
- Graph centrality based prediction of cancer genes
- Hunting for significance: Bayesian classifiers under a mixture loss function
- A Bayesian model averaging approach for observational gene expression studies
- A two-sample test for high-dimensional data with applications to gene-set testing
- Linear combination test for hierarchical gene set analysis
- Identification of consistent functional genetic modules
- Test for high dimensional regression coefficients of partially linear models
- The cluster D-trace loss for differential network analysis
- A new test for part of high dimensional regression coefficients
- Discrimination and scoring using small sets of genes for two-sample microarray data
- A GS-CORE algorithm for performing a reduction test on multiple gene sets and their core genes
- A feasible high dimensional randomization test for the mean vector
- Measures of Uncertainty for Shrinkage Model Selection
- Mutual fund performance: false discoveries, bias, and power
- Deriving and comparing the distribution for the number of false positives in single step methods to control \(k\)-FWER
- Combined hypothesis testing on graphs with applications to gene set enrichment analysis
- Testing the Effects of High-Dimensional Covariates via Aggregating Cumulative Covariances
- Gene set analysis for GWAS: assessing the use of modified Kolmogorov-Smirnov statistics
- Robust group variable screening based on maximum Lq-likelihood estimation
- Distance-correlation based gene set analysis in longitudinal studies
- Joint adaptive mean-variance regularization and variance stabilization of high dimensional data
- Conditional mean and quantile dependence testing in high dimension
- Two sample tests for high-dimensional covariance matrices
- Rank-based score tests for high-dimensional regression coefficients
- Statistical significance for genomewide studies
- Randomized test of mean function for high-frequency functional data
- Graphical modeling for gene set analysis: a critical appraisal
- Simultaneous test for linear model via projection
- A decision-theory approach to interpretable set analysis for high-dimensional data
- Power boosting: fusion of multiple test statistics via resampling
- Conditional Test for Ultrahigh Dimensional Linear Regression Coefficients
- A new nonparametric test for high-dimensional regression coefficients
- A Bayesian extension of the hypergeometric test for functional enrichment analysis
- Distribution-free tests of mean vectors and covariance matrices for multivariate paired data
- Rotation testing in gene set enrichment analysis for small direct comparison experiments
- Nonparametric conditional mean testing via an extreme-type statistic in high dimension
- Stable feature selection for biomarker discovery
- Bayesian nonparametric clustering and association studies for candidate SNP observations
- Rotation gene set testing for longitudinal expression data
- Statistical tests for the intersection of independent lists of genes: sensitivity, FDR, and type I error control
- Invisible fence methods and the identification of differentially expressed gene sets
- A statistical framework for testing functional categories in microarray data
- Bayesian gene set analysis for identifying significant biological pathways
- Consistency of invariance-based randomization tests
- A nonparametric method for classification trees using grouped covariates
- Robust \(U\)-type test for high dimensional regression coefficients using refitted cross-validation variance estimation
This page was built for publication: On testing the significance of sets of genes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q995734)