Abstract: This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. [Proc. Natl. Acad. Sci. USA 102 (2005) 15545--15550]. We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas.
Recommendations
- Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis
- Linear combination test for hierarchical gene set analysis
- Multiple hypothesis testing in microarray experiments.
- A renewed approach to the nonparametric analysis of replicated microarray experiments
- A GS-CORE algorithm for performing a reduction test on multiple gene sets and their core genes
Cited in
(65)- Test for high dimensional regression coefficients of partially linear models
- Rotation testing in gene set enrichment analysis for small direct comparison experiments
- Gene set analysis for GWAS: assessing the use of modified Kolmogorov-Smirnov statistics
- Mutual fund performance: false discoveries, bias, and power
- Testing SNPs and sets of SNPs for importance in association studies
- Rotation gene set testing for longitudinal expression data
- Additive varying-coefficient model for nonlinear gene-environment interactions
- A Bayesian extension of the hypergeometric test for functional enrichment analysis
- Linear combination test for hierarchical gene set analysis
- Graphical modeling for gene set analysis: a critical appraisal
- Distribution-free tests of mean vectors and covariance matrices for multivariate paired data
- Permutation test for incomplete paired data with application to cDNA microarray data
- Simultaneous inference: when should hypothesis testing problems be combined?
- Deriving and comparing the distribution for the number of false positives in single step methods to control \(k\)-FWER
- Calculating the Statistical Significance of Changes in Pathway Activity From Gene Expression Data
- Conditional Test for Ultrahigh Dimensional Linear Regression Coefficients
- Bayesian gene set analysis for identifying significant biological pathways
- A decision-theory approach to interpretable set analysis for high-dimensional data
- Joint adaptive mean-variance regularization and variance stabilization of high dimensional data
- Discrimination and scoring using small sets of genes for two-sample microarray data
- Graph centrality based prediction of cancer genes
- Correcting length-bias in gene set analysis for DNA methylation data
- Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis
- A statistical framework for testing functional categories in microarray data
- Hunting for significance: Bayesian classifiers under a mixture loss function
- A GS-CORE algorithm for performing a reduction test on multiple gene sets and their core genes
- Incorporation of gene exchangeabilities improves the reproducibility of gene set rankings
- Stable feature selection for biomarker discovery
- Bayesian nonparametric clustering and association studies for candidate SNP observations
- Bayesian analysis of multiple hypothesis testing with applications to microarray experiments
- Rank-based score tests for high-dimensional regression coefficients
- Feature screening via distance correlation learning
- Two sample tests for high-dimensional covariance matrices
- Statistical significance for genomewide studies
- A new test for part of high dimensional regression coefficients
- A two-sample test for high-dimensional data with applications to gene-set testing
- A new nonparametric test for high-dimensional regression coefficients
- A feasible high dimensional randomization test for the mean vector
- Identification of consistent functional genetic modules
- Distance-correlation based gene set analysis in longitudinal studies
- Testing the Effects of High-Dimensional Covariates via Aggregating Cumulative Covariances
- Statistical tests for the intersection of independent lists of genes: sensitivity, FDR, and type I error control
- Invisible fence methods and the identification of differentially expressed gene sets
- Shrinkage-based diagonal Hotelling's tests for high-dimensional small sample size data
- Probabilistic methods in cancer biology
- Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis
- Robust \(U\)-type test for high dimensional regression coefficients using refitted cross-validation variance estimation
- Penalized model-based clustering
- A high-dimensional two-sample test for the mean using random subspaces
- Identification of differentially expressed spatial clusters using humoral response microarray data
- Partition clustering of high dimensional low sample size data based on \(p\)-values
- Microarrays, empirical Bayes and the two-groups model
- A Bayesian model averaging approach for observational gene expression studies
- Consistency of invariance-based randomization tests
- Conditional mean and quantile dependence testing in high dimension
- Power boosting: fusion of multiple test statistics via resampling
- Robust group variable screening based on maximum Lq-likelihood estimation
- Measures of Uncertainty for Shrinkage Model Selection
- Randomized test of mean function for high-frequency functional data
- A nonparametric method for classification trees using grouped covariates
- The cluster D-trace loss for differential network analysis
- Power-Enhanced Simultaneous Test of High-Dimensional Mean Vectors and Covariance Matrices with Application to Gene-Set Testing
- Nonparametric conditional mean testing via an extreme-type statistic in high dimension
- Combined hypothesis testing on graphs with applications to gene set enrichment analysis
- Simultaneous test for linear model via projection
This page was built for publication: On testing the significance of sets of genes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q995734)