Robust methods to detect disease-genotype association in genetic association studies: calculate p-values using exact conditional enumeration instead of simulated permutations or asymptotic approximations

From MaRDI portal
Publication:482819

DOI10.1515/SAGMB-2013-0084zbMATH Open1302.92074arXiv1307.7536OpenAlexW3106510470WikidataQ45914058 ScholiaQ45914058MaRDI QIDQ482819FDOQ482819


Authors: Mette Langaas, Øyvind Bakke Edit this on Wikidata


Publication date: 6 January 2015

Published in: Statistical Applications in Genetics and Molecular Biology (Search for Journal in Brave)

Abstract: In genetic association studies, detecting disease-genotype associations is a primary goal. For most diseases, the underlying genetic model is unknown, and we study seven robust test statistics for monotone association. For a given test statistic, there are many ways to calculate a p-value, but in genetic association studies, calculations have predominantly been based on asymptotic approximations or on simulated permutations. We show that when the number of permutations tends to infinity, the permutation p-value approaches the exact conditional enumeration p-value, and further that calculating the latter p-value is much more efficient than performing simulated permutations. We then answer two research questions. (i) Which of the test statistics under study are the most powerful for monotone genetic models? (ii) Based on test size, power, and computational considerations, should asymptotic approximations or exact conditional enumeration be used for calculating p-values? We have studied case-control sample sizes with 500-5000 cases and 500-15000 controls, and significance levels from 5e-8 to 0.05, thus our results are applicable to genetic association studies with only one genetic marker under study, intermediate follow-up studies, and genome wide association studies. We find that if all monotone genetic models are of interest, the best performance is achieved for a test statistics based on the maximum over a range of Cochrane-Armitage trend tests with different scores and for a constrained likelihood ratio test. For significance levels below 0.05, asymptotic approximations may give a test size up to 20 times the nominal level, and should therefore be used with caution. Further, calculating p-values based on exact conditional enumeration is a powerful, valid and computationally feasible approach, and we advocate its use in genetic association studies.


Full work available at URL: https://arxiv.org/abs/1307.7536




Recommendations




Cites Work


Cited In (3)

Uses Software





This page was built for publication: Robust methods to detect disease-genotype association in genetic association studies: calculate \(p\)-values using exact conditional enumeration instead of simulated permutations or asymptotic approximations

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q482819)