The assessment of replication success based on relative effect size
From MaRDI portal
Publication:65581
DOI10.1214/21-AOAS1502zbMATH Open1498.62220arXiv2009.07782MaRDI QIDQ65581FDOQ65581
Authors: Leonhard Held, Charlotte Micheloud, Samuel Pawel, Leonhard Held, Charlotte Micheloud, Samuel Pawel
Publication date: 1 June 2022
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Abstract: Replication studies are increasingly conducted in order to confirm original findings. However, there is no established standard how to assess replication success and in practice many different approaches are used. The purpose of this paper is to refine and extend a recently proposed reverse-Bayes approach for the analysis of replication studies. We show how this method is directly related to the relative effect size, the ratio of the replication to the original effect estimate. This perspective leads to a new proposal to recalibrate the assessment of replication success, the golden level. The recalibration ensures that for borderline significant original studies replication success can only be achieved if the replication effect estimate is larger than the original one. Conditional power for replication success can then take any desired value if the original study is significant and the replication sample size is large enough. Compared to the standard approach to require statistical significance of both the original and replication study, replication success at the golden level offers uniform gains in project power and controls the Type-I error rate if the replication sample size is not smaller than the original one. An application to data from four large replication projects shows that the new approach leads to more appropriate inferences, as it penalizes shrinkage of the replication estimate compared to the original one, while ensuring that both effect estimates are sufficiently convincing on their own.
Full work available at URL: https://arxiv.org/abs/2009.07782
Recommendations
- Bayesian design of ``successful replications
- Statistical methods for replicability assessment
- Sample size determination in replication attempts: The standard normal z test
- Variation and Covariation in Large-Scale Replication Projects: An Evaluation of Replicability
- An evaluation of statistical methods for aggregate patterns of replication failure
Bayesian inference (62F15) Applications of statistics to biology and medical sciences; meta analysis (62P10)
Cites Work
- Sampling and Bayes' Inference in Scientific Modelling and Robustness
- The Well-Calibrated Bayesian
- Statistical Issues in Drug Development
- Introduction to Randomized Controlled Clinical Trials
- Why should clinicians care about Bayesian methods? (With discussions and response)
- Bayesianly justifiable and relevant frequency calculations for the applied statistician
Cited In (9)
- Power calculations for replication studies
- Replication success under questionable research practices -- a simulation study
- Bayesian design of ``successful replications
- Beyond the two-trials rule: Type-I error control and sample size planning with the sceptical $p$-value
- ReplicationSuccess
- Sample size determination in replication attempts: The standard normal z test
- Variation and Covariation in Large-Scale Replication Projects: An Evaluation of Replicability
- Power priors for replication studies
- Statistical review of animal trials -- a guideline
This page was built for publication: The assessment of replication success based on relative effect size
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q65581)