A fantastic paper which I found on ResearchGate that seems highly relevant to all of us:
I have not read the entire paper, but I believe it remains to be seen if replicability is improved by these measures.
Are there any general statistics indicating the extent to which findings in studies published in major journals have proved replicable in other studies? It would be very interesting to develop a measure for this and then keep track of it to see if there are any noteworthy trends over time.
Otherwise, the suggestion that the replication crisis is met with useful strategies that improve the credibility of research is just an un-tested hypothesis.
Are there any general statistics indicating the extent to which findings in studies published in major journals have proved replicable in other studies?
From the article:
In one of the most impactful replication initiative of the last decade, the Open Science Collaboration5 sampled studies from three prominent journals representing different sub-fields of psychology to estimate the replicability of psychological research. Out of 100 independently performed replications, only 39% were subjectively labelled as successful replications, and on average, the effects were roughly half the original size. Putting these results into a wider context, a minimum replicability rate of 89% should have been expected if all of the original effects were true (and not false positives; ref. 6). Pooling the Open Science Collaboration5 replications with 207 other replications from recent years resulted in a higher estimate; 64% of effects successfully replicated with effect sizes being 32% smaller than the original effects7. While estimations of replicability may vary, they nevertheless appear to be sub-optimal—an issue that is not exclusive to psychology and found across many other disciplines (e.g., animal behaviour8,9,10; cancer biology11; economics12), and symptomatic of persistent issues within the research environment13,14.
It would be very interesting to develop a measure for this and then keep track of it to see if there are any noteworthy trends over time.
Good point, I don’t know if any studies that have tested whether the interventions described in the article have actually improved replicability. This reminds me the metascience learning loop described in A Vision of Metascience. Unfortunately, the whole loop probably takes a while to generate a meausrable change in replicability.
From the end of the article:
While developments within the credibility revolution were originally fuelled by failed replications, these in themselves are not the only issue of discussion within the credibility revolution. Furthermore, replication rates alone may not be the best measure of research quality. Instead of focusing purely on replicability, we should strive to maximize transparency, rigour, and quality in all aspects of research18,200
I view the credibility revolution very positively, but this does seems like shifting the goal posts from a hard to measure output of the research process (replicability) to proxies that are potentially easier to measure (transparency, rigour, and quality). I just skimmed the article, so I may have missed it, but it would be nice to see some evidence that the proxies are at least correlated with replicability.
I’m a new member here, so this may have been mentioned before, but in the field of cancer biology a reproducibility study was conducted and published in 2021. Here is the link https://elifesciences.org/articles/67995
Another paper that fits the context here:
lowndes2017.pdf (329.9 KB)
Hot of the press, it looks like these interventions do work after all:
This paper reports an investigation by four coordinated laboratories of the prospective replicability of 16 novel experimental findings using rigour-enhancing practices: confirmatory tests, large sample sizes, preregistration and methodological transparency. In contrast to past systematic replication efforts that reported replication rates averaging 50%, replication attempts here produced the expected effects with significance testing (P < 0.05) in 86% of attempts, slightly exceeding the maximum expected replicability based on observed effect sizes and sample sizes. When one lab attempted to replicate an effect discovered by another lab, the effect size in the replications was 97% that in the original study. This high replication rate justifies confidence in rigour-enhancing methods to increase the replicability of new discoveries.
See these commentaries as well: