New paper: No raw data, no science: another possible source of the reproducibility crisis

rebecca · February 22, 2020, 1:58pm

Miyakawa, T. No raw data, no science: another possible source of the reproducibility crisis. Mol Brain 13, 24 (2020). No raw data, no science: another possible source of the reproducibility crisis | Molecular Brain | Full Text

Note: Highlights in bold below by me.

Abstract

A reproducibility crisis is a situation where many scientific studies cannot be reproduced. Inappropriate practices of science, such as HARKing, p-hacking, and selective reporting of positive results, have been suggested as causes of irreproducibility. In this editorial, I propose that a lack of raw data or data fabrication is another possible cause of irreproducibility.

As an Editor-in-Chief of Molecular Brain, I have handled 180 manuscripts since early 2017 and have made 41 editorial decisions categorized as “Revise before review,” requesting that the authors provide raw data. Surprisingly, among those 41 manuscripts, 21 were withdrawn without providing raw data, indicating that requiring raw data drove away more than half of the manuscripts. I rejected 19 out of the remaining 20 manuscripts because of insufficient raw data. Thus, more than 97% of the 41 manuscripts did not present the raw data supporting their results when requested by an editor, suggesting a possibility that the raw data did not exist from the beginning, at least in some portions of these cases.

Considering that any scientific study should be based on raw data, and that data storage space should no longer be a challenge, journals, in principle, should try to have their authors publicize raw data in a public database or journal site upon the publication of the paper to increase reproducibility of the published results and to increase public trust in science.

sTeamTraen · February 24, 2020, 6:24pm

As I wrote on Twitter, this feels like a watershed moment. Ignoring for a moment this editor’s apparent talent for spotting results that are too good to be true (40/41), the headline number for me is that in at least 40 out of 180 cases, the authors of a manuscript knew that their data would not pass muster. I’m guessing that if he had asked for the raw data in every case, he would have had a fair few extra refusals.

Here’s a thought experiment. What would the world look like if a third of all research (in the life sciences, or psychology, or some other field where replications are rare and results are reported in terms of statistical inference) were fake, and this had been the case for many years? I suggest that some of the consequences would be:

Many results cannot be replicated
Researchers find a variety of excuses not to share their data, some of them not far from “The dog ate my homework”
Universities and journals — both staffed by senior people who have been part of the “academic science industry” for many years — are reluctant to investigate allegations of misconduct.

rebecca · February 28, 2020, 3:31pm

I completely agree. And it’s rather terrifying.

Topic		Replies	Views
New blog post: Open and replicable science cannot save us from academia Academia open-science , scientific-misconduct , new-academia , replicability-crisis	0	374	August 20, 2019
What's Wrong with Social Science and How to Fix It: Reflections After Reading 2578 Papers Social sciences metascience , replicability-crisis , replicable-science	3	496	September 29, 2020
Scientific Misconduct - a story on reporting it Social sciences	1	384	February 5, 2021
MIT Technology Review: AI is wrestling with a replication crisis Open and replicable science	1	355	November 24, 2020
Nature: Software searches out reproducibility issues in scientific papers Open and replicable science reproducibility	1	432	January 23, 2020

New paper: No raw data, no science: another possible source of the reproducibility crisis

Abstract

Related topics