WebPlotDigitizer

Gavin · February 22, 2021, 2:37pm

I came across this handy tool for digitizing plots recently and found it very useful. It has enough automation to save a lot of time, but enough interactivity to correct mistakes easily.

It’s not really a replicability focused tool, but it’s certainly useful for digitizing values from plots in papers that don’t provide raw data-

btrettel · February 23, 2021, 10:57pm

These types of programs are great. I’m partial to this one: https://github.com/pn2200/g3data

I used g3data extensively during my PhD and have continued using it afterwards. You can see a significant fraction of the data I’ve transcribed here: https://github.com/btrettel/pipe-jet-breakup-data/tree/master/data

I think sharing transcribed data should be a bigger part of open science. Transcribing the data takes a lot of time so it would be best to minimize duplicate work if possible. I also note in README files if I noticed any issues with the data, though this could also be done on post-publication peer review websites like PubPeer. Putting the notice with the data is more convenient, I think, however. And if the problem I noticed prompted me to modify the data, that definitely is worth noting with the data. (I imagine that some people have a different philosophy where the data as published itself is an artifact that should not be changed, but I don’t share that philosophy.)

Also, if you want extra precision or find your hand shaking too much, look for a setting or program to do “mouse emulation”. This’ll allow you to move your cursor one pixel at a time with the keyboard.

mkcor · February 24, 2021, 7:46am

Yay! Brings back memories… like 2014 Plotly Blog - Automatically Grab Data From an Image with...

Gavin · February 24, 2021, 7:45pm

Wow, that’s a lot of papers you’ve transcribed data from!

This is a good point, and I don’t think I’ve ever heard anybody mention it before. I had planned to include the transcribed data in a supplement for what I publish, but this makes me think it would be better to put it in a separate place/repository and link to that from my paper. It seems like it would also be useful to put a link to the data in a comment PubPeer for the original publication, to make it more visible. Do you do that for the datasets you have on Github?

btrettel · March 6, 2021, 6:22pm

I haven’t put any comments on PubPeer for the data I’ve transcribed, but that’s a good idea. I’ve added this to my PubPeer to-do list, which I keep in my reference manager.

Topic		Replies	Views
Executable research articles & Interactive articles Open and replicable science open-practices	1	409	September 18, 2020
Greetings and several new preprints in Bahasa Indonesia Open and replicable science open-science , new-academia , new-paper	12	567	July 26, 2020
Lectures at the open science summer school of the university of Maribor Open and replicable science	0	158	September 14, 2023
Open Scholarship Knowledge Database Open and replicable science open-science	1	359	August 12, 2020
A consensus-based transparency checklist Open and replicable science open-science , reproducibility , new-paper , social-science	0	493	December 3, 2019

WebPlotDigitizer

Related topics