Very brave and mature preprint by Dr. Justin Pickett, taking responsibility for errors and mistakes in a published paper despite the fact that his co-authors disagree. We need more of this. More bravery and responsibility in science. (The disagreements between him and the co-authors become clear only when reading the full preprint.)
My coauthors and I were informed about data irregularities in Johnson, Stewart, Pickett, and Gertz (2011), and in my coauthors’ other articles. Subsequently, I examined my limited files and found evidence that we: 1) included hundreds of duplicates, 2) underreported the number of counties, and 3) somehow added another 316 respondents right before publication (and over a year after the survey was conducted) without changing nearly any of the reported statistics (means, standard deviations, regression coefficients). The survey company confirmed that it sent us only 500 respondents, not the 1,184 reported in the article. I obtained and reanalyzed those data. This report presents the findings from my reanalysis, which suggest that the sample was not just duplicated. The data were also altered—intentionally or unintentionally—in other ways, and those alterations produced the article’s main findings. Additionally, we misreported data characteristics as well as aspects of our analysis and findings, and we failed to report the use of imputation for missing data.
The following eight findings emerged from my reanalysis:
- The article reports 1,184 respondents, but actually there are 500.
- The article reports 91 counties, but actually there are 326.
- The article describes respondents that differ substantially from those in the data.
- The article reports two significant interaction effects, but actually there are none.
- The article reports the effect of Hispanic growth is significant and positive, but actually it is non-significant and negative.
- The article reports many other findings that do not exist in the data.
- The standard errors are stable in our published article, but not in the actual data or in articles published by other authors using similar modeling techniques with large samples.
- Although never mentioned in the article, 208 of the 500 respondents in the data (or 42%) have imputed values.