Open Science Reading Recommendations

Many members of the forum community are interested in Open Science, but it’s not always easy to keep up with the latest OS literature if you’re not a metascience researcher yourself (well, if you’re not active on Twitter…). So I wanted to encourage people to post new OS/metascience paper/preprints here, particularly if they think the article will be of general interest and stimulate discussion in the forum.

A previous example of preprint that generated some discussion is Metascience as a scientific social movement

Like the one above, articles are welcome to cover advanced topics and issues around Open Science. I’d refer anybody looking for an introduction to OS to Easing Into Open Science: A Guide for Graduate Students and Their Advisors and the material in the Intro Papers folder ReproducibiliTea Zotero library.

If a given article generates a bit of discussion then I or another moderator can split it into another thread. I’ll also pick an interesting article to feature in the Latest IGDORE Newsletter each month (and give a shout out to whoever posted it originally).


Here is a paper on journal impact factors to start things off (h/t @pcmasuzzo)

Journal impact factors, publication charges and assessment of quality and accuracy of scientific research are critical for researchers, managers, funders, policy makers, and society. Editors and publishers compete for impact factor rankings, to demonstrate how important their journals are, and researchers strive to publish in perceived top journals, despite high publication and access charges. This raises questions of how top journals are identified, whether assessments of impacts are accurate and whether high publication charges borne by the research community are justified, bearing in mind that they also collectively provide free peer-review to the publishers. Although traditional journals accelerated peer review and publication during the COVID-19 pandemic, preprint servers made a greater impact with over 30,000 open access articles becoming available and accelerating a trend already seen in other fields of research. We review and comment on the advantages and disadvantages of a range of assessment methods and the way in which they are used by researchers, managers, employers and publishers. We argue that new approaches to assessment are required to provide a realistic and comprehensive measure of the value of research and journals and we support open access publishing at a modest, affordable price to benefit research producers and consumers.

Some old arguments against the impact factor:

In 1997 Per Seglen (1997) summarized in four points why JIFs should not be used for the evaluation of research:

  1. “Use of journal impact factors conceals the difference in article citation rates (articles in the most cited half of articles in a journal are cited 10 times as often as the least cited half).
  2. Journals’ impact factors are determined by technicalities unrelated to the scientific quality of their articles.
  3. Journal impact factors depend on the research field: high impact factors are likely in journals covering large areas of basic research with a rapidly expanding but short lived literature that uses many references per article.
  4. Article citation rates determine the journal impact factor, not vice versa.”

Some existing alternatives:

A number of alternative metrics to JIF have been developed (Table 1). All of these are based on citation counts for individual papers but vary in how the numbers are used to assess impact. As discussed later, the accuracy of data based on citation counts is highly questionable.

  • CiteScore calculates a citations/published items score conceptually similar to JIF but using Scopus data to count four years of citations and four years of published items.
  • The Source Normalized Impact Factor also uses Scopus data to take a citation/published items score and normalizes it against the average number of citations/citing document.
  • The Eigenfactor (EF) and Scimago Journal Rank work in a manner analogous to Google’s PageRank algorithm, employing iterative calculations with data from Journal Citation Reports and Scopus respectively to derive scores based on the weighted valuations of citing documents.
  • Finally, h-indexes attempt to balance the number of papers published by an author or journal against the distribution of citation counts for those papers. This metric is frequently used and is discussed in more detail in a following section.

Some new suggestions for evaluative criteria:

If so, and recognizing that any evaluation based on a single criterion alone can be criticized, what are the criteria we should consider in order to devise a more effective system for recognition and assessment of accomplishments which also supports an equitable publishing process that is not hidden behind expensive paywalls and OA fees? The following are all metrics that could be collectively looked at to aid in assessment although as we have discussed, if used alone, all have their limitations:

  1. Contribution of an author to the paper including preprints, i.e. first author, last author, conducted experiments, analyzed data, contributed to the writing, other?
  2. Number of years active in research field and productivity
  3. Number of publications in journals where others in the same field also publish
  4. Views and downloads
  5. Number of citations as first, last, or corresponding author.

The authors’ get bonus points for including a reference to a Bob Dylan song as well!

I think that final points could be good factors to start making an assessment metric. One thing I note is that these are all quantitive factors that are easy to collect, but I wonder if there is scope to include other qualitative assessments (although these generally require more effort to create)? For instance, I think that peer usage and validation of research findings and/or output would be a very positive indicator. A replication study is basically the essence of peer validation, but in some cases, information indicating usage could be easier to get (e.g. looking at forks/contributions to a code repository that go on to be used for other papers).

Any thoughts? Or other suggestions for assessment metrics?

1 Like

I came across this month’s preprint at a Nowhere Lab meeting (h/t Dwayne Lieck). The paper is quite heavy on Philosophy of Science, but I think it does a good job of showing the value of (combining) different types of replications.

A Falsificationist Treatment of Auxiliary Hypotheses in Social and Behavioral Sciences: Systematic Replications Framework

In short:

we investigate how the current undesirable state is related to the problem of empirical underdetermination and its disproportionately detrimental effects in the social and behavioral sciences. We then discuss how close and conceptual replications can be employed to mitigate different aspects of underdetermination, and why they might even aggravate the problem when conducted in isolation. … The Systematic Replications Framework we propose involves conducting logically connected series of close and conceptual replications and will provide a way to increase the informativity of (non)corroborative results and thereby effectively reduce the ambiguity of falsification.

The introduction is catchy:

At least some of the problems that social and behavioral sciences tackle have far-reaching and serious implications in the real world. Among them one could list very diverse questions, such as “Is exposure to media violence related to aggressive behavior and how?” … Apart from all being socially very pertinent, substantial numbers of studies investigated each of these questions. However, the similarities do not end here. Curiously enough, even after so much resource has been invested in the empirical investigation of these almost-too-relevant problems, nothing much is accomplished in terms of arriving at clear, definitive answers … Resolving theoretical disputes is an important means to scientific progress because when a given scientific field lacks consensus regarding established evidence and how exactly it supports or contradicts competing theoretical claims, the scientific community cannot appraise whether there is scientific progress or merely a misleading semblance of it. That is to say, it cannot be in a position to judge whether a theory constitutes scientific progress in the sense that it accounts for phenomena better than alternative or previous theories and can lead to the discovery of new facts, or is degenerating in the sense that it focuses on explaining away counterevidence by finding faults in replications (Lakatos, 1978). Observing this state, Lakatos maintained decades ago that most theorizing in social sciences risks making merely pseudo-scientific progress (1978, p. 88-9, n. 3-4). What further solidifies this problem is that most “hypothesis-tests” do not test any theory and those that do so subject the theory to radically few number of tests (see e.g., McPhetres et. al., 2020). This situation has actually been going on for a considerably long time, which renders an old observation of Meehl still relevant; namely, that theoretical claims often do not die normal deaths at the hands of empirical evidence but are discontinued due to a sheer loss of interest (1978).

As researchers whose work doesn’t directly replicate point out, a failed replication doesn’t necessarily mean a theory is falsified:

this straightforward falsificationist strategy is complicated by the fact that theories by themselves do not logically imply any testable predictions. As the Duhem-Quine Thesis (DQT from now on) famously propounds, scientific theories or hypotheses have empirical consequences only in conjunction with other hypotheses or background assumptions. These auxiliary hypotheses range from ceteris paribus clauses (i.e., all other things being equal) to various assumptions regarding the research design and the instruments being used, the accuracy of the measurements, the validity of the operationalizations of the theoretical terms linked in the main hypothesis, the implications of previous theories and so on. Consequently, it is impossible to test a theoretical hypothesis in isolation. In other words, the antecedent clause in the first premise of the modus tollens is not a theory ( T ) but actually a bundle consisting of the theory and various auxiliary hypotheses ( T , AH 1, …, AH n). For this reason, falsification is necessarily ambiguous. That is, it cannot be ascertained from a single test if the hypothesis under test or one or more of the auxiliary hypotheses should bear the burden of falsification (see Duhem, 1954, p. 187; also Strevens, 2001, p. 516).1 Likewise, Lakatos maintained that absolute falsification is impossible, because in the face of a failed prediction, the target of the modus tollens can always be shifted towards the auxiliary hypotheses and away from the theory (1978, p. 18-19; see also Popper, 2002b, p. 20).

Popper considered auxiliary hypotheses to be unimportant background assumptions that researchers had to demarcate from the theory being tested by designing a good methodology. But this is hard to do in the social sciences (my experience suggests this is probably true in many areas of biology as well):

In the social and behavioral sciences, relegating AH s to unproblematic background assumptions is particularly difficult, and consequently the implications of the DQT are particularly relevant and crucial (Meehl, 1978; 1990). For several reasons we need to presume that AH s nearly always enter the test along with the main theoretical hypothesis (Meehl, 1990). Firstly, in the social and behavioral sciences the theories are so loosely organized that they do not say much about how the measurements should be (Folger, 1989; Meehl, 1978). Secondly, AH s are seldom independently testable (Meehl, 1978) and, consequently, usually no particular operationalization qualitatively stands out. Besides, in these disciplines, theoretical terms are often necessarily vague (Qizilbash, 2003), and researchers have a lesser degree of control on the environment of inquiry, so hypothesized relationships can be expected to be spatiotemporally less reliable (Leonelli, 2018). Moreover, in the absence of a strong theory of measurement that is informed by the dominant paradigm of the given scientific discipline (Muthukrishna & Henrich, 2019), the selection of AH s is usually guided by the assumptions of the very theory that is put into test. Consequently, each contending approach develops its own measurement devices regarding the same phenomenon, heeding to their own theoretical postulations. Attesting to the threat this situation poses for the validity of scientific inferences, it has recently been shown that the differences in research teams’ preferences of basic design elements drastically influence the effects observed for the same theoretical hypotheses (Landy et al., 2020).

The proposed Systematic Replications Framework (also depicted in Fig. 2):

SRF consists of a systematically organized series of replications that function collectively as a single research line. The basic idea is to bring close and conceptual replications together in order to weight the effects of the AH pre and AH out sets on the findings . SRF starts with a close replication, which is followed by a series of conceptual replications in which the operationalization of one theoretical variable at a time is varied while keeping that of the other constant and then repeats the procedure for the other leg.

Its benefits for hypothesis testing are:

SRF reduces ambiguities implied by the DQT in original studies as well as in close and conceptual replications. Primarily, it allows for non-corroborative evidence to have differential implications for the components of the TH & AH s bundle. Thereby these components can receive blame not collectively but in terms of a weighted distribution. In cases where it is not possible to achieve this, it allows demarcating on which pairings from possible AH pre and AH out sets the truth-value of the TH is conditional. In all cases, the confounding effects deriving from the AH s can be relatively isolated. Lastly, SRF can enable that we approximate to an ideal test of a theoretical hypothesis within the methodological falsificationist paradigm by embedding alternative operationalizations and associated measurement approaches into a severe testing framework (see Mayo, 1997; 2018).

Besides replications, the SRF could also be useful for doing systematic literature reviews:

Another potential practical implication of SRF lies in using the same strategy of logically connecting different AH bundles in conducting and interpreting systematic literature reviews (particularly when the previous findings are mixed). Such a strategy can help researchers distinguish the effects that seem to be driven by certain AH s from the ones in which the TH is more robust to such influences. To put it differently, in a contested literature there are already numerous conceptual replications that have been conducted, and at least some of these replications rely on the same AH s in their operationalizations. Therefore, to the extent that they have overlaps in their AH s, their results can be organized in such a way that resembles a pattern of results that can be obtained with a novel research project planned according to SRF. The term “systematic” in systematic literature review already indicates that the scientific question to be investigated (i.e., the subject-matter, the problem or hypothesis), the data collection strategy (e.g., databases to be searched, inclusion criteria) as well as the method that will be used in analyzing the data (e.g., statistical tests or qualitative analyses) are standardized. However, for various reasons (e.g., to limit the inquiry to those studies that use a particular method), not every systematic literature review is conducive to figuring out whether the TH is conditional on particular AH sets. An SRF-inspired strategy of tabulating the results in a systematic literature review will also help researchers in appraising the conceptual networks of theoretical claims, theoretically relevant auxiliary assumptions and measurements. Thus, it can eventually help in appraising the verisimilitude of the TH by revealing how it is conditional on certain AH s, and can lead to the reformulation or refinement of the TH as well as guide and constrain subsequent modifications to it.

In closing:

The decade-long discussion on a replicability and confidence crisis in several disciplines of social, behavioral and life sciences (e.g., Camerer et al., 2018; OSC, 2015; Ioannidis, 2005) has identified the prioritization of the exploratory over the critical mission as one of the key causes, and led to proposals for slowing science down (Stengers, 2018), applying more caution in giving policy advice (Ijzerman et al., 2020), and inaugurating a credibility revolution (Vazire, 2020). All potential contributions of SRF will be part of a strategy to prioritize science’s critical mission on the way towards more credible research in social, behavioral, and life sciences. This would imply that the scientific community focuses less on producing huge numbers of novel hypotheses with little corroboration and more on having a lesser number of severely tested theoretical claims. Successful implementation of SRF also requires openness and transparency regarding both positive and negative results of original and replication studies (Nosek et al., 2015) and demands increased research collaboration (Landy et al., 2020). Ideally, this would also take the form of adversarial collaboration.

@surya re. the adversarial collaborations. It’s discussed in more detail in its own section.

I’d be interested to hear from some people involved in current psychology replication projects about their thoughts on using conceptual replications to test auxiliary hypotheses vs. just using close/direct replications.

This paper also reminded me of the old concept of strong inference, which also focuses on testing a variety of alternative hypotheses in a given study.

On reflection, I realized that this statement combined with the idea of auxiliary hypothesis reminded me a lot of Technology Readiness Level, which is a framework used to assess the progress from the observation of phenomena to it being used in a mature design that is deployed in a real-world system (originally developed for use in aerospace design by NASA, although I believe it is now being used more broadly by the European Commission to assess all types of innovation). I think that this write up by Ben Reinhardt is quite a helpful introduction: Technology Readiness Levels

How does this relate? Well, testing a hypothesis in the lab vs. using the idea reliably in the real world requires a greater understanding of the phenomena being used, and this could be considered as refining the set of auxiliary hypotheses that determine when the core hypothesis can be still observed in conditions of decreasing experimental control (although I’ve never seen it framed like this). I wonder if refining auxiliary hypotheses could be a useful framing for applied research in academia and am now quite motivated to read more about Laktos’s model of research programs

This month’s article is from the authors of the preprint Metascience as a scientific social movement which criticaly reflects on the structure of the current scientific reform movement (we previously discussed it here). Peterson and Panofsky now put forward Arguments against efficiency in science (not OA, but preprinted), which is a short response to Hallonsten’s Stop evaluating science: A historical-sociological argument.

Peterson and Panofsky note:

The arguments of the proponents of evaluations and metascientific reform are firmly rooted in the values of liberal society: transparency, accountability, and productivity. Counterarguments are easily cast as defensiveness or obscurantism. This is not just an academic problem. Scientists we interviewed told us that they felt constrained expressing their skepticism of reforms because, while reformers can draw on popular rhetoric of how science should operate, critics must wade into the murky waters of real scientific practice. … Ultimately, our goal is not to suggest that the concept of efficiency has no place in science but, rather, that efficiency is only one value in a cluster of values that includes utility, significance, elegance and, even, sustainability and justice. That efficiency is the easiest to articulate because it accords with other dominant bureaucratic and economic values should not allow it to win policy discussions by default. The fact that the argument against efficiency is challenging makes it all the more pressing to make it.

I am inclined to agree, I just skimmed Stop Evaluating Science - it is interesting, but draws a rather nebulous argument that incorporates discussions on the economization, distrust, democratization and comidification of science to ultimately end in rhetorical argument against scientific evaluation: Questions like ‘has science been productive enough?’ and ‘how can it be proven that science has been productive enough?’ shall first and foremost be answered with a rhetorical question, namely, ‘how else do you suppose that we have achieved this level of wealth and technical standard in Europe and North America?’. But as Hallonsten then notes:

In spite of the overwhelming logic of this rhetorical counter-question, and the historical evidence that supports it, champions of the view that science is insufficiently productive and must be made productive and held accountable through limitations to its self-governance and the use of quantitative performance appraisals will demand evidence that they can comprehend and, preferably, compare with their own simple and straightforward numbers. A list of counter-examples will therefore probably not suffice, since it can be discarded as mere ‘anecdotal evidence’ against which also the shallowest and most oversimplified statistics usually win.

Peterson and Panofsky present a brief argument against efficiency based on two key points. Firstly, efficiency shouldn’t be equated to scientific progress because we don’t agree on what progress is:

Our inability to chart basic scientific progress undermines the ability to measure efficiency. The notion of efficiency only makes sense in the context of established means/ends relationships. The goal is to organize the means in the optimal way to achieve the desired end. The problem is that, in the area of basic science, the end is unknown. … There is little agreement among the scientists themselves about what constitutes a significant contribution. There is reason to believe this dissensus is not a mere technical deficiency, but is a constitutive feature of the cutting edge of science (Cole, 1992: 18). Rather than clarity, these accounts underscore the complexity of conceptualizing progress in science.

Secondly, the incentivizing efficiency may have counterproductive outcomes compared to which existing inefficient practices are preferable. Incentives may be particularly difficult to apply in academic environments as:

scientific cultures are not Lego sets that can be broken down and rebuilt anew. They have organically evolved their own systems of communication and evaluation. They interpret broadly accepted, but abstract, values like skepticism, verification, and transparency in ways sensible to their particular contexts. Applying blanket rules to maximize efficiency in such systems can lead to unintended and, even, counterproductive outcomes.

The mistaken assumption of trying to make science more efficient stems from misinterpreting scientists as nothing more than value-maximizing, incentive-driven agents. Reformers in science have adopted economic language and, in so doing, have treated scientists as actors primarily motivated by material rewards (e.g., Harris, 2017; Nosek et al., 2012). This can be compared to a Mertonian account which would view them motivated by the interlocking system of scientific norms. Under an economic account, the best way to change behavior in science is to alter the incentive structure to reward or punish specific behaviors. Rational scientists will then react to those incentives and outcomes can be ensured.

The problem with incentive-based legislation has been detailed in a recent book by economist Samuel Bowles (2016). He argues that trying to engineer social systems by treating actors as thoroughly self-interested and incentive-driven ignores the useful role that preexisting cultural values play. In the reformer’s mind, newly introduced incentives and existing preferences are ‘additively separable’ from existing values. That is, if actors already value a behavior, then adding an incentive can only have a positive, cumulative effect. Yet, this need not be the case. Bowles details laboratory and field studies that show how the introduction of incentives can reduce or even reverse existing values.

I’m quite partial to the second point as I feel that grassroots cultural change in academia is more likely to lead to beneficial scientific reform than the use of top-down incentives and rules. Still, data on the effectiveness of institutional policies at promoting Open Science practices should be starting to become available, so this point may prove easier to resolve than the first.

I’m in favour of promoting Open Science, but I do think this paper was a thought-provoking critique that provided:

the beginnings of a counterargument, so that any reform dressed in the language of efficiency must address what it means by efficiency and how it might impinge on other values. Science reform should be a slow, reversable process with input from funders, institutions, those who study science, and, most importantly, the scientists themselves. And although defensiveness and obfuscation are enemies of science, resistance to reforms may have reasonable roots.

I was discussing Peterson and Panofsky’s paper with somebody who thought it didn’t clearly articulate what scientific efficiency meant from a metascience perspective, as it simply stated:

Metascientific activists have conceptualized efficiency in terms of improving the proportion of replicable claims to nonreplicable claims in the literature (e.g., Ioannidis, 2012).

Which was set against the status quo process for scientific progress:

a biologist at MIT who contrasted these organized replication efforts with what he viewed as the current ‘Darwinian process […] which progressively sifts out the findings which are not replicable and not extended by others’. Under this alternative theory of scientific efficiency, there is a natural process in which researchers produce many claims. Some may be flat wrong. Some may be right, yet hard to reproduce, or only narrowly correct and, therefore, be of limited use. However, some provide robust and exciting grounds to build upon and these become the shoulders on which future generations stand (Peterson and Panofsky, 2021)

But one point that came up is that surely reproducibility, whether it comes from directed efforts or natural selection, isn’t enough to ensure efficient scientific progress if you aren’t testing hypotheses that will lead to useful theoretical and/or practical progress in the first place. (note the papers first point is essentially we don’t know what progress is in basic science, see my post above)

This reminded me of the 2009 original article about avoidable research waste which proposed four stages of research waste: 1) irrelevant questions, 2) inappropriate design and waste, 3) inaccessible or incomplete publications, 4) biased or unusable reports (inefficient research regulation and management was later inserted at position 3). This paper is known for estimating that 85% of investment in biomedical research is wasted, but this only takes into account losses at stages 2, 3, and 4. It is these three stages that are then addressed by the two efficiency promoting manifestos cited by Peterson and Panofsky (Ioannidis et al. 2015 and Munafò et al. 2017) under the themes of improved Methods, Reporting and Dissemination, Reproducibility and Evaluation, all of which are supported by Incentives. Figure 1 of the latter manifesto does show Generate and specify hypothesis in a circular diagram of the scientific method, but in the context of scientific reproducibility, the discussion focuses on the risks that uncontrolled cognitive biases pose to hypothesising:

a major challenge for scientists is to be open to new and important insights while simultaneously avoiding being misled by our tendency to see structure in randomness. The combination of apophenia (the tendency to see patterns in random data), confirmation bias (the tendency to focus on evidence that is in line with our expectations or favoured explanation) and hindsight bias (the tendency to see an event as having been predictable only after it has occurred) can easily lead us to false conclusions.

Besides the metascience manifestos above, a 2014 Lancet series on increasing value and reducing waste in biomedical research also provided recommendations to address each stage of research waste. The first article in the series considered the problem of choosing what to research but primarily set this out as a challenge for funders and regulators when setting research priorities. While some suggestions are made that could be useful for researchers working doing clinical, applied or even use-inspired studies (namely, consider the potential user’s needs) the most broadly applicable advice for individual researchers seems to be using systematic and metareviews to ensure that existing knowledge is recognized and then used to justify additional work.

I feel that the question of what to research (particularly in basic research and for the individual researcher) has been neglected by metascientific reformers and their current focus on improving replicability. Don’t get me wrong, replicability is important as producing unreplicable results from testing innovative hypotheses doesn’t mean much, but I think the two aspects of efficient science need to move forward together.

Refreshingly, a recent article that introduced the Society of Open, Reliable, and Transparent Ecology and Evolutionary biology notes that promoting good theory development is an outstanding question for meta-research and provides a reference to the beguiling titled paper Why Hypothesis Testers Should Spend Less Time Testing Hypotheses. I’ve yet to look at this last paper and its citations in detail, but I still wonder if I’ve missed something. Has work in metascience really not looked into problem selection as much as the other stages of research waste? Or is this being addressed using a different terminology or by a different field? Or do we continue to really on researchers developing the tacit skill of selecting good research questions during their training?

I discovered this months preprint Empowering early career researchers to improve science in a session of the name at Metascience 2021 (recording). I was particularly impressed with the diverse range of perspectives (both in terms of early career researchers (ECR) demographics and types of initiatives) that the organizers drew together with an asynchronous unconference (which they reflect on here).

Besides catagorizing reform efforts into: ‘(1) publishing, (2) reproducibility, (3) public involvement and science communication, (4) diversity and global perspectives, (5) training and working conditions for early career researchers (ECRs), and (6) rewards and incentives.’ (section 1)

The preprint also discusses 7 why reasons ECR need to be involved in scientific improvements (section 2, briefly summarized):

  1. ECRs are future research leaders and should be able to shape the future of the research enterprise,
  2. ECRs are a more diverse group than senior scientists,
  3. ECRs may more open to new ideas and less ‘set in their ways’ than senior scientists,
  4. ECRs may still be motivated by unchallenged idealism (i.e. they haven’t yet become jaded and career-driven PIs),
  5. ECRs are at the forefront of technical advances and tool development,
  6. Some ECRs have more time and energy to commit to reform initiatives,
  7. ECRs represent the majority of the scientific workforce.

And six obstacles faced by ECRs involved in efforts to improve research culture (section 3, also summarized):

  1. Scientific improvement initiatives are rarely rewarded or incentivized and aren’t seen as a priority for career progression: ‘This contributes to an endless cycle, where ECRs who work to improve science are pushed out before securing faculty positions or leadership roles where they gain more ability to implement systemic changes.
  2. Limited funding and resources are available for ECRs (isn’t this true for everybody?) working on scientific reform,
  3. ECRs are generally excluded from decision making in existing institutions,
  4. ECRs often can’t set aside time for scientific imrovement work, their positions/career stage is unstable, and their supervisors may see scientific improvement as a diversion from their main research projects,
  5. ECRs are perceived as lacking the required experience to improve science,
  6. The western scientific culture creates added challenges from ECRs who are from marginalized groups.

Yet other stakeholders and using good practices can help ECRs overcome these obstacles! A list of suggested actions that institutions (e.g. universities, funders, publishers, academic societies, and peer communities) and senior individuals (e.g. supervisors) can take to support ECRs who are working on reform projects is provided in section 4, and the paper ends with list of lessons learned by ECRs who have previously worked on improving science (section 6, subsection headings):

  • Know what has been done before
  • Start with a feasible goal
  • Collaborate wisely
  • Work towards equity, diversity, and inclusion
  • Build a positive and inclusive team dynamic
  • Anticipate concerns or resistance to change
  • Be persistent
  • Plan for sustainability

Additional (and more detailed) tips are also included in the document ‘Tips and tricks for ECRs organizing initiatives’ on OSF.

Section 5 discusses additional obstacles ECRs in countries with limited funding face and mentions ‘In some countries, postdocs and occasionally PIs lack institutional affiliations, which can be an added challenge when trying to initiate systemic change.’

IGDORE actually provides free institutional affiliation to researchers and researchers in training. IGDORE currently has affiliates on all continents and we’d be very happy to provide institutional support services for ECRs working on scientific reform (or field-specific research for that matter) who don’t have access to an institutional affiliation in their own country. (The Ronin Institute is another option for getting an institutional affiliation.) Additionally, this very forum (On Science & Academia) can also provide a place for ECRs who want to have more nuanced discussions about scientific reform than is possible/probable on Twitter or Slack (we can make public/private categories for specific initiatives, just message a forum moderator or admin about your groups needs). Readers should feel free to point ECRs towards IGDORE and this forum if they think one or both could assist them.

To conclude:

ECRs are important stakeholders working to catalyse systemic change in research practice and culture. The examples presented reveal that ECRs have already made remarkable progress. Future efforts should focus on incentivizing and rewarding systemic efforts to improve science culture and practice. This includes providing protected time for individuals working in these areas and amplifying ECR voices and meaningfully incorporating them into decision-making structures. ECRs working on improving science in communities or countries with limited research funding should be supported by organizations with access to greater resources to improve science for all. We hope that the tools, lessons learned, and resources developed during this event will enhance efforts spearheaded by ECRs around the world, while prompting organizations and individuals to take action to support ECRs working to improve science.

What about the OS&A forum readers? A few of you are involved in scientific reform initiatives (perhaps more so outside of traditional academia than inside of it). What resonates here? What would you add?

As a Global Board member at IGDORE I’ve personally noticed that having well thought out plans and goals is important, as is developing a strong diverse team (with respect to volunteers) and planning for leadership sustainability (tbh we’re still working to improve all of these points). Persistence is critical even in the most basic things like building up newsletter readership :rofl: Interestingly, global/language diversity of affiliations seems to have come relatively easily to IGDORE, although this may be specific to our particular project. An additional point that I think has been useful for IGDORE but didn’t see mentioned prominently in the preprint is forming cooperations/collaborations between organisations/initiatives working on similar or allied projects - IGDORE has several organisational collaborations and I’ve found engaging with these other organisations has also helped my work at IGDORE.


Nice find! Will check this out.

1 Like