23  Selecting a study for replication

There are (currently) far more candidate papers for replication than there are resources for replication. In that case, how should we select the paper to replicate?

Some possible criteria (largely drawn from Association for Psychological Science (2026)) are as follows:

The impact of the study: Replication of a highly influential study can be useful as it is relevant to a broad literature and may shape other studies. We might base our estimate of the impact on evidence such as the number of citations or the degree to which the study forms the foundation for subsequent work.

The methodological soundness of the study: If the study lacks methodological soundness, replication may be of little value if the replication is subject to the same flaws. For example, if the experimental design fails to provide an effective control (for example, due to spillover), replication may not yield any benefit if it has the same point of failure. A new experimental design may be of more value.

The ambiguity of the interpretation of the result: Most experiments are designed to test a hypothesis. However, some experiments are open to multiple interpretations and do not rule out explanations other than the proposed hypothesis. In that case, it should be asked whether a better experimental design might be more useful than a replication of the original experiment.

The theoretical basis of the study: A study that establishes the foundation for a theoretical position or forces a reconsideration of an important theory may be a useful target for replication. Theoretical support could also come from a more precise estimate of the effect’s size

The level of corroboration: Replication of a study already subject to published replications may be less valuable unless those replications have yielded inconsistent results. Consideration should be given to whether an additional replication would help reduce uncertainty.

Uncertainty about effect size: If there is a reason for substantial uncertainty about the size of the effect, replication may help provide a more precise estimate. This is particularly the case where the original study has a low sample size, likely leading to an exaggeration of the effect size in any significant finding.

Personal interest: Although less relevant for publication, you may have a personal interest in the study. Replication can help you better understand the study and phenomena.

Randomness: There may be a case for random selection of studies to give a better indication of the robustness of a discipline than selection under specific criteria.

23.0.1 Tips on searching for replications and other information on the target

  • Search Google Scholar for the target article, click “cited by” and “Search within citing articles” for the phrase “replicat*” (replication, replicated, replicating, etc.). If the article you’re replicating isn’t the first instance of the phenomenon, you may also want to browse “Related articles”.
  • “Search within citing articles” for the phrase “review” and/or “meta analysis” to see whether there are recent reviews or meta-analyses of the phenomenon or related to the phenomenon.
  • “Search within citing articles” for phrases such as “mixed findings”, “boundary conditions”, and/or “debate” to see whether there were any debates regarding the existence and/or constraints of generality of the phenomenon.

23.1 Class example

For this subject, we will prepare a replication of Study 3b from Dietvorst, Simmons and Massey’s (2015) article Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err. I have chosen this article because:

  • I am interested in the topic and want to run some experiments of a similar nature in field settings. Running the experiment will build my experience in the area.
  • This study is a seminal paper in a rapidly growing research area on how humans use algorithms in their judgment. Many subsequent papers have explored how to reduce algorithm aversion. As of 23 February 2023, Google Scholar identifies 1359 citations.
  • It is an important area beyond academia. How humans use algorithmic inputs (e.g. ChatGPT) will have major implications for human decision making.

Less favourably, I know at least one existing replication of Study 3b and an extension of the paper by Jung and Seiter (2021). The task in Study 3b was also used in Logg et al. (2019).

Many papers, including Dietvorst et al. (2015), contain more than one study (experiment). A registered replication might replicate one, some or all of the studies in a paper.