21 What is pre-registration?
Pre-registration is a commitment to a specific study plan that documents the hypotheses being tested, the data collection methodology and how the analysis will be carried out. Plans are typically shared publicly on platforms such as the AEA RCT Registry, although they can also be held in locked private repositories.
21.1 Why pre-register?
Pre-registration helps preserve the validity of statistical analysis. Suppose we test an intervention that truly has no effect. Natural variation means that some draws of the data will appear consistent with our hypothesised effect. Statistical significance gives the probability of observing data as extreme as the observed data under the null hypothesis of no effect (the p-value). Researchers conventionally treat p < 0.05 as “significant”, which still allows false positives about 5 % of the time.
A problem emerges when researchers try many analytical approaches (excluding data points, changing tests, altering outcomes of interest). Doing so inflates the chance of finding a “significant” result, so the nominal p-value is no longer the probability of the data given the null hypothesis, but the probability of the data given the null hypothesis and the sequence of analytical choices.
Pre-registration constrains researchers from making these after-the-fact decisions. Stating the planned analysis up front prevents post hoc choices that inflate the false-positive rate and allows us to take the p-value at face value.
21.1.1 Publication bias
To take a p-value at face value, we also need visibility over the full universe of experiments. If 20 researchers test a null effect, chance alone means one of them is likely to find a significant result. If only that result is published, we may incorrectly believe there is an effect.
Pre-registration allows researchers to see the full spectrum of work that has been conducted, even when studies remain unpublished. Knowing the size of the “file drawer” helps us understand whether the published results represent all the research. Ideally, all studies would be published regardless of novelty, effect size or statistical significance; pre-registration and later lodgement of results move us closer to that goal.
21.2 Examples
21.2.1 Example 1
In 2019, students in Leif Nelson and Don Moore’s PhD seminar (Counts, 2019) selected papers on how scarcity shifts attention and impedes cognitive function for replication. They lodged their pre-analysis plans on OSF, the platform developed by the Center for Open Science.
The plan for one of the papers is available here. It is a simple plan for a simple experiment but illustrates the required components. (They also used AsPredicted for the pre-registration.) The final publication that emerged from this replication exercise is (O’Donnell and Others, 2021).
21.2.2 Example 2
(Kristal et al., 2020) conducted a high-powered replication of the honesty priming experiment reported in (Shu et al., 2012). The experiment was pre-registered on OSF, providing the full plan for the replication.