30 Replication design
Once planning is complete you need to design and implement the replication. For this subject, that includes building a Qualtrics survey.
Implementation involves many decisions:
- Who will participate? Will they be in-person or online? Will you seek a sample representative of a certain population?
- Will you include attention checks to ensure participants are engaged?
- What wording will you use for instructions and questions?
- What experimental artefacts are required (e.g. a visual display, a scenario vignette)?
- What departures from the original design are required because of the digital platform?
- How will you fill gaps when the original paper is unclear?
- What demographic data will you collect? How will you manage identifiability and privacy?
- In what order will participants experience the stages of the experiment and any demographic questions?
Many decisions will mirror the original paper, but constraints or improvements may lead to deliberate deviations.
30.1 Class example
I implemented the Study 3b replication from (Dietvorst et al., 2015) in Qualtrics. You can:
As in the original study, I propose an online sample, but I will recruit Australian adults covered by the UTS Behavioural Lab blanket ethics approval. For your own replication, propose the participant pool that best fits your hypothesis.
30.1.1 Changes from the original study
I implemented several small changes relative to the original study:
- I use a participant information sheet at the start of the survey to match the requirements for ethics approval.
- The gender question follows the Australian Bureau of Statistics standard.
- I changed the education response options match the Australian rather than US education system.
- I changed the alignment of some multiple-choice questions are displayed vertically to improve the mobile experience.
30.1.2 Extension
The original hypothesis is that seeing the algorithm err reduces willingness to use it, whereas seeing a human err does not have the same effect.
However, across ten trials it is hard for participants to notice the algorithm’s superior accuracy. I wish to test whether making the superiority of the algorithm more salient can ameliorate the effect of the algorithm erring. I included a single screen summarising the algorithm and human participants’ average performance. This will be shown immediately before their choice for the incentivised round. The screen will read:
- The average error of the statistical model is 4.32 ranks.
- The average error of past human participants is 8.34 ranks.
A more detailed summary of performance across the ten trials might have a stronger effect, but it would be more complex to implement in Qualtrics.