Psychologist logo
Cartoon of figures conducting research
Methods and statistics, Research

This is what happened when psychologists tried to replicate 100 previously published findings

While 97 per cent of the original results showed a statistically significant effect, this was reproduced in only 36 per cent of the replications.

27 August 2015

By Christian Jarrett

After some high-profile and at times acrimonious failures to replicate past landmark findings, psychology as a discipline and scientific community has led the way in trying to find out more about why some scientific findings reproduce and others don't, including instituting reporting practices to improve the reliability of future results. Much of this endevour is thanks to the Center for Open Science, co-founded by the University of Virginia psychologist Brian Nosek.

Today, the Center has published its latest large-scale project: an attempt by 270 psychologists to replicate findings from 100 psychology studies published in 2008 in three prestigious journals that cover cognitive and social psychology: Psychological Science, the Journal of Personality and Social Psychology, and the Journal of Experimental Psychology: Learning, Memory and Cognition.

The Reproducibility Project is designed to estimate the "reproducibility" of psychological findings and complements the Many Labs Replication Project which published its initial results last year. The new effort aimed to replicate many different prior results to try to establish the distinguishing features of replicable versus unreliable findings: in this sense, it was broad and shallow and looking for general rules that apply across the fields studied. By contrast, the Many Labs Project involved many different teams all attempting to replicate a smaller number of past findings – in that sense it was narrow and deep, providing more detailed insights into specific psychological phenomena.

The headline result from the new Reproducibility Project report is that whereas 97 per cent of the original results showed a statistically significant effect, this was reproduced in only 36 per cent of the replication attempts. Some replications found the opposite effect to the one they were trying to recreate. This is despite the fact that the Project went to incredible lengths to make the replication attempts true to the original studies, including consulting with the original authors.

Just because a finding doesn't replicate doesn't mean the original result was false – there are many possible reasons for a replication failure, including unknown or unavoidable deviations from the original methodology. Overall, however, the results of the Project are likely indicative of the biases that researchers and journals show towards producing and publishing positive findings. For example, a survey published a few years ago revealed the questionable practices many researchers use to achieve positive results, and it's well known that journals are less likely to publish negative results.

The Project found that studies that initially reported weaker or more surprising results were less likely to replicate. In contrast, the expertise of the original research team or replication research team were not related to the chances of replication success. Meanwhile, social psychology replications were less than half as likely to achieve a significant finding compared with cognitive psychology replication attempts, but in terms of declines in size of effect, both fields showed the same average reduction from the original study to replication attempt, to less than half (cognitive psychology studies started out with larger effects and this is why more of the replications in this area retained statistical significance).

Among the studies that failed to replicate was research on loneliness increasing supernatural beliefs; conceptual fluency increasing a preference for concrete descriptions (e.g. if I prime you with the name of a city, that increases your conceptual fluency for the city, which supposedly makes you prefer concrete descriptions of that city); and research on links between people's racial prejudice and their response times to pictures showing people from different ethnic groups alongside guns. A full list of the findings that the researchers attempted to replicate can be found on the Reproducibility Project website (as can all the data and replication analyses).

This may sound like a disappointing day for psychology, but in fact, really the opposite is true. Through the Reproducibility Project, psychology and psychologists are blazing a trail, helping shed light on a problem that afflicts all of science, not just psychology. The Project, which was backed by the Association for Psychological Science (publisher of the journal Psychological Science), is a model of constructive collaboration showing how original authors and the authors of replication attempts can work together to further their field. In fact, some investigators on the Project were in the position of being both an original author and a replication researcher.

"The present results suggest there is room to improve reproducibility in psychology," the authors of the Reproducibility Project concluded. But they added: "Any temptation to interpret these results as a defeat for psychology, or science more generally, must contend with the fact that this project demonstrates science behaving as it should" – that is, being constantly sceptical of its own explanatory claims and striving for improvement. "This isn't a pessimistic story", added Brian Nosek in a press conference for the new results. "The project shows science demonstrating an essential quality, self-correction – a community of researchers volunteered their time to contribute to a large project for which they would receive little individual credit."

Further reading

Open Science Collaboration (2015). Estimating the reproducibility of psychological science.