Psychologist logo
Hand with magnifying glass suggesting web search
Research

Psychology’s algorithmic echo chamber

Ciaran Smith on how search engines may be hindering research.

23 January 2025

Few of us are immune from feelings of time pressure. Clocks are on our walls, computers, home appliances, everywhere. Time is a resource, spoken of as a currency – to spend, invest, or waste. So why was I struggling to find the psychological literature around how we schedule our efforts to meet, delay, or choose between the demands of daily life?

As a psychology student at the University of Buckingham, I was delving into theories surrounding time-based decision-making. And I was finding it maddeningly difficult to unearth relevant studies – even when I expanded my search terms from 'time' to 'temporal', 'chronological', and tangential lines of enquiry into why procrastination is prevalent. 

My repeated lack of results led me to reluctantly accept that there were no published theories on time-based decision making or temporal drive systems at all. I resolved to write one myself.

To my surprise and exasperation, after nearly a year of researching the topic and writing the theory, an existing theory finally began to appear in searches: Temporal Self-regulation Theory (TST; Hall & Fong, 2007). Interestingly, the TST – published in Health Psychology Review – only appeared after I had begun unifying Ajzen's Theory of Planned Behaviour with Rosenstock's Health Belief Model, using dual-process theories and Engel's Biopsychosocial Model as a framework. That search for published papers relating to health may have contributed – perhaps if I had stuck with time-related keywords, I'd still be looking now.

I had found the paper I sought. But the search had gone on for so long that a new concern arose; it wasn't that the published research was missing, it was that I had somehow been missing it. I began to question the role of search engine algorithms in shaping the accessibility and visibility of academic research.

A blind spot in the system?

It was not the first time I had encountered this phenomenon. Near the beginning of my studies, I was researching trait personality frameworks to investigate whether dual-process theories could model the 'Dark Tetrad'. The search engines seemed determined to feed me a steady diet of Five Factor models – a strong model, to be sure, but not what I was searching for. In contrast, the Cognitive Experiential Self Theory (CEST; Epstein, 1994) was exactly what I was looking for, but that did not appear until I had been searching for models following Jungian principles for several months

What seemed odd to me at the time was that I only stumbled upon CEST by chance, after research on temporal discounting for a University tutorial happened to bring up a paper published by a similarly named author (Epstein et al., 2010). Just as the time- and health-based searches seemingly combined to reveal the TST, were the algorithms deciding what I 'probably wanted' by reference to past searches? Was I being nudged toward the mainstream, well-cited literature rather than new or niche ideas that were actually a better fit with my keywords?

Academic search facilitation systems like Google Scholar, PubMed, and JSTOR are no longer the relatively simple search-and-lookup tools their earliest incarnations were. Nowadays they are increasingly sophisticated engines designed to rapidly parse vast bodies of research, then to filter the irrelevant and distil and sort the results. We rely on them to point us to the most 'relevant' material. By some metrics, they succeed admirably. But I was starting to see that 'relevance' for an algorithm often means 'popular and heavily cited' rather than 'innovative and highly applicable'.

Others have noted this. Beel and Gipp's early unpacking of Google Scholar's ranking algorithm (2009; see also Bandara et al., 2015) found that high citation counts, and specific title keywords, typically dominate the first page of results. This suggests that search algorithms may serve as amplifiers of the Matthew effect (Merton, 1968), where well-cited material becomes even more visible while fresh or challenging work is overlooked.

Even after finding Epstein's CEST, the shadows seemed determined to reclaim it. Having coffee with a tutor, I excitedly told her I had found a relatively recent dual-process theory that followed Jungian principles. But we couldn't find it! Fortunately, I had downloaded the paper already, and by directly pasting the title into the search bar, it did appear in the results. It seemed unlikely that it kept slipping back into obscurity due to a lack of merit – this was highly esteemed as a unified theory. It began to feel more likely that its title or keywords were no longer – or had never been – the academic zeitgeist that search engines reward. 

Into the digital hinterlands

I began to believe that ideas which might fill critical gaps, inspire fresh lines of study, or bolster existing concepts, could be passing us by, unseen. I questioned whether search algorithms, designed to prioritise relevance, are inadvertently undermining psychological science. 

Greenberg's 2009 analysis 'How citation distortions create unfounded authority' identified three mechanisms – citation bias, amplification, and invention. The prioritisation of statistically significant findings by search algorithms is a particularly pressing concern. Null results are at least as important as significant ones and, in some contexts, even more critical. Consider the resources – time, funding, and effort – spent repeatedly revisiting research questions that keep yielding null results, only for those findings to remain unpublished or inaccessible. A vital empirical question emerges: do search engines systematically underrepresent null results, thereby deepening the replication crisis and perpetuating biases in the accessibility and visibility of scientific knowledge?

The citation distortions Greenberg studied can create recursive feedback loops; echo chambers that amplify dominant ideas and obscure dissenting perspectives. Similarly, search engines might magnify biases in citation practices by directing users primarily toward heavily cited, statistically significant studies. Such practices not only reinforce existing paradigms but may also deter us from exploring transformative ideas buried in the digital hinterlands. Students and early-career researchers could be discouraged from pursuing genuinely insightful avenues of research if the algorithms do not reveal relevant material, despite diligent and frequent searching (Kacperski et al., 2023). We must continue to stand upon the shoulders of giants to discern the 'road less travelled', and avoid exclusively walking the easier but recursive 'road more cited'.

The amplifying artificial assistant

Artificial Intelligence (AI) increasingly features in academic search engines, and may compound these issues further. While carefully and properly used AI can accelerate the research process, it is also vulnerable to replicating and amplifying existing biases simply due to the way machine learning works (Vlasceanu & Amodio, 2022). AI-driven algorithms have been shown to resurface discredited scientific theories because they were heavily cited in historical datasets, and are heavily promoted now by those still seeking to resurrect and advance the viewpoint (Noble, 2018; Hao, 2023). We need transparency in algorithm design, and a rethink on how relevance is defined in the academic context.

Generative pretrained transformers (GPTs) are the largest and most capable type of AI available to the public. The key word here is pretrained. Unless explicitly guided otherwise by careful prompts or additional lines of custom code, AI follows default assumptions until it 'learns' from its user sufficiently to modify its responses. If these default assumptions include significance preference, then it is a tendency that risks excluding null results, further entrenching existing biases and narrowing the scope of academic discovery. 

This is no easily-solved coding issue, either, as even with recursive accuracy assessment code added, AI still sometimes fabricates academic papers – worsening the issue Greenback discusses in his paper regarding invention. 

What can be done?

Fortunately, there are ways to counter these algorithmic tendencies. An important first step, swiftly and easily implemented, would be for all academic bodies to check their websites and determine how their own search system works. Outsourced or pre-built website designs might have algorithms built in, causing this same issue to manifest even when searching in an organisation's own database of their own published articles.

Additionally, researchers can experiment with more advanced search techniques, such as Boolean operators, field-specific databases, or structured query language (SQL), and share knowledge of these strategies within academic communities. Universities and professional bodies could provide training and resources to help academics and students navigate algorithmic biases effectively. Furthermore, thinking carefully about titles and terminology can help new work appear in top results, given the weighting towards title-based keywords (Beel & Gipp, 2009). These approaches would mitigate the impact of both search algorithm biases, and the systemic issues highlighted in Greenberg's analysis of citation networks.

However, the burden should not rest solely on researchers. Developers of academic search engines must work toward reducing algorithmic biases, ensuring that emerging and critical research is not systematically disadvantaged. Specific measures could include creating algorithmic weighting systems that balance citation counts with measures of intellectual diversity, explicitly promoting null results and underrepresented studies. Incorporating user-controlled filters or settings to balance findings or prioritise novelty could help keep diverse perspectives in view. The 'random page' feature from Wiki-style databases comes to mind. Finally, transparency in algorithm design – such as clear disclosures about ranking criteria – would allow users to mitigate biases in their search strategies. 

Conclusion

Algorithmic curation is undeniably helpful; few of us would want to scroll through thousands of irrelevant results. We may demand more nuance from the digital tools that shape our enquiries, but they are still tools – and even AI-powered ones are a facsimile of intelligence, not actually intelligent. That said, they are powerful tools offering an alluring path of ease, and time pressure is the great constant (despite so few theories on it emerging from the algorithmic Twilight Zone). 

And therein lies the hidden danger. You're trying to save time, and a system working behind the scenes to criteria you're not aware of is helping you find what it thinks you want. By investigating these potential hidden risks and disseminating the results, the path of ease offered by the quick search becomes less tempting, and the extra time spent on advanced searches seems more worth it. We protect and strengthen our research, and thereby continue to improve our access to, and benefit from, the academic literature. 

After all, the next groundbreaking theories may already exist.

  • Ciaran Smith is a British Psychological Society member and a student of Psychology at the University of Buckingham, as well as being the BPS's Student Ambassador for the University.

Commentary from Marcus Munafo, Associate Editor for Research

Algorithmic biases are just the latest specific instantiation of a long-standing problem in science – our disproportionate interest in 'interesting' results. Publication bias – the tendency for authors to preferentially write up, and journals to preferentially publish, statistically significant findings over null results – is probably the best known of these. But there are many more, and they have cumulative and reinforcing effects. We described these in the context of the efficacy of antidepressant drugs here [see in particular Figure 1, showing the cumulative impact of reporting and citation biases on the evidence base for antidepressants.

These biases are likely to contribute to problems of low reproducibility – findings that appear robust based on the published literature, and the studies that are cited in that literature, may be anything but. 

What are the solutions? The systematic review methodology – with replicable search strategies developed with specialists in library teams – is designed to counter these biases and surface studies relevant to a research question irrespective of whether or not they have been cited. Non-algorithmic search engines such as PsycInfo are central to these. But of course this is effortful – the very large number of hits typically returned need to be screened, ideally by two independent screeners (to reduce bias!) to isolate the truly relevant studies.

I always suggest to PhD students that a systematic review is a good way to start their PhD – to develop a useful skill and to truly understand the literature they will be basing their work on. But most of us will feel that we don't have the time and resources to do this, well, systematically whenever we start a new project. But perhaps we should take that time, by doing fewer projects and doing them more thorough, rigorously and – yes – systematically. Slow science may be better science.

References

Bandara, W., Furtmueller, E., Gorbacheva, E., Miskon, S. & Beekhuyzen, J. (2015). Achieving rigor in literature reviews: Insights from qualitative data analysis and tool-support. Communications of the Association for Information Systems, 37, Article 8. 

Beel, J. & Gipp, B. (2009). Google Scholar's ranking algorithm: An introductory overview. In Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI'09) (Vol. 1, pp. 230-241). International Society for Scientometrics and Informetrics.

Epstein, S. (1994). Integration of the cognitive and the psychodynamic unconscious. American Psychologist, 49(8), 709-724. 

Greenberg, S.A. (2009). How citation distortions create unfounded authority: Analysis of a citation network. BMJ, 339, b2680. 

Hall, P.A. & Fong, G.T. (2007). Temporal self-regulation theory: A model for individual health behavior. Health Psychology Review, 1(1), 6-52. 

Hao, K. (2023). Google, Microsoft, and Perplexity are promoting scientific racism in search results. Wired. 

Kacperski, C., Bielig, M., Makhortykh, M., Sydorova, M. & Ulloa, R. (2024). Examining bias perpetuation in academic search engines: An algorithm audit of Google and Semantic Scholar. First Monday, 29(11). 

Merton, R.K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science, 159(3810), 56-63. 

Noble, S.U. (2018). Algorithms of oppression: How search engines reinforce racism. New York University Press.

Vlasceanu, M. & Amodio, D.M. (2022). Propagation of societal gender inequality by internet search algorithms. Proceedings of the National Academy of Sciences, 119(29), e2204529119.