Ruth Leys Anatomy of a Train Wreck book cover

Research Ethics, Social and behavioural

‘I am certain psychology will see many more train wrecks’

Our editor Jon Sutton has some questions for Ruth Leys, author of Anatomy of a Train Wreck: The Rise and Fall of Priming Research (University of Chicago Press); plus an extract.

05 December 2024

Share this page

How did your previous books lead you to this one?

I am a historian of science by training, specialising in the history of the human sciences. One of my first publications was an edition of the early 20th century correspondence between Adolf Meyer, in America the preeminent psychiatrist of his time, and Edward Bradford Titchener, the leading academic psychologist of his time (Leys and Evans, Defining American Psychology: The Correspondence between Adolf Meyer and Edward Bradford Titchener, 1990). Psychology then was plagued by competing theoretical assumptions and research strategies for obtaining objective knowledge of human behaviour. Meyer and Titchener deplored this state of affairs and hoped to reach a consensus about the best ways to proceed. But the two men could not agree, and their correspondence was broken off.

The Meyer-Titchener exchange epitomises the status of psychology to this day: it is a discipline that lacks agreement about theoretical assumptions and research protocols. As a historian I have tried to understand this state of affairs by focusing on certain topics where the various presuppositions and methodologies that have been adopted have led to various impasses of the kind that are so characteristic of the field.

My book on the genealogy of the concept of trauma is a case in point (Trauma: A Genealogy, 2000). On the one hand, trauma is an apparently indispensable notion for understanding the psychic harms associated both with central experiences of the modern world (the Holocaust, Hiroshima) and with the more common harms of everyday life. But on the other hand, it is also a notion regarding which there is (still) no agreed definition. My book shows why that was and remains the case.

From the topic of trauma, it was a short step for me to investigate the history of efforts to understand certain emotions linked to trauma, such as survivor guilt and shame, topics that again have precipitated much unresolved dispute (From Guilt to Shame: Auschwitz and After, 2007). My work on approaches to guilt and shame led in turn to my interest in the subject of the emotions more generally, and to my book The Ascent of Affect: Genealogy and Critique (2017), in which I analyse the controversies that have shaped the scientific study of the emotions from the post-World War 2 period to the present. I have subsequently studied the history of contested approaches to imitation (Neonatal Imitation: The Stakes of an Idea, 2020), and I also devote a chapter of my new book on priming to Bargh's problematic claims about mimicry.

When and why did you specifically become interested in priming research?

I became aware of priming while researching The Ascent of Affect. In that book I was concerned with the challenge emotion scientists faced when treating the intentionality of emotions, the fact that our emotional experiences seem to be about something meaningful, whether real or imagined, and are sensitive to reasons. Specifically, psychologists found it difficult to operationalise the concept of intentionality by subjecting it to empirical norms and practices, and indeed were constantly tempted to reduce intentionality and meaning to non-intentional computational processes.

That is where the topic of priming came in. Bargh's doctoral thesis advisor at the University of Michigan, Robert Zajonc, was interested in the idea that emotions are independent of cognition or meaning, arguing in the early 1980s that they are instead unintentional, automatic, affective processes that occur outside awareness. Zajonc's claims depended in turn on the cognitive revolution that was well under way by then, according to which mental functions could be understood in information-processing terms. In particular, cognitive scientists had begun to suggest that automatic processes were governed by information-processing channels separate from channels controlling conscious processing, and that many actions and events, such as emotionally-laden reactions, operated unintentionally and automatically in this way.

Influenced by Zajonc and the rise of cognitive science's computational assumptions, Bargh in Unintended Thought (1989) explicitly framed the problem of the intentionality of human action as an empirical question, and went on to commit himself to the project of reducing mental functions, including even the so-called 'executive' or conscious functions, to unintentional processes. Priming was an experimental method for studying the unconscious influence of stimuli, such as words, on habitual behaviours of all kinds, and Bargh became the acknowledged leader of this domain of research.

So you were in on the story when that area, priming research, began to run into trouble?

Yes. In 2012 a team of Belgian scientists had reported that they had been unable to replicate one of Bargh's famous priming experiments. This was an experiment that appeared to show that participants 'primed' by, which is to say exposed to, words connoting old age, automatically walked more slowly when leaving the laboratory, without being aware of the influence of the priming words on their walking speed. Only the year before, in his best-selling book, Thinking, Fast and Slow (2011), Nobel Laureate Daniel Kahneman had praised Bargh's elderly priming experiment, and priming research generally, for demonstrating that primes unconsciously influenced people's behaviour, findings that lent support to Kahneman's claims regarding System 1 and System 2 thinking. But when Kahneman learned in 2012 that Bargh's elderly priming experiment had failed to replicate, he published a letter in Nature warning Bargh and other priming scientists that he saw a 'train wreck looming' if they did not put their house in order. Ten years later he concluded that the researchers had failed to repair the damage and that the field of priming was effectively dead.

The replication crisis that erupted in 2012, and is ongoing, has produced an important literature chiefly focused on the methodological problems that haunt psychology… and not only psychology. But as far as I am aware, there have been no attempts to analyse in depth the history of the scientific hypotheses and experimental practices that from the start have guided and motivated priming research itself. This is where I saw my opportunity, and what my new book aims to provide.

One detail that leaps out from your forensic account of this replication crisis is how we are perhaps all – as Psychologists, as people – drawn to the 'alien idea,' the 'cuteness' and 'cleverness' in science. Do you see those tendencies as more significant than the labour conditions and incentive structures Psychologists work within?

I view the labour conditions and incentive structures psychologists work within as having everything to do with the replication crisis in the field. I am not alone in thinking that the incentives in psychology research tend to encourage making a name for yourself by coming up with a new theory and new research findings, regardless of how solid your theory is or whether your latest experiments have been carried out rigorously. An emphasis on 'cuteness' or 'cleverness' or the 'alien idea' is part of this toxic mix.

Throughout your work and writing, have you arrived at broad conclusions about what Psychologists are 'like'? There's something pretty fundamental about Psychology, Psychologists and people in general there in your book I think . . . that we all have a tendency to not take people, and their own reasons, seriously enough. And that applies to how we think about behaviour and how we think about the people that study behaviour! Is that fair to say?

I think so. But I would qualify your statement by noting that throughout the history of psychology there always have been dissenting voices who have opposed the tendency to disqualify people's reasons for acting the way they do. Those dissenting voices often come from outside the field, especially from philosophy, but by no means only from outside. In my new book I pay attention to psychologists who challenged the tendency to treat all of human behaviour in mechanical-causal terms. The paper by Nisbett and Wilson, 'Telling More Than We Know' (1977), which I discuss in the excerpt reprinted below, produced an important body of criticism by psychologists who accused those authors of disregarding the difference between reason and causal explanations. But, in my experience, all too many psychologists, and following them the general public, love to disparage the influence of reasons in everyday life in favour of an emphasis on the idea that we are all victims of our 'gut feelings'.

By the way, it is worth noting that reasons don't have to be good reasons to be reasons: having bad reasons is one way of exercising reason, just as having good reasons is another way.

Do you think you would have made a good Psychologist? What would you have studied?

On the basis of my experience in the lab when studying psychology as an undergraduate at Oxford University, I would not have made a good experimental psychologist, I would not have had the patience. I am not sure I would have had the patience to be a therapist either! If I had become a professional psychologist I would have been drawn to the study of the theoretical foundations of the field, especially the philosophy of mind, a subject that is absolutely central to the discipline but is rarely taught.

I believe you have had no contact with John Bargh, the major focus of this book. How do you think he would respond to it? Do you care?

I have deliberately had no contact with Bargh. I have made it my rule as a scholar not to contact scientists whose work I know I am likely to criticise, on the grounds that it is unethical to engage someone in discussions on the basis of which one later plans to disagree. Moreover, I have found from experience that it is quite difficult to maintain a properly critical distance from people once one gets to know them, since it is all too easy to end up liking or at least sympathising with them in ways that make one want to soften one's critique.

I predict that Bargh would reject my arguments, just as he has tended to reject criticisms of his priming research. But I would be interested to read any response of his.

Daniel Kahneman's 'train wreck looming' comment gives the book its title and a strong thread throughout. Do you think the train wreck led to stronger, safer trains? Or just a load of wreckage and 'crud'?

This is a very interesting question. On the one hand, I have been impressed by the enormous efforts of reformers in psychology to mend laboratory practices, with a view to improving standards. On the other hand, as a historian I can't help having a feeling of déjà vu, knowing that replication crises in psychology are not new, indeed that crises not just over methodology but over fundamentals are almost what define the field as a pre-paradigmatic or pre-normal science, in the philosopher Thomas Kuhn's sense of those terms. So I am certain psychology will see many more train wrecks. The rise of social media and the emphasis on self-promotion mean that there is more and more scope for psychologists to make claims about the psyche based on very little evidence or research, and the public audience's appetite for simplistic solutions to life's challenges appears to be insatiable.

The 'crud factor' was defined long ago by the psychologist Paul Meehl as the principle according to which 'everything is correlated with everything, more or less', a principle that for Meehl implied that, in the absence of better statistical and other methods, most of the correlations observed in psychology were uninterpretable or simply meaningless. It seems to me that the crud factor is alive and well in a great deal of psychology today.

"Telling More Than We Can Know": Attribution and the Limits of Introspection

Richard Nisbett and Timothy Wilson's famous – to its critics, infamous – 1977 paper entitled "Telling More Than We Can Know" became well-known for numerous reasons, above all for its skeptical conclusions concerning the relevance of the commonsense assumption that people's agency, intentions, and reasons make a real difference in how they behave. John Bargh was not alone in regarding Nisbett and Wilson's paper as among the most consequential of the many publications emerging from attribution research, one that decisively shaped not only his own thinking about priming and automaticity but also the further development of American social psychology.

In their paper Nisbett and Wilson radicalized attribution theory by claiming not only that ordinary individuals were often mistaken about the causes of their own behavior, as other attribution theorists had already argued, but that they were in principle incapable of identifying them because the workings of their own minds were unavailable to introspection. This claim was indeed far-reaching, because social scientists routinely depended on participants' self-reports to provide information about their experiences. But now Nisbett and Wilson were questioning the use of self-report as a reliable research method – even though such self-reports remained an important source of information in social psychology research as a means of verifying participants' lack of awareness of the influence of primes on their behavior. The authors proceeded to raise their questions by first invoking a key premise of cognitive science, that the contents of mental states are the product of hidden, unconscious information-processing events that are unreachable by consciousness. Nisbett and Wilson quoted several cognitive psychologists to this effect. "'It is the result of thinking, not the process of thinking, that appears spontaneously in consciousness,'" they quoted George Miller as stating. They also quoted Ulric Neisser's assertion that "'the constructive processes [of encoding perceptual sensations] themselves never appear in consciousness, their products do.'" To these they added several characteristic statements by George Mandler: "'The analysis of situations and appraisal of the environment . . . goes on mainly at the nonconscious level.'" Again: "'There are many systems that cannot be brought into consciousness, and probably most systems that analyze the environment in the first place have that characteristic. In most of these cases, only the products of cognitive and mental activities are available to consciousness.'" Or again: "'Unconscious processes . . . include those that are not available to conscious experience, be they feature analyzers, deep syntactic structures, affective appraisals, computational processes, language production systems, [or] action systems of many kinds.'"

For Nisbett and Wilson, then, self-reports were likely to be an inadequate method for detecting the causes of behavior, either one's own or that of others, because we are only able to report on the "contents" or "products" of our cognitions, but not on the cognitive "processes" themselves. This conclusion implied in turn that we are better able to report correctly what we are thinking and feeling than why we are. Nisbett and Wilson also argued that, even on those occasions when we do give correct testimony about our higher mental processes, that testimony is not the result of direct introspective awareness but of the "incidentally correct employment of a priori causal theories" about the connection between stimulus and response.

In support of their claims, Nisbett and Wilson reviewed various empirical findings casting doubt on the ability of individuals to accurately report their cognitive processes, including especially the accumulating evidence from attribution research. For example, in what became a frequently cited study, passersby ostensibly participating in a consumer survey had been invited to evaluate four pairs of nylon stockings laid out in a row, side by side. The participants were asked to say which pair was the best quality and to explain their selections. In actuality, all four pairs were identical. Nisbett and Wilson reported that, by a factor of four to one, the individuals surveyed had a pronounced tendency to favor the rightmost pair of stockings, although no participants spontaneously mentioned that the position of the stockings had played a role in their choices. When asked directly whether a position effect could have influenced their decisions, nearly all the subjects denied this possibility, "usually with a worried glance at the interviewer suggesting that they felt either that they had misunderstood the question or were dealing with a madman." Nisbett and Wilson therefore argued that, unknown to the participants, the position of the stocking pairs was the decisive cause of their selections, adding: "Precisely why the position effect occurs is not obvious. It is possible that subjects carried into the judgment task the consumer's habit of 'shopping around,' holding off on choice of early-seen garments on the left in favor of later-seen garments on the right."

The authors concluded on the basis of this and other experiments that, on the whole, people were unable to identify accurately the causes of their own behavior. Pointing to the fact that we can proficiently perform skilled actions without being able to articulate how we do it, philosopher Michael Polanyi had previously argued that because of our possession of tacit knowledge, "we can know more than we can tell." Now, Nisbett and Wilson were claiming that the converse was also true: we "sometimes tell more than we can know."

Nisbett and Wilson also suggested that people's erroneous reports about their own cognitive processes and the causes of their behavior were not "capricious or haphazard" but "regular and systematic." The evidence for this assertion came from the fact that "observer" subjects, who did not participate in attribution studies but simply read verbal descriptions of the same situations as the experimental subjects, made predictions about how they would react that were remarkably similar to the (inaccurate) reports given by the actual participants. This finding suggested to the researchers that both experimental subjects and observers were drawing on implicit, a priori causal theories about the extent to which a particular stimulus was a plausible cause of a given response. For example, in an experiment described by Nisbett and Wilson, when a word association experiment was described to mere observers,

the judgments of probability that particular word cues would affect particular target responses were positively correlated with the original subjects' "introspective reports" of the effects of the word cues on the target responses . . . Thus, whatever capacity for introspection exists, it does not produce accurate reports about stimulus effects, nor does it even produce reports that differ from predictions of observers operating with only a verbal description of the stimulus situation.

Nisbett and Wilson therefore concluded that if the reports of subjects in such experiments did not differ from those of observers, then it is unnecessary to assume that the former are drawing on "a fount of privileged knowledge" . . . It seems equally likely that subjects and observers are drawing on a similar source for their verbal reports about stimulus effects.

That "similar source," they suggested, was the set of explicit or implicit theories about causal relations embedded in the culture or subculture to which participants and observers alike belonged.

Such theories, added Nisbett and Wilson, might well include the "representativeness heuristic" described by Tversky and Kahneman according to which, in making judgments about the probability that an individual is, say, a librarian, one compares information about the individual with stereotypes about librarians, and if the given information is representative of the stereotype, then one deems it "probable" that the individual is a librarian. "Information that is more pertinent to a true probability judgment, such as the proportion of librarians in a population, is ignored. We are proposing that a similar sort of representativeness heuristic is employed in assessing cause and effect relations in self-perception." Nisbett and Wilson argued that other heuristics described by Tversky and Kahneman, such as the availability heuristic, were undoubtedly involved in the (mistaken) attribution of cause-and-effect relationships.

It is important to note that Nisbett and Wilson's arguments depended on a major shift in psychology's understanding of unconscious processes. Prior to the cognitive revolution, unconscious processes were treated in Freudian terms as mental states or events that were not present to consciousness because they had been actively repressed. Moreover, in Sigmund Freud's conception unconscious and conscious processes were governed by different rules or laws: for instance, unlike people's ordinary consciousness of events, the unconscious knew no time and made no distinction between reality and fantasy. In addition, the psychoanalytic unconscious was dynamic and conflictual, involving the role of an ego capable of banishing the subject's unacceptable desires and thoughts from conscious awareness. Freud's conception did not mean that the unconscious was inaccessible to consciousness: the purpose of the Freudian "talking cure" was to provide a method whereby the patient's unconscious thoughts could be brought into consciousness. In addition, and crucially, for Freud unconscious mental states were intentional: their contents were infantile erotic intentions and desires that were blocked from consciousness because of their prohibited nature.

But with the rise of cognitive science and information-processing theories of mental function, Freud's dynamic theory was abandoned. Unconscious activities were now viewed as forms of automatic, nonconscious, non-intentional information processing that occurred in computational subsystems capable of acting independently of the mind's conscious control. In the process of this transformation in the understanding of the unconscious, the dynamic-conflictual dimension of the psychoanalytic unconscious was lost as mental actions were converted into nonconscious, mechanically filtered, adaptive processes and events.

That these attempts at reformulation were motivated by a lack of comfort with the very notion of intention was spelled out by Matthew Hugh Erdelyi, one of the leading figures in this development. Commenting on the Freudian notion of "defense," Erdelyi observed that "the ultimate problem with defense, as with so many other constructs in psychology, is that it is anchored to the notion of intention and purpose and is thus problematic on both philosophical and methodological grounds . . . [W]e have yet no explicit methodology of purpose." As a solution to this problem, he suggested that the phenomenon of defense and other processes could be accounted for differently as "the nondefensive disruption of perception and memory by emotion or by any other attention-disrupting event." In 1987, John Kihlstrom gave the name "cognitive unconscious" to the unconscious formulated in information-processing terms, with the result that for majority of psychologists the triumph of the computer model of the mind seemed virtually complete.

Nisbett and Wilson's paper quickly achieved the status of a classic. It also contributed to one of the most sustained and divisive controversies in the postwar period about the direction of the social psychology as a field. It was apparent to many critics that the authors' claims trenched on long-standing philosophical debates about the explanation of behavior. In particular, opponents like Allan Buss, Eliot R. Smith and Frederick D. Miller, Peter White and others asserted that the authors' arguments – and indeed attribution theorists' similar contentions about the weakness and irrationality of the individual's grasp of causal processes – amounted to an indictment of intentionality. Nisbett and Wilson's suggestion was that human behavior was determined by causes external to the person as an intentional agent, indeed that individuals were not advantaged over observers in their knowledge of their own intentions, because the causes of their behavior were sub-personal information-processing mechanisms over which they had no conscious control. By making the causes of behavior, whether environmental or personal-dispositional, the focus of interest rather than the reasons individuals gave for their actions, attribution theorists lent support to the idea that people's intentions and reasons could not be explanatory at all. The issue was not just that people held many false beliefs and reasons for their actions: they were wrong in thinking they had reasons of any kind.

In short, Nisbett and Wilson's arguments were influential because of their skeptical conclusions concerning the relevance of the commonsense assumption that people's agency, intentions, and reasons make a real difference in how they behave. Their views fed into and reinforced an ongoing debate over the best ways in which to approach the study of social psychology, one in which the conflict between causal and intentional explanations occupied center stage. Many commentators focused their criticisms on Nisbett and Wilson's stocking experiment, particularly the claim that the participants' tendency to prefer the rightmost pair of identical stockings was based on the stockings' position alone. Objections included the argument that the participants had reasons for thinking that there were differences among the stocking pairs, and that they were not mistaken about their reasons: they necessarily knew what their reasons were. Indeed, without such reasons they would not have made the choices they did. Moreover, critics like Don Locke and Donald Pennington argued, even if in this particular experiment the participants' reasons might appear to be rationalizations because the position of the stocking pair influenced their decisions in ways of which they were unaware, it did not therefore follow that all reasons were explanatorily nugatory. The net result of such critiques was to insist on the significance of the distinction between performing an action for reasons versus being caused to behave in some particular way.