Psychological testing

Choosing the right tools to find the right people

Jacob B. Hirsh looks at performance prediction, an area with some of the strongest relationships in psychological research.

04 September 2009

One of the classic goals of psychometric assessment has been to predict performance based on psychological characteristics. Two major categories of psychometric instruments are tests of intelligence and personality, both of which have a long history in predicting behavioural outcomes. Intelligence tests, for example, were originally used to identify learning disabilities among schoolchildren, while personality tests were geared primarily toward the prediction of dysfunctional behaviour. Following their broader adoption during the two world wars, these techniques gained prominence as tools for assessing performance ability and facilitating job placement. Importantly, the goals of psychometric assessment have expanded over time and now include not only the prediction of dysfunctional behaviour, but also performance differences across the normal range of psychological characteristics.

While psychometric testing and performance prediction have evolved considerably over the past 100 years, their value is often underappreciated. In the current article, two critical lessons from this broad field of research are highlighted. Namely, research on performance prediction has taught us the importance of (a) choosing the right people, and (b) using the right tools to do so.

Choosing the right people

Most people would agree that in a competitive environment, the most qualified individual should be chosen for a given position. However, there are many obstacles to the real-world implementation of this meritocratic ideal. One such obstacle is the fact that people tend to underestimate the massive performance and productivity differences that exist between individuals.

A powerful illustration of such differences is codified in 'Price's law', which describes the unequal distribution of productivity in any creative domain (Price, 1963). According to this formula, the square root of the number of people working within a field produce 50 per cent of the total creative output. For example, if there were 100 scientists working on a problem, the 10 most productive researchers within this group would produce the same amount of material as the remaining 90. This concentration of creative work becomes even more pronounced at the highest ends of the productivity distribution, such that the most prolific individuals within a domain generate disproportionately larger shares of the total output. Similar analyses have shown, for instance, that the 10 most prolific composers produced 40 per cent of the 'masterworks' in classical music (Moles, 1958).

Although Price's law was originally used to describe the unequal distribution of creative output, the substantial between-person variability in productivity and performance outcomes extends to non-creative work domains as well. Meta-analytic studies of performance variability indicate that as the work domain becomes more complex, the variability in performance across individuals becomes larger. One way to examine this variability is as a percentage of an average employee's output levels. Zero variability would indicate that all employees perform at the same level, whereas higher values indicate greater differences between individuals.

For unskilled and semi-skilled work, the standard deviation of work output as a percentage of average output is 19 per cent; for skilled work it is 32 per cent; and for managerial and professional work it is 48 per cent (Schmidt & Hunter, 1998). What this means is that a professional who performs at the 84th percentile (one standard deviation above the mean) will be 96 per cent more productive than an individual performing at the 16th percentile (one standard deviation below the mean). In financial terms, this performance difference would result in a £48,000 yearly productivity bonus, based on a £50,000 yearly salary. These productivity differences become even more pronounced when they are summed across multiple people. Organisations that are able to identify and recruit high-performing individuals thus have a considerable economic and strategic advantage.

While selecting the best people is an important goal in itself, a parallel goal of no less importance is screening out undesirable candidates. The consequences of choosing the wrong people are substantial, as they lead to increased turnover rates, recruitment costs, and training expenses, along with lost productivity and decreases in morale. The high costs associated with replacing poorly performing individuals make it all the more important to identify and select the best performers in the first place.

Using the right tools

Because there are almost always more applicants than there are open positions, it is inevitable that some selection process is used. While the previous section highlighted the importance of identifying and selecting the right people, we turn now to the importance of using the right tools to do so.

Just as people tend to underestimate the productivity and performance differences that exist across individuals, they also tend to overestimate the effectiveness of common selection methods. A perfect illustration of this problem is found in the field of graphology, which involves the analysis of an individual's handwriting to derive assessments of psychological characteristics and performance potential. Numerous empirical examinations of graphology suggest that it is completely ineffective at discriminating between high and low performers, providing little more than chance estimates of an individual's potential. Nevertheless, this technique is an extremely popular selection tool in certain regions. In France, for instance, graphology-based psychological assessments are used by up to 50 per cent of all companies, and 80 per cent of all organisational consultants (Bradley, 2005).

While it may be easier to see the folly of graphology, there are in fact many widespread selection techniques that provide little more than chance estimates of who will succeed in a given position. Some examples include education level, training and experience ratings, and academic achievement, which are all common selection methods that nonetheless provide minimal predictive utility. Other popular selection methods, such as unstructured interviews, vary considerably in their effectiveness and are far from optimal.

Why, then, are ineffective selection techniques so popular, when there is a large scientific literature detailing best practices for performance prediction? The discrepancy between research and practice in this domain reflects the nature of organisational decision making, which is influenced by many factors beyond the results of empirical validation studies. Indeed, one of the most common reasons for not employing optimal selection methods is that many human resource practitioners and top managers simply do not believe in the real-world effectiveness of empirically validated selection tools (Terpstra & Rozell, 1997). This may not be surprising in light of the fact that most managers and staffing professionals are not deeply familiar with the academic literature. Despite the many studies that examine the utility and validity of different selection procedures, the results of this research have not fully permeated the awareness of managers and decision makers. Selection practices also vary substantially across nations, suggesting that the cultural context in which an organisation operates can influence the manner in which selection tools are evaluated and employed (Ryan et al., 1999).

In order to take advantage of the large individual differences in productivity, it is first necessary to identify the top candidates. In this respect, it is clear that hiring the best people requires the use of the best selection procedures. Based on meta-analyses of numerous validation studies across a variety of domains, the most effective and efficient method for selecting the top performers involves testing for both cognitive ability and personality (Schmidt & Hunter, 1998).

Cognitive ability, also known as general mental ability, intelligence, or simply IQ, is one of the best predictors of performance across many different domains. Broadly speaking, it reflects an individual's ability to plan, reason, process information, and control his or her behaviour. Some would argue that it is in fact the best-validated construct in all of psychology, as its ability to predict performance has been repeatedly demonstrated in thousands of studies carried out across 100 years of research (Schmidt & Hunter, 2004).

Across all job categories, individual differences in cognitive ability account for approximately 25 per cent of the variability in performance. The general factor of cognitive ability predicts performance outcomes even better than aptitude tests claiming to assess the specific skills needed for a given job. If only one variable could be assessed to predict performance across multiple domains, cognitive ability would certainly be the most useful. While there has been some concern that such tests are culturally biased, there are also non-verbal tests of cognitive ability that do not discriminate against respondents from different cultural and linguistic backgrounds (Higgins et al., 2007).

Following cognitive ability, the second most important variable in performance prediction is personality. While researchers have long used a variety of trait dimensions to predict real-world outcomes, the field of performance prediction has benefited greatly from the five-factor model of personality. The five-factor framework, or 'Big Five' model, is a taxonomy that describes personality differences across five broad dimensions of variation (Goldberg, 1993). The five dimensions are extraversion, agreeableness, conscientiousness, emotional stability, and openness. These dimensions demonstrate good cross-cultural reliability, are relatively stable across the lifetime, and incorporate the variance captured by most other personality taxonomies.

In terms of performance prediction, four of these traits in particular stand out. Conscientiousness, which describes individuals who are reliable, hard-working, and self-disciplined, is the best personality predictor of workplace performance and academic success, in addition to health and longevity (Barrick & Mount, 1991). Conscientious individuals have a strong work ethic, and tend to be more effective at pursuing their goals. An individual who is low in conscientiousness will be more easily distracted, less organised, and less productive.

Following conscientiousness, the most important personality trait for predicting success across multiple domains is emotional stability. Individuals who score highly on this trait experience less negative emotion and generally handle stress better. In contrast, less emotionally stable individuals will have higher levels of chronic stress and anxiety. This trait is particularly important for predicting performance in highly demanding positions, and is also associated with increased health, job satisfaction, and lower rates of job burnout (Judge & Bono, 2001).

While most positions are best served by selecting for cognitive ability, conscientiousness, and emotional stability, certain positions can benefit from the examination of other traits as well. In particular, the outgoing, assertive, and talkative nature of extraverts gives them an advantage in domains that require extensive social interaction. Extraversion therefore appears to be a good predictor of success in sales and management positions, in addition to the variables already discussed.

Openness, finally, is a good predictor of performance in domains requiring innovation and creativity (King et al., 1996). This trait is associated with an open-minded, reflective, and exploratory mindset that facilitates divergent thinking and cognitive flexibility. Other things being equal, creative individuals tend to score higher on measures of openness than their less creative counterparts.

Faking and response bias

Although personality traits are extremely useful for predicting performance outcomes, assessment of these traits is limited to some extent by reliance on the method of self-report. In most circumstances, respondents are honest and accurate enough that scores from personality questionnaires are reliable indicators of performance potential. However, in competitive circumstances, where there is motivation to present oneself in a positive light, the predictive validity of these personality questionnaires can be diminished (Rosse et al., 1998). For example, the personality profiles obtained from job applicants tend to be considerably inflated when compared to those obtained from non-applicant samples (Birkeland et al., 2006). This may not be surprising, given that such questionnaires most commonly employ a numeric rating system, or 'Likert scale', on which participants rate their agreement with a variety of personal descriptions. Unfortunately, the transparency of such questionnaires can make them extremely vulnerable to response distortion.

One strategy for dealing with this response bias on self-report questionnaires has been to administer 'validity scales' or measures of social desirability. These questionnaires are intended to assess the extent to which an individual is responding honestly and accurately, with high scores indicating the presence of response bias. Unfortunately, using these scales has failed to improve the actual prediction of performance outcomes, casting doubt on their utility in combating biased responding (Piedmont et al., 2000).

An alternative strategy involves the use of personality questionnaires that are more resistant to biased responding in the first place. With this goal in mind, we have recently developed a 'fake-proof" measure of the Big Five personality traits (Hirsh & Peterson, 2008). This questionnaire involves a number of comparisons between equally desirable personality descriptions (e.g. 'Are you a hard worker or a creative thinker?'). This type of forced choice between two or more desirable options limits the opportunities for self-enhancement, as a respondent cannot inflate scores in one domain without simultaneously deflating scores in another domain. While traditional questionnaires lost their predictive validity when participants were instructed to 'fake good', the new questionnaire was able to predict academic performance and creative achievement outcomes even when participants were actively trying to distort their responses. Bias-resistant questionnaires such as this may prove very useful for assessing personality in competitive environments.

Individuals and group performance

An important question for the field of performance prediction is whether an emphasis on individual attributes and abilities is the best strategy for ensuring organisational fitness. In particular, one might ask whether the dispositional qualities of high-performing individuals are still relevant in the context of larger groups of people working towards a common goal. Although predictions of individual and group-level outcomes have traditionally been kept separate, more recent work has begun to combine them into multilevel models of group performance (Ployhart & Schneider, 2005).

What this research has shown is that individual-level variables remain important predictors of group-level outcomes. The success of a work team, for example, can be predicted by the cognitive ability and personality scores of its members (Barrick et al., 1998). A work team with higher average levels of Conscientiousness would thus perform better than a comparable team with lower scores on this trait. Perhaps even more interesting is the finding that even a single team member with lower levels of Conscientiousness can negatively influence the group's dynamics, leading to increased conflict and reduced team effectiveness. Cross-level research thus demonstrates that individual performance ability is still an important determinant of larger-scale organisational effectiveness.

Another pathway by which individual characteristics can influence group-level performance is through the personality of the group's leader. Just as there is a great deal of variability in other performance domains, so too is there tremendous variability in the quality of leadership. What is unique about leadership positions, however, is that they can directly influence the performance of a large number of other people. As a result, the performance of those in leadership positions has important consequences for the broader success of the organisation. Recent analyses suggest that 15 per cent of the variance in an organisation's profitability is directly influenced by the CEO's actions (Joyce et al., 2003). Thus, while good managers can inspire a group towards higher levels of motivation and productivity, bad managers can be equally effective at hindering group performance. Importantly, a leader's effectiveness is substantially influenced by his or her personality profile (Judge et al., 2002). Indeed, the personality profile of a company's CEO has important implications for the financial performance of the organisation (Peterson et al., 2003).

Situational moderators

While measures of personality and cognitive ability are good overall predictors of performance, their predictive validities can vary somewhat depending on the performance context. Although researchers examining the situational and dispositional determinants of behaviour have traditionally been at odds with one another, contemporary models acknowledge the importance of adopting an interactionist framework. These frameworks emphasise that dispositional influences on behaviour are moderated by situational affordances. The importance of a given personality trait will thus be constrained or enhanced depending on the social, organisational, and task context (Tett & Burnett, 2003). For instance, the presence of clearly structured roles, close supervision, and formalised communication systems may help to reduce the discrepancy between more and less conscientious employees, thereby reducing the importance of this trait for predicting performance. By contrast, situations involving sudden unexpected crises or requiring immediate emergency action may enhance the performance differences between more and less emotionally stable individuals, thereby increasing the extent to which this trait predicts performance. Thus, while the traits described above are able to predict performance across a large number of situations, their importance in any given situation is influenced by the behavioural context.

Comparison of effect sizes

Although valid selection techniques only predict a portion of the variance in performance, it is worth noting that even small gains in predictive validity can lead to substantial improvements in productivity, and the associated economic benefits. It is also revealing to compare the effect sizes obtained from the performance prediction literature with those from other research areas. According to reviews of the psychological literature, the middle third of all obtained effect sizes corresponds to a correlation coefficient between r = .20 and r = .30 (Hemphill, 2003). Correlations higher than r = .30 correspond to the top third of all psychological effect sizes. Based on meta-analytic findings, the mean validity of a combined test of cognitive ability and personality in predicting workplace performance is in the range of r = .65. Our ability to predict performance outcomes using dispositional measures is thus one of the strongest relationships obtained in psychological research. In contrast, there are many well-known medical relationships that actually have lower predictive validities, including the associations between smoking and lung cancer within 25 years (r = .08), ibuprofen and pain reduction (r = .14), and Viagra and improved sexual functioning (r = .38) (see Hogan, 2005). When viewed within this larger context, the effectiveness of performance prediction techniques is striking.

An invaluable tool

Across a broad number of domains, there are large individual differences in performance and productivity outcomes. In order to capitalise on these differences, however, it is necessary to use the most effective selection methods. Even small improvements in the predictive validity of selection processes can lead to substantial economic benefits. A large body of research now indicates that measures of cognitive ability and personality are powerful and efficient tools for predicting performance. While some form of selection is inevitable for any competitive position, psychological assessment remains an invaluable tool for identifying the top performers and making an informed decision.

Jacob B. Hirsh is with the Department of Psychology, University of Toronto
[email protected]

References

Barrick, M.R. & Mount, M.K. (1991). The Big Five personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44, 1–26.

Barrick, M.R., Stewart, G.L., Neubert, M.J. & Mount, M.K. (1998). Relating member ability and personality to work-team processes and team effectiveness. Journal of Applied Psychology, 83, 377–391.

Birkeland, S., Manson, T., Kisamore, J. et al. (2006). A meta-analytic investigation of job applicant faking on personality measures. International Journal of Selection and Assessment, 14, 317–335.

Bradley, N. (2005). Users of graphology. Graphology, 69, 55–57.

Goldberg, L.R. (1993). The structure of phenotypic personality traits. American Psychologist, 48, 26–34.

Hemphill, J.F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58, 78–79.

Higgins, D.M., Peterson, J.B., Lee, A.G.M. & Pihl, R.O. (2007). Prefrontal cognitive ability, intelligence, Big Five personality, and the prediction of advanced academic and workplace performance. Journal of Personality and Social Psychology, 93, 298–319.

Hirsh, J.B. & Peterson, J.B. (2008). Predicting creativity and academic success with a 'fake-proof' measure of the Big Five. Journal of Research in Personality, 42, 1323–1333.

Hogan, R. (2005). In defense of personality measurement: New wine for old whiners. Human Performance, 18, 331–341.

Joyce, W., Nohria, N. & Roberson, B. (2003). What really works. New York: Harper Business.

Judge, T.A. & Bono, J.E. (2001). Relationship of core self-evaluations traits – self-esteem, generalized self-efficacy, locus of control, and emotional stability – with job satisfaction and job performance: A meta-analysis. Journal of Applied Psychology, 86, 80–92.

Judge, T.A., Bono, J.E., Ilies, R. & Gerhardt, M.W. (2002). Personality and leadership. Journal of Applied Psychology, 87, 765–780.

King, L., Walker, L. & Broyles, S. (1996). Creativity and the five-factor model. Journal of Research in Personality, 30, 189–203.

Moles, A. (1958). Information theory and aesthetic perception. Urbana, IL: University of Illinois Press.

Peterson, R., Smith, D., Martorana, P. & Owens, P. (2003). The impact of chief executive officer personality on top management team dynamics. Journal of Applied Psychology, 88, 795–808.

Piedmont, R., McCrae, R., Riemann, R. & Angleitner, A. (2000). On the invalidity of validity scales. Journal of Personality and Social Psychology, 78, 582–593.

Ployhart, R. & Schneider, B. (2005). Multilevel selection and prediction: Theories, methods, and models. In A. Evers, O. Smit-Voskuyl & N. Anderson (Eds.) Handbook of personnel selection (pp. 495–516). London: Wiley.

Price, D. (1963). Little science, big science. New York: Columbia University Press.

Rosse, J., Stecher, M., Miller, J. & Levin, R. (1998). The impact of response distortion on preemployment personality testing and hiring decisions. Journal of Applied Psychology, 83, 634–644.

Ryan, A., Mcfarland, L., Baron, H. & Page, R. (1999). An international look at selection practices. Personnel Psychology, 52, 359–362.

Schmidt, F.L. & Hunter, J.E. (1998). The validity and utility of selection methods in personnel psychology. Psychological Bulletin, 124, 262–274.

Schmidt, F.L. & Hunter, J.E. (2004). General mental ability in the world of work. Journal of Personality and Social Psychology, 86, 162–173.

Terpstra, D. & Rozell, E. (1997). Why some potentially effective staffing practices are seldom used. Public Personnel Management, 26, 483–495.

Tett, R. & Burnett, D. (2003). A Personality trait-based interactionist model of job performance. Journal of Applied Psychology, 88, 500–517.