Describe and evaluate contemporary use of personality measurement and testing, focusing on issues of reliability and validity, using empirical evidence to support your arguments.
Personality tests are widely used these days in both professional and informal settings. One may take a personality test online, for example; in order to determine how much like a film character they are, or they may take one in an employment process or clinical setting. This essay, however, will only be looking at formal tests. The tests have many uses, including recognizing psychological disorders or calculating future behaviour (Plotnik, 2002). It is important to note that personality tests are like any other instrument used to increase understanding of a topic and, like all instruments and methods, their use can cause both affirmative and undesirable results (Anastasi & Urbina, 1997).
We will look at both projective and self-inventory tests and compare their methods in terms of their validity and reliability. There are many different personality tests available today but we are only concerned with the Rorschach Ink Blot test (henceforth referred to as the Rorschach), the Thematic Apperception Test (TAT) and the Minnesota Multiphasic Personality Inventory – 2 (MMPI-2). Before the discussion of validity and reliability of the tests it is essential to comprehend just what personality is and to obtain an elementary outline of the three tests being discussed.
Personality is interpreted as a mixture of a person’s consistent behaviour, emotion and thought that illustrate the technique an individual exemplifies when responding to another individual or situation (Letzring, Wells & Funder, 2006). The specific reaction an individual presents is unique and affects their daily life in how they organise events, control emotions and make decisions. Eysenck (2004) outlines personality in a more thorough manner as he discusses it in terms of its stability, regularity throughout life, internalisation. It comes from within and is affected and changed by the individual, and its individuality to the person, as it differs between individuals in the same situations. The use of personality tests for psychologists, employers, and doctors to analyse a character and draw conclusions can be very useful, although what makes an accurate test of personality is a controversial topic, which will become evident as the essay progresses.
Projective tests are one type of personality test and consist of ambiguous words or images that a participant must interpret. They derive from the psychoanalytic approach of psychology suggesting that personality is governed by unconscious urges and thoughts. These images cause the participant to project internal needs and wants onto the stimulus presented (Passer & Smith, 2008). Projective tests are believed to contact the deepest levels of personality structure therefore offering an impression of the individual as a whole and not simply one characteristic of personality (Rose, Kaser-Boyd & Maloney, 2001).
The Rorschach Ink Blot test is one of the oldest projective tests, developed in 1921 by Hermann Rorschach. It consists of 10 ambiguous ink blots that a participant interprets as an image and this interpretation is analysed to create an impression of the individual’s personality (Weiner, 1998). The Rorschach is a very controversial test due to the numerous different coding schemes available to it and the subjectivity of the coding between different psychologists (Blatt, 2000; Aronow, Reznikoff, & Moreland, 1995; Lilienfeld, Wood & Garb, 2000 as cited in Kaplan & Saccuzzo, 2005; Rose, T., Kaser-Boyd, N. & Maloney, M. P., 2001.)
The Thematic Apperception Test is another example of a projective test. It consists of 30 ambiguous scenes that a participant has to make a story out of. The participant interprets what is happening, the emotions of the characters and the ending that will follow the scene. The psychologist then codes the test based on the desires, motivations and worries of the character including how it ended (Kaplan & Saccuzzo, 2005). The TAT is used to understand the personalities of both normal and mentally ill patients and is therefore often used in a clinical setting (Plotnik, 2002).
An alternative way of measuring personality is by using self-report inventories. These are normally in a paper and pencil format although they are becoming increasingly common to be computer based. They consist of a series of statements or questions that a participant reads and then decides how much each one relates to their personality. The majority of people will probably have completed a self-report inventory in their life time as they are commonly found in hospitals, market research surveys and in online personality tests. These tests are often used to examine past, present and hypothetical behaviours and therefore are useful in predicting behaviours in certain situations.
Examples of self-report inventories for testing personality include the 16 personality factors questionnaire by Raymond Cattell (1988), which was developed to evaluate people based on his trait theory of personality, the California Personality Inventory, which was designed by Gough (1956) to measure characteristics such as self-control and independence, and the MMPI-2. The MMPI-2, originally known as the MMPI when it was first developed in the 1940s, but later revised to become the MMPI-2, is a set of 567 statements that can either be rated as true or false by the participant. The statements relate to relationships, health, irregular behaviour as well as religious, social and sexual attitudes (Passer & Smith, 2008).
The MMPI-2, according to research, is found to be high in reliability (Tarescavage, A. M.; Wygant, D. B.; Boutacoff, L. I. and Ben-Porath, Y. S., 2013). However, it could be affected by test-retest reliability as people may change their response to the questionnaire based on their mood that day or on their current situation. Test-retest indicates the regularity of results when participants are retested using the same or equivalent test at a different time (Passer & Smith, 2008). The MMPI-2 may also be affected by social desirability bias due to demand characteristics being present in the questions or participants wanting to appear in a positive way. This would reduce the internal validity. Internal validity refers to how strong the cause and effect relationship is based on any influence from confounding variables.
However, it is shown to be high in internal validity due to the use of its lie scale, f scale, which shows if a person is faking good or bad, and its ‘cannot say’ scale, if a participant answers ‘cannot say’ to 30 or more questions the result is invalid. These allow researchers to determine if a participant was responding truthfully and therefore if their result will be valid. The MMPI-2 could be said to have low external validity due to its use of college students only as participants. External validity is the extent to which research can be generalised to everyday situations and to its target population. However, research has shown that these samples have not affected results and therefore do not influence the external validity (McCray, Bailly & King, 2005).
The Rorschach is a much more controversial test in terms of its considered reliability. It can be seen to be lacking in inter-rater reliability. Inter-rater reliability refers to the consistency with which researchers assess the participants’ responses; high inter-rater reliability is found when researchers get similar results. This is due to the analysis of a participant’s answers being extremely subjective as to how the researcher interprets them. The reliability is further reduced by the creation of numerous coding schemes for results, with each giving a different analysis of the results. Test-retest may also affect reliability as participants may interpret the inkblot differently based on their state on that day (Groth-Marnat, 2009).
In terms of validity, the Rorschach test can be seen to lack face validity because many psychologists have disputed the strength of the link between interpreting ink blots and personality type. For this reason it can also be seen to lack convergent validity. Face validity refers to how much a test appears to measure what it aims to at face value. Convergent validity tests that constructs that are believed to be related are actually related. Content validity can also be seen to be lacking as the test was not originally intended for measuring personality but rather to produce a profile of schizophrenia. Therefore, how can it be seen as a measure that represents all elements of the construct when it was not intended for this construct originally? External validity can also be seen to be lacking as the task of interpreting ink blots is not relatable to real life situations.
TAT is an alternative projective test of personality that has various reliability and validity issues. Reliability can be seen to be generally low. Internal consistency is low for TAT due to the scoring systems. Internal consistency refers to how strongly articles relate to one another. However, every card in the TAT test signifies alternative situations and, therefore, should not produce similar responses (Cramer, 1999). Inter-rater reliability and test-retest reliability are also lacking due to their high variability across scoring methods (Murstein, 1963). Conversely, it could be argued that the test is not affected by test-retest reliability as it measures internal conditions.
The internal validity is also extremely low as researchers have been shown to code and analyse participants’ answers at just above chance and when paired with the MMPI it actually reduced the accuracy of results (Wildman, 1975). This means that the TAT lacks test validity as very little meaning can be placed upon the results from the test as they may be due to chance. It could also be seen to lack face validity as linking how a person interprets pictures to what type of personality they have may seem to lack a causal relationship. The external validity is also low due to the unrealistic task in relation to real life. It is not representative of a real life situation and is therefore difficult to generalise.
Examination of personality tests is still frequently occurring and, though different tools are being made available to acknowledge clinical issues, traditional methods endure in stimulating additional research and remain widely used. Remaining the most widely used of methods for clinical and research based use is the MMPI-2, although the TAT and Rorschach are also used in the clinical setting, in spite of expectations opposing this (Butcher & Rouse, 1996).
The data suggests that conflicting views will always be present when considering personality tests and their accuracy in the study of participants’ personalities. The necessity for high reliability and validity arise as the most vital features when determining a tests ability to measure personality; however there appears to be great deliberation in relation to the three tests discussed here. Overall, it can be seen that all personality tests discussed lack external validity. However, the MMPI-2 had the highest face validity and the Rorschach had the lowest content validity. In terms of reliability the TAT and Rorschach tests both lack inter-rater reliability and all three may lack test-retest reliability.
This tells us that personality tests in general find it difficult to use a method with high external validity. Future research should focus on correcting this by observing participants in a situation and seeing how different participants respond. The participants could also complete the MMPI-2, TAT or Rorschach as part of the study and then results could be compared with the coded behaviour from the observation. The points discussed also tell us that it is hard to avoid demand characteristics in personality tests. Future research could conduct more natural observations so people would believe the situation to be real. This again could be combined with a personality test to see if the way personality is coded links to behaviour displayed.
-Anastasi, A. & Urbina, S. (1997). Nature and use of psychological tests. _Psychological testing_ (7th ed, pp. 2-31).
-Butcher, J. N. & Rouse, S. V. (1996). Personality: _Individual differences and clinical assessment._ Annual Review Psychology, _47_, pp 87-111.
-Cattell, R. B., Eber, H. W., & Tatsuoka, M. M. (1988). _Handbook for the sixteen personality factor questionnaire (16 PF)_. Champaign, Illinois: Institute for Personality and Ability Testing.
-Cramer, P. (1999). _Future directions for the Thematic Apperception Test._ Journal of Personality Assessment, 72, pp 74-92.
-Eysenck, M. (2004). Personality. _Psychology: An international perspective._ Have, UK: Psychological Press. (pp. 445-481).
-Gough, H. G.(1956). _California Psychological Inventory_. Palo Alto, CA, England: Consulting Psychologists Press. 40 pp
-Groth-Marnat, G. (2009). Handbook of psychological assessment (5th ed.).
Hoboken, NJ: John Wiley & Sons, Inc.
-Kaplan, R. M. & Saccuzzo, D. P. (2005). Projective personality tests. In _Psychological testing: Principles, applications, and issues._ Belmont, CA: Wadsworth Thomson Learning. (6th ed., pp. 390-420).
-Letzring, T., Wells, S., & Funder, D. (2006). Information quantity and quality affect the realistic accuracy of personality judgment. _Journal of Personality and Social Psychology, 91,_ 111-123.
-McCray, J., Bailly, M. & King, A. (2005). _The external validity of MMPI-2 research conducted using college samples disproportionately represented by psychology majors_. Personality and Individual Differences, Vol 38, Issue 5, pp 1097-1105
-Murstein, B. I. (1963). _Theory and Research in Projective Techniques (Emphasizing the TAT)._ New York, NY: John Wiley & Sons
-Passer, M. W., & Smith, R. E. (2008). _Psychology: The science of mind and behavior_ (4th ed.). New York, USA: McGraw-Hill.
-Plotnik, R. (2002). _Introduction to Psychology_ (6th ed.). California, USA: Wadsworth-Thomson Learning.
– Rose, T., Kaser-Boyd, N. & Maloney, M. P. (2001). _Essentials of Rorschach Assessment._ Canada: John Wiley & Sons Inc.
-Tarescavage, A. M.; Wygant, D. B.; Boutacoff, L. I. & Ben-Porath, Y. S. (2013). _Reliability, validity, and utility of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) in assessments of bariatric surgery candidates._ Psychological Assessment, Vol 25(4), pp 1179-1194
-Weiner, I. B. (Ed.). (1998). _Principles of Rorschach Interpretation._ New Jersey, US: Lawrence Erlbaum Associates Inc.
-Wildman, R. W., & Wildman, R. W. II. (1975). _An investigation into the comparative validity of several diagnostic tests and test batteries_. Journal of clinical Psychology, 31, pp 455 – 458.