Personality characteristics of medical students are correlated with academic achievement and clinical skill, including preclinical and clinical grades and grade point average, National Board of Medical Examiners’ examinations, and clinical skills examinations (1–7). An additional source of evaluation of students completing third-year clinical clerkships is the clinical evaluation. Typically, this evaluation is completed by an attending physician who rates the student on a standardized form in a variety of areas, including general knowledge of the discipline, history and examination skills, professionalism, team and patient rapport, and the like. Despite wide variability in clinical evaluation methods and evidence for poor validity of these evaluations in relation to more objective indicators of student performance (8, 9), clinical evaluations are often significant contributors to the final grade received for a clerkship. Since clinical evaluations are primarily subjective, global impressions of students gleaned during relatively brief interactions on a clinical service, the extent to which student personality characteristics color these evaluations is an important issue, especially in light of the tenuous validity of these evaluations with respect to other performance assessments.
At the least, clinical evaluations of student knowledge and clinical behavior should correlate to the same (or greater) degree with independent indicators of performance in those areas than they do with student personality characteristics. For example, in multivariate analyses of personality-performance relationships among medical students, personality characteristics were significantly predictive of indicators like grades or grade point averages (1, 2, 4, 7), particularly aspects of conscientiousness. These indicators, however, had comparable levels of association or, more often, stronger associations with variables like MCAT score, previous academic performance, and ratings from admissions interviews. With respect to clinical evaluations of student performance, where relationships to other performance indicators are generally weak, it is conceivable that the largest proportion of variability in these evaluations is, in fact, explained by the demeanor of the student, as expressed through “positive” or “negative” personality traits.
Recently, Davis and Banken (10) examined this question in an obstetrics-gynecology clerkship. They studied whether personality characteristics of medical students had more predictive validity than the National Board of Medical Examiners’ subject examination with regard to end-of-clerkship clinical evaluations. They found that the Extraversion and Introversion scales of the Myers-Briggs Type Indicator, but not the subject examination score, were significantly correlated with clinical evaluations (i.e., aggregated evaluation of medical knowledge, clinical performance, patient interaction, team interaction, and initiative/work ethic). Davis and Banken concluded that obstetrics-gynecology clerkship grades, often heavily weighted by clinical evaluations, “may be more influenced by personality rather than clinical skill.”
The present study sought to extend the reasoning behind the Davis and Banken study to clinical evaluations in psychiatry. Moreover, the Davis and Banken study was limited by a small sample size (63 students), relatively few personality dimensions, the use of univariate statistical analyses, and the absence of an independent measure of clinical/interpersonal skills. The current study addressed each of these limitations. Personality characteristics of third-year psychiatry clerks were assessed using the Revised NEO Personality Inventory (NEO PI-R), a comprehensive measure of normal adult personality based on the five-factor model of personality, a model that enjoys widespread acceptance among personality theorists in psychology (11, 12). Using both univariate and multivariate statistical analyses, clinical evaluations of knowledge/skill and interpersonal behavior were examined in relation to personality variables, National Board of Medical Examiners’ subject examination performance, and clinical performance as assessed on an Objective Structured Clinical Examination. It was hypothesized that student personality characteristics, particularly those related to extraversion, would explain significantly more variance in clinical evaluations than the National Board of Medical Examiners’ subject examination (general knowledge) or the Objective Structured Clinical Examination (clinical skills).
Study Sample, Setting, and Recruitment
Third-year medical students completing the clinical clerkship in psychiatry during the 2005–2006 academic year at St. Louis University were recruited to participate. At clerkship orientation, students were apprised of the opportunity to participate in this research, including a sign-up sheet that detailed the study purpose and the times, dates, and places for participation.
The Revised NEO Personality Inventory (NEO PI-R) was used to assess personality (11). Consistent with the five-factor model of personality, the NEO PI-R measures five domains of normal adult personality (it does not measure psychopathology) and six personality traits per domain (30 traits total). The domains include neuroticism (N) (higher scores=stronger tendency to experience negative affect), with traits of anxiety, angry hostility, depression, self-consciousness, impulsiveness, and vulnerability; extraversion (E) (higher scores=stronger tendency toward sociability and affability), with traits of warmth, gregariousness, assertiveness, activity, excitement-seeking, and positive emotions; openness (O) (higher scores=stronger tendency toward divergent thinking, creativity, emotionality, and unconventionality), with traits of fantasy, aesthetics, feelings, actions, ideas, and values; agreeableness (A) (higher scores=stronger tendency toward trust, altruism, empathy, and cooperativeness), with traits of trust, straightforwardness, altruism, compliance, modesty, and tender-mindedness; and conscientiousness (C) (higher scores=stronger tendency toward purposefulness, reliability, goal striving, and self-discipline), with traits of competence, order, dutifulness, achievement striving, self-discipline, and deliberation. Scores are norm-referenced, gender-based T scores (mean=50, SD=10). As detailed in the NEO PI-R Professional Manual (11), there is substantial conceptual and psychometric research to support the NEO PI-R as a valid, reliable, and comprehensive measure of normal adult personality.
The student’s national percentile rank (adjusted for academic year quartile) on the National Board of Medical Examiners’ subject examination was used as an indicator of psychiatry knowledge. A psychiatry Objective Structured Clinical Examination was used to assess clinical skills (13). It includes six “stations” where all students interviewed a standardized patient portraying a psychiatric disorder. Binary checklists reflected student coverage of content regarding history of present illness (0–30 points), physical examination (0–15 points), and communication of information (0–15 points). Also, standardized patients evaluated the student’s interpersonal behavior (e.g., warmth, respect, empathy, listening skills, openness) using the Patient Perception Questionnaire (PPQ; 0–30 points) (14). Following each station, students also wrote a clinical note, graded on a pass/fail basis (0–25 points). The sum of the Objective Structured Clinical Examination scores (0–115) was also calculated.
At the conclusion of the psychiatry clerkship, each student received a clinical evaluation from an attending physician. The standardized evaluation form addresses 12 areas that are rated on a 5-point Likert-type scale (1=unacceptable; 2=weak; 3=satisfactory; 4=very good; 5=excellent): general knowledge of psychiatry, history taking, physical examination, mental status examination, discrimination/focus of clinical data, communication of clinical data to colleagues, differential diagnosis, diagnostic and therapeutic planning, professionalism in patient care, motivation to learn, patient rapport, and health care team rapport.
The Revised NEO Personality Inventory data were collected in weeks two and five of each of the eight, 6-week Psychiatry clerkship rotations in the 2005–2006 academic year. Students voluntarily presented to a large classroom in the Department of Psychiatry on one of the scheduled days, 6–12 students per session. After complete description of the study to the students, written informed consent was obtained from students who were willing to participate. Most students completed the NEO PI-R within 1 hour. Answer sheets were electronically scored. After clerkship grades were submitted to the School of Medicine, students received a summary profile of their NEO PI-R results, with the option of discussing their results with the investigators. Clerkship directors and faculty had no access to student NEO PI-R results.
All analyses were conducted using SPSS 13.0 for Windows (SPSS, Inc., Chicago, 2004). Chi-square test of association was used to compare study participants with nonparticipants regarding gender and clerkship quartile of the academic year (1=July-Sep, 2=Oct-Dec, 3=Jan-March, 4=April-June).
Initially, the 12 clinical evaluation items were subjected to a principal components analysis in order to reduce the items to fewer, internally consistent composites. Components (factors) with eigenvalues in excess of 1.0 were retained and rotated using Varimax rotation. Factor loadings ≥ |0.30| were interpreted as significant. Factor scores were calculated from the rotated factors using the standard regression approach, yielding uncorrelated factor scores (mean=0, SD=1). Internal consistency reliability of the resulting clinical evaluation scores was estimated using Cronbach coefficient alpha.
Pearson correlations were used to examine the association of demographic variables (age, gender, and clerkship quartile) and National Board of Medical Examiners, Objective Structured Clinical Examination, and NEO PI-R scores with the clinical evaluation scores (derived from the principal components analysis). If multiple demographic variables, NEO PI-R domain scores, and/or NEO PI-R trait scores were correlated with the clinical evaluation scores, these variables were included in a canonical correlation analysis. Canonical correlation is a multiple regression approach that allows for more than one criterion (outcome) variable (i.e., a set of ≥2 predictors is used to predict a set of ≥2 criterion variables) (15). Initially, predictor set variables and criterion set variables are weighted and linear composites (canonical variates) are derived, such that the correlation of the variates is maximized (the canonical correlation). Iteratively, successive canonical correlations (and variates) are derived from residual variance (variance not explained by previous canonical correlations). The number of canonical correlations extracted is equal to the lowest number of variables in either the predictor or criterion set. Once canonical variates are extracted, canonical loadings (or structure correlations) are calculated for each variable in the predictor and criterion sets. Loadings represent the linear correlation of a given variable with its respective variate and are, therefore, an indicator of the influence of a given variable on the correlations obtained. Loadings ≥ |0.30| were interpreted as significant. The redundancy index is the final indicator of relevance in canonical correlation analysis. This index, equivalent to R2 in multiple regression analysis, represents the percentage of variance in the criterion set variables (evaluated one at a time and averaged) that is explained by the canonical variates for the predictor set.
Of the 150 students who completed the psychiatry clerkship, 133 (88.7%) participated in the study. Seventy-eight of 133 (58.6%) were men and 55 of 133 (41.4%) were women, with a mean age of 25.5 years (SD=1.7). Regarding gender, the 17 nonparticipants (58.8% men) did not differ significantly from participants, χ2=0.0, df=1, p=0.99. Age data were unavailable for nonparticipants. Participation rate by clerkship quartile was 40 of 42 (95.2%) for quartile one, 31 of 36 (86.1%) for quartile two, 31 of 35 (88.6%) for quartile three, and 31 of 37 (83.8%) for quartile four, a nonsignificant comparison (χ2=2.9, df=3, p=0.41).
Initial Analysis of Clinical Evaluation Items
Ratings for the 12 clinical evaluation items were analyzed using principal components analysis with Varimax rotation. The items reduced to two factors that explained 65.4% of the variance. Factor 1, labeled “knowledge and skill,” accounted for 39.0% of the variance and included differential diagnosis (factor loading = 0.81), diagnostic/therapeutic planning (0.80), general knowledge (0.80), discrimination/focus of clinical data (0.78), history taking (0.73), mental status examination (0.67), communication of clinical data to colleagues (0.66), and physical examination (0.64). Internal consistency reliability of this factor was 0.93. Factor 2, labeled “interpersonal behavior,” accounted for 26.4% of the variance and included professionalism in patient care (0.86), health care team rapport (0.83), patient rapport (0.73), and motivation to learn (0.68). Internal consistency reliability of this factor was 0.84. No items cross-loaded on the two factors. Using the regression method, factor scores were generated for the two factors described above.
Associations with Clinical Evaluations
As displayed in Table 1
, clinical evaluations of “knowledge and skill” and “interpersonal behavior” were not significantly correlated with demographic variables, National Board of Medical Examiners’ score, or Objective Structured Clinical Examination scores. Of the NEO PI-R domain scores, conscientiousness (C) was significantly correlated with “knowledge and skill” and agreeableness (A) was correlated with “interpersonal behavior.” These variables were omitted from further, multivariate analyses.
Among the 30 NEO PI-R trait scores, warmth (E; denotes domain—Extraversion—under which trait is included) was significantly correlated (p<0.05 or less) with both clinical evaluation scores. Competence (C) and achievement striving (C) were correlated with “knowledge and skill.” Angry hostility (N), gregariousness (E), positive emotions (E), trust (A), altruism (A), compliance (A), and tender-mindedness (A) were significantly associated with “interpersonal behavior.” Based on these correlations, a canonical correlation analysis was conducted with these 10 NEO PI-R trait scores in the predictor set and the two clinical evaluation scores in the criterion set. Two significant canonical correlations emerged (r=0.37, Wilks’ λ=0.75, p=0.018; r=0.36, Wilks’ λ=0.87, p=0.048). As shown in Table 2
, loadings on the first variate indicated that higher warmth, competence, and achievement striving scores were associated with stronger clinical evaluations of “knowledge and skill.” For the second variate, higher warmth, gregariousness, positive emotions, trust, altruism, compliance, and tender-mindedness scores and lower angry hostility scores were associated with stronger clinical evaluations of “interpersonal behavior.” The two canonical variates for the predictor set explained approximately equal amounts of variance in the clinical evaluation scores, with 13% variance explained overall.
Consistent with recent research on the relationship of personality characteristics to clinical skills evaluations in obstetrics-gynecology (10), the present study of a psychiatry clerkship found significant relationships between student personality and clinical evaluations by attending physicians. Equally clearly, the results yielded no evidence to support the validity of clinical evaluations of the knowledge, skill, and behavior of medical students when evaluated against relatively more objective assessments of student performance like the National Board of Medical Examiners subject examination and a psychiatry Objective Structured Clinical Examination. It was particularly noteworthy that clinical evaluations of student “knowledge and skill” (including evaluations of students’ general psychiatry knowledge and skills in history taking/interviewing, physical examination, and diagnosis) were essentially uncorrelated with National Board of Medical Examiners performance and Objective Structured Clinical Examination scores for history, physical, and clinical notes. Moreover, clinical evaluations of “interpersonal behavior” (including ratings of student rapport, professionalism in patient care, and motivation) were also uncorrelated with the Objective Structured Clinical Examination Patient Perception Questionnaire score, a measure of students’ interpersonal behavior (e.g., warmth, respect, listening skills, empathy) in simulated clinical encounters.
These results are consistent with our hypothesis that personality factors would explain significantly more variance in clinical evaluations than the National Board of Medical Examiners or Objective Structured Clinical Examination. In fact, the moderate amount of variance that was explainable in clinical evaluations was solely attributable to student personality characteristics. The primary personality predictors of “knowledge and skill” evaluations were the domain of conscientiousness and the traits of warmth (E), competence (C), and achievement striving (C). People who scored higher on these traits were notable for their friendliness and the ease with which they form close attachments in social encounters; their sense of capability and effectiveness in interaction with the world; and their lofty aspirations, diligence, and goal-directedness. By contrast, people who scored lower on these traits may be described as socially formal and reserved, with a need for more interpersonal space; low in self-esteem, with a poor opinion of their abilities; and lackadaisical regarding goals and ambitions. Thus, clinical evaluations of student “knowledge and skill” in psychiatry were significantly higher for people-friendly, confident “go-getters,” regardless of National Board of Medical Examiners and Objective Structured Clinical Examination performance, the latter representing independent, standardized indicators of psychiatry knowledge and clinical skill, respectively.
Likewise, the domain of agreeableness and the traits of warmth (E), gregariousness (E), positive emotions (E), trust (A), altruism (A), compliance (A), tender-mindedness (A), and angry hostility (N) were associated with evaluations of “interpersonal behavior.” At high levels (low levels for angry hostility), these domains/traits describe people who are friendly and form attachments easily; prefer the company of others; more likely to experience positive affect; trusting that others are honest; concerned for the welfare of others in need; deferential and cooperative; empathic; and slow to anger. The contrasting profile describes people who are socially formal and reserved; loners; less likely to experience positive affect; cynical and skeptical; self-centered; competitive; unempathic; and quick to anger. Thus, clinical evaluations of student professionalism in patient care, team/patient rapport, and motivation were higher for students who may best be described as “good-natured and deferential,” regardless of Objective Structured Clinical Examination performance (including Objective Structured Clinical Examination Patient Perception Questionnaire and communication scores).
It may be argued that stronger clinical evaluations of “interpersonal behavior” ought to be related to positive personality characteristics of students, particularly those related to empathy, interpersonal warmth, and positivity. Yet, these same evaluations were unrelated to students’ abilities to demonstrate empathy, respect, listening skills, and warmth on an Objective Structured Clinical Examination. This pattern suggests that a “halo effect” (i.e., the tendency to judge someone favorably on specific behaviors, based on a favorable general impression) may have confounded attending evaluations of student behavior in the clinic. In other words, attending physicians may have been positively valenced toward “good natured and deferential” students, which was generalized to their perceptions of student professionalism and rapport during the clerkship. An alternative explanation is that the Objective Structured Clinical Examination is not a particularly valid representation of student behavior with actual patients in a nonsimulated, nonevaluative context, an argument that has been explored in the literature (16). The possibility must at least be considered that the relationship between student personality characteristics and clinical evaluations of interpersonal behavior might be mediated by genuinely good clinical performance with real patients in an actual clinic.
The present results raise serious questions about the validity of clinical evaluations of medical students completing a psychiatry clerkship. Granted, the study is limited in generalizability by virtue of its inclusion of one class of medical students from one school. Nevertheless, in combination with the findings of Davis and Banken (10), and given the fact that clinical evaluations can strongly influence clerkship grades (which in turn can strongly influence residency application transcripts and letters of recommendation) (8–10), the present results should stimulate further research on the validity and utility of clinical evaluations. For example, multiple inputs to the clinical evaluation of students—from attendings, residents, nurses, peers, etc.—may enhance the validity of clinical evaluations. In general, others are more likely to be positively disposed toward people with good-natured personalities, including patients toward their physicians and attendings toward their students. Yet, it seems equally important that clinical evaluators be able to separate, to a considerable degree, the relatively fixed personality tendencies of the student from the acquired knowledge and skills of the student. Otherwise, clinical evaluations may favor students with positive personality traits, regardless of performance.