0
1
BRIEFREPORT   |    
Using Standardized Patients’ Marks in Scoring Postgraduate Psychiatry OSCEs
Paul Whelan, M.D., M.Sc., M.R.C.Psych; Laurence Church, M.D., M.Sc., M.R.C.Psych; Khaled Kadry, M.D., M.R.C.Psych
Academic Psychiatry 2009;33:319-322. 04090108
View Author and Article Information

Received August 20, 2007; revised November 19, 2007, and March 16, 2008; accepted April 23, 2008. The authors are affiliated with South London and Maudsley NHS Foundation Trust in London. Address correspondence to Paul James Whelan, South London and Maudsley NHS Trust, Old Age Psychiatry, Guy’s Hospital, Weston St, London, SE1 3RR, United Kingdom; paul.whelan@nhs.net (e-mail).

Copyright © 2009 Academic Psychiatry

Abstract

Objective: Standardized patients (SPs) do not contribute scores in postgraduate psychiatry objective structured clinical examinations (OSCEs) in the United Kingdom. However, this may change in the near future. The primary aim of this study was to measure the degree of agreement between scores given by examiners and those given by SPs in an OSCE. Methods: The authors measured the degree of agreement in two consecutive postgraduate OSCEs for psychiatric residents on a London training scheme. Results: Fifty-five candidates participated in the two OSCEs. There was a moderate degree of agreement between examiner and SP scores for communication and for the overall mark. However, there was a stronger relationship between the examiner score for communication and the candidates’ overall mark. Conclusion: Examiners and SPs scored candidates differently. Therefore, the decision to include SP scores in the marking scheme for postgraduate OSCEs would be a significant development.

Abstract Teaser
Figures in this Article

The Objective Structured Clinical Examination (OSCE) is a well-established examination technique used at all levels of medical education. Originally introduced to reduce some of the problems associated with traditional clinical examinations, OSCEs have been shown to have higher reliability and validity than less structured oral examinations (1, 2). However, educators in psychiatry were slow to adopt the OSCE despite this format having been shown to have acceptable validity in assessing psychiatry trainees (3).

Membership into the Royal College of Psychiatrists is mandatory for psychiatric residents in the United Kingdom to progress in their training and become consultants (attendings). Membership is contingent upon passing both written and clinical parts of an examination set by the College, which is taken by psychiatry trainees in stages, during their first three years of training (equivalent to years 1–3 of the U.S. residency program). Candidates who successfully pass a series of three multiple-choice examinations proceed to take an OSCE in the third year of the program. An OSCE format was introduced in the spring of 2003 (4), representing one of two types of clinical examinations administered by the College at that point in time (the other examination was a traditional “long case” with a real patient). In 2008, the “long case” was discarded and the OSCE examination was adapted slightly to allow for testing of an increased level of complexity given that this now represents the only clinical examination taken by U.K. residents before they complete their basic psychiatry training [see Whelan et al. (5) for further details of the examination format].

In the Royal College of Psychiatrists OSCE, standardized patients (SPs) are used in lieu of real patients. In some U.K. universities, SP scores contribute up to 20% of the candidates’ overall marks in undergraduate examinations. However, due to concerns about poor reliability and validity, SP marks are not used in postgraduate-level examinations in the U.K. Within the Royal College of Psychiatrists, there is currently a movement toward SP marks for communication contributing to the overall candidate score. If this is the case, it has implications for the validity of the marking scheme. Although SPs may measure different components of candidates’ abilities in an OSCE than examiners, there needs to be a reasonable degree of agreement between the two to render the former valid. The primary aim of this study is to measure the degree of agreement between scores given by examiners and SPs in a psychiatry OSCE.

Few studies have looked at this particular area, especially relating to psychiatry OSCEs taken at the postgraduate level. A patient perception scale, similar to the Royal College of Psychiatrists OSCE measure of communication, was used by McLay et al. (6) in a study of a single OSCE taken during a medical student psychiatry clerkship. This showed no significant correlation between scores given by SPs and independent reviewers who rated videotapes of the interviews. However, the SP marks did correlate with the student score in other examination formats (e.g., written papers and ward grades), suggesting some validity. More recently, SP assessments of a broad range of clinical skills in a fourth-year medical student OSCE were studied for evidence of a variety of common “rater errors” (7). The SPs rated with a significant degree of error in severity (i.e., too severe or too lenient), raising the danger of poor reliability of their scores. There were also errors of inconsistency, causing random variability in marking. The authors suggested that the dual role of portraying the character in the OSCE and attempting to mark the students could give rise to the SP making two potential sources of error. The effect of a faculty member acting as an examiner may reduce this problem.

However, regardless of the correlation between examiner and SP scores, good communication skills are considered vital to performing well in an OSCE, especially so in psychiatry. The secondary aim of the study is to test whether candidates’ scores for communication predicts their overall (pass/fail) mark for the station.

An OSCE is administered twice a year to residents on a London psychiatry rotation in preparation for the Royal College of Psychiatrists examination. Examiners score candidates on a number of criteria depending on the clinical scenario of the station. Candidates are examined on tasks ranging across history taking, practical skills, physical examination, and emergency management.

The OSCE replicates the real examination, except that SPs also give marks for candidates’ communication and an overall score. This method was chosen because candidates may receive helpful feedback on their communication skills from the SPs. The SPs used in the examination are part of the SP program of the medical school with which the training scheme is affiliated. As part of the program, SPs receive training on how to role play in a standardized fashion. Prior to the OSCE, they are sent a description of the patient they will be playing. On the morning of the OSCE, there is a briefing by a lead examiner to remind the SPs of their roles. Finally, they meet with their paired examiner at their station to clarify any final details.

The examiners are mostly psychiatry attendings and a few senior fellows who have already passed the Royal College of Psychiatrists examination. The majority have vast experience in examining in OSCEs. Despite this, all examiners are given training prior to the OSCE on how to examine and score candidates in a standardized fashion. The majority of the SPs and a number of examiners participate in the real Royal College of Psychiatrists examination, for which they receive further training from the Royal College (e.g., watching videos of “good” and “bad” OSCE performances, group discussion of marking schemes).

Marks were allocated on a rank basis: A=excellent; B=good; C=pass; D=borderline fail; and E=fail. The examiners’ scoring sheet is the same as the validated one used by the Royal College in the real examination.

The agreement between the examiner and the SP scores for communication and the overall mark was measured in two consecutive mock OSCEs (spring and fall, 2005). SPs were only asked to mark candidates in these two areas, whereas the examiners also had to allocate scores for other technical and skills domains. However, for the purposes of the study, only the examiner scores for communication and the overall mark were correlated with SP scores in the same domains. The degree of agreement between examiner and SP scores was calculated using Cohen’s weighted kappa. The interpretation of the kappa coefficient was based on standard classification (8).

A total of 55 candidates participated in the OSCEs (27 in the spring, 28 in the fall). There were 11 stations in the spring exam that had paired SPs and examiners, because one station did not require a SP. There were 12 stations in the fall examination. This provided 633 potential sets of associations. However, sometimes examiners failed to give marks (once for communication and nine times for the overall mark); SPs failed to give marks 20 times for communication and 22 times for the overall mark.

The results of the correlations between examiner and SP scores are presented in Table 1. Agreement was lowest (0.41) between the SP communication score and the examiner overall score. Agreement was highest (0.56) between the examiner score for communication and the overall mark that they gave.

Doctors work and train in an age of increasingly standardized assessment. As such, postgraduate clinical examinations are moving toward an OSCE format. In the near future, all U.K.-trained psychiatric residents will be examined only by means of an OSCE in the clinical part of their postgraduate examinations.

Standardized patients are widely regarded as proxies for real patients. With the increasing emphasis on appraisal and patient feedback in medical practice, the incorporation of SP scores into the overall marking scheme for postgraduate OSCEs may be foreseen (9). Despite the face validity of doing so, this study shows that caution may be needed in this proposed action, because there was only a moderate degree of agreement between examiner and SP marks. Experienced physician-examiners are the closest thing to a “gold standard” in an OSCE. As such, including SP scores lacks concurrent validity.

Our findings are in keeping with those of McLaughlin et al. (10), who reported a weak correlation (coefficient 0.4) between physician-examiners’ scores and those of SPs. Such a level of agreement may not be adequate to rate performance in such an important examination. McLaughlin et al. (10) also found that, unlike physician-examiner scores, SP scores were not related to other measures of competence, like multiple-choice examination marks.

A potential weakness of this study is that the difference between SP and examiner scores may be attributable to the normal variation between examiners found in OSCEs. Although interexaminer data were not collected during the two OSCEs reported here, they were during subsequent examinations. In the fall 2006 OSCE, an external examiner scored a proportion of candidates in addition to the usual examiners. There was a high degree of correlation between scores (coefficient 0.67). This examination was organized in a very similar manner, with mostly the same SPs and examiners as those used in the 2005 OSCEs.

Another potential limitation is that the marking sheet used by SPs has not been validated. This may especially apply to the SP overall mark, which requires the examiner to score the candidate based on a number of technical criteria beyond the knowledge of a layperson. SPs were not trained in the clinical skills assessed in each station, raising concerns over the validity of their scores for overall performance. However, even in the domain of communication, the degree of agreement between examiners and SPs remained only moderate. Communication skills appear to be perceived differently according to one’s role in the OSCE. Furthermore, judgment of communication is likely to be extremely subjective and hard to quantify. This is in keeping with previous studies that have found scores from OSCE communication checklists to be poor predictors of patient perceptions of doctor communication (11).

We decided to look at potential reasons why SPs and examiners score candidates differently. We reviewed the literature on medical communication skills, and drew up a list of 10 common domains for good communication (ranging from not using medical jargon to responding to nonverbal cues). In the fall 2006 OSCE, examiners (n=21) and SPs (n=11) were asked to rate on a 5-point scale how much they agreed that each quality was important when scoring candidates for communication. When a Mann-Whitney U test was applied, there were no significant differences between the examiner and SP responses. Caution needs to be used when interpreting the results of this post hoc work, which only had a small number of participants. However, it seems that explaining why SPs and examiners mark communication differently is difficult. Identifying the reasons for this is important and should be the focus of future research using a more sophisticated study design.

There may only be moderate agreement between examiner and SP scores, but the correlation between examiner scores for communication and the overall mark is stronger. This emphasizes the importance of good communication skills in passing psychiatry OSCEs: it is not what one says, but how one says it. Psychiatry educators must now decide on the importance of to whom one is saying it.

TABLE 1. Correlation Between Examiner and Standardized Patient Marks

At the time of submission, the authors disclosed no competing interests.

.
Kowlowitz VHA, Sloane PD: Implementing the Objective Structured Clinical Examination in a traditional medical school. Acad Med 1991; 66:345–347
 
.
Hodges B, Hanson M, McNaughton N, et al: What do psychiatric residents think of an OSCE? Acad Psychiatry 1999; 23:198–204
 
.
Hodges B, Regehr G, Hanson M, et al: Validation of an Objective Structured Clinical Examination in psychiatry. Acad Med 1998; 73:910–912
 
.
Sauer J, Hodges B, Santhouse A, et al: The OSCE has landed: one small step for British psychiatry? Acad Psychiatry 2005; 29:310–315
 
.
Whelan P, Lawrence-Smith G, Church L, et al: Goodbye OSCE, hello CASC: organizing and evaluating a mock CASC examination and course. Psychiatr Bull 2009; 33:149–153
 
.
McLay MA, Rodenhauser P, Anderson DS, et al: Simulating a full length psychiatric interview with a complex patient. Acad Psychiatry 2002; 26:162–167
 
.
Iramaneerat C, Yudkowsky R: Rater errors in a clinical skills assessment of medical students. Eval Health Prof 2007; 30:266–283
 
.
Altman DG: Practical statistics for medical research. London, Chapman & Hall, 1991
 
.
Greco M, Pocklington S: Incorporating patient feedback into vocational training: an interpersonal skills development exercise for GP trainers and registrars. Educ Prim care 2001; 12:285–291
 
.
McLaughlin K, Gregor L, Jones A, et al: Can standardized patients replace physicians as OSCE examiners? BMC Med Educ 2006; 6:12
 
.
Mazor KM, Ockene JK, Rogers HJ, et al: The relationship between checklist scores on a communication OSCE and analogue patients’ perceptions of communication. Adv Health Sci Educ Theory Pract 2005; 10:37–51
 
TABLE 1. Correlation Between Examiner and Standardized Patient Marks
+

References

.
Kowlowitz VHA, Sloane PD: Implementing the Objective Structured Clinical Examination in a traditional medical school. Acad Med 1991; 66:345–347
 
.
Hodges B, Hanson M, McNaughton N, et al: What do psychiatric residents think of an OSCE? Acad Psychiatry 1999; 23:198–204
 
.
Hodges B, Regehr G, Hanson M, et al: Validation of an Objective Structured Clinical Examination in psychiatry. Acad Med 1998; 73:910–912
 
.
Sauer J, Hodges B, Santhouse A, et al: The OSCE has landed: one small step for British psychiatry? Acad Psychiatry 2005; 29:310–315
 
.
Whelan P, Lawrence-Smith G, Church L, et al: Goodbye OSCE, hello CASC: organizing and evaluating a mock CASC examination and course. Psychiatr Bull 2009; 33:149–153
 
.
McLay MA, Rodenhauser P, Anderson DS, et al: Simulating a full length psychiatric interview with a complex patient. Acad Psychiatry 2002; 26:162–167
 
.
Iramaneerat C, Yudkowsky R: Rater errors in a clinical skills assessment of medical students. Eval Health Prof 2007; 30:266–283
 
.
Altman DG: Practical statistics for medical research. London, Chapman & Hall, 1991
 
.
Greco M, Pocklington S: Incorporating patient feedback into vocational training: an interpersonal skills development exercise for GP trainers and registrars. Educ Prim care 2001; 12:285–291
 
.
McLaughlin K, Gregor L, Jones A, et al: Can standardized patients replace physicians as OSCE examiners? BMC Med Educ 2006; 6:12
 
.
Mazor KM, Ockene JK, Rogers HJ, et al: The relationship between checklist scores on a communication OSCE and analogue patients’ perceptions of communication. Adv Health Sci Educ Theory Pract 2005; 10:37–51
 
+
+

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of APA editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe



Related Content
Articles
Books
The American Psychiatric Publishing Textbook of Geriatric Psychiatry, 4th Edition > Chapter 33.  >
Dulcan's Textbook of Child and Adolescent Psychiatry > Chapter 62.  >
Dulcan's Textbook of Child and Adolescent Psychiatry > Chapter 62.  >
The American Psychiatric Publishing Textbook of Substance Abuse Treatment, 4th Edition > Chapter 28.  >
The American Psychiatric Publishing Textbook of Substance Abuse Treatment, 4th Edition > Chapter 32.  >
Topic Collections
Psychiatric News
Read more at Psychiatric News >>