0
1
Commentary   |    
Using an Objective Structured Clinical Examination in a Psychiatry Residency
Karen Broquet, M.D.
Academic Psychiatry 2002;26:197-201. 10.1176/appi.ap.26.3.197
View Article Information
Objective Structured Clinical Examination (OSCE)Standardized Patients
Dr. Broquet is an Associate Professor in the Departments of Internal Medicine and Psychiatry, Southern Illinois University School of Medicine, Carbondale, IL. Address correspondence to Dr. Broquet, Department of Internal Medicine, Southern Illinois University School of Medicine, P.O. Box 19636, Springfield, IL 62794-9636.
Objective structured clinical examinations (OSCEs) have become a respected and relatively well validated form of learner assessment, particularly at the medical student level. An OSCE involves a series of stations at which examinees perform a variety of clinical tasks. Stations may or may not utilize a standardized patient (SP). Performance is evaluated with checklists or ratings specifically tailored to the station.
In this issue, Brian Hodges and the University of Toronto Psychiatric Skills Assessment Team provide a very nice tour of the development of an OSCE for evaluating psychiatry clerks. Psychiatric OSCEs have been less prevalent at the postgraduate level. The psychiatry residency program at Southern Illinois University (SIU) has been using a yearly structured clinical examination at the postgraduate level since 1985.
In its current form, the clinical exam at SIU consists of two stations involving a clinical task with an SP and a third, separate station with a real patient interview and an oral examination patterned on the Part II ABPN examination. The complete half-day exam (SPs and mock oral segment) is given yearly to all five PGY-2 residents. PGY-3 and PGY-4 residents receive the mock oral component only. At least one SP station requires a full clinical evaluation (history, mental status exam, recommendations to patient, and write-up of findings and treatment plan). Patient evaluation stations are allotted 30 minutes for the examination and 30 minutes for write-up. The other SP station may be a shorter (5- to 20-minute) clinical task station. We keep a pool of available cases that portray depression, borderline disorder, substance abuse, psychosis, cognitive deficits, acute mental status change, or combinations thereof. Each case has a blueprint that describes the characteristics of a patient problem, the critical items to be tested, a patient database, and a scoring rubric. The patient database contains a history, mental status and physical exam findings, and lab values where available. Both checklists and ratings are used as scoring rubrics.
Each resident is "graded" in three spheres. These include 1) faculty ratings on interview management, 2) SP ratings on interpersonal skills and professional demeanor, and 3) a checklist-based review of the write-up.
Interview management skills are rated on a 6-item, 5-point scale. Items assessed include 1) establishment of rapport, 2) use of open-ended questions, 3) demonstration of empathy, 4) control of interview, 5) exploration of suicidal ideations, and 6) explanation of recommendations. We have historically used two nonphysician faculty clinical raters, each of whom observes the entire station through a one-way mirror. In their 1990 review of SP-based exams, van der Vleuten and Swanson (+1) found interrater reliability to be quite good. Two raters are probably unnecessary. However, it is the perception of residents that the input of two people is more fair than just one, and since we have a large pool of nonpsychiatrist faculty from which to draw, we have continued to use two.
The SP feedback is a 4-point rating scale assessing interpersonal/communication skills, ability to be nonjudgmental, explanation of treatment recommendations, level of SP confidence in the physician, and an overall global rating. Typically the SPs' ratings for interpersonal skills are higher than those of the faculty. It is possible that faculty are focusing more on technical interviewing skills, or that the faculty and SP forms are truly measuring different skills (although there is some overlap in items), or that SPs are picking up on levels of warmth and empathy that are not evident from the observation room, which is some feet away.
Two psychiatrists review the resident's write-up against a checklist of critical items that must be addressed. These include necessary history and mental status findings, diagnoses/problems that need to be identified, and appropriate steps in treatment. The rating psychiatrists do not observe the encounter. During the exam, the available psychiatrists (a limited number) are used to do the oral board-type examinations and are not present to assess the clinical aspects of the evaluation in real time. Stations are videotaped, but the tapes vary greatly in quality. The write-up-based approach reinforces the importance of documentation in the real world. The abilities to document completely and appropriately and to communicate one's reasoning in written form are just as important as the process of clinical decision-making, and just as worthy of assessment. Rating psychiatrists are given significant leeway in assessment. If, for example, a resident does not amass a minimum passing number of points but the readers feel that the evaluation, diagnostic formulation, and treatment plan are competently done, a grade of "pass" may still be given. Because the raters have the checklist, they know if critical information is omitted or misidentified. We have not encountered a situation where a resident received a minimum passing score when the raters felt the evaluation was less than competent. If this occurred, it would bring the validity of the checklist into serious question.
Even though the primary purpose of the exam is not formative, it does present a good learning opportunity, and feedback is given to the residents. They receive copies of their ratings and associated comments from the SPs and the faculty observers. They receive feedback on the write-up as to whether it met an acceptable level of performance. Faculty reviewer comments are included. The residents are at liberty to review their write-up and checklist with the program director, but these documents are not given to the residents. Residents are instructed not to discuss the exam cases with classes below them. The impact of test security has been reviewed (+2) and has not been found to threaten the integrity of an exam. However, none of the studies reviewed involved hard copies of the case getting into circulation, and it seems prudent to keep them confidential to the greatest extent possible. Residents are given detailed performance feedback on their mock oral exam by the examiners.
The clinical examination presents significant benefits in assessment. First, it allows the program to evaluate patient interaction and clinical skills in one setting. Resident interviews are often observed in teaching settings, but not always an entire interview, and the clinical examination is generally perceived as a fairer, more objective form of assessment. Second, trainees are rated on specific, predetermined criteria. Because our faculty raters know most of the residents being tested, there is still the potential for a "halo effect," or the intrusion of nonclinical factors into the rating process, but it is drastically reduced. Third, the OSCE eliminates the luck factor in patient assignment, since all examinees see the same simulated patient. Most of us can still remember from our APBN Part II exam the fear of getting a "bad" patient or one who is very difficult to interview. The program can choose the type of patient and level of difficulty of the clinical task. Fourth, the clinical examination is one of the rare occasions when residents get specific feedback (from the SPs) on how they come across to patients.
+

Logistics

Many of the challenges of the exam are logistical. It is a time-consuming process. Each case takes an estimated 20 person-hours to prepare. However, most of this time investment is up front, and when cases are repeated, it is limited to the time required for training the SPs, giving the exam, and reviewing the write-ups. SIU is fortunate in that we have a fully developed SP program in the school. Although the department hosting an exam must pay the SPs as well as pay for any food, videotapes, and other such expenses, the Medical Education Department recruits SPs and provides an SP trainer (a person who coaches the SP in the nonclinical aspects of any simulation) as well as observation rooms with one-way mirrors and videotaping capability.
+

Portraying Complex Cases

Another challenge is that psychiatric cases are more difficult to simulate than nonpsychiatric. The histories that SPs must learn are more complex, and the simulations sometimes alien. Although medical and surgical simulations often entail illnesses the SP has not actually had, most SPs have experienced pain, fatigue, or other common symptoms in their life and can draw upon that when simulating a physical symptom. Everyone has been sad at some point and can relate to that in simulating depression, but most have not experienced auditory hallucinations or mania. I think the most difficult SP cases to train have been those involving personality disorders and the subtle nuances in interpersonal behavior that are necessary for a true portrayal of the case. We have overcompensated for this difficulty by training SPs to make the interactions obvious. One could make the argument that this creates a truer threshold of competency. A good clinician can pick up on subtle signs, but a minimally competent clinician must observe obvious ones. Some programs use actors as SPs. I suspect this would be a valuable asset in psychiatric cases.
+

Achieving Validity

By far the biggest challenge of the clinical exam is what to do with the results. At the medical student level, OSCEs are often used to make pass-fail decisions. An obstacle to using an OSCE as a high-stakes exam at the postgraduate level in psychiatry has been reconciling validity with reliability. Because we have not yet achieved this, no decisions regarding promotion or remediation of residents are made solely on the clinical exam.
+

Assessing at the Appropriate Skill Level

A valid station will measure accurately the skills that we as educators expect psychiatry residents to possess. These are very different from the skill level we expect of clerkship students, most of whom will not become psychiatrists. With clerkship students, we want to know if they possess basic interpersonal skills and clinical skills appropriate to the primary care setting. We expect them to identify and sometimes manage common psychiatric disorders, assess suicidality, identify and manage life-threatening situations, and know when a patient needs psychiatric consultation. This task can be accomplished in a 15-minute station.
Our expectations of residents are much higher. The level of history we expect them to take goes well beyond basic symptoms and safety issues and encompasses comorbid conditions, patterns of relationships, functional history, and defense mechanisms. We expect them to gather enough of a database to be able to form an understanding of the patient from a diagnostic and biopsychosocial perspective, and to formulate a treatment plan based on these. This is a time-consuming task.
For the SP station involving a full clinical evaluation, SIU has tried various station lengths and found that a minimum of 30 minutes with the SP is required for PGY-2 residents, with another 30 minutes for the write-up. This limits the number of stations a given resident can do. The short station assesses skills not directly tied to clinical diagnosis or treatment. This could include handling a nurse's phone call regarding an agitated patient or informing parents that their child has been diagnosed with schizophrenia. It would be easy enough to break down all stations into more focused tasks such as taking a sexual history, performing a full cognitive exam, examining for extrapyramidal symptoms, or doing a complete suicide risk assessment. However, this would not tell us if the resident knows when it is appropriate to do these things or can put all the information together. Howard Barrows, one of the pioneers of problem-based learning and assessment, used a swimming analogy: you may be able to kick your legs, move your arms, and breath rhythmically, but if you cannot swim, these individual skills are meaningless. Residents need to demonstrate that they can swim, and this takes time.
+

Achieving Reliability

Can an OSCE truly identify competence or lack thereof? Obviously, an exam has to be reliable if it is to be used to make high-stakes, pass/fail decisions. We have all seen residents who do well in certain clinical situations but falter in others. With an OSCE, this presents the problem of content specificity, or variation in residents' performance from one station to another. Research has determined that an exam needs to include 10 to 15 stations, or 3 to 12 hours of testing time, to achieve a reliability coefficient of 0.8, an accepted standard for educational tests (+1,+2). In general, longer stations yield less reproducible results, so an exam made up entirely of long stations would need to be in the upper range of test length. This has so far exceeded the available resources in our program. As noted, the half-day exam is given yearly to 5 PGY-2 residents. PGY-3 and PGY-4 residents receive the mock oral board component only.
The structure of the exam has evolved somewhat. It began with 6 stations involving a combination of 20-minute SP encounters and specific tasks (interpreting MMPI profiles, EEGs, laboratory data). Over time, faculty reevaluated whether some of the tasks (reading EEGs) were truly critical for a competent psychiatrist, and they were removed. With the SP stations, it became clear that 20 minutes was not long enough for most residents (particularly PGY-2s) to do a thorough evaluation, and station length was increased. Even with the original 6 stations, the exam did not achieve a level of reliability sufficient for a concrete summative evaluation (+3).Therefore it made sense to increase the perceived validity as much as we could, by using fewer, longer stations. Around the same time, the mock oral board exam was included to better prepare our residents for their certification exam. Although the oral board exam with a random actual patient has been criticized as being less objective than a standardized case, it still provides an excellent forum to observe and assess interpersonal skills. It also allows faculty examiners to explore the residents' clinical reasoning process on a level we have not achieved with the OSCE checklist. Resident feedback indicates that the mock oral, though anxiety-provoking, is highly valued.
Do OSCEs have a role in assessing Residency Review Committee competencies? Absolutely. Exactly what that role will be remains to be seen. The Accreditation Council for Graduate Medical Education's Toolbox of Assessment Methods (+4) lists the OSCE (with or without SPs) as a "most desirable" method of assessment in the areas of patient care, interpersonal and communication skills, and professionalism. It is considered a "next best" or "potentially applicable" method for evaluating practice-based learning and improvement and systems-based practice. Using an OSCE format to assess medical knowledge is not the most efficient use of the format because it tends to increase the amount of testing time needed to reach reliability (+1).
The opportunities to assess patient care, professionalism, and interpersonal skills via OSCE are self-evident. Here is an example of how a modified OSCE can be used to assess practice-based learning and improvement and medical knowledge along with clinical skills—for many years, first- and second-year students in the problem-based learning curriculum at SIU were assessed almost exclusively by this method. For each unit exam they had a series of SP cases for which they were assessed on their clinical evaluation and write-up. In addition, the students had to identify the important learning issues (gaps in their data base), research them over the following 2 days, then undergo an oral examination based on their readings. Faculty oral examiners rated them on the basis of a template formulated by a team of experts in each discipline being tested. In addition to mastering a specific data set, students were asked for a critical review of their learning resources (books, journals, faculty, etc.) as well as a self-assessment of their knowledge base. Although useful, this process was highly labor-intensive and not necessarily the most efficient way to assess that particular competency. However, it does demonstrate that the possible uses of OSCEs, with or without SPs, are limited only by our imagination and resources. Most psychiatry training programs these days have more imagination than resources.
For programs that choose to utilize OSCEs, there are real opportunities for collaboration in the development and administration of exams. At a minimum, cases and validated checklists/grading rubrics can be shared among programs. For programs in geographic proximity, jointly administered clinical exams make a lot of sense. They can reduce SP expenses because all SPs for a given case can be trained at the same time, no matter how many are needed for the exam. Sharing of faculty as observers/raters/reviewers gives residents the benefit of unknown reviewers. It also gives residents a view of the world outside of their particular program. The biggest benefit of pooled exams would be a larger number of examinees, which can bring us closer to understanding the length and size in an OSCE that are needed to achieve reliability for psychiatry residents.
van der Vleuten C, Swanson D: Assessment of clinical skills with standardized patients: state of the art. Teaching and Learning in Medicine  1990; 2:58-76[CrossRef]
 
Colliver J, Williams R: Technical issues: test application. Acad Med  1993; 68:454-460 [PubMed][CrossRef]
 
Loschen E: Using the objective structured clinical examination in a psychiatry residency. Academic Psychiatry  1993; 17:95-104
 
Toolbox of Assessment Methods. ACGME Outcomes Project. Chicago, Accreditation Council for Graduate Medical Education and American Board of Medical Specialties, September 2000
 
+
van der Vleuten C, Swanson D: Assessment of clinical skills with standardized patients: state of the art. Teaching and Learning in Medicine  1990; 2:58-76[CrossRef]
 
Colliver J, Williams R: Technical issues: test application. Acad Med  1993; 68:454-460 [PubMed][CrossRef]
 
Loschen E: Using the objective structured clinical examination in a psychiatry residency. Academic Psychiatry  1993; 17:95-104
 
Toolbox of Assessment Methods. ACGME Outcomes Project. Chicago, Accreditation Council for Graduate Medical Education and American Board of Medical Specialties, September 2000
 
+
+

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of JBJS editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe



Related Content
Articles
Books
The American Psychiatric Publishing Textbook of Substance Abuse Treatment, 4th Edition > Chapter 47.  >
Topic Collections
Psychiatric News
PubMed Articles
A piece of my mind. His patients. My patients.
JAMA : the journal of the American Medical Association 2011 Dec 7