
Academic Psychiatry 26:134-161, September 2002
© 2002 Academic Psychiatry
Creating, Monitoring, and Improving a Psychiatry OSCE
A Guide for Faculty
Brian Hodges, M.D., M.Ed., FRCPC, in collaboration with the University of Toronto Psychiatric Skills Assessment Project team:,
Mark Hanson, M.D., M.Ed., FRCPC,
Nancy McNaughton, B.A. and
Glenn Regehr, Ph.D.
Dr. Hodges is Associate Professor and Vice-Chair (Education) of the Department of Psychiatry and Wilson Centre for Research in Education, Faculty of Medicine, University of Toronto. Dr. Hanson, Ms. McNaughton, and Dr. Regehr are the members of the Psychiatric Skills Assessment Project team, University of Toronto. Address correspondence to Dr. Hodges, Department of Psychiatry, University Health NetworkToronto General Hospital Site, 200 Elizabeth Street, 8 Eaton, Room 212, Toronto, Ontario M5G 2C4, Canada.
Key Words: Objective Structured Clinical Examination (OSCE)

|
INTRODUCTION
|
The objective structured clinical examination (OSCE) was first described by Dr. Ronald Harden in the 1970s (1). As a new evaluation tool that allowed clinicians to be observed performing in many different clinical situations, the OSCE was a major improvement over oral examinations in which only one clinical encounter was observed. The OSCE also incorporated the technology of standardized patients first described by Barrows and Abrahamson in 1964 (2). The use of standardized patients allowed the nature of problems and the level of difficulty to be standardized for all students (3).
This combination of multiple observations and standardization of content and difficulty made the OSCE a very popular evaluation tool. Further, extensive research demonstrated that OSCEs could have excellent psychometric properties. As a result, the use of OSCEs is now extensive in medical schools throughout the world. OSCEs have become indispensable for the assessment of medical students, clinical clerks, interns, and residents and of candidates for licensure and certification. OSCEs are also used extensively for the assessment of the competence of other health professionals, including chiropractors, nurses, nurse practitioners, pharmacists, and physiotherapists. In addition, the development of certification exams by both the Medical Council of Canada and the National Board of Medical Examiners for all physicians has been an impetus toward the growth and development of OSCE technology across the spectrum of medical education.
The Psychiatric Skills Assessment Project (PSAP) was launched at the University of Toronto in 1994 by a group of individuals interested in understanding and improving the technology of assessment at all levels of psychiatry education. In 1994, the OSCE was a relatively new and promising evaluation that had found little application in psychiatry. The PSAP team set out a course of extensive investigation into the use of the OSCE and standardized patients in psychiatry. The result has been a series of research projects and publications that have helped us develop a much greater understanding of the technology of assessing psychiatric skills.
Development of this expertise has allowed us to help many groups across North America and in the rest of the world develop new assessment techniques in psychiatry, and more specifically, psychiatry OSCEs. Over the past several years we have undertaken so many of these consultations that we thought it would be helpful to put together a guide for faculty interested in developing a psychiatry OSCE. The following sections outline the steps of creating a psychiatry OSCE, from the early conceptual phase through the creation and monitoring and on to quality improvement. For those who are interested in undertaking research in psychiatric educational assessment, the last section addresses research directions. We believe that creating and studying innovative assessment methods in psychiatry is a stimulating and rewarding pursuit that contributes to high standards of education and ultimately to excellent clinical practice.

|
SECTION 1. THE USE OF THE OSCE IN PSYCHIATRY EDUCATION
|
Performance-Based Assessment in Psychiatry
As is typical at medical schools in North America, clinical clerks in psychiatry at the University of Toronto were evaluated for many years with oral examinations. The format used was a 1-hour patient interview observed by faculty examiners, followed by a second hour during which the students presented cases and discussed diagnosis and management. Unfortunately, year after year, oral examination marks were inflated, poorly distributed (almost everyone received a mark of between 75% and 85%), and significantly discrepant from clinical performance on the wards. In addition, both the faculty and the students became increasingly disgruntled with an oral examination that they felt was more appropriate for the assessment of the skills of psychiatric residents than clinical clerks. Indeed, a large body of literature has criticized the use of an oral examination to assess the skills of medical trainees at all levels (48). Most concerning were data gathered by the National Board of Medical Examiners (NBME) in the United States (9) during 3 years of examinations involving over 10,000 medical students. The NBME found that the correlation of independent evaluations by 2 examiners for candidates in a single encounter with a patient was less than 0.25. These data, together with a growing number of examinees, prompted the NBME to discontinue the use of oral examinations for its Part III evaluation across the United States in 1963. In psychiatry, Leichner and co-workers demonstrated statistically that the "luck of the draw" in selection of examiner and patient played a significant role in the outcome of an oral examination, particularly for borderline candidates (10,11). In Leichner's words, "since it is difficult to improve the reliability of individual raters because of time and cost-effectiveness issues, increasing the number of different tasks and evaluators at an oral examination, as suggested through the use of objective structured clinical examinations (OSCEs), may be a feasible means of improving the validity and reliability of oral examinations" (11, p. 283). Indeed, throughout North America, OSCEs have begun to replace oral examinations for the assessment of clinical clerks and, more recently, residents in psychiatry.
As described by Harden and Gleeson in 1979 (1), an OSCE is a timed examination in which students move from station to station, each station requiring performance in a simulated setting that usually involves interaction with a standardized patient. The student is typically required to demonstrate some combination of history taking, physical examination, counseling, or other aspect of patient management. At each station, candidates' performances are rated on checklists and global rating scales. In virtually every medical discipline except psychiatry, OSCEs have been extensively studied and established as performance-based assessment instruments with good validity and reliability (1216). Indeed, because OSCEs have been shown to have a much higher reliability and validity than traditional and less structured oral examinations, they have grown in acceptance as a means of episodic performance-based assessment (17).
OSCEs are in common use in many medical disciplines, but psychiatric educators have been slow to adopt this method of evaluation. Although OSCE stations with psychiatric content have been incorporated into large examinations such as the Medical Council of Canada Part 2 Exam, there are few studies describing a complete psychiatry OSCE. A 1996 study provided data demonstrating acceptable reliability and validity of an OSCE designed solely to test communications skills in stations with psychiatric content (18); however, communication skills are but one component of psychiatric competence.
There were only two other papers describing the use of a psychiatry OSCE in the literature before 1995. In the first paper, Loschen (19) described 6 years' experience with a 6-station OSCE for the evaluation of psychiatry residents at Southern Illinois University. Although Loschen did not provide specific data on reliability, he stated that the exam allowed him to "test residents in a number of relevant clinical situations in a short period of time. The clinical situations are accurate and realistic, and can be tailored specifically to the training level of the residents tested." Loschen's examination used 45-minute stations, which he felt were an adequate reflection of the typical encounter of a psychiatric resident with a patient. To save time and money, however, several of the stations involved videotaped or written material and were scored after the fact by videotape assessment or by nonmedical personnel. Only 3 of the stations had live standardized patients, and each of these scenarios involved a brief assessment interview during the first 20 minutes of the station.
A second paper, by Famuyiwa and colleagues (20), described 6 years' experience with an OSCE for the assessment of medical students at the University of Lagos in Nigeria. These authors concluded that "there is some evidence for the justification for using the OSCE as a major form of assessment in psychiatry, but there should be an emphasis on the use of examiner-assessed stations." They based this conclusion on findings of greater reliability in stations where a live examiner was present than in stations utilizing Patient Management Problems or other written questions. This group did not describe the length of the stations but did state that 20 stations were used, thus suggesting a short time period for each station.
The University of Toronto Psychiatry OSCE
In 1994, members of the University of Toronto Department of Psychiatry came together to create the Psychiatric Skills Assessment Project, a multidimensional effort to create a psychometrically sound OSCE for psychiatry clerks. The first OSCE was a modest affair, involving only 4 stations at one hospital site. Stations were 15-minute encounters requiring a focused assessment of a clinical case, emphasizing rapid diagnosis and issues of safety. Stations were intended to reflect the typical interaction of a family physician or emergency psychiatrist. In the first year, 42 fourth-year clinical clerks were tested. Each station score was composed of a checklist score (50%) and global ratings (50%). Informal assessment of interrater reliability, using 3 student encounters at each of the 4 stations, was promising. For 6 of the 12 encounters, raters demonstrated perfect agreement in scoring the checklist items; for 5 of the remaining 6 encounters, raters disagreed on only 1 or 2 of the checklist items. Interstation reliability was also quite high, at 0.61 (21).
Subsequently, the examination was expanded to 6 stations and was used to examine all University of Toronto clinical clerks during their 6-week psychiatry rotation. A committee created an evaluation blueprint emphasizing common diagnostic groups (mania, schizophrenia, depression); skills of diagnosis, management, and patient education; and process components such as rapport, organization, and interview skills. Starting with real cases, station authors created stations portraying patients with difficult psychiatric problems such as thought disorder, delirium, panic, manic psychosis, and personality disorder. Psychiatrists supported the patient trainer with videos, reading, supervision, and practice and helped to pilot test the cases with students and residents. Gradually a bank of more than 25 OSCE stations was created for use in various administrations of the OSCE.
Between 1995 and 2000, more than 1,000 students were tested by using the University of Toronto psychiatry OSCE. During those years, many studies were undertaken to better understand the nature, strengths, and limitations of a psychiatry OSCE. These studies included examination of the reliability of the OSCE (22), its validity (23), ways to integrate child psychiatry in the OSCE (24), usefulness for assessing residents (25), and the psychological impact of playing emotional roles on the standardized patients themselves (26). The Psychiatric Skills Assessment Project is more fully described in Appendix A.
Much has been learned about the use of an OSCE to assess competence in psychiatry, but a great deal remains to be done. From the outset, it has been clear that there are special issues that make simulation in psychiatry different from simulation in other disciplines. For example, the shorter time interval of 10 to 15 minutes is more typical of an assessment in primary care than in psychiatry. As well, the simulation of complex problems such as psychosis and personality disorder are much more involved than simulation of a headache or a sore knee. Finally, there is a whole host of ethical issues to consider, including the impact of playing such roles on the actors, the use of children and adolescents, simulation of disinhibited behavior, boundary crossing and foul language, and the induction of strong reactions in students. Educators who conduct and research psychiatry OSCEs must address all of these issues.
The focus of this guide is primarily on the use of OSCEs for the assessment of medical students. Recently, however, there has been considerable interest in the potential applications of OSCEs and related standardized-patientbased assessment methods for residents. Members of the PSAP team have consulted with both the Royal College of Physicians and Surgeons of Canada and the American Board of Psychiatry and Neurology. Both organizations are exploring revisions to their certifying examinations that will incorporate the elements of standardization and simulation described herein. Only two published papers have explored the use of OSCE for assessing psychiatry residents (19,25), and both highlight the fact that resident OSCEs need to be quite different from those used for medical students. Thus, a great deal of research is still needed to understand how to appropriately modify and extend OSCE technology into the area of postgraduate assessment.
While there are significant hurdles to overcome in extrapolating what has been learned about medical students' assessment to the postgraduate and perhaps even continuing education domains, it is the very challenge of studying and understanding such complexities that makes OSCE research and development exciting. Indeed, findings from psychiatry OSCE research have led to new methods of teaching, feedback, and evaluation and have even begun to affect thinking about diagnosis and management in psychiatry itself. The members of PSAP have been excited and energized by this line of research and development, and we hope our readers will be as well.

|
SECTION 2. THE LANGUAGE OF OSCEs: TERMS AND CONCEPTS
|
The advent of standardized-patientbased teaching and assessment and the objective structured clinical examination has been accompanied by the introduction of new terms and concepts. The following glossary is an attempt to help those new to these technologies to better understand the literature in the area. It also provides a brief introduction to terms and concepts that will be used in later sections of this guide.
Binary Scoring: An item or series of items, usually on a checklist, scored in a dichotomous fashion (Example: Yes or No, Done or Not Done).
Borderline Means Method: A method of standard-setting in which the mean score of all candidates judged to be "borderline" is used to determine the passing score for an OSCE station.
Checklist: A set of items (usually from 10 to 30 items) used to score the appropriateness of the content of student history taking or the maneuvers of physical examination in an OSCE.
Couplet: An OSCE station in which a first interval (e.g., 5 minutes) involves one task (e.g., a history) and a second interval involves a second task (e.g., answering written questions based on the history).
Cut Score: The score above which candidates are determined to pass and below which they fail.
Dry Run: A preview of an OSCE station just prior to an examination to ensure the fidelity of the portrayal and familiarize the examiner with the nature of the station.
Global Ratings: Series of continuous scales on which standardized patients or faculty observers rate the overall qualities of competence of a student in OSCE stations.
Objective Structured Clinical Examination (OSCE): A timed examination in which students move from one room to another. In each room the student interacts with a standardized patient for the purposes of obtaining a history, completing a physical exam, counseling the patient, etc. In most OSCEs the student performance is marked in each station by either a standardized patient or a physician examiner.
OSCEology: A term that denotes the field concerned with utilizing and/or studying objective structured clinical examinations.
Performance-Based Assessment: A form of testing in which the actual performance of clinical skills is observed in order to adjudicate the competence of health professionals.
Pilot Testing: An opportunity to observe an OSCE station as it would function in an examination, by using a real standardized patient and a volunteer "student."
Post-encounter Probe (PEP): A question or questions administered immediately following a patient encounter in an OSCE station.
Role / Script: A detailed outline of the features to be portrayed by a standardized patient and detailed information about the life of the patient he or she is portraying. Unlike a traditional script for a play, this generally does not include lines of dialogue.
SP Trainer: An individual, generally with advanced experience as a standardized patient, who trains standardized patients to portray roles.
Standardized Patient (SP): A healthy person or patient with chronic stable findings who is trained to portray a clinical scenario for the purposes of teaching or assessment.
Standard Setting: A process whereby the criteria to be used to pass or fail students are defined prior to administration of the examination.
Stem: A set of instructions posted outside the door of an OSCE station that gives candidates some minimal information about the problem and directions about the tasks they should undertake during the scenario.

|
SECTION 3. PLANNING FOR A PSYCHIATRY OSCE
|
Budgeting
Evaluation is an expensive undertaking. However, while there is a cost to all medical education, the significant expenses associated with teaching are usually less obvious than those associated with evaluation. The former costs are often buried in the regular duties of academic physicians. When it comes to evaluation, discrete periods of examiner time are involved and additional costs for the creation and administration of tests occur. Often a large number of additional personnel are required. New technologies such as the OSCE have specific costs associated with the use of standardized patients, as well as materials, supplies, and the time of examiners and support staff (27).
The reader might find it odd that we begin with a section on finances. However, it has been our experience that the planning of OSCEs is greatly influenced by the resources available. Thus it is very important to establish at the outset of the OSCE planning process what the costs will be and from where the funding will be obtained. The costs of the OSCE are directly related to size and complexity. OSCEs can be roughly broken into three categories: a) small departmental OSCEs, b) faculty-wide OSCEs, and c) multi-site OSCEs.
Figure 1 provides a budget outline to help you plan for your OSCE. We have assumed that you will be creating either a departmental or a faculty-wide OSCE. The creation of multi-site OSCEs such as those used for certification and licensure is a much more involved process. Such examinations require a full-time dedicated staff and a much larger budget. However, the principles of budgeting outlined here would also apply to a certification or licensure OSCE.
The creation of an OSCE budget will be determined by the educational needs of your institution (such as how many students need to be assessed and how often) and the financial realities (you undoubtedly have a limited budget), both of which interact with the psychometric requirements (the higher stakes the exam, the higher the reliability required and therefore the more stations needed). Budgeting falls into the following major categories:
- Standardized patient costs:
- Training time
- Performance time
- Standardized patient trainer costs
- Examiner costs
- Support staff costs
- Supplies
- Rental of space and equipment
- Catering
- Psychometric analysis
Once you have made a preliminary budget, you can significantly affect the overall cost by changing the number of stations or the length of the stations. Of course, this should be done only while considering the educational requirements of the exam.
Exercise:
Imagine you have 30 clinical clerks to evaluate every 6 weeks at the end of a clerkship rotation. You have $10,000 to spend. How many stations can you have? What is the impact of using 6 stations vs. 10? Or 10-minute stations vs. 15-minute stations? Or running 3 circuits of 10 stations vs. 2 circuits of 15? (Refer to Figure 1 in making your calculations.)
Securing Funds
Sources of Funds:
OSCEs are relatively expensive. The source of funds varies from site to site and program to program, but in most cases the examination coordinator will be involved in the funding of the OSCE at some level. Possible sources include the following:
- Fees charged to students/candidates
- University department
- Faculty/Dean's Office
- Private industry
- Grants
- Profit from other educational activities
Funds to administer an OSCE may be derived from a number of sources. Where funding is sought depends on the organization of the medical school. In many medical schools, budgets are controlled by department heads. Thus, the Chair of Psychiatry may be a very important person to work with in order to secure funds for your OSCE. In other schools, the budget is held centrally and approval must be sought from the Dean's Office. At some sites, educational programs will be funded through a local teaching facility such as a hospital or clinic. For most OSCE planners it will be necessary to make a presentation to an individual or a board that will ultimately approve the budget for the OSCE. We have found that the most useful approach is to "speak in the language" of those controlling the budgets. There are many aspects of an OSCE that are appealing to administrators and academic boards. Examples are outlined below. A skillful presentation of several of these points can be the key to successfully obtaining funding for your OSCE.
Useful Arguments to Help You Secure OSCE Funding:
1. Valid assessment:
Departments of Psychiatry are very concerned with the teaching of interpersonal skills, relationships with patients, and the dynamic techniques of empathy, in addition to the biological components of psychiatry. It is easy to make a case that written tests are not adequate assessments of any of these skills. Thus, a live patient examination is essential if these skills are to be tested. A traditional long oral examination with one "real" patient goes part way to addressing these skills. However, consider that these "real" patients are not as real as we sometimes assume. For example, a student is usually asked to interview a real patient as though the patient were actually under the student's care. But of course the patient is already under the care of someone else and most often is on an inpatient unit. Medical students or residents would rarely be in the position of assessing an already hospitalized and treated patient in real practice. The problem is even greater for medical students, most of whom will not become psychiatrists and thus would never be asked to give second opinions on psychiatric care. Thus, both the context and the approach to the patient can be quite unrealistic in oral exams, making them very "unreal" assessments. As a result, the validity of oral examinations can be quite low. While it is true that some interpersonal skills can be assessed, it is very rare that any of the difficult interpersonal challenges, such as handling anger, active psychiatric symptoms, or a strong countertransference, will emerge during these oral examinations. Patients are simply not selected to participate when they are very ill.
The use of standardized patients allows the evaluator to introduce such elements as angry patients, seductive patients, and patients with difficult problems such as active psychosis. These are much better tests of a student's ability to establish a relationship with a sick patient and show that the student has the capability to make a safe, rapid, and effective assessment. As well, it is more fair to use the same standardized situations for all students. The question to ask skeptics is: "We know how our students will perform in taking a routine history from a relatively stable patient, but how will they do when they are faced with an angry, hostile, or difficult patient?" Convincing a Chair or Dean that it is essential to assess these competencies using a fair and valid OSCE can be a very successful route to obtaining support for your project.
2. Appeals and the Failing Student:
Increasingly, medical educators are asked to justify their assessments of students who appeal marks and grades. Although it is extremely important to have a formal appeal mechanism and defensible evaluation tools at the certification level, it is increasingly true that medical schools themselves are faced with lawsuits and appeals. Thus assessment techniques used in medical schools must also meet a high standard of reliability and validity.
A commonly discussed measure of reliability is Cronbach's alpha, which gives an indication of the degree to which student performance is consistent across multiple observations. For high-stakes exams such as those used for certification and licensure, reliability should exceed 0.80, a level at which little of the observed variance in student performance is attributable to sources of error in the examination process. To achieve this level of reliability, up to 15 stations are generally required. However, even 3 or 4 stations will have a higher reliability than a single observation of a student with one patient. Reliability as high as 0.61 has been obtained with a psychiatry OSCEs that contained as few as 4 stations (21). Likely, a psychiatry OSCE with between 5 and 10 stations will produce enough independent observations to be defensible and have a high enough reliability to discriminate the competent from the incompetent student.
3. Research:
Chairs and Deans are greatly interested in the development of research among their academic staff and students. Rigorous evaluation techniques such as the OSCE lend themselves naturally to research because of the large amount of data produced. As well, the reliable and valid nature of most OSCEs makes it easier to draw conclusions regarding the data than is possible with other, more unreliable methods. A very convincing argument for funding can be made when Chairs or Deans understand that the money invested in an OSCE will stimulate a whole new research endeavor. Funded OSCEs can pique the research interest of faculty members as well as fellows, residents, and even medical students who may go on to use OSCE data as a springboard for medical education research of all kinds.
4. Congruence:
National certification and licensing authorities, including the Medical Council of Canada and the National Board of Medical Educators in the United States, are turning to OSCEs for assessment at the certification level. It behooves medical schools to create experiences for their students that will help them to be successful at national licensing examinations. The skills required for success on certification examinations are better developed through opportunities offered during medical school training than in separately funded, profit-oriented preparation courses set up outside of medical schools. Many Chairs and Deans will have heard from students who are anxious about the national licensing exams, and thus student support may be helpful in your effort to obtain funding for an OSCE.
Many techniques can be used to make an effective presentation to your Chair or Dean. One particularly good technique is to present a case vignette. We have used the following true vignette at our medical school to illustrate the importance of the OSCE in psychiatry.
Vignette:
A clinical clerk was interviewing a psychotic patient. As the interview progressed the patient became increasingly paranoid and agitated as the student failed to address the apparent fears the patient had of the student. Finally the situation became so intolerable that the patient bolted from the room. Alarmed by the situation the student leaped to his feet, grabbed the patient and threw him up against the wall in a headlock.
These events occurred during an examination. An examiner was present in the room and the psychotic man was a standardized patient.
This vignette describes an actual situation that occurred at our departmental OSCE. It is an anecdote that leads to a rich discussion about the reality of simulation, the importance of having students experience difficult clinical problems, and the opportunity to give students feedback while in medical training rather than after they have made a mistake in practice. Case vignettes are extremely powerful in organizing a presentation to garner support for your project.
Gathering a Team
A highly respected scholar of education was asked at his retirement party to describe the secret of his success. He said simply, "I have always surrounded myself with people who are smarter than I am!" This wise advice is good to remember when organizing a team to set up something as complex as an OSCE. While no individual should try to set up an OSCE alone, the number of people involved will depend on the size of the examination. Small 4- or 5-station exams administered once or twice a year will need only a couple of staff members, whereas large faculty-wide examinations or repeated administrations will require more personnel.
Generally, three kinds of help are required:
- A small group of educationally astute and enthusiastic participants who will help you create your examination.
- Specialists and consultants in specific areas who can advise you in areas outside your expertise or when problems arise.
- Support staffs who are hired to supervise and administer the logistics of the examination.
Finding a small number of like-minded and enthusiastic colleagues (the first category) is very helpful. These might include other faculty psychiatrists, senior residents, and medical educators or allied health professionals. A small team of three or four seems an ideal number to do the initial groundwork such as convincing the Department of the importance of the exam, doing early faculty development and promotion, and creating the first examination material. A retreat outside of business hours, perhaps in a nice location, is an effective way to start the process.
In the second category are specialists with areas of expertise that you may not have. These include statisticians, professional educators, and standardized patient trainers. You will need to involve all of these people if you do not have this expertise yourself. In all cases, explore the degree to which your university employs people with these qualifications. Often you will find that other departments are running OSCEs and are willing to collaborate and advise you in the development of your psychiatry OSCE. These specialists may join your team and in some cases may go on to become collaborators in research and development. In other cases these busy people prefer to be advisers to your group, helping you develop your initial exam plan and consulting when you encounter difficulties. If they are salaried by a university or hospital, they may be willing to assist you as part of their job. Others, particularly standardized-patient trainers, will need to be financially compensated if they do not have permanent salary support.
The third category of personnel is the support staff. The first people to consider are staff already employed by the university in administrative positions. For example, many schools have administrative assistants for undergraduate and postgraduate education, and these people, although very busy, may be able to provide some support either in advance or during the examination process. Standardized patient programs often can provide support staff as well. Also, in our experience residents with an interest in education have been enthusiastic about participating at undergraduate and even postgraduate OSCEs.
Standardized Patient Program
Not all schools have established standardized patient programs, but more and more have. If your school has a standardized patient program, one of the first things you should do is meet with the director to discuss your ideas for a psychiatry OSCE. Standardized patients across North America are "professionalizing," meaning they have increasingly well-defined guidelines for salary and working conditions and are increasingly sophisticated in the general domain of education. Many standardized patient program directors and trainers have advanced degrees in education. Thus, the staff of a local standardized patient program can be invaluable in helping you to develop your psychiatry OSCE. As well, standardized patient programs often have a bank of stations that may be easily modified to save work for authors in your department.
If you do not have a local standardized patient program, you might consider visiting a nearby university that has one. They may be willing to help with the standardized patient component of your examination. They can also provide invaluable advice about how to find and work with standardized patients at your institution. We are often asked about the use of faculty, volunteer allied health professionals, or residents as standardized patients. We have had some experience in this regard and described the use of faculty SPs in our first pilot study of a mini-OSCE (21). Although more readily available, faculty SPs are not ultimately cheaper because their time spent on the exam is time taken away from other pursuits such as clinical activities. There can be difficulties in standardizing the performance of faculty because they are not professionally trained to act as standardized patients. However, when carefully utilized, faculty can role-play very well, and there may be some advantages to encouraging their participation. Other volunteers such as residents, medical students, or volunteers from the community have been used in OSCEs by others. The degree to which these people can be effective standardized patients will relate to the amount of time available for their training and the careful selection of people who are able to portray psychiatric roles. Overall, however, our group would recommend strongly the use of professional standardized patients because of the complexity of the roles they will have to portray in psychiatry. In the exam setting even more than in the teaching setting, the fidelity or accuracy of portrayal is extremely important.
When working with standardized patients who are performing psychiatry roles for the first time, it is very helpful to introduce them to the psychiatry context. We have found that standardized patients sometimes feel confident with the portrayal of psychosocial roles but have never seen realistic psychiatric presentations. Opportunities to meet with patients who have actually experienced psychiatric illness, the use of videos showing psychopathology, and visits to psychiatric hospitals are very useful in helping standardized patients bring their roles to life. When we first began to portray acute psychosis secondary to schizophrenia, the actors greatly benefited from a trip to see the acute care unit of a psychiatric hospital. Further, seeing a portrayal of a psychotic patient on video and meeting patients who had experienced mental illness were invaluable in ensuring the accuracy and authenticity of the role.
A final consideration is recent evidence suggesting that portraying emotionally difficult roles may have an effect on the standardized patients themselves. Our group has explored factors that contribute to this impact on standardized patients and also described interventions that mitigate such adverse effects (27). Most standardized patients and programs that have not delved into complex psychiatric roles do not have experience with these potential adverse effects. Psychiatrists and their colleagues creating psychiatry OSCEs and scenarios for teaching will want to work closely with standardized patient programs to monitor for such effects and to educate one another about the potential risks and benefits of portraying complicated emotional roles. With that caution in mind, we now believe that virtually any psychiatric presentation can be brought to life by a standardized patient in a believable and entirely authentic way. While such roles can be emotionally draining for standardized patients and on occasion can produce some adverse effects, they are on the whole very gratifying for the standardized patients, many of whom return again and again to perform these roles.

|
SECTION 4. CREATING A PSYCHIATRY OSCE
|
The OSCE Blueprint
Before creating OSCE stations it is essential to create an OSCE "blueprint." The blueprint is a matrix that outlines parameters for the exam: content areas, knowledge, skills and attitudes, station type, and length. The blueprint matrix ensures that there is balance across the exam and between different administrations of similar exams. Figure 2 shows an example of a blueprint for content areas, and Figure 3 provides a grid to help you create your own OSCE blueprint.
The first consideration in exam design is the station length. OSCE stations ranging from 4 minutes to over an hour have been described in the literature. There are two major factors that have to be considered in determining station length. The first consideration is the congruence of the station length with actual practice. Although residents might be assessed in a 45-minute encounter with a psychiatric patient, it is unlikely that most medical students will ever spend this much time in an assessment. Interactions of family physicians range from 5 to 15 minutes, and thus stations of this length are likely to be most appropriate for the clinical clerk or "undifferentiated physician." Very short stations of 5 minutes or less do not generally permit a complete assessment and therefore require students to be directed to one part of the interaction. For example, a station might direct the student to "examine this patient for extrapyramidal symptoms." Stations of 12 to 15 minutes are long enough for a complete encounter such as would occur in a family physician's office or in an emergency department. Stations of this length are easier to administer and require fewer instructions. Of course as the station length grows, the total exam time will also grow if the same number of stations is used. Twice as many 5-minute stations can be used as 10-minute stations in the same testing time. The use of very long stations, such as lengths of 30 or 45 minutes, requires a great deal of testing if a minimum of 4 stations is to be used. Many centers have settled on 15-minute stations that are long enough to test a complete assessment but short enough to allow the use of many different scenarios. It is reasonable for students to have as much as 2 or 2 hours of testing time in one administration. Thus up to 8 or 10 stations of 15 minutes can be used in one examination. Return to Figure 1 in Section 3 and ensure that you are happy with the station length that you have chosen. Use your selected station length to fill in the appropriate section of the OSCE blueprint grid shown in Figure 3 in this section.
The next important area to blueprint is content. Start with the assumption that scenarios tested at the OSCE should reflect the content of the curriculum. Let us consider a typical psychiatry clerkship OSCE as an example. Although clerkships vary tremendously between universities, most are between 4 and 10 weeks. Take for example a 6-week clerkship that is organized to include 6 thematic topics: mood disorders, anxiety disorders, child psychiatry, psychosis, personality disorders and substance abuse disorders. For this curriculum in which students spend 1 week focusing on each of these topics, an excellent blueprint would involve a 6-station OSCE with 1 station from each of the theme weeks. If the examination was administered at different times throughout the year, different scenarios within the same topics could be chosen. As shown in Figure 2, the scenario relevant to the "mood" week might be a postpartum depression in one examination and hypomania in a subsequent examination. It is useful to have the group responsible for the curriculum, such as the Undergraduate Education Committee, review and prioritize the curriculum content prior to completing the blueprint. In Figure 3, fill in the desired content for each station.
The next variable required in the blueprint, the task, addresses the balance of knowledge, skills, and attitudes to be tested in the examination. Many things can be tested with the same scenario. For example, a station involving a young person with schizophrenia might include taking a history, performing a physical examination for extrapyramidal symptoms, counseling about medication treatment, and providing support and management of crisis. The tasks undertaken in stations should reflect the competencies taught in the curriculum. They also should be balanced across the examination. For example, an unbalanced blueprint would contain 5 physical examination stations, 1 management, and no history taking. The inclusion of advanced skills such as management depends on whether such skills are an expected competency of students taking the curriculum. Certainly a resident group will be expected to manage and treat patients as well as give them advice about the treatment recommendations. However, clerkship stations are more often focused on history gathering, physical examination, and simple diagnostic decision making. In Figure 3, indicate the task of each station by filling in the knowledge, skills, or attitudes that you want to test. When this is completed, a quick scan of the blueprint will allow you to ensure a balance of station requirements.
Finally, you need to consider the format of the station. There are three major station formats: a) an uninterrupted interaction between the student and the patient (e.g., a history, physical, combination of the two, or counseling); b) a long interaction between the student and the patient, interrupted in the last 1 or 2 minutes for questions from the examiner; or c) a shorter interaction between the patient and the student that is terminated approximately at the halfway mark and followed by written questioning (sometimes called a post-encounter probe or PEP). It is best not to mix different station types within the same examination because this has the potential to confuse the students. If different station types are to be used, the exam should be run in different segments. For example, in a mixed-design 10-station exam, students might begin with 5 long stations. After a rest the students would then take a series of 5 "couplet" stations, each of which includes a 5-minute patient encounter and a 5-minute PEP. In Figure 3, indicate the format of each station in your OSCE.
The final consideration in exam blueprinting is the number of students that will take each exam administration. The more stations used, the greater the number of students that can be tested in one examination "circuit." A circuit consists of a complete set of all the stations in an OSCE. Circuits are either repeated or duplicated for greater numbers of students. For example, if 20 clerks were to be tested following a 6-week clerkship rotation, and the blueprint created included 10 stations, two options exist: a) give the examination two times or b) run two circuits of 10 stations simultaneously. The first option makes the testing time for examiners and standardized patients longer, but the second option, although shortening the testing time, requires least 20 contiguous rooms. Site capacity often determines which of these approaches will be taken. The choice of simultaneous or sequential circuits also has implications for standardized patient performance and thus has an impact on the cost of the examination. As well, longer examinations generally require the provision of coffee or lunch for examiners and standardized patients, thus increasing the cost.
Creating Stations
The best OSCE stations are created from real clinical scenarios. In order to run an OSCE, a bank of stations will have to be created. The content of these stations will be determined by the blueprinting process described above. Likely, several stations with similar content (depression) but different scripts (postpartum depression in a young woman, grief in an elderly man) will be required. Once a list of stations and tasks has been developed, station authors need to be engaged to create the stations. Many people can create psychiatry stations, including psychiatrists, psychologists, allied health professionals, psychiatry residents, and even medical students. Of course, for a high-stakes examination there are good arguments to be made for using faculty members in the Department of Psychiatry. We have found that a team consisting of a psychiatrist and a standardized patient is optimal for creating new psychiatry OSCE scenarios. The psychiatrist can provide the clinical detail necessary to make the role realistic, while the standardized patient can provide advice about what is most easily portrayed and what is realistic and achievable in the short period of time. Bringing such teams together to write stations can be done effectively in a group retreat or by using several independent author meetings. The author or pair of authors should be encouraged to write a very detailed story of the case and to create a checklist of items that would be expected of students in the OSCE stations. Once the station has been written in draft, it is wise to pilot-test it as early as possible to ensure that the scenario is realistic and can be accomplished within the constraints of the time limit of the station. A template for station creation is provided in Figure 4.
Measurement Instruments
Literature is available to exam creators to help them decide what type of instruments to use (2830). This is an evolving field, and there has been a shift in recent years away from binary checklists (Yes/No or Done/Not Done) toward global ratings. The reasons for this change are complex and relate to the properties of assessment instruments in general. This literature is not reviewed in detail here, but the interested reader might review papers by Hunter et al. (31) and Charlin et al. (32). Our group believes that there is a role for checklists but that heavy weight should be given to global ratings that are better able to capture the complex interpersonal interactions emphasized in psychiatry. Qualities such as empathy, organization, and rapport are not well captured by binary checklists. Nevertheless, the specific steps of taking a good diagnostic history or performing a physical examination can still be effectively captured on a well-designed checklist.
An important caution relates to recent research that shows checklists do not capture increasing levels of expertise (33). Although checklists are good tools for assessing the skills of a novice, it has been shown that at higher levels of expertise, clinicians ask fewer rather than more questions about a topic. Their years of experience allow pattern recognition and the incorporation of nonverbal material to help them make a more rapid diagnosis. Often experts ask a small number of questions only to confirm their hypothesis and then move on to management. Thus, faculty who are creating OSCEs for residents and for psychiatrists or other professionals in practice will want to rely heavily on global ratings. We have argued in the literature that new measures of competence are required to assess the skills of experts, but while these are still being developed the best measures we have are the global ratings (34). We have provided a sample checklist (Figure 5) and a sample global rating (Figure 6) that may be used in a psychiatry OSCE.
In an entirely formative examination used solely for feedback to students, psychometric considerations such as reliability and validity (discussed in Section 7) are less important. However, in a high-stakes examination such as one used to derive a grade for the student at the end of a clerkship rotation, exam creators will have to pay attention to the ability of the examination to provide data that accurately discriminate between students. The literature suggests that global ratings are somewhat better at discriminating between students in a reliable fashion (35). On the other hand, the interrater reliability (agreement between two raters) is slightly higher for checklists, as might be expected, because of their very specific behavioral descriptors. You will have to carefully decide which types of ratings to use and how to weight them. In our experience we have been satisfied with a combination of 50% checklist and 50% global ratings. Figure 7 summarizes the characteristics of ratings.
At the end of the encounter it is standard practice for the examiner to make an overall judgment of whether the student is competent or not competent. Sometimes this is put into the language of pass and fail. Most exams now incorporate a middle category of borderline, which not only allows greater feedback to the student but is necessary for an important standard-setting process called the Borderline Means Method. This method is described more fully later.
A final consideration is who will be completing the ratings. The standard in Canada has been to use physician examiners to grade all OSCEs. In the United States, there is a tradition of using standardized patients to assess student performance. A developing literature shows that under the right circumstances, standardized patients can also be effective and reliable evaluators (30). However, there are some special considerations when using standardized patients. First of all, if they are required to complete a checklist they will have to do this entirely after the scenario is completed and the student has left the room. Their ability to do this requires an adequate interval between scenarios to remember and record what was done. They also have to "leave" their role while they are doing the evaluation. It has been our experience that standardized patients performing very emotional or complex psychiatric roles find it difficult to leave the role quickly in the short period of time required to complete the evaluation and then get back into role quickly. If standard patients will be completing measures, it is necessary to leave ample time and make checklists reasonably short so that standardized patients can easily complete them between stations. Our research group and our standardized patients are for the most part not in favor of asking SPs to move in and out of role frequently to rate students during psychiatry OSCEs.
There is a different implication for global ratings. Certainly ample literature demonstrates that standardized patients are effective raters, and perhaps more effective than faculty to adjudicate the immediate interpersonal relationship variables such as empathy and rapport. However, standardized patients are not usually experienced health professionals and may not be able to adjudicate such qualities as organization, integration of clinical material, or complex management strategies such as handling a borderline or psychotic patient. When creating global ratings for standardized patients, it is important to distinguish which domains are within the assessment abilities of a standardized patient and which would be better adjudicated by a faculty member. When time permits, a combination of assessment by both standardized patient and faculty member is ideal. If it is difficult for the standardized patient to complete a paper rating while "in role," he or she can quickly give a verbal impression to a faculty member who is completing the global rating.

|
SECTION 5. PREPARING FOR A PSYCHIATRY OSCE
|
Training Standardized Patients
We strongly recommend that an experienced standardized patient trainer train standardized patients. A standardized patient trainer has a wealth of experience bringing roles to life and can spot problems in station design before the exam. Standardized patient trainers will need to spend about 4 to 6 hours training each standardized patient or group of standardized patients for each new role. Standardized patient trainers generally earn more than do standardized patients, in the range of $20 to $30 (Canadian) per hour. However, we recognize that this may not always be possible because of financial constraints. If it is necessary for faculty members to train standardized patients directly, they should spend time with an experienced standardized patient trainer or standardized patient program at a nearby university. Several publications provide helpful advice for training SPs (3639).
The training of standardized patients involves several stages. First, the standardized patient should be given the role to read and a chance to reflect on the scenario. If he or she is entirely unfamiliar with the clinical problem (as commonly occurs in psychiatry), supporting experiences such as meeting with patients or visiting clinical settings can be helpful. Videotapes demonstrating psychopathology are also extraordinarily useful. Following this phase the standardized patient and trainer will meet to discuss the role and bring it to life through role-playing. Once a standardized patient has become comfortable with the role it is time to practice with a "student." Often the standardized patient trainer performs this role first, acting as a "standardized student." Where possible it is very helpful to have psychiatrists help with this phase of the training. Once the standardized patient feels comfortable with the role and it looks realistic to someone with experience in psychiatry, it is ready to be incorporated into the OSCE. We strongly recommend the use of a "dry run" prior to the first portrayal in the exam. Using a dry run format, the examiner who will be marking the station interviews the standardized patient to get a sense of how the station works. Dry runs can be completed in 20 to 30 minutes prior to the start of the exam.
Recruiting and Training Examiners
In a busy academic world it is an unfortunate reality for many of us that the recruitment and retention of examiners is one of the most difficult tasks. Nevertheless, it has been our experience that once they experience their first OSCE, even the most skeptical faculty members find it an enjoyable experience. The chance to observe their students performing in OSCE scenarios is very interesting for faculty and often has a powerful influence on their teaching. We have also found that the examination is a rare opportunity for faculty members to meet together and discuss educational issues. For this reason it is useful to have an opportunity for faculty to informally discuss their observations, both before the exam and after. A small investment in coffee and something to eat, as well as reimbursement of parking, goes a long way to maintaining the cooperation and enthusiasm of faculty examiners.
Although residents and allied professionals may also function very effectively as examiners, we have chosen to insist that our examination staff be faculty psychiatrists. We feel that the presence of faculty at the examination demonstrates for students the importance of the process. For giving feedback, a series of observations from 6 or 8 faculty members can have a big impact on students. While residents and allied health professionals certainly can be excellent examiners, we have been conscious of a "slippery slope." That is, if your faculty feel that other colleagues such as residents or allied professionals can be delegated the job of examining at the OSCE, you may never see faculty members again. Allied health professionals and residents can, however, have active roles at an examination, including the development of roles, the marking of written components, and other support functions.
Prior to each OSCE, examiners will need to be oriented. Even for examiners familiar with OSCEs, it is helpful to review the parameters of your specific OSCE with them because they may have recently participated in another OSCE with different characteristics. Send out a package that contains information about the examination site, time, and a brief overview of the process a few weeks in advance. On examination day, gather the examiners together 30 to 45 minutes before the exam for a more formal orientation. This is the time to review the timing signals, timing of stations, number of candidates, use of checklists and global ratings, and so on. You will also want to give a short list of "dos" (e.g., do write narrative comments on each checklist) and "don'ts" (e.g., don't give feedback directly to the students about their performance). It is helpful to create a standard set of orientation instructions that can be read by a volunteer at each orientation. Following the overview of the OSCE, send the examiners to the rooms where they will work and have them meet the standardized patients for a dry run of the station. Having the examiners actually interview the SPs is a great way to familiarize them with the station.
An important caution is to prevent faculty examiners from changing the roles at the exam. Although faculty input is helpful in role creation and training, changes at the exam are not only unhelpful and confusing to the standardized patients, but may also be blatantly unfair to students where other versions of the same station are being performed across the hall or later in the day. Rather, faculty should be encouraged to give their feedback about stations following the exam. When faculty members know that their suggestions are taken seriously and that roles are dynamic and constantly being improved, they are more than willing to complete such post-exam evaluations.
Finding an Exam Site
One of the most difficult tasks in organizing an OSCE is finding an appropriate site. In practice, any space that has a series of contiguous rooms with several chairs can provide an acceptable site for an OSCE. A small examination of 4 to 5 stations can be accommodated within a small clinic or set of contiguous clinical offices. It may not be a big imposition to ask colleagues to move from their offices for a couple of hours for an examination. A larger exam of 8 to 12 stations will probably require a student learning center, an abandoned ward, or a clinic facility. In an era of hospital closures there are often abandoned hospital wards available. The timing of the examination may have to be flexible in order to secure a space. Clinical teaching centers and hospital clinics that shut down at certain times of the day or on evenings and weekends make excellent OSCE sites. While finding a site is one of the more mundane aspects of running an OSCE, locating and confirming a location is something that must be done early and repeated for each academic year. Some centers have the luxury of a testing center that is available for free or for a modest fee. Such facilities often come with the added benefit of videotaping technology, microphones, and prearranged room setups that accommodate an OSCE. These additional features are wonderful; however, in practice, almost any empty room with a couple of chairs will suffice. In complex stations such as those that require patients to be in beds, a little more planning is required. But for ease of organization, a stretcher is better than a bed and a chair is better than a stretcher.

|
SECTION 6. RUNNING APSYCHIATRY OSCE
|
Preparation
There are many steps that allow an exam team to run an efficient OSCE. A standardized patient trainer or other exam coordinator should be appointed to ensure everything is done. A clear sense of "who does what" is essential to a well-coordinated OSCE. An excellent video, How to Run An OSCE, was created by the Educating Future Physicians for Ontario Project (EFPO) and is available to help exam coordinators (40). (Ordering information is found in Appendix B.) Figure 8 is a brief checklist of the tasks to be completed before, during, and after the OSCE. You may wish to tailor the checklist for your own needs.
Useful Documents
The following are useful documents to prepare prior to the OSCE:
- Student orientation instructions
- Examiner orientation instructions
- Examiner feedback form
- Student feedback form
Reporting Results to Students
The format used to report results to students (Figure 9, Figure 10) will depend on the evaluation policy at your school. Some schools prohibit the circulation of marks in favor of an honors-pass-fail or pass-fail system. This will affect the type of OSCE feedback you can give. In general, feedback can include any or all of the following: overall grade, component grades (by station, by checklist, by global rating), and narrative comments. In general, the more feedback students receive, the better. Simply providing a final numeric grade will not direct students to a program of further study or practice. Students are particularly appreciative of specific behavioral feedback such as individual narrative descriptions of performance in OSCE stations.

|
SECTION 7. MONITORING AND IMPROVING PSYCHIATRY OSCEs
|
Resources for Monitoring Quality
An OSCE is an excellent opportunity for medical education research. This topic is addressed specifically in the next section. However, even when research is not an explicit aim of your examination team, ongoing monitoring of the quality of the examination is highly recommended. This becomes relatively easy because an OSCE generates a large amount of data. There are several ways of ascertaining the quality of your examination.
The following is a partial list of sources of information about the quality and effectiveness of your OSCE:
- Post-examination feedback from examiners.
- Post-examination feedback from students.
- Post-examination feedback from standardized patients.
- Student examination scores.
- Relationship of examination scores to other variables such as written tests, ward marks, and supervisor evaluations.
- End-of-rotation focus groups with students.
- Focus groups with faculty preceptors.
- Relationship of OSCE results to national licensing exam scores.
- Relationship of exam results to scores on performance-based assessments in other disciplines.
Data Analysis
By their nature, OSCEs generate a large amount of data that can be used to ascertain and improve the quality of the examination. Several types of analysis can be performed as described below. Each will be considered in detail.
The method by which you will enter and analyze data is an important consideration. Checklists and global ratings can be scored by hand if they are not too numerous. For larger exams, it is helpful to enter results into a computer program such as Microsoft Excel. This can be done immediately at the exam site. Very large exams may best be analyzed by using scoring sheets that can be optically scanned.
Basic Psychometric Data:
For each examination, the following calculations are very useful: overall examination mean score and standard deviation, mean and standard deviation for each station, exam total mean for each student, highest mark, lowest mark, and Cronbach's alpha (reliability).
Item Analysis:
If you have the resources to enter every data bit (each yes/no from the checklist, every individual global scale score), you can perform an item analysis. Such analyses permit a deeper understanding of how a station performs. Standard analyses of response rate (percentage of students that scored the item correctly) and item total correlation are useful. Additional "expert item classification" schemes have been described and may add richness to item analysis (41).
Examiner Feedback:
Helpful examiner feedback includes satisfaction ratings related to elements such as realism of stations, standardized patient performance, and utility of measures. Descriptive analysis of such data can assist in improvement of individual stations, as well as provide information for understanding faculty feeling about the OSCE in general. Problems can be remedied, and positive feedback can be used in reports to Chairs or Deans to support ongoing funding.
Student Feedback:
Simply asking for student feedback sends an important message about faculty openness. Satisfaction measures, like those completed by faculty, can be used to identify problems and meet other challenges of the exam format such as appeals by failing students.
Tests of Concurrent Validity:
With examination results in hand, you are well positioned to assess the concurrent validity of your OSCE. Correlating exam results with other measures is the first step. One would expect to find correlation with similar tests (e.g., process scores on the OSCE might correlate with student marks in an interviewing course). Tests of unrelated domains (OSCE vs. a multiple-choice test) are likely to have low or no correlation. Assuming both tests are relatively reliable (moderate to high Cronbach's alpha), such low correlation is an indicator that the tests are examining different knowledge or skill sets.
Tests of Construct Validity:
Construct validity is a demonstration that the OSCE tests an underlying domain or construct. Generally, this is "clinical competence in psychiatry." In order to demonstrate such validity, a common technique is to use the test to assess comparison groups. For example, volunteer residents should obtain higher scores than clerks. Expert psychiatrists should obtain higher scores than residents. Failure to demonstrate such increases in scores would raise questions about exam validity. It could be that the measures are not sensitive to increasing levels of expertise (33) or it may indicate that other constructs such as "test-taking ability" or "sunny personality" are confounding measurement of the desired construct. We have published an example of a validity study that may assist exam teams interested in pursuing this aspect of OSCE development (23).
Standard Setting
OSCE planners must give careful thought to the method they will use to make pass-fail decisions. In general, it is not sufficient to simply select an arbitrary pass mark, such as 60%. Sophisticated methods of standard setting have been described (42,43), but a full discussion of this topic is beyond the scope of this guide. Briefly, the method that our group has employed for many years is a "borderline means" approach. Using this method of standard setting, the mean score of all candidates whose performance is judged to be "borderline" is used to determine the pass score for each OSCE station. Individual station pass marks ("cut scores") are summed to arrive at an overall examination pass mark. We have been very satisfied with this method, which is also employed in the national licensure OSCE of the Medical Council of Canada.

|
SECTION 8. RESEARCH AND THE PSYCHIATRY OSCE
|
As discussed in the previous section, a large amount of data are generated by OSCE examinations, allowing for comprehensive monitoring of quality. It is a small step to move from quality assurance to research. Much innovative and interesting research has been published since the first description of an OSCE and the use of standardized patients several decades ago. There are a few considerations in moving from quality assurance to research.
The first consideration is consent. Every institution has a scientific and ethical review process. The body responsible for this process should approve all research proposals relating to OSCEs. While it is true that in some institutions it is considered ethical to publish the retrospective analysis of aggregate student data without ethics approval, we have found that it is very useful to proceed through scientific and ethical review in any case. The process is not onerous and helps to hone the thinking of the research team. In any instance where an intervention is made during the assessment that would not be undertaken under normal educational circumstances, consent must be obtained from participants. There is an important power issue to consider when taking consent from students during an examination. Naturally students are concerned about their results, and it is important not to put them in a position where they feel that refusing to give consent could in any way affect their academic progress. Researchers not involved with the assessment of students should approach students for consent. The difference between consenting to have one's data used in a study and the necessity of taking a regular examination for purposes of assessment and promotion in the medical school should be made very explicit. Where individual student data are to be assessed for research purposes it is useful to have an assistant assign code numbers to students and remove their names before the research team analyzes the data. Ultimately, only aggregated student data should be reported in presentations and publications unless specific consent was obtained from students.
The second consideration in undertaking research is the appropriate payment of colleagues involved. Generally faculty are not paid to be examiners at departmental or faculty OSCEs. Where a greater amount of their time is required because of a research protocol, it is often appropriate to pay a stipend. This likely means applying for a grant to undertake the study. Similarly, departmental staff such as secretaries, standardized patient coordinators, and education coordinators may find their work increased by the presence of a study, and consideration should be given to compensating them for their increased time. Finally, it may be necessary to ask students to be involved in an assessment that is not part of their regular course evaluation. In this case they become research subjects and should be given some compensation for their participation in the research project, in addition to giving consent. In our experience, medical students are generally willing to participate in several hours of OSCE testing outside of their usual assessment for between $25 and $75 Canadian.
Undertaking research on evaluation methods including the OSCE is an excellent way to contribute to the academic mission of one's university. Not only does it bring rigor to the evaluation methodologies being used, but the presentation of results at national and international conferences and publication in journals brings one into contact with a large number of colleagues. As mentioned in the section on financing, the ability to derive academic research from educational endeavors such as an OSCE is a potent factor in assuring continued funding and support from department Chairs and Deans.
There are several types of research that have been conducted on OSCEs and many areas where further research is greatly needed. As cited above, researchers have studied the basic psychometrics of OSCEs (including reliability and validity), the cost of examinations, and the incorporation into various curricula. Areas of innovative research have included the effects of portraying emotionally difficult roles on the standardized patients themselves, the impact of OSCEs on students' learning, and cross-cultural and gender aspects involved in assessment. A recent and growing literature looks at the development of expertise as a variable in the assessment of competence. Other areas for OSCE research might include:
- Methods of setting pass-fail standards.
- Correlation of results with clinical practice outcomes.
- Ways to incorporate feedback into OSCEs.
- Comparison of the effectiveness of scoring by physicians, student peers, and standardized patients.
- Assessment of skills such as psychotherapy or electroconvulsive therapy.
- Use of OSCEs to assess residents.
- Use of OSCE to assess physicians in practice or as part of continuing education.
- Assessment of interprofessional skills.
- Simulations of groups or families.
- Assessment of cross-cultural skills.
There are many more questions to be answered regarding the assessment of competence and the OSCE itself. In our experience, the most fruitful way to develop a research project is to talk with faculty and students and find out what topics they become excited or even passionate about. Often a controversy or disagreement about the use of the examination forms the nidus for an excellent research idea. A whole host of granting agencies and journals have appeared in recent years that are eager to fund studies and publish articles in the area of assessment. OSCE research is highly valued by granting agencies and journals. Appendix B lists agencies that will fund studies on assessment in psychiatry, journals that will publish articles related to assessment in psychiatry, and other resources for faculty.

|
REFERENCES
|
- Harden RM, Gleeson FA: Assessment of clinical competence using an observed structured clinical examination. Med Educ 1979; 13:41-47[Medline]
- Barrows HS, Abrahamson S: The programmed patient: a technique for appraising student performance in clinical neurology. Journal of Medical Education 1964; 39:802-805[Medline]
- van der Vleuten CPM, Swanson D: Assessment of clinical skills with standardized patients: state of the art. Teaching and Learning in Medicine 1990; 2:58-76
- Abrahmson S (ed): The oral examination: the case for and against, in Evaluating the Clinical Skills of Medical Specialists. Evanston, IL, American Board of Medical Specialties, 1983, pp 121-124
- McGuire CH: The oral examination as a measure of professional competence. Journal of Medical Education 1966; 41:267-274
- Davidson RG: A point of view: oral examinations (letter). Annals of the Royal College of Physicians and Surgeons of Canada [Annals RCPSC] 1983; 16:114
- Valberg LS, Stuart RK: A point of view: university in-training evaluation and oral examinations in internal medicine. Annals RCPSC 1983; 16:513-515
- Jayawickramarajah PT: Oral examinations in medical education. Med Educ 1985; 19:290-293[Medline]
- Muzzin LJ, Hart L: Oral examinations, in Assessing Clinical Competence, edited by Neufeld VR, Norman GR. New York, Springer, 1985, pp 71-93
- Leichner P, Sisler GC, Harper D: A study of the reliability of the clinical oral examination in psychiatry. Can J Psychiatry 1984; 29:394-397[Medline]
- Leichner P, Sisler GC, Harper D: The clinical oral examination in psychiatry: the patient variable. Annals RCPSC 1986; 19:283-284
- Rothman AI, Cohen R: Understanding the objective structured clinical examination (OSCE): issues and options. Annals RCPSC 1995; 28:283-287
- Kowlovitz V, Hoole AJ, Sloane PD: Implementing the objective structured clinical examination in a traditional medical school. Acad Med 1991; 66:345-347[Medline]
- Matsell DG, Wolfish NM, Hsu E: Reliability and validity of the objective structured clinical examination in paediatrics. Med Educ 1991; 25:293-299[Medline]
- Sloan DA, Donnelly MB, Johnson SB, et al: Use of an objective structured clinical examination to measure improvement in clinical competence during the surgical internship. Surgery 1993; 114:343-351[Medline]
- McFaul PB, Taylor DJ, Howie PW: The assessment of clinical competence in obstetrics and gynecology in two medical schools by an objective structured clinical examination. British Journal of Obstetrics and Gynaecology 1993; 100:842-846[Medline]
- Cohen R, Rothman A, Ross J, et al: A comprehensive assessment of graduates of foreign medical schools. Annals RCPSC 1989; 21:505-509
- Hodges B, Turnbull J, Cohen R, et al: Evaluating communication skills in the OSCE format: reliability and generalizability. Med Educ 1996; 30:38-43[Medline]
- Loschen EL: Using the objective structured clinical examination in a psychiatry residency. Academic Psychiatry 1993; 17:95-104[Abstract]
- Famuyiwa OO, Zachariah MP, Ilechukwu STC: The objective structured clinical exam in psychiatry. Med Educ 1991; 25:45-50[Medline]
- Hodges B, Lofchy J: Examining psychiatry clinical clerks with a mini-OSCE. Academic Psychiatry 1997; 21:219-225[Abstract]
- Hodges B, Regehr G, Hanson M, et al: Evaluating psychiatric clinical clerks with an objective structured clinical examination. Acad Med 1997; 72:715-721[Medline]
- Hodges B, Regehr G, Hanson M, et al: The objective structured clinical exam in psychiatry: a validation study. Acad Med 1998; 73:74-76
- Hanson M, Hodges B, McNaughton N, et al: The integration of child psychiatry into a psychiatry clerkship OSCE. Can J Psychiatry 1998; 43:614-618[Medline]
- Hodges B, Hanson M, McNaughton N, et al: What do psychiatry residents think of an objective structured clinical examination? Academic Psychiatry 1999; 23:1-7[Abstract/Free Full Text]
- McNaughton N, Tiberius R, Hodges B: Effects of portraying psychologically and emotionally complex standardized patient roles. Teaching and Learning in Medicine 1999; 11:135-141[CrossRef]
- Cusimano MD, Cohen R, Tucker W, et al: A comparative analysis of the costs of administration of an OSCE. Acad Med 1994; 69:571-576[Medline]
- Reznick R, Regehr G, Yee G, et al: Process rating forms versus task-specific checklists in an OSCE for medical licensure. Acad Med 1998; 73(10, suppl):S97-S99
- Regehr G, MacRae H, Reznick R, et al: Comparing the psychometric properties of checklists and global rating scales for assessing performance on an OSCE-format examination. Acad Med 1998; 73:993-997[Medline]
- Regehr G, Freeman R, Robb A, et al: OSCE performance evaluations made by standardized patients: comparing checklist and global rating scores. Acad Med 1999; 74(10, suppl):S135-S137
- Hunter DM, Jones RM, Randhawa BS: The use of holistic versus analytic scoring for large-scale assessment of writing. The Canadian Journal of Program Evaluation 1996; 11:61-85
- Charlin B, Tardif J, Boshuizen PA: Scripts and medical diagnostic knowledge: theory and applications for clinical reasoning instruction and research. Acad Med 2000; 72:182-190
- Hodges B, Regehr G, McNaughton N, et al: OSCE checklists do not capture increasing levels of expertise. Acad Med 1999; 74:1129-1134[Medline]
- Hodges B, McNaughton N, Regehr G, et al: The challenge of creating OSCE measures to capture the characteristics of expertise. Med Educ (in press)
- Regehr G, Freeman R, Hodges B, et al: Assessing the generalizability of OSCE measures across content domains. Acad Med 1999; 74:1320-1322[Medline]
- Barrows HS: Simulated Patients (Programmed Patients): Development and Use of a New Technique in Medical Education. Springfield, IL, CC Thomas, 1971
- Stillman P, Ruggill JS, Rutala PJ, et al: Patient instructors as teachers and evaluators. Journal of Medical Education 1980; 55:186-193[Medline]
- King A, Perkowski-Roger L, Pohl S: Planning standardized patient programs: case development, patient training and costs. Teaching and Learning in Medicine 1994; 6:6-14
- Hanson M, Tiberius R, Hodges B, et al: Adolescent standardized patients: methods of selection and assessment of benefits and risks. Teaching and Learning in Medicine 2002; 14:104-113[CrossRef][Medline]
- Educating Future Physicians for Ontario: How to Run an OSCE (video and manual). University of Toronto, University of Ottawa, Queen's University, McMaster University, and University of Western Ontario, 1994 [See Appendix B for ordering information.]
- Hodges B, Russell L, MacRury K, et al: The anatomy of an OSCE station: augmenting classical item statistics with expert item classification (abstract). Proceedings of the International Ottawa Conference on Assessment, Maastricht, The Netherlands, 1996
- Hambleton RK, Powell S: A framework for viewing the process of standard setting. Evaluation and the Health Professions 1983; 6:3-24[Abstract/Free Full Text]
- Meskauskas JA: Setting standards for credentialling examinations: an update. Evaluation and the Health Professions 1986; 9:187-203[Abstract/Free Full Text]
This article has been cited by other articles:

|
 |

|
 |
 
M. Martimianakis, N. McNaughton, G. R. Tait, A. E. Waddell, S. Lieff, I. Silver, and B. Hodges
The Research Innovation and Scholarship in Education Program: An Innovative Way to Nurture Education
Acad Psychiatry,
September 1, 2009;
33(5):
364 - 369.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. M. Brenner
Uses and Limitations of Simulated Patients in Psychiatric Education
Acad Psychiatry,
March 1, 2009;
33(2):
112 - 119.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. A. Bourgeois, H. Ton, J. Onate, T. McCarthy, F. T. Stevenson, M. E. Servis, and M. S. Wilkes
The Doctoring Curriculum at the University of California, Davis School of Medicine: Leadership and Participant Roles for Psychiatry Faculty
Acad Psychiatry,
May 1, 2008;
32(3):
249 - 254.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. S. Sierles
The Association of Directors of Medical Student Education in Psychiatry
Acad Psychiatry,
April 1, 2007;
31(2):
107 - 109.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. J. Bennett, L. M. Arnold, and J. A. Welge
Use of Standardized Patients During a Psychiatry Clerkship
Acad Psychiatry,
June 1, 2006;
30(3):
185 - 190.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Sauer, B. Hodges, A. Santhouse, and N. Blackwood
The OSCE Has Landed: One Small Step for British Psychiatry?
Acad Psychiatry,
August 1, 2005;
29(3):
310 - 315.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Fox
Teaching Normal Development Using Stimulus Videotapes in Psychiatric Education
Acad Psychiatry,
December 1, 2003;
27(4):
283 - 288.
[Abstract]
[Full Text]
[PDF]
|
 |
|
Get information about faster international access.
a>
Privacy Policy
Copyright © 2002
Academic Psychiatry.
All rights reserved.
Home
| Search
| Current Issue
| Past Issues
| Subscribe
| All APPI Journals
| Help
| Contact Us
|