0
1
Commentary   |    
Appropriate Expertise and Training for Standardized Patient Assessment Examiners
Jay Parkes; Nancy Sinclair; Teresita McCarty
Academic Psychiatry 2009;33:285-288.
View Article Information

Received March 19, 2009; revised and accepted March 20, 2009. The authors are affiliated with Educational Psychology in the College of Education, the Assessment and Learning Division of the School of Medicine, and the Department of Psychiatry at the University of New Mexico in Albuquerque. Address correspondence to Teresita McCarty, M.D., MSC09 5030, 1 University of New Mexico, Albuquerque, NM 87131; tmccarty@salud.unm.edu (e-mail).

Copyright © 2009 Academic Psychiatry

Standardized patient assessments are used routinely in high-stakes licensure examinations (1). The most important consideration for such examinations is whether they measure skills that are important in the clinical evaluation of patients—whether the examinations are valid. One of the key validity considerations is the role of expert judgment, and some educators have argued that physicians, not standardized patients, are the only appropriate judges of patient care skills (2). Although expertise is indeed a critical factor in valid determinations of clinical skills, it is essential to define the requisite expertise and to infuse that expertise throughout the assessment process rather than only when assigning final scores. So although expertise in clinical skills examiners is unequivocally required, those designing high-stakes performance assessments must answer a series of questions. Exactly what expertise is required? How is that expertise acquired? Where does expertise reside in the assessment process? How is examination expertise evidenced? The responses to these questions have implications for training examiners and, more generally, for designing assessments that use standardized patients.
Performance examinations that use standardized patients often assess at least two relatively distinct skill sets: clinical skills (obtaining the history and examining the patient) and communication skills (the techniques and behaviors used to obtain information from and interact with the patient). The nature of these skills sets needs to be considered. There are more concrete as well as more complex aspects to each, although clinical skills tend toward the more concrete while communication skills tend toward the more complex. Either the student did or did not elicit a particular portion of the medical history or perform a particular maneuver in the physical examination. In the context of communication skills, there are concrete, behavioral aspects of both verbal and nonverbal behaviors that may be observed "objectively." For instance, either the examinee made appropriate eye contact or did not. But other qualities associated with relationship-centered patient communication are less obvious. The dynamic aspect of how well the patient feels heard and understood is vital to establishing trust, compliance, and ultimately healthy outcomes. This is a construct assembled by the patient. Therefore, if a goal of the assessment is to measure how effectively a clinician communicates with a patient, the patient’s perception of the effect of that communication should be represented. This may be most validly assessed when the examiner is carefully trained to embody that perception.
In judging an examinee’s ability to take a history and perform a physical examination, an informed clinician, well experienced with the condition and its evaluation, brings the requisite expertise. (Equating experience with expertise is perhaps unwise, given that Hawkins et al. (3) showed a negative correlation between years since medical school graduation and standardized patient checklist scores when the seasoned clinicians were the examinees.) In judging an examinee’s ability to communicate effectively with a patient, a seasoned clinician might not bring the requisite experience, but the informed, observant, thoughtful, and engaged patient would have the requisite experience to be the expert (4, 5). The decision of what constitutes expertise in a particular domain needs to rest squarely on an understanding of the learning objectives to be assessed.
In actual clinical encounters, the patient is often the only judge of the physician’s communication skills. For example, a physician may think he or she did a very nice job of making health care recommendations, but the recommendations may not register with a patient who is distracted by an important concern the physician overlooked. At the global level, a patient’s perception of the clinician’s communication skills is often reflected in his or her satisfaction with the care. Thus, if the educational intent of a licensing examination is to ensure that practicing clinicians can communicate effectively with patients, important patient concerns should be designed into the cases, and the patient’s perception of clinician success in navigating both the patient’s and the clinician’s agendas should be represented strongly in the final score. Although it is difficult for a clinician-observer to monitor these subtle, dynamic, and complex states of "patient satisfaction," a well-designed case and a well-trained standardized patient can provide accurate information regarding patient perception and patient satisfaction.
Expertise is domain specific and therefore depends on the learning objectives of the assessment. Being a clinical expert is not the same as being a well-trained examiner. Examiner expertise requires being trained to apply the scoring guide accurately. So for either concrete or complex skill sets, a key characteristic of an effective examiner is someone who can be trained to score the instrument effectively. In fact, less clinical knowledge may make someone easier to train to be an examiner. It is important to select examiners who are trainable, who observe and remember accurately, whose scoring is consistent with established criteria, who enter the data appropriately, and who complete the task in a timely manner. Just as some patients would be hard to train as standardized patients, so could a clinician be hard to train as an examiner. Clinical experts have developed intuition and habits that help them deliver their expertise in a clinical setting. The instinctive nature of these well-practiced mental habits is not always desirable in an assessment situation where examinees need developmentally appropriate scoring and feedback. Examiner training, not expertise, is the key to scoring accuracy and even examination validity.
Expertise must be applied in the training of standardized patients. Standardized case portrayals support construct validity while the reliability of scores hinge on the rater’s accuracy. Systematic training techniques to achieve accuracy in both the case portrayal and completion of the rating tool include clear case materials, coaching for accuracy and standardization of portrayal, opportunities for practice, and feedback for improvement (6, 7).
Expertise in training raises the issue that expertise needs to inform the entire assessment process, not just the examiner component. Expertise is needed in the choice of the learning objectives to be assessed, the selection and design of the cases, the training of the standardized patients’ portrayal of the cases, and the construction of the scoring guides.
Interviewing and physical examination may be observed directly and measured concretely. Either the examinee did or did not elicit the appropriate medical history or select the appropriate physical examination and perform it correctly. These skills can be successfully evaluated through a yes/no checklist. The key is having the right checklist. Expertise is needed to construct an appropriate checklist and to establish evidence of content validity (8). More complex tasks are better reflected in more complex scoring rubrics, such as global rating scales (9). Once the instrument exists, the best examiner is not necessarily someone with clinical expertise, but someone who can be trained to complete the scoring guide reliably.
Sometimes the best examiner is not someone but something. The National Board of Medical Examiners, a group of assessment design experts, has been using an automated scoring system, eschewing human examiners, let alone expert examiners, in its computer-based case simulation system (10). And there is evidence from other uses of automated scoring that the computers can score more accurately than humans can (11).
If the examiners are human, the validity burden is also borne by the training that the examiners receive, as already described. With sufficient education about the components of the scoring guide and practice at employing the guide across examples of examinee performance, "nonexpert" examiners can reach high levels of interexaminer and intraexaminer agreement (12).
These principles of expertise as applied to standardized patient examinations have implications for the kinds of validity evidence that should be collected. An examiner agreement index or the correlation between clinician and nonclinician examiners (13) needs to be interpreted very carefully, as do studies that employ clinicians’ expert judgment as the criterion for the predictive validity of standardized patient scores (14). As Clauser (15, p 316) succinctly expressed it, "Experts’ ratings are a provisional criterion, not an ultimate standard." The prerequisite question mentioned earlier is, "Is the clinician the appropriate expert?" With clinical skills such as history taking and physical examination, in which the clinician clearly has the relevant expertise, high correlation or agreement is desirable and is evidence of validity. With communication skills, clinician and nonclinician score agreement might actually indicate a validity problem, and examiner disagreement might be positive evidence of validity. Given a well-designed case and a well-trained standardized patient, such disagreement may not be error but may represent the authenticity of the patient experience that is important to clinical care.
The validity of standardized patient assessments is enhanced by infusing expertise throughout the process, not simply in the qualifications of examiners, which may prove to be the least effective place for it. Four steps for this infusion are implied:
Expertise varies by domain and does not readily transfer from one domain to another. In performance assessment, the application of expertise begins with the selection of the objectives to be assessed. Clarity about the assessment of objectives directs the designers to the most relevant domains of needed expertise. For assessment outcomes to be valid, the context, design, scoring guides, examiners, training, and implementation—all of which imply different areas of expertise—must be considered. Sometimes these areas of expertise may reside in one expert, and sometimes they may be constellated across different experts. The realistic infusion of expertise throughout the assessment is what supports validity.
.
Boulet JR, Smee SM, Dillon GF, et al: The use of standardized patient assessments for certification and licensure decisions. Simul Healthc 2009; 4:35—42
 
.
McLaughlin K, Gregor L, Jones A, et al: Can standardized patients replace physicians as OSCE examiners? BMC Med Educ 2006; 6:12
 
.
Hawkins R, MacKrell Gaglione M, et al: Assessment of patient management skills and clinical skills of practicing doctors using computer-based case simulations and standardized patients. Med Educ 2004; 38:958—968
 
.
Makoul G, Krupat E, Chang CH: Measuring patient views of physician communication skills: development and testing of the communication assessment tool. Patient Educ Couns 2007; 67:333—342
 
.
Mercer LM, Tanabe P, Pang PS, et al: Patient perspectives on communication with the medical team: pilot study using the Communication Assessment Tool-Team (CAT-T). Patient Educ Couns 2008; 73:220—223
 
.
Wallace P: Coaching Standardized Patients for Use in the Assessment of Clinical Competence. New York, Springer, 2007
 
.
Heine N, Garman K, Wallace P, et al: An analysis of standardized patient checklist errors and their effect on student scores. Med Educ 2003; 37:99—104
 
.
Boulet JR, van Zanten M, de Champlain A, et al: Checklist content on a standardized patient assessment: an ex post facto review. Adv Health Sci Educ Theory Pract 2008; 13:59—69
 
.
Hodges B, McIlroy JH: Analytic global OSCE ratings are sensitive to level of training. Med Educ 2003; 37:1012—1016
 
.
Clyman SG, Melnick, Del, Clauser BE: Computer-based case simulations from medicine: assessing skills in patient management, in Innovative Simulations for Assessing Professional Competence. Edited by Tekian A, McGahie WC. Chicago, University of Illinois, Department of Medical Education, 1999, pp 29—41
 
.
Williamson D, Behar L, Hone A: "Mental model" comparison of automated and human scoring. J Educational Measurement 1999; 36:158—184
 
.
van Zanten M, Boulet JR, McKinley D: Using standardized patients to assess the interpersonal skills of physicians: six years’ experience with a high-stakes certification examination. Health Commun 2007; 22:195—205
 
.
Boulet JR, Ben-David MF, Burdick W, et al: An investigation of the sources of measurement error in the post-encounter written scores from standardized patient examinations. Adv Health Sci Educ Theory Pract 1998; 3:89—100
 
.
Boulet JR, McKinley DW, Norcini JJ, et al: Assessing the comparability of standardized patient and physician evaluations of clinical skills. Adv Health Sci Educ Theory Pract 2002; 7:85—97
 
.
Clauser BE: Further discussion of SP checklists and videotaped performances. Acad Med 2000; 75:315—316; author reply 317—318
 
+
.
Boulet JR, Smee SM, Dillon GF, et al: The use of standardized patient assessments for certification and licensure decisions. Simul Healthc 2009; 4:35—42
 
.
McLaughlin K, Gregor L, Jones A, et al: Can standardized patients replace physicians as OSCE examiners? BMC Med Educ 2006; 6:12
 
.
Hawkins R, MacKrell Gaglione M, et al: Assessment of patient management skills and clinical skills of practicing doctors using computer-based case simulations and standardized patients. Med Educ 2004; 38:958—968
 
.
Makoul G, Krupat E, Chang CH: Measuring patient views of physician communication skills: development and testing of the communication assessment tool. Patient Educ Couns 2007; 67:333—342
 
.
Mercer LM, Tanabe P, Pang PS, et al: Patient perspectives on communication with the medical team: pilot study using the Communication Assessment Tool-Team (CAT-T). Patient Educ Couns 2008; 73:220—223
 
.
Wallace P: Coaching Standardized Patients for Use in the Assessment of Clinical Competence. New York, Springer, 2007
 
.
Heine N, Garman K, Wallace P, et al: An analysis of standardized patient checklist errors and their effect on student scores. Med Educ 2003; 37:99—104
 
.
Boulet JR, van Zanten M, de Champlain A, et al: Checklist content on a standardized patient assessment: an ex post facto review. Adv Health Sci Educ Theory Pract 2008; 13:59—69
 
.
Hodges B, McIlroy JH: Analytic global OSCE ratings are sensitive to level of training. Med Educ 2003; 37:1012—1016
 
.
Clyman SG, Melnick, Del, Clauser BE: Computer-based case simulations from medicine: assessing skills in patient management, in Innovative Simulations for Assessing Professional Competence. Edited by Tekian A, McGahie WC. Chicago, University of Illinois, Department of Medical Education, 1999, pp 29—41
 
.
Williamson D, Behar L, Hone A: "Mental model" comparison of automated and human scoring. J Educational Measurement 1999; 36:158—184
 
.
van Zanten M, Boulet JR, McKinley D: Using standardized patients to assess the interpersonal skills of physicians: six years’ experience with a high-stakes certification examination. Health Commun 2007; 22:195—205
 
.
Boulet JR, Ben-David MF, Burdick W, et al: An investigation of the sources of measurement error in the post-encounter written scores from standardized patient examinations. Adv Health Sci Educ Theory Pract 1998; 3:89—100
 
.
Boulet JR, McKinley DW, Norcini JJ, et al: Assessing the comparability of standardized patient and physician evaluations of clinical skills. Adv Health Sci Educ Theory Pract 2002; 7:85—97
 
.
Clauser BE: Further discussion of SP checklists and videotaped performances. Acad Med 2000; 75:315—316; author reply 317—318
 
+
+

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of JBJS editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe



Related Content
Articles
Topic Collections
PubMed Articles