In July 2002, the Accreditation Council for Graduate Medical Education (ACGME) began requiring residency programs to demonstrate resident competency in six areas: patient care, medical knowledge, practice—based learning and improvement, interpersonal and communication skills, professionalism, and systems—based practice. Previous studies indicate that measuring both professional (1) and medical (2) competence is extremely complex. Assessment techniques have limitations, and therefore multiple strategies are recommended (1). For example, multiple-choice questions (MCQ) may not be the best method to determine how a resident will perform with a patient (3). Global rating forms are limited because few faculty members are trained to use them and there is evidence for bias and little discrimination among residents using such forms (4).
The ACGME developed a "Toolbox" to suggest possible techniques for evaluating each competency. The toolbox defines 13 different evaluation techniques: record review, chart stimulated recall, checklist, global rating, standardized patients, objective structured clinical exam (OSCE), simulations and models, 360° global rating, portfolios, multiple-choice question exam, oral exam, procedures/case logs and patient survey. Each competency has several associated skills (25 skills for all six competencies). Experts in educational measurement provided ratings for which evaluation method would be best for each competency skill (see "ACGME Competencies: Suggested Best Methods for Evaluation" http://www.acgme.org/Outcome/).
Research indicates that both medical students’ (2, 5) and residents’ (6) perceptions of methods used to assess their competence are important. Morgan and Cleave-Hogg (2) advocate that opinions of those evaluated should be addressed and, if possible, integrated into the proposed assessment technique. Fried, Devore and Dailey (7) cite the well-established theory of reasoned action which explains that individuals are likely to exhibit a behavior when they view it positively. In this case, residents who view the evaluation as an effective method to measure their competency may be more willing to participate in the process.
The purpose of this study was to solicit residents’ perceptions of how effectively different evaluation methods assessed their competency for each of the 25 required skills defined by ACGME. We addressed the following questions:
What evaluation technique is rated highest (or lowest) overall for each competency?
Which evaluation techniques do the residents rate as equally effective for evaluating their competency?
How do residents’ perceptions of evaluation techniques compare to the ACGME expert rankings?
The research team designed a survey to obtain residents’ ratings of each evaluation method. We gave the survey to PGY1-4 general psychiatry residents during the 2001—2002 academic year at the University of Arkansas for Medical Sciences. Residents in this program are evaluated by monthly global evaluation forms, an annual in-training residency examination, and two oral examinations. This year was the initial year that residents completed a required portfolio as part of their program.
For the survey, we used the ACGME matrix with 12 evaluation techniques and 25 associated skills (permission from S. Swing, personal communication). We did not include global rating, as this is the current evaluation used in most programs. The survey asked residents about their perceptions of the effectiveness of each of the 12 other techniques for assessing their competency in each of the 25 defined skills. The residents gave one of four ratings: 0=not at all, 1=somewhat effective, 2=effective, 3=very effective in assessing resident competency. Each resident made 300 (12 techniques × 25 skills) ratings. Additionally, residents provided these demographics: post-graduate year of training, gender, age and U.S. or non-U.S. medical school graduate. All residents received a brief description of each assessment method as defined by the ACGME toolbox of assessment methods (http://www.acgme.org/Outcome/). We distributed and collected the surveys during scheduled weekly didactic meetings for residents. Participation was voluntary and responses were anonymous. The survey was not timed and took approximately 30 minutes.
A brief follow-up survey was given to the residents to determine their experience with each of the techniques described. They could indicate for each method if they never, rarely, sometimes or often evaluated using that method. Since all data are anonymous and reported voluntarily, our Institutional Review Board (IRB) considers this study to have exempt status.
To address the first research question, we calculated means and standard deviations for each of these 300 ratings. We averaged the residents’ ratings across the skills within each of the 6 competencies for the 12 techniques, resulting in 72 means. We reviewed the 72 means to determine techniques rated highest and lowest overall for each of the six defined competency areas. To address the second research question concerning similarly evaluated methods, we selected Kendall’s tau b to determine the relationship between perception of methods within each of the six competency areas. We selected this non-parametric correlation coefficient because of the limited range of scores possible. We inspected the resulting six (12 techniques × 12 techniques) matrices for patterns of high intercorrelation exceeding 0.7. These high intercorrelations indicate that residents perceived the two methods as functioning in a similar manner for assessing that competency. The third research question concerned how the residents’ ratings compared to the ACGME suggestions. We examined what was rated high by the residents compared to what the ACGME suggested. The follow-up survey was analyzed descriptively. Percentages describe the previous experience that the residents had with each of the methods assessed in this study.
Sixteen residents completed the survey (70%). Their average age was 32.03 years (SD=6.29), ranging from 27 to 46 years old. Eight (50%) were female. The respondents represented all four residency years (PGY1: 18.8% [N=3], PGY2: 31.3% [N=5], PGY3: 18.8% [N=3], PGY4: 31.3% [N=5]). One resident was a foreign medical school graduate. t1 provides the means and standard deviations summarized for the six competencies. The first question related to which evaluation method was rated highest for each competency. The residents chose the 360° evaluation as the most effective for assessing resident competency for all competencies except medical knowledge. Logs were chosen as least effective for assessing resident competency in all of the areas except medical knowledge. Residents rated portfolios as the best way to demonstrate medical knowledge, with oral examination and multiple-choice questions rated as the next most effective techniques in this competency area. Patient surveys were rated least favorably as an effective means of measuring resident medical knowledge.
Evaluation methods receiving a mean score of 2 or greater were considered "effective" methods. Those rated below 2 were seen as ineffective. The following evaluation methods were viewed as ineffective for all of the six general competencies: checklist, procedures/case logs, OSCE, standardized patients, record review and simulations and models. The residents did not rate any of the 12 techniques as effective to evaluate their competence in practice—based learning and systems—based practice.
The second research question addressed whether residents rated two methods equal for their effectiveness. The residents rated OSCE and standardized patient similarly in terms of their effectiveness. Kendall’s tau exceeded 0.7 for the correlation between these two techniques for all six competencies. Only one other correlation exceeded 0.7, which was between simulations and models and standardized patients for systems—based practice.
The last research question concerned agreement between residents and rating by experts for the ACGME. ACGME used the "most desirable" rating for only 46 of the 300 possible ratings and our residents agreed with only 23% of these. t2 illustrates methods perceived effective for each competency (considering any skill) for the residents and the ACGME experts.
For communication skills, the experts selected the OSCE and standardized patients as the best technique; our residents did rate these techniques as effective for listening, but not for creating a therapeutic relationship. The residents and experts agreed in the competency of medical knowledge in general, with the residents including portfolios where the experts included simulations and models. In professionalism, both groups supported the use of a 360° evaluation, but the residents preferred patient surveys to OSCE. ACGME experts favored OSCE over patient surveys. For the nine patient care skills, the residents agreed with the ACGME for two of the best methods for assessment, 360° and chart recall. These residents assessed all of the methods as very low on effectiveness for evaluating their competency for practice—based learning and for systems—based practice. Therefore, there was no agreement with the ACGME experts.
On the follow-up survey we received responses from 10 of the original participants (63%). As medical students, at least 50% of the residents were often evaluated with checklists, faculty global rating, and MCQ exams. At least 50% of them sometimes had experience with simulations and models and oral exams. Over 50% never had been evaluated in medical school with chart stimulated recall, 360° evaluation and portfolios. Rarely did they experience record review and patient surveys. Most students were evaluated either rarely or sometimes with OSCEs and SPs. As residents, at least 50% often used checklists, faculty global ratings and portfolios. The residents were never evaluated with chart stimulated recalls, standardized patients, OSCE and 360° evaluation.
This study provides guidance to program directors concerning psychiatry residents’ perceptions of techniques available to effectively assess their competency. The residents’ ratings indicated awareness that evaluators should use different techniques for different competencies. Residents recognized that different techniques could be equally effective for assessing a given skill. However, residents do perceive effectiveness of techniques differently from the ACGME experts who rated the methods’ effectiveness based on psychometric qualities. Program directors should attend to residents’ perceptions in order to facilitate acceptance of whatever techniques are chosen to measure competency. The low ratings of techniques for practice—based learning and for systems—based practice suggest that program directors should make sure that residents understand what is to be evaluated and how the chosen method will work effectively.
Residents are sufficiently knowledgeable about evaluation techniques to provide their perceptions. For example, the residents saw patient surveys as useful for assessing their effectiveness in communication skills and professionalism, but logically not in areas such as medical knowledge and only for selected aspects of patient care such as demonstrating caring and respectful behavior and providing patient education. Similar patterns occurred with most of the techniques. The residents also evaluated favorably techniques that they did not know as well. The 360° evaluation consistently rated high with the residents as an effective method to measure their competency, even though this is a new technique for the residents.
Residents had difficulty distinguishing between the effectiveness of the OSCE and standardized patients methods. The OSCE and standardized patients methods are similar in many ways, particularly in a specialty such as psychiatry. This similarity probably explains the lack of distinction in the pattern of residents’ ratings.
Experts assigned the traditional global rating the descriptor of "a potentially applicable method" five times out of the 25 possible times. This indicates that they perceive that among the 12 other methods there are more desirable techniques. We found that there was limited agreement between the experts and the residents about the effectiveness of these other methods. Residents are as comfortable as the experts in assessing the best ways to measure competency in medical knowledge. Residents are most experienced in assessments of their medical knowledge and have comfort and familiarity with the proposed techniques. Experience with traditional techniques may be one reason that they tend to agree with the experts. However, residents had experience with portfolios and rated that as the best method for assessing medical knowledge. The experts did not rank portfolios. Therefore, the residents are making decisions based on more than just their traditional experience.
Experts frequently rated the OSCE or standardized patient as the most desirable technique in the competency of interpersonal and communication skills, but the residents did not. The residents in other forums have mentioned the artificiality of these situations and thus rated them less conducive for evaluating their ability in a skill such as establishing a therapeutic relationship. Because psychiatry requires highly developed communication skills, residents may be sensing that these artificial situations could be limiting. On the other hand, residents could fear poor performance, resulting in their negative view. The residents’ lack of enthusiasm for the OSCE is illustrated as well in the competencies of professionalism and patient care.
For practice—based learning and systems—based practice, only one skill was in agreement with ACGME. Residents gave low ratings overall to all methods. They did not see these assessment options as particularly effective for measuring their competency. This may be due to their actual beliefs or it may be related to unfamiliarity with these two competencies. Residency education has not emphasized these areas as much as the other four competencies, so it may be very difficult for the residents to make informed judgment. Residency program directors should clarify these competencies as well as explain the choice of an evaluation method.
Residents in our program use portfolios as part of their evaluation. Our residents select examples of actual work they have done to demonstrate a specific competency as a psychiatrist (8). In this study, our residents rated portfolios as the best way to assess their medical knowledge. They also felt that portfolios could effectively measure their competence in some patient care skills. Portfolios rated as the best way (other than 360) to measure competency in practice—based learning. Their experience likely influenced their perceptions.
Residents also discriminated among evaluation methods not used in their program. For example, 360° rating was not currently used in our program, yet residents ranked this as their preferred method of evaluation for all but the medical knowledge competency. The residents mentioned during the survey liking the idea of receiving input from multiple perspectives from individuals who are not usually considered in evaluations, but know the residents well. Residents are capable of endorsing new evaluation methods.
A limitation of this study is that residents from only one psychiatry residency program participated. A post-survey focus group could clarify the residents’ rationale for their ratings. Further research is needed to compare these results with surveys of residents in other psychiatry programs. In addition, it would be interesting to evaluate and compare psychiatry residents’ perceptions to those of residents in programs other than psychiatry. Also, it would be worth obtaining faculty perceptions of these methods.
This paper clarifies what methods are of interest to residents and how their perspective relates to that of experts. Residents can see the value of methods, even ones they do not know well. They also have insights about techniques they use frequently. Careful consideration of these results could allow a program director to work more collaboratively with the residents by considering their perspective in conjunction with expert information to design the best evaluation strategies for a particular program.
This study was supported in part by the Edward J. Stemmler M.D. Medical Education Research Fund of the National Board of Medical Examiners, "Demonstration of a Portfolio Assessment in Residency Education" (P.S. O’Sullivan, PI, #60-9899).