Several medical disciplines have addressed whether rotation timing for third-year medical students affects clerkship performance. The concern has been that completion of a rotation early in the clerkship cycle adversely affects student performance. In particular, the association of rotation timing with National Board of Medical Examiners (NBME) subject examination performance has prompted numerous studies because of the substantial influence these examinations often exert on clerkship grades and the consequences of a particular grade for residency matching. This has potential importance for students if the timing of the clerkship for the specialty in which they want to match is associated with NBME performance.
Overall, research has shown a positive, linear trend in NBME performance over time (
+1—
+12). This trend, however, is not universal, depending on specialty and sample, and the effect sizes are sometimes trivial (
+4,
+6,
+13—
+16). Moreover, the slight timing effect on NBME scores does not appear to generalize to United States Medical Licensing Examination (USMLE) Step 2 scores (
+5,
+14,
+16—
+19). Nevertheless, in psychiatry, the NBME has recently published quarterly norms for the psychiatry subject examination (
+19). Other performance indicators like attending physician evaluations, Objective Structured Clinical Examinations (OSCE) and school-specific oral and written examinations have generally shown little effect of timing (
+3,
+10,
+11,
+20—
+22). Several studies, however, have reported positive trends for specific evaluations, including history and physical examination skills (
+13), final written examinations (
+22), pattern recognition in surgery (
+15), and attending evaluations (
+4).
In summary, clerkship timing effects may or may not occur, depending on the sample and outcome measure. When clerkship timing effects do occur, the effect is often weak. No study has examined the association between clerkship timing and the overall pattern of performance in a clerkship. Data in psychiatry are particularly lacking. In addition, previous studies have treated rotation timing as a random phenomenon, when, in fact, a proportion of students complete a given rotation at their preferred time (
+1—
+6,
+14). This timing preference effect may contribute to timing effects if students delay their most important rotations until later in the year. This may be related to residency matching, such that the rotation for the specialty in which the student wants to match may be similarly delayed. The purpose of this study was to examine the association between rotation timing and overall pattern of clerkship performance in psychiatry and the moderating effects of preferred rotation order and residency matching.
The Saint Louis University institutional review board gave expedited approval to this study. Data were aggregated for third-year clerks at Saint Louis University from 6 years of psychiatry rotations, eight 6-week rotations per year (N=869; 1997—1998: N=153, 1998—1999: N=151, 1999—2000: N=140, 2000—2001: N=146, 2001—2002: N= 131, 2002—2003: N=148; period 1: N=116, period 2: N= 112; period 3: N=110, period 4: N=108, period 5: N=98, period 6: N=112, period 7: N=104, period 8: N=109). Gender distribution was approximately equal in all years.
Students were assigned to rotation orders by a preference-lottery system. At one medical school, students must complete internal medicine, obstetrics-gynecology, and psychiatry rotations in one 6-month block and pediatrics, surgery, family medicine, and neurology clerkships in the other 6-month block. Students must complete psychiatry immediately before or after obstetrics-gynecology and family medicine immediately before or after neurology. Within these restrictions, students submit order preferences, which are met at random until it is necessary to alter preferred orders (randomly) to maintain a balance across blocks and rotations.
Data on preferred period for the psychiatry rotation were available for four of the 6 years studied (1999—00 to 2002—03) and 495 of the 869 students (57.0%). Data for residency matching were available for 5 years (1997—98 to 2001—02) and 686 of the 869 students (78.9%).
For the psychiatry clerkship, student grades are determined by the number of points students earn across several performance indicators. The lowest number of points a student may earn is 30, and the highest number of points is 422. Cutoff scores for grades of honors, near honors, pass, and deferred/fail have been established. The primary end-of-rotation performance indicators include the OSCE, the NBME psychiatry subject examination, and an attending evaluation. Additionally, students receive points for completing a series of learning experience exercises. Objective Structured Clinical Examination, NBME subject examination, and attending evaluation scores are weighted and standardized so that the total OSCE score and NBME score each account for 25% of the total possible points a student can earn, while the attending evaluation score accounts for 30% of the total points possible. The learning experiences score accounts for the remaining 20% of the points. For this study, only the OSCE, NBME subject examination, and attending evaluation scores were analyzed.
For the psychiatry OSCE, five stations were 15-minute interviews with standardized patients (SP) and scored by observers using binary (0/1) checklists (9—15 items). These mechanics checklists were content-oriented regarding history taking, mental status examination, communication, and prescriptions. The OSCE mechanics score ranged from 0—28. Standardized patients completed a six-item Patient Perception Questionnaire (PPQ; 5-point scale) on the student (
+23). The PPQ is process-oriented regarding interpersonal skills (e.g., listening, interest, respect), and the score ranged from 0—25. There were three writing stations requiring a differential diagnosis (12—14 items, scored 0/1) and observational summary (8—17 items, scored 0/1). Objective Structured Clinical Examination differential diagnosis and observation scores ranged from 0—15 and 0—20, respectively. Reliability and validity of these scores are established (
+24). OSCE scores were equated to account for minor content differences over the years. For the purposes of grading, all of the OSCE scores are summed together to generate a weighted point total.
The NBME psychiatry subject examination (NBME PSE) was administered at the end of each rotation. Each student’s national percentile rank (1—99) was used as the performance measure (norm-adjusted over the 6-year period; the NBME quarterly norms [
+19] were not used). The reliability and validity of the NBME subject tests are established (
+25).
The attending evaluation was an aggregate of 12 end-of-rotation ratings that evaluated performance in psychiatric knowledge, history taking, physical examination, mental status examination, information organization, communication, differential diagnosis, treatment planning, professionalism, motivation/attitude, and patient/health care team rapport (5-point scale). Internal consistency for the attending evaluation score (range = 29—147) was 0.95.
All statistical analyses were performed using SPSS 11.5 for windows (SPSS, Inc., 2002).
A hierarchical agglomerative cluster analysis using Ward’s procedure and a squared euclidean distance measure was used to establish homogeneous groups of students based on the six performance indicators (
+26). Cluster formation is iterative (number of iterations = N). In stage one, the squared euclidean distance between pairs of cases is calculated, and two cases are combined based on the Ward criterion, which minimizes within-cluster variance. There is a decrease in cases/clusters as cases/clusters are merged at successive iterations. At stage N, all cases are in one cluster. In this study, change in agglomeration coefficient was examined to determine the number of clusters. This coefficient represents change in squared euclidean distance between the two most dissimilar members in clusters being combined at a particular stage. Small changes indicate that homogeneous clusters are being merged. Large changes indicate that heterogeneous clusters are being combined. A large increase in agglomeration coefficient indicates the optimal number of clusters.
Chi-square tests of association were used to examine relationships between clusters and timing. Cramer’s
V statistic was used as an indicator of effect size (
+27). The
V in Cramer’s
V statistic represents the four-fold point correlation of categorical variables and is interpreted as a correlation (range = 0.0—1.0), with values less than 0.30 considered weak. Trend analysis and analysis of variance (ANOVA) were used to identify performance trends and differences between clusters. Eta-squared (η
2) was used to indicate effect size (
+27). Eta-squared represents the percentage of variance in the outcome measure that is explained by group designation, with values less than 0.10 considered weak.
+
Performance Cluster Analysis
In this study, the first large agglomeration coefficient change occurred between the five-cluster and four-cluster solutions (iterations 864 and 865, 291-point increase), indicating that a five-cluster solution was optimal. Solution validity was evaluated by examining between-cluster differences on performance indicators using multivariate analysis of variance (MANOVA), which was significant, Wilks’ γ = 0.12, p<0.001. As shown in
+Table 1, univariate ANOVAs yielded significant cluster effects for all indicators.
Student performance across clusters was differentiated by examining pattern of
Z-score means on the indicators (high:
Z > 0.5 moderately high: 0 <
Z < 0.5; moderately low: −0.5 <
Z < 0; low:
Z < −0.5). As shown in
+Figure 1, students in cluster 2 performed uniformly low, while students in cluster 4 performed uniformly well. Cluster 1 students performed relatively well in all areas except attending evaluation. Students in cluster 3 had relatively low performance on paper-and-pencil indicators: NBME PSE and OSCE differential diagnosis and observation. Students in Cluster 5 did well on NBME PSE and Attending Evaluation, but relatively low on the OSCE.
+
Performance Cluster Membership and Rotation Period
As shown in
+Table 2, the proportion of students represented in each cluster as a function of rotation period was not significantly different, χ
2=37.3, df=28, p=0.11,
V = 0.10. Moreover, the proportion of students receiving grades of honors, near honors, and pass on the psychiatry rotation did not vary significantly as a function of period, χ
2= 13.8, df=14, p=0.47, V = 0.09.
+
Rotation Period Preference, Performance Cluster Membership, and Rotation Period
Of the 495 students for whom rotation preference was available, 50.9% (N=252) completed psychiatry in their preferred period. As a function of cluster membership and rotation period, these 252 students were compared with those who did not take psychiatry in their preferred period (N=243). To maximize sample size, the 8 periods were collapsed into quarters. Analysis indicated no significant association between cluster membership and psychiatry rotation quarter for students who took psychiatry at their preferred time (χ2=8.1, df=12, p=0.77, V = 0.10) and those who took it at a nonpreferred time (χ2=12.4, df=12, p=0.41, V = 0.13).
+
Residency Matching, Performance Cluster Membership, and Rotation Period
Of the 686 students for whom residency match data were available, only 23 (3.4%) matched in psychiatry. Analysis indicated no significant association between cluster membership and psychiatry rotation timing for students who did not match in psychiatry (χ2=20.2, df=12, p=0.07, V = 0.10). This analysis could not be done for students who matched in psychiatry because of the small sample. When the variables were examined separately, analysis yielded no significant relationship between match (psychiatry versus other) and cluster membership (χ2=1.6, df=4, p=0.82, V = 0.05) or match and rotation period (χ2=3.2, df=3, p=0.37, V = 0.07).
+
Performance Trend Analyses and Rotation Period
Trend analyses evaluated timing effects for each indicator separately. As shown in
+Figure 2, attending evaluations, OSCE PPQ, and OSCE observation scores showed no significant trend. National Board of Medical Examiners PSE showed a positive linear trend (p<0.05; η
2=0.007) and a cubic trend (p<0.05; η
2= 0.005). The cubic trend, however, was essentially a function of the drop from period 7 to period 8. Objective Structured Clinical Examination mechanics showed a negative linear trend (p<0.05; η
2=0.004). Objective Structured Clinical Examination differential diagnosis indicated a cubic trend (p<0.01; η
2=0.011), although the trend here was also primarily a function of the drop from period 7 to period 8. While statistically significant, the effect sizes for these trends were very weak.
This study found little support for the rotation timing effect in psychiatry, which is consistent with the limited literature for psychiatry clerkships (
+6,
+14). Across 6 years and eight periods, students were clustered into homogeneous groups that varied in pattern of performance across different indicators. Students in the highest performing and lowest performing clusters were no more likely to appear in any of the eight periods. While NBME PSE scores increased slightly in later periods, period 8 scores were comparable to scores in periods 1—3. Additionally, NBME performance tended to be balanced by other indicators, including OSCE mechanics and differential diagnosis scores. Attending evaluations and other OSCE indicators had no statistically discernable trends.
This study also found that any timing effects were weak. The NBME PSE trend had an order of magnitude of less than 1% of the variance. This is consistent, albeit somewhat lower, with other published data (
+2,
+19). The weak effect sizes and the lack of association between timing and overall performance suggest that psychiatry clerkship performance was essentially unrelated to timing in this sample.
The data also indicate little systematic variation in performance trends attributable to rotation order preference and specialty choice. As a proxy for interest in a discipline, residency matching did not vary systematically as a function of preference and was not associated with performance cluster membership. Thus, specialty match does not appear to be associated with preference for psychiatry rotation timing or student rotation performance, although the majority of students may ultimately match in the discipline they are considering at the beginning of their clinical year (
+28).
The obvious limitation of this study is that it provides data for psychiatry in one school of medicine. Further, the OSCE and attending evaluation measures used were unique to the psychiatry department studied. As a result, generalizing the current findings to other programs and performance indicators is needed (although the results are generally consistent with other studies and the effect sizes are comparable). The literature on timing effects may also benefit from future explication of the role of preference order and other student-level variables on clerkship performance throughout the clinical years. The large drop in NBME performance from period 7 to period 8, for example, merits explanation.
In summary, it is perhaps more surprising that pattern of performance did not increase linearly over time, given that completion of previous rotations was expected to enhance students’ clinical experience/skills. Rather, performance was distributed relatively evenly across periods, with little significant variance attributable to timing, order preference, or specialty choice. From a practical perspective, the current data indicate that less concern over clerkship timing effects may be warranted among psychiatry clerkship directors, especially when multiple methods of evaluation are used to assess performance. Adjusting student evaluation data for the timing of the clerkship appears unnecessary or, at the most (given the weak effect sizes found here), only a minor point of consideration in the design of performance assessments for third-year clerks in psychiatry.