The Global Assessment of Functioning Scale is a standardized clinical scale that can be used by clinicians to measure and monitor their patients’ clinical level of functioning (1). The Global Assessment of Functioning (GAF) uses a 100-point scale and is recorded under Axis V of the multiaxial diagnostic system used in psychiatry. The GAF measures the patients’ clinical status and progress in psychological, social, and occupational functioning.
The current version of the GAF scale is a modified version of the Luborsky Health Sickness Rating scale, introduced in 1962 (2). This was a standardized, 100-point scale that was used by clinicians as a tool to measure their patients’ overall mental health. In 1976, this scale was modified to have a 10-point interval and was reintroduced as the Global Assessment Scale (GAS) (3).
The Diagnostic and Statistical Manual of Mental Disorders, 3rd edition (DSM-III), introduced a five axial system for recording psychiatric diagnosis and required that the patients’ level of functioning be noted under Axis V. At that time, a seven-point adaptive functioning scale was used to assess social relations, occupational functioning, and use of leisure time (4).
With the introduction of the Diagnostic and Statistical Manual-III-R (DSM-III-R) in 1987, the Axis V scale was modified to a 90-point scale, which resembled the Global Assessment Scale, and was called the Global Assessment of Functioning scale (5). In the Diagnostic and Statistical Manual-IV (DSM-IV), the GAF was revised again to a 100-point scale (6).
The current Diagnostic and Statistical Manual-IV-TR incorporates the same GAF scale used in DSM-IV (1). The GAF scale has become a standard and an essential assessment tool in the current practice of psychiatry. Scores are used to monitor clinical progress, justify the level of care, and determine eligibility for treatment, for example. Also, insurance providers use this scale to ascertain insurance benefits, as well as the type and duration of treatment authorized. Further, due to its standardized format, the GAF serves as a useful tool for clinicians to communicate their patients’ clinical status and progress.
Hilsenroth et al. (7) suggest that there might be good reliability among clinicians for using GAF. In their study, 44 outpatient psychiatry patients received a GAF score evaluation by 10 graduate students and a trained examiner. Results suggested a high interclass correlation coefficient (0.86), suggesting that the GAF was a reliable clinical scale. However, the small sample size and the variation between trainees/students and experienced clinicians were some of the limitations of this study.
Even though the GAF is standardized and fairly easy to use, there is significant room for variability and personal interpretation. DSM-III field trials conducted by Spitzer and Forman (8) in 1979 indicate a 0.69 interclass correlation coefficient (ICC) for test-retest evaluations. Fernando et al. (9) reported an ICC of 0.49 when the Axis V GAF scale was used to measure psychosocial functioning by multidisciplinary clinical providers on an inpatient service. Russell et al. (10) had found a 64% agreement between raters of psychopathology using an adaptation of the seven-point scale used in Axis V of DSM-III. Further training and practice in the use of this scale might decrease the ICC.
Bates et al. (11) assessed the effects of brief training on the application of the GAF. In their study, 31 clinicians were asked to assess the GAF scores for patients in two clinical vignettes. Then these clinicians were provided with a 1-hour training session on the GAF and were asked to reassess the GAF scores for patients in the same two vignettes. Results suggested that clinicians were more likely to assign higher GAF scores to the patients in the vignettes after their brief training (11).
In this study, we compared the GAF scores assigned by the medical students with those assigned by residents and staff psychiatrists for patients in two clinical vignettes.
The objective of the survey was to ascertain:
1. Whether there were significant differences in the assessment of GAF scores by medical students and the psychiatry residents and staff psychiatrists.
2. Whether reviewing the GAF scoring guidelines decreased the difference in the assessment of GAF scores by the students and the psychiatry residents and staff psychiatrists.
We designed a questionnaire with two case vignettes for a cross-sectional survey of medical students, residents, and staff psychiatrists. This study was reviewed and approved by the Creighton University Institutional Review Board. Subjects were identified from the students completing their 1-month psychiatry clerkship at the Omaha Veterans Affairs Medical Center during January of 2005. The clerkship is offered to all third-year medical students and some fourth-year students on elective time. This clerkship includes 1 to 2 week rotations in the psychiatry inpatient unit, addictions unit, and/or the partial/day hospital unit. They also attend lectures on psychiatric disorders (one half-day every week for a total of 4 half-days). The didactic lectures do not specifically address the GAF scoring.
The comparison group included all psychiatry residents at the combined Creighton University and University of Nebraska psychiatry residency programs, as well as staff psychiatrists at the universities’ psychiatry departments. All subjects were provided with a brief description of the project and invited in person by one of the investigators to participate in a survey at the end of their psychiatry clerkship. Consenting subjects were asked to provide demographic information and then complete a questionnaire about patients in two clinical vignettes. This was done individually by one of the investigators.
The first vignette described a patient with major depression who was not suicidal, thus inappropriate to be maintained in outpatient treatment. The second vignette depicted a patient with psychotic symptoms who needed inpatient treatment (Appendix 1). All subjects were asked to estimate the GAF scores for the patients in both vignettes. Then they were provided with the printed GAF scoring guide for review and asked to reassess the GAF scores for the same patients in these vignettes.
A total of 61 subjects participated in this survey. Participants were divided into three groups: medical students (N=19), psychiatry residents (N=24), and staff psychiatrists (N=18). We contacted a total of 29 students in person at the end of their month-long psychiatry rotation, 19 of whom completed the survey (response rate of 65%). Of the 25 residents, 24 residents completed the survey (response rate of 96%), and 18 of the 25 staff psychiatrists completed the survey (response rate of 72%). Sixty-four percent (64%) were men and 36% were women. Sixty-nine percent (69%) were white, 3% were African American, and 28% belonged to other races.
GAF scores for the three groups were initially combined for both vignettes and measures of central tendency were compared.
Analysis of variance (ANOVA) statistics were performed to compare the mean GAF scores for the patient in the two vignettes, before and after providing participants the GAF training. While performing the multiple comparisons using Tukey HSD method, it was found that medical students differed significantly from the residents’ (p=0.003) and staff psychiatrists’ (p=0.016) assigned GAF scores for the patient in Vignette 1 before reviewing the GAF scoring guidelines. However, there was no significant difference between GAF scores assigned by residents and staff psychiatrists (p=0.940) for the same vignette. For the patient in Vignette 2, there was no statistically significant difference between the medical students, residents, and staff psychiatrists in the assigned GAF scores before reviewing the GAF scoring guidelines.
After participants were given the chance to review the GAF scoring guidelines, they were asked to rate the patients in the same vignettes. There was no statistically significant difference between medical students’ and residents’ (p=0.066) assigned GAF scores, but there was a statistically significant difference between medical students’ and staff psychiatrists’ (p=0.022) assigned GAF scores after reviewing the GAF scoring guidelines for the patient in Vignette 1. However, there was not a statistically significant difference between residents’ and staff psychiatrists’ (p=0.803) assigned GAF scores after reviewing the GAF scoring guidelines.
For the patient in Vignette 2, there were no statistically significant differences among medical students’, residents’, and staff psychiatrists’ (p>0.05) assigned GAF scores after reviewing the GAF scoring guidelines (Tables 1–3).
To our knowledge, this is the first study comparing the GAF score assessment by medical professionals at different levels of training. Furthermore, we asked the participants to assess the GAF scores for patients in two different vignettes with varying levels of severity of symptoms. The results indicate that medical students assigned higher GAF scores than residents and staff psychiatrists for the patient in Vignette 1, who presented with a lower level of severity. Even after being shown the GAF scale-scoring guide, mean scores remained higher. Interestingly, the GAF scores assigned by all three groups when scoring for the patient in Vignette 2, who presented with a higher level of severity, were comparable.
The staff psychiatrists’ estimations of the GAF scores were comparable to estimations by the residents, suggesting that there was general consensus with the GAF scores for these patients. However, the results suggest that the medical students might have a better grasp on assessing the correct GAF scores for patients with more severe symptoms, who typically would receive lower GAF scores. Interestingly, the assessment of the GAF scores did not change after reviewing the GAF scoring guide, which may be different from what has been previously reported that clinicians can be trained to assess the correct GAF scores after a brief review of the scoring guide.
With workload constraints, medical students are being asked to take more responsibility for their patients’ initial workup, including a complete initial psychiatric assessment with a GAF score assessment. Our results suggest that in our sample, the medical students might not be completely trained to assess the proper GAF scores for patients, even after their month-long psychiatric rotation. Theoretically, a higher GAF estimation could lead to patients receiving a lower level of care than needed. In extreme cases, inpatient psychiatric care may be denied if the third-party payers deemed that patients are not sick enough for inpatient psychiatric hospitalization. Further, our study suggests that a short review of the GAF scale scoring guide does little to affect GAF scoring practices among trainees and physicians. This underscores the importance of providing students with ample practice of GAF scoring during their mandatory psychiatric rotation.
Limitations of our study include a small sample size, geographic limitations, and use of vignettes versus live standardized patients. Also, our study is based on the use of two vignettes, one of which showed that the students’ GAF assessments were comparable to those of the residents and the staff psychiatrists. Also, this study was completed while the students were attending their rotation at the VA hospital, and we could not include the medical students who receive their psychiatry training at sites other than the Omaha VA Medical Center. All of these might have had an impact on the results. Nevertheless, our results suggest a trend that needs further study in bigger samples.
Present techniques to teach medical students proper assessment of GAF scores may not be adequate. More efforts are needed to provide medical students the supervision and training for the assessment of GAF scores.