You are on page 1of 10

The Journal of TRAUMA威 Injury, Infection, and Critical Care

Improving the Glasgow Coma Scale Score: Motor Score


Alone Is a Better Predictor
C. Healey, MD, Turner M. Osler, MD, Frederick B. Rogers, MD, Mark A. Healey, MD, Laurent G. Glance, MD,
Patrick D. Kilgo, MS, Steven R. Shackford, MD, and J. Wayne Meredith, MD

Background: The Glasgow Coma scores by simple addition (motor [m] ⴙ ⴝ 0.89, ROCm ⴝ 0.87; pseudo R2GCS ⴝ
Scale (GCS) has served as an assessment verbal [v] ⴙ eye [e] ⴝ GCS score). Prob- 0.42, pseudo R2m ⴝ 0.40) and has a better
tool in head trauma and as a measure of lematically, different combinations sum- calibrated logistic model.
physiologic derangement in outcome mod- ming to a single GCS score may actually Conclusion: Because the motor com-
els (e.g., TRISS and Acute Physiology and have very different mortalities. For exam- ponent of the GCS contains virtually all
Chronic Health Evaluation), but it has not ple, the GCS score of 4 can represent any the information of the GCS itself, can be
been rigorously examined as a predictor of three mve combinations: 2/1/1 (survival measured in intubated patients, and is
of outcome. ⴝ 0.52), 1/2/1 (survival ⴝ 0.73), or 1/1/2 much better behaved statistically than the
Methods: Using a large trauma data (survival ⴝ 0.81). In addition, the rela- GCS, we believe that the motor compo-
set (National Trauma Data Bank, N ⴝ tionship between GCS score and survival nent of the GCS should replace the GCS
204,181), we compared the predictive is not linear, and furthermore, a logistic in outcome prediction models. Because the
power (pseudo R2, receiver operating model based on GCS score is poorly cali- m component is nonlinear in the log odds
characteristic [ROC]) and calibration of brated even after fractional polynomial of survival, however, it should be mathe-
the GCS to its components. transformation. The m component of the matically transformed before its inclusion
Results: The GCS is actually a col- GCS, by contrast, is not only linearly re- in broader outcome prediction models.
lection of 120 different combinations of its lated to survival, but preserves almost all Key Words: Glasgow Coma Scale,
3 predictors grouped into 12 different the predictive power of the GCS (ROCGCS Predictive power, Outcome.
J Trauma. 2003;54:671–680.

T
he Glasgow Coma Scale (GCS)1 was introduced a quar- Given the fundamental importance of the GCS, it may
ter of a century ago and is now a part of the bedrock of seem remarkable that this score has never been subjected to
outcome prediction after head injury. Created to be re- careful statistical evaluation. However, the GCS was ac-
liably used even by workers without specialized training, the cepted as a useful description of consciousness and powerful
GCS seems straightforward: it is simply the sum of three predictor of outcome long before large databases required for
coded values that describe a patient’s motor (1– 6) verbal rigorous statistical analysis were available. Although many
(1–5), and eye (1– 4) level of response to speech or pain. authors (its creators among them) have noted shortcomings in
Since its creation, the GCS has been widely used. Not only is the GCS, including the inability to calculate the GCS score
it used to describe individual trauma patients in the ambu- for many patients6,7 and poor statistical performance,8 we
lance, the emergency room, and the intensive care unit, but it believe a reappraisal is in order. The data available in the
is also used as a component of several other outcome predic- National Trauma Data Bank (NTDB) provide a powerful tool
tion scores: the Revised Trauma Score;2 Acute Physiology for this analysis. We hypothesized that most of the power of
and Chronic Health Evaluation;3 TRISS;4 Circulation, Res- the GCS resides in the motor component and that the addition
piration, Abdomen, Motor, Speech Scale; and A Severity of the verbal and eye components add little to the predictive
Characterization of Trauma5 all use the GCS as a predictor. power of the GCS. Moreover, we thought it likely that the
addition of the verbal and eye subscores would undermine
Submitted for publication October 9, 2002. useful mathematical characteristics of the motor-only model
Accepted for publication January 6, 2003. of consciousness.
Copyright © 2003 by Lippincott Williams & Wilkins, Inc.
From the Department of Surgery, University of Vermont, College of
Medicine (C.H., T.M.O., F.B.R., M.A.H., S.R.S.), Burlington, Vermont, PATIENTS AND METHODS
Department of Anesthesia, University of Rochester (L.G.G., P.D.K.), Roch- The American College of Surgeons established the
ester, New York, and Department of Surgery, Wake Forest University NTDB as a national repository of trauma data. It contains
School of Medicine (J.W.M.), Winston-Salem, North Carolina. information supplied by 89 hospitals from around the country
Presented at the 61st Annual Meeting of the American Association for
the Surgery of Trauma, September 26 –28, 2002, Orlando, Florida. representing Level I, II, and III trauma centers. Although the
Address for reprints: Turner M. Osler, MD, FACS, Department of NTDB includes a wide variety of information for each case,
Surgery, University of Vermont, 111 Colchester Avenue, FL 466, Burling- for this study only GCS subscores (motor [m], values coded
ton, VT 05401; email: turner.osler@vtmednet.org. 1– 6 or blank; verbal [v], values coded 1–5 or blank; and eye
DOI: 10.1097/01.TA.0000058130.30490.5D [e], values coded 1– 4 or blank) and outcome (survival to

Volume 54 • Number 4 671


The Journal of TRAUMA威 Injury, Infection, and Critical Care

hospital discharge or death) were available. Cases were col-


lected between 1994 and 2001 and represented all age groups.
We examined the individual predictor elements of the
GCS for availability in the NTDB, linearity with respect to
survival, and linearity in the log odds of survival (fractional
polynomial analysis,9 see Appendix). Survival models were
then constructed using the subscores of the GCS (m, v, and e)
individually and as simple sums (m ⫹ v, m ⫹ v ⫹ e ⫽ GCS
score). Although the focus of this analysis was on the dis-
crimination rather than the calibration of these models, math-
ematical transformations of predictors in these models were
also examined using the technique of fractional polynomial
analysis10 to optimize calibration (see Appendix). Survival
models were evaluated for discrimination (receiver operating
characteristic [ROC] curve area), overall percentage of vari-
ability explained by the model (pseudo R2),11 misclassifica-
tion rate, and calibration, which was assessed with the Pear-
son ␹2 statistic. (The Hosmer-Lemeshow statistic was not
examined because several models had only a few different
covariate patterns, a circumstance that so reduced the power
of the Hosmer-Lemeshow statistic as to make it
meaningless.12) In addition, Akaike’s information criteria13
were computed as a measure of overall information for each
model. Ninety-five percent confidence intervals for ROC
values were calculated using a resampling algorithm and
statistical significance of differences between ROC statistics
were calculated using the nonparametric method of Hanley
and McNeil.14 All analysis was performed using Intercooled
STATA version 7.0 (College Station, TX). Model evaluation
was performed using SPost commands.15 Fig. 1. The frequency of various GCS scores in the data set. Note
that almost 80% of patients in the NTDB had a GCS score of 15.
RESULTS GCS scores of 14 and 3 were also common (approximately 6%
The NTDB data set used in this analysis included a total each), but other scores occurred in less than 1% of cases.
of 204,181 cases. However, 177 eye subscores, 1,583 verbal
subscores, and 673 motor subscores were unrecorded. As a following motor-verbal-eye (mve) groupings: 2/1/1, 1/2/1, or
result, a total of 1,926 GCS scores were unavailable. Overall 1/1/2. A GCS score of 9 can be attained by any of 18 different
mortality for the data set was 6%, but for the 1,926 patients mve combinations. Overall, 120 different mve combinations
for whom the GCS score was unavailable the overall mortal- are possible, but this multitude of combinations collapse into
ity was higher: 9.8% overall mortality for any missing sub- just 13 different GCS scores (3–15 inclusive) by the simple
score, 6.84% mortality for missing motor score, 12.99% mor- convention of summing the subscores. Unfortunately, we
tality for missing eye score, and 9.92% mortality for missing found that different mve combinations resulting in a single
verbal score). Of the 1,926 patients with no available GCS GCS score may have very different mortalities. Thus, al-
score, 675 patients also had no motor score (mortality, though the mve combination 1/2/1 (GCS score of 4) has a
6.84%) and 1,253 patients did have a motor score available mortality of 28%, the mve combination 2/1/1 (GCS score of
(mortality, 11.41%). The remainder of this analysis is based 4) has a mortality almost twice as high (52%) (p ⫽ 0.000,
on the 202,255 patients for whom all GCS subscores were Fisher’s exact test). Such discrepancies are not rare in the
available. GCS. In fact, every GCS score in this data set has such
Eighty percent of GCS scores were 15. GCS scores of 14 discrepancies that are statistically significant except 3 and 15
and 3 were also frequent, with approximately 6% of obser- (by definition) and 6, 12, and 13 (Fig. 2).
vations in each group. GCS scores from 4 through 13 (inclu- Survival proved to be a very nonlinear function of the
sive) were much less common, with less than 1% of obser- GCS. Moreover, survival was nonlinear for each of the GCS
vations in most groups (Fig. 1). subscores except the motor subscore, which in contrast was
In reality, every GCS score except 3 and 15 is a collec- strikingly linear (Fig. 3). Neither GCS score nor its motor
tion of several different possible groupings of subscores. component score were linear in the log odds of survival, but
Thus, a GCS score of 4 may be achieved by any of the the motor score is far less irregular (Fig. 4).

672 April 2003


Improving the Glasgow Coma Scale Score

Fig. 2. The 120 possible combinations of GCS subscores with their survival rates (and 95% confidence interval) grouped by GCS score.
Although the GCS is commonly thought of as 13 possible scores ranging from 3 to 15, it is actually 120 different possible combinations of
its component subscores that are grouped into 13 individual “scores” by the simple expedient of addition. Unfortunately, different
combinations of subscores that sum to the same GCS score often have very different survival rates.

The ability of the GCS to predict survival was evaluated analysis to improve the fit of these single predictors. Finally,
and compared with models based on its individual component a model containing all GCS subscores and their interaction
subscores (m, v, and e) and the sum of its motor and verbal terms appropriately transformed using the technique of mul-
scores (m ⫹ v) to determine where the predictive power of tiple fractional polynomials combined into a single predictor
GCS arises. Two further survival prediction models, one (“multiple fracpoly GCS model”) was created and evaluated.
based on GCS (“fracpoly GCS model”) and one based on the This final model represents the state-of-the-art approach to
motor component of GCS alone (“fracpoly m model”) were the three predictors available in the GCS. The performance of
also created using the technique of fractional polynomial these eight models is presented in Table 1. We note first that

Volume 54 • Number 4 673


The Journal of TRAUMA威 Injury, Infection, and Critical Care

Fig. 3. Survival as a function of the eye, verbal, and motor scores, and their sum, the GCS score. Note that both the eye and verbal scores
are distinctly nonlinear and that this is reflected in the GCS score. The motor score, by contrast, is very linear.

removing the eye component from the GCS results in a model the relative rankings of models were largely unchanged.
(m ⫹ v) indistinguishable from the GCS (m ⫹ v ⫹ e): not However, the difference in ROC values for the m and GCS
only are the ROC and pseudo R2 values the same for these scores was reduced by 50%. This suggests that the improve-
two models, but the smaller, more parsimonious m ⫹ v model ment in prediction resulting from literally adding v and e
is actually slightly better calibrated. The further elimination scores to the m score is largely attributable to the superior
of the verbal subscore from the GCS leaves the “motor-only discrimination of the GCS in patients with a normal level of
score” model. This further simplification results in a small consciousness (data not shown).
but statistically significant decrease in performance: ROC All models are poorly calibrated as assessed by the
falls (ROCGCS ⫽ 0.891 vs. ROCm ⫽ 0.873, p ⫽ 0.000), as Pearson ␹2 statistic, but calibration is improved by mathe-
does pseudo R2 (pseudo R2GCS ⫽ 0.416 vs. pseudo R2m ⫽ matically transforming predictors using the technique of frac-
0.403), and misclassifications increase (GCS ⫽ 4.9% vs. m ⫽ tional polynomials before creating a logistic model. For ex-
5.1). ample, the logistic model based on m alone can be
Elimination of all patients with a GCS score of 15 from transformed by including the inverse square root of m and the
the data set resulted in worse performance of all models, but third power of m in the logistic model with considerable

Fig. 4. Neither the GCS score nor its motor component are linear in the log odds of survival. However, the motor score is far less irregular.

674 April 2003


Improving the Glasgow Coma Scale Score

Table 1 Comparison of Prediction Models


Covariate Misclassification Psuedo Pearson’s ␹2
Model Transformations ROC (95% CI) AIC
Patterns (95% CI) (%) R2 (df)

m 6 — 0.873 (0.870–0.875) 5.1 0.403 0.270 604.7 (4)


v 5 — 0.0881 (0.875–0.883) 6.0 0.382 0.279 224.2 (3)
e 4 — 0.858 (0.854–0.862) 6.0 0.376 0.282 138.8 (2)
m⫹v 10 — 0.890 (0.886–0.893) 5.0 0.418 0.263 324.3 (8)
GCS score (m ⫹ v ⫹ e) 13 — 0.891 (0.888–0.894) 4.9 0.416 0.263 656.6 (11)
Fractional polynomial m 6 m–1/2, m3 0.873 (0.870–0.875) 5.1 0.408 0.269 30.8 (3)
Fractional polynomial GCS 13 GCS–2, GCS3 0.891 (0.888–0.894) 4.9 0.420 0.262 313.6 (10)
f (m, v, e, me, mv, ev, mev) 118 m–3 v–2 me–1/2 mv3, ev–2, mev2 0.891 (0.888–0.894) 4.9 0.424 0.260 151.3
ROC, receiver operating characteristic; CI, confidence interval; AIC, Akaike’s information criteria; df, degrees of freedom.

improvement in calibration. Although this transformed model noted that some components of the score might be impossible
is still not “well calibrated” (i.e., Pearson ␹2 value of 30 on 3 to assess. Of perhaps greater significance, in their 1977 dis-
degrees of freedom results in a value of p ⬍ 0.001), to casual cussion they observed that “[the] validity of the assumption
examination calibration is quite good (Fig. 5). Transforma- that each of the three parts of the scale should count equally
tion of the GCS score using the inverse square of the GCS and that each step should differ equally from that next to it
score and the third power of the GCS score is much less has still to be tested.”
successful because of the nonmonotonic nature of the GCS Teasdale and Jennett were not able to evaluate these
score (Fig. 6). concerns, perhaps because they did not have large patient
databases available to them. Their concerns were well
DISCUSSION founded, however. Numerous authors have since observed
The GCS was developed in 1974 by Teasdale and Jennett that trauma patients who are inebriated, intubated, or phar-
as a practical way to measure the “depth and duration of macologically paralyzed cannot have their GCS score as-
impaired consciousness” in a variety of conditions including sessed. This is a particularly troublesome problem because it
head trauma. Simplicity was the overriding design concern, is precisely these patients who are at the highest risk of dying.
with the goal of interrater reliability even by staff without This problem is compounded by the wide variety of ways it
special training. As originally proposed, the GCS score was has been “solved” at different trauma centers, such as scoring
reported as three independent subscores (motor, verbal, and components as the minimum possible value, the maximum
eye). The further simplification of recording only the sum of possible value, as a “T,” or simply as “unknown.”16,17 At-
the three components as a single score was adopted by Teas- tempts at a unified solution, such as imputation of missing
dale and Jennett in 1977. values using a linear regression model,18,19 have not been
Interestingly, the creators of the GCS foresaw possible adopted, perhaps because the formula is complicated and may
shortcomings in their score. In their original article, they not apply equally well to all case mixes. Other authors have

Fig. 5. The motor-only logistic model (dashed line) is poorly calibrated, but mathematical transformation of the motor score greatly improves
calibration (dotted line). The solid line represents perfect calibration (predicted mortality ⫽ actual mortality).

Volume 54 • Number 4 675


The Journal of TRAUMA威 Injury, Infection, and Critical Care

Fig. 6. The GCS logistic model (dashed line) is poorly calibrated, but mathematical transformation of the GCS score somewhat improves
calibration (dotted line). The solid line represents perfect calibration (predicted mortality ⫽ actual mortality).

advocated for the simple replacement of the GCS score with Although the motor subscore is a powerful predictor of
the motor score alone20 or in place of the GCS score in the mortality, we do not expect it to be used in isolation, because
Revised Trauma Score.21 These last studies are persuasive other predictors (e.g., injury severity, age, comorbidities) are
but were based on small data sets. Still another approach is also known to influence outcome. Rather, the motor subscore
the use of the Reaction Level Scale,22 which has eight values will be used as a component of a larger, more comprehensive
and resembles an enhanced motor subscore. survival model. When the motor subscore is incorporated into
We believe that the eye subscore should certainly be such a model, however, it will be important that it first be
removed from the GCS because it adds nothing to the pre- mathematically transformed to be linear in the log odds of
dictive power of the model and is occasionally impossible to survival, because this is a condition of the logistic model.
obtain. The choice to remove the verbal subscore is more This study has limitations. Most importantly, it is based
difficult, because the presence of the verbal subscore does on a cohort of trauma patients in whom all GCS subscores
improve the model of impaired consciousness at a statistically were recorded in the NTDB, and therefore its applicability to
significant level. Nevertheless, we believe that the verbal patients in whom not all subscores were available is not
subscore should also be removed because its contribution is certain. Moreover, in patients in whom all subscores were
not great and it is occasionally impossible to assess (i.e., assigned the procedures for assigning scores is likely to have
intubated patients or inebriated patients). Thus, we advocate been subject to local conventions that were almost certainly
the motor subscore as the best choice for a level-of-con- not uniform. These two problems may have biased the results
sciousness indicator. Although this model preserves most of of this study in unpredictable ways. Nevertheless, this study
the power of the GCS, it avoids the problems inherent in represents by far the largest and most comprehensive exam-
collecting the verbal and eye components of the GCS. More- ination of the GCS available to date.
over, its linearity with respect to survival is far more intuitive
and easily remembered than the complex survival graph of CONCLUSION
the GCS. In summary, the GCS is composed of three subscores
There are two circumstances in which the motor-only that contain redundant information. The simple addition of
model is unreliable: in patients with pharmacologic (thera- these subscores to create the GCS, although convenient, re-
peutic) paralysis and in patients with traumatic paralysis (i.e., sults in a nonlinear relationship between the GCS score and
high spinal cord injuries). In these cases, the motor score is mortality. We found that the motor component of the GCS
simply not a measure of consciousness and cannot be used as score is a powerful predictor of outcome and contains most of
one. In the case of pharmacologic paralysis, it is a simple the predictive power of the score. The addition of the verbal
matter to allow the drug to wear off before assessing the subscore adds slightly to the predictive power, but the further
motor subscore. The case of quadriplegia is more difficult to addition of the eye subscore (resulting in the familiar GCS
deal with, but it is possible that a standardized group of facial score) adds nothing to predictive power. We believe that a
responses to voice and noxious stimuli can be developed with motor subscore-only model of level of consciousness is the
only slight loss of accuracy overall. most practical because the verbal score may be impossible to

676 April 2003


Improving the Glasgow Coma Scale Score

obtain in seriously injured patients. Adding to the appeal of used extensively in trauma research. Unfortunately, because
the motor-only model is its near linearity with respect to this test requires grouping the data into at least 6 (and pref-
mortality. Quadriplegia is a potentially problematic injury erably 10) groups based on outcome score, for scores such as
because it naturally confounds the motor-only score. Six the m, v, and e subscores of the GCS it is simply not
levels of the motor score based on physical examination of applicable. Moreover, when most patients’ predictors fall into
the face will need to be defined to make the motor-only score a single covariate pattern, it can be impossible to partition the
universally calculable. Finally, the motor-only score will re- data to allow the calculation of the Hosmer-Lemeshow
quire mathematical transformation to ensure linearity in the statistic.
log odds of survival before incorporation into more compre- For the purposes of this article, we chose to examine
hensive logistic models of survival. three measures of goodness-of-fit: overall misclassification
rate, McFadden’s R2, and Akaike’s information criterion. We
APPENDIX selected overall misclassification because it is unambiguous
Discrimination and Calibration of Survival Models in its interpretation and easily calculated: one simply finds a
Survival predictions are based on mathematical models cutpoint in the score that minimizes misclassifications and
that take the values of one or more predictors and allow the reports this rate. We chose to report McFadden’s (pseudo) R2
calculation of the outcome of interest (typically, death in because of its analogy to R2 in linear regression, where R2
trauma outcome models). There are many possible prediction represents the percentage of variability explained by a model.
models, and so the business of selecting the “best” model is Although this interpretation of the pseudo R2 is not strictly
of obvious interest. Unfortunately, measuring how well a correct for logistic models, because the pseudo R2 values
predictive model performs is not straightforward. Two broad reported here are all calculated using the same data set, it is
measures of models are discrimination and goodness-of-fit. appropriate to compare this statistic between models. Finally,
Discrimination is the degree to which a model separates we chose to examine Akaike’s information criteria, which
survivors from nonsurvivors. This is usually quantified as the examines the amount of information contained in a score
area under the ROC curve. The ROC curve varies from 0.5 based on the likelihood of the data observed given the model
(separation of survivors from nonsurvivors is no better than under consideration corrected for the number of predictors in
chance alone) to 1.0 (perfect separation of survivors from the model. Although the absolute level of Akaike’s informa-
nonsurvivors). Although the actual calculation of the ROC tion criteria is uninformative, lower values imply more infor-
curve is not straightforward, in principle it could be calcu- mation conveyed by a model.
lated by repeatedly randomly selecting a survivor and a non- We believe that the most informative and convenient
survivor and recording whether the model under consider- approach to goodness-of-fit is a simple graph of predicted
ation correctly predicts the survivor. The percentage of survival versus actual survival for each covariate pattern. Not
randomly selected pairs correctly predicted by the model is its only does this single graph display the performance of a
ROC curve statistic. Although the ROC curve is conceptually model throughout its range, but it is also immediately inter-
simple, in practice its use requires some care. For example, pretable: the closer such a line lies to the diagonal, the more
the actual value of the ROC curve for a model depends on the reliable the prediction model. Another attractive feature of
distribution of cases in the data set. Thus, in order for two this graphical approach to goodness-of-fit is that it is not
different models to be compared with respect to ROC curve, affected by case mix.
the identical data set must be evaluated by each model. As an
additional complexity, the results of such model comparisons Fractional Polynomial Analysis
may actually change with different data sets, again depending A fractional polynomial (FP) is a polynomial whose
on case mix. powers are integers or fractions that may be positive, zero, or
Goodness-of-fit tests attempt to capture how well a negative. Introduced by Royston and Altman in 1994,23 FPs
model predicts outcome throughout the range of predictions. are extremely useful in regression models because they offer
That is, for a model to have acceptable goodness-of-fit, we greater flexibility than ordinary polynomials. In multivariate
wish the differences between predicted and actual outcomes regression models, FPs allow us to preserve the continuous
to be small and, furthermore, that errors be unsystematically nature of predictor variables even when such variables are
distributed. Unfortunately, there are many goodness-of-fit originally nonlinear and thus allow the creation of better
tests and no agreement on which is “best.” The oldest such calibrated models. Alternatively, if a model based on FP
test is the Pearson ␹2. Because the Pearson ␹2 tests whether a transformations fails to improve on untransformed predictors,
model’s calibration is indistinguishable from “perfectly cali- we can be assured that the regression is in fact linear in its
brated,” the very large data sets used in trauma outcome predictors.9
research almost always have sufficient power to reject the As implemented in STATA, a back-fitting algorithm is
“perfectly calibrated” hypothesis, and thus the judgment “not used that finds a fractional polynomial transformation for
perfectly calibrated” is not very informative. The Hosmer- each predictor in turn, while holding the functional forms of
Lemeshow test is another goodness-of-fit test that has been the other predictors temporarily fixed. The algorithm con-

Volume 54 • Number 4 677


The Journal of TRAUMA威 Injury, Infection, and Critical Care

verges when the functional forms of the predictors do not DISCUSSION


change. Dr. Randall M. Chesnut (Portland, Oregon): Having
spent a bit of my training time in Glasgow and gotten to know
REFERENCES fairly well Brian Jennett and Graham Teasdale, I could tell
1. Geasdale G, Murray G, Parker L, Jennett B. Adding up the Glasgow you that even the authors are quite impressed by how far this
Coma Score. Acta Neurochir Suppl (Wien). 1979;28:13–19.
little scale got, which was essentially developed over a sum-
2. Champion HR, Sacco WJ, Copes WS, et al. A revision of the trauma
score. J Trauma. 1990;29:623– 631. mer evening drinking a wee bit of Scotch.
3. Knaus WA, Draper EA, Wagner DP, et al. APACHE II: a severity This article is a statistical tour de force that represents the
of disease classification system. Crit Care Med. 1985;13:818 – 826. first thorough analysis of the mathematical behavior and
4. Boyd CR, Tolson MA, Copes WS. Evaluating trauma care: the predictive value of the Glasgow Coma Scale and its compo-
TRISS method. J Trauma. 1987;27:370 –381.
5. Champion HR, Copes WS, Sacco WJ, et al. A new characterization
nent scores in a large database from the general trauma
of injury severity. J Trauma. 1990;30:539 –543. population. On the basis of their excellent statistical method-
6. Segatore M, Way C. The Glasgow Coma Scale: time for change. ology, the authors conclude that the vast majority of the
Heart Lung. 1992;21:548 –551. useful predictive value of the GCS lies in the motor compo-
7. Marshall LF, Becker DP, Bowers SA, et al. The national traumatic
nent; therefore, the verbal and eye subscales should be
coma data bank: part I— design, purpose, goals, and results.
J Neurosurg. 1983;59:267–279. discarded.
8. Teoh LSG, Gowardman JR, Larsen PD, et al. Glasgow Coma Scale: Although I find this article very well done, and I believe
variation in mortality among permutations of specific total scores. that their results accurately reflect the data, I’m inclined to
Intensive Care Med. 2000;26:157–161. exercise significant caution with respect to generalizing the
9. Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed.
New York: John Wiley & Sons; 2000:100 –103.
conclusions. I have three criticisms of this article, the first
10. Royston P, Altman DG. Regression using fractional polynomials of two of which are fairly predictable.
continuous covariates: parsimonious parametric modeling. Appl Stat. This article relies critically on the validity of the data
1994;43:429 – 467. analyzed. Unfortunately, as shown multiple times in the lit-
11. Nagelkerke NJD. A note on a general definition of the coefficient of
erature, the GCS score is notoriously difficult to collect
determination. Biometrika. 1991;78:691– 692.
12. Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. reliably. Optimally, it is collected by trained observers in the
New York: John Wiley & Sons; 2000:151. absence of drugs after complete physiologic resuscitation.
13. Akaike H. Information theory and an extension of the maximum Unfortunately, nowhere in this article is described the reli-
likelihood principle. In: Petrov B, Csaki F, eds. Second International ability of the GCS values collected by many different people
Symposium on Information Theory. Budapest: Akedemiai Kiado;
1973:267–281.
at multiple centers. Would the authors please fill us in on
14. Hanley JA, McNeil BJ. A method of comparing the areas under these details.
receiver operating characteristic curves derived from the same cases. The second two criticisms revolve around the traumatic
Radiology. 148:839 – 843. brain injury population. Although this study addressed the
15. Long JS, Freese J. Regression Models for Categorical Dependent
general trauma population, the GCS was developed for use in
Variables Using STATA. College Station, TX: Stata Press; 2001:63–96.
16. Buechler MC, Blostein PA, Koestner A, Hurt K, Schaars M, patients with suspected neurologic deficits. For the brain
McKernan J. Variation among trauma centers’ calculation of injury population, survival to discharge is a terrible index of
Glasgow Coma Scale score: results of a national survey. J Trauma. outcome, because it completely neglects the quality of life in
1998;45:429 – 432. survivorship. I realize that survival to discharge was used
17. Marion DW, Carlier PM. Problems with initial Glasgow Coma Scale
assessment caused by prehospital treatment of patients with head
here as an endpoint of convenience, given the database avail-
injuries: results of a national survey. J Trauma. 1994;36:89 –95. able, but I do not believe that any analysis of brain injury–
18. Rutledge R, Lentz CW, Fakhry S, Hunt J. Appropriate use of the related data can be deemed complete or valid if based solely
Glasgow Coma Scale in intubated patients: a linear regression on mortality figures.
prediction of the Glasgow verbal score from the Glasgow eye and The third criticism is the most important. This study
motor scores. J Trauma. 1996;41:514 –522.
19. Meredith W, Rutledge R, Fakhry SM, Emery S, Kromhout-Schiro
addressed the general trauma population and not a brain
S. The conundrum of the Glasgow Coma Scale in intubated injury population. When present, brain injury is highly deter-
patients: a linear regression prediction of the Glasgow verbal minant of outcome. In this article, however, 90% of patients
score from the Glasgow eye and motor scores. J Trauma. 1998; had a GCS score of 15, which is the same GCS score that
44:839 – 845.
presumably most of the people in this audience have. Also,
20. Jagger J, Jane JA, Rimel R. The Glasgow coma scale: to sum or not
to sum? [letter]. Lancet. 1983;2:97. the mortality for the 13,114 patients in this study that had a
21. Offner PJ, Jurkovich GJ, Gurney J, Rivara FP. Revision of TRISS GCS score of 3 was 58%, which is the more expected mor-
for intubated patients. J Trauma. 1992;32:32–35. tality of 75% to 80% in traumatic brain injury studies. In
22. Starmark JE, Stalhammar D, Holmgren E. The reaction level scale addition, we know nothing of the intracranial diagnoses of
(RLS 85). Acta Neurochir (Wien). 1988;69:699 –760.
23. Royston P, Altman DG. Regression using fractional polynomials of
these patients, their pupillary examination, or other relevant
continuous covariates: parsimonious parametric modeling. Appl Stat. indices of neurologic function. Therefore, there is no indica-
1994;43:429 – 467. tion that this study reflects the brain injury population. Be-

678 April 2003


Improving the Glasgow Coma Scale Score

cause the major value of the GCS is in the area of traumatic simplify it into a single measure, you lose information. The
brain injury, any recommendations to change its collection or Glasgow Coma Scale score is a measure of brain injury. My
interpretation must be based on investigations into the trau- concern is that the way the Glasgow Coma Scale is scored
matic brain injury population, per se, using long-term func- across this country, it’s more a global assessment of neuro-
tional outcome. logic injury.
This article makes a very strong statement that the motor Did you consider taking out patients with diagnoses of
subscale of the GCS can and probably should replace the spinal cord injury and seeing how the day-to-day measure-
whole GCS in modeling studies of short-term mortality in the ment of the Glasgow Coma Scale is affected? Could you then
general trauma population. I do not, however, believe that it get best motor score related to brain injury and get rid of the
has any implications regarding the clinical use of the GCS at noise from including spinal cord injuries? Are your findings
this time or in prognostic modeling specific to the traumatic more a reflection of coding based on spinal cord injury and
brain injury population. I hope that further efforts from this not on head injury?
excellent group will allow us to take these steps. Dr. Turner M. Osler (closing): Dr. Chestnut, thank you
Dr. Pascal Udekwu (Raleigh, North Carolina): Having for those insightful thoughts. I, too, am troubled by the
had the opportunity to look at the North Carolina Trauma National Trauma Data Bank because it’s too good to be true:
Registry data set on head injury patients based on Interna- You send some email, and you get 200,000 cases to work on.
tional Classification of Diseases, Ninth Revision codes, I The problem with not having collected the data yourself or
have to say that our conclusions really support those pre- participating in the data collection is that you can never know
sented here today. I think that our results are also strength- what’s going on there.
ened by the fact that a subset of patients had Functional I am extremely troubled by the fact that in the NTDB
Independence Measure scores that provide some support in only 2,000 cases lacked a component of the Glasgow Coma
terms of functional outcome as opposed to mortality alone. Scale. That would be 1%. That’s inconceivable. So, we know
So, I would like to say, wonderful article. Thank you very that this data set must have people filling in blanks, perhaps
much. I look forward to its publication.
according to local protocols that they just assign a 1 or they
Dr. Howard R. Champion (Annapolis, Maryland): I
just assign some of the highest possible number, the lowest
would like to, first of all, state that I agree entirely with your
possible number. We don’t know what these conventions
conclusions, as far as they go. I need to take issue with the
were. They could, of course, bias the data. We have no way
first reviewer in terms of the way in which it may in any way
of knowing in what direction they would have biased the
reflect on Scotch whiskey. The GCS was devised in a pub,
data. However, given the data set, it’s not a problem we can
and is thus more a reflection of Glasgow beer than the fine
solve except by having a better data set. So, this study
Scotch whiskey that is also found in that country.
obviously needs to be repeated.
From a historical point of view, the Glasgow Coma Scale
was developed to measure coma in head-injured patients 24 To comment on the traumatic brain injury population, it
hours or so after injury. When we applied it in prehospital is true that Teasdale and Jennett conceived the Glasgow
care in this country, to get around terms such as “lethargy” Coma Scale to deal with brain surgery and brain injury. They
and “stupor” as the descriptors for head injury, Graham weren’t really thinking about the trauma population. What
Teasdale was actually very upset with that concept. has been discovered, however, is that in severely traumatized
He said, “It wasn’t designed for that” and “should not be patients who are in a shock-like state, their Glasgow Coma
applied in that area.” The questions I have are two: first, when Scale score will fall off.
you take the measure of coma down from the 15-integer As a measure of overall illness, it turns out that the
Glasgow Coma Scale to best motor response, although it Glasgow Coma Scale is quite powerful. Thus, it has been
makes an awful lot of sense, you’re taking the measurement incorporated into our trauma models. That’s not to say that
of coma to five intervals. Most neurosurgeons would say that what we are actually measuring is brain injury. Many patients
is insufficient to characterize head injury in any way in with a depressed Glasgow Coma Scale score have no brain
relationship to any outcome. injury at all; they are merely in shock. The GCS, however, is
The other thing is that the Glasgow Coma Scale relation- a powerful predictor, and it may be that the GCS score is just
ship to mortality differs considerably between blunt and pen- a surrogate for shock.
etrating injury to the head. How do you account for this? I agree that for the day-to-day work of a brain surgeon,
How are you going to address these two issues if you’re having all three components of the GSC may be helpful.
just going to distill it and simplify it? Your proposal has However, looking at a broader trauma population, which is
certain advantages for certain users but big disadvantages for what our mandate was, I think that it’s clear that the motor
others. score contains virtually all of the power. I hope that covers
Dr. K. Dean Gubler (Portland, Oregon): I appreciate your questions.
this article, the presentation, and the amount of work to Dr. Champion, Thank you. Those are both excellent
contemplate. When you combine measures and then try to points. I think that what I’m doing is not for neurosurgeons.

Volume 54 • Number 4 679


The Journal of TRAUMA威 Injury, Infection, and Critical Care

What I’m doing is for trauma demography and outcome of course, this doesn’t reflect anything about the brain. We
prediction. I think it’s perfectly appropriate for neurosur- propose that if we’re going to use the motor score alone, there
geons to continue to use whatever scale they want, and I think needs to be a special score for quadriplegics based on just
they are aware that a GCS score of 7 equals a GCS score of their facial response to noxious stimuli or voice that would
8 equals a GCS score of 9 equals a GCS score of 10 equals fill in the gap. However, quadriplegics don’t represent a large
a GCS score 11. They’re aware of that foible, and our math- category in anybody’s data set, so that hasn’t affected our
ematical models need to be aware of that foible as well. overall results.
I think the blunt-penetrating distinction is an important I’d like to underscore, again, that the Glasgow Coma
one, but most of our trauma registries don’t contain very Scale, although it was conceived to measure brain injury,
much penetrating head trauma. Therefore, it doesn’t really isn’t measuring brain injury—it’s measuring brain function,
affect the conclusions of an article that’s looking at such a and brain function can be affected by many things, including
large number of patients. shock. So, the reason the Glasgow Coma Scale works so well
Thank you Dr. Gubler, those are good thoughts. We, too, as the predictor in general trauma populations is because it’s
are concerned about the spinal cord injury problem, because measuring a lot of different things, not just brain injury.
in the case of a quadriplegic, the motor score goes to 1, and Thank you very much.

680 April 2003

You might also like