You are on page 1of 6

Original Paper

Eur Neurol 2010;63:364–369 Received: November 12, 2009


Accepted: February 22, 2010
DOI: 10.1159/000292498
Published online: June 16, 2010

Validation of the FOUR Score


(Spanish Version) in Acute Stroke:
An Interobserver Variability Study
Luis Idrovo a Blanca Fuentes a Josmarlin Medina a Laura Gabaldón a
       

Gerardo Ruiz-Ares a María José Abenza a María José Aguilar-Amat a


     

Patricia Martínez-Sánchez a Luis Rodríguez a Rubén Cazorla a Marta Martínez a


       

Alfonso Tafur b Eelco F.M. Wijdicks b Exuperio Diez-Tejedor a


     

a
Department of Neurology, Stroke Unit, La Paz University Hospital, Autonoma University of Madrid, IdiPAZ,
 

Madrid, Spain; b Department of Neurology, Division of Critical Care Neurology, Mayo Clinic College of Medicine,
 

Rochester, Minn., USA

Key Words ischemic attacks). Thirty-three (55%) patients were alert, 17


Glasgow Coma Scale ⴢ Full Outline of UnResponsiveness (28.3%) drowsy and 10 (16.7%) stuporous or comatose. The
score ⴢ Transient ischemic attack ⴢ Acute stroke overall rater agreement was excellent in the FOUR score (Kw
0.93; 95% CI 0.89–0.97) with an ICC of 0.94 (95% CI 0.91–0.96)
and in the GCS (Kw 0.96; 95% CI 0.94–0.98) with an ICC of 0.96
Abstract (95% CI 0.93–0.97). A good correlation was found between
Background: Methods to assess impaired consciousness in the FOUR score and the GCS (␳ 0.83; p ! 0.01) and between
acute stroke typically include the Glasgow Coma Scale (GCS), the FOUR score and the NIH stroke scale (␳ –0.78; p ! 0.001).
but the verbal component has limitations in aphasic or intu- Conclusions: The FOUR score is a reliable scale for evaluating
bated patients. The FOUR (Full Outline of UnResponsiveness) the level of consciousness in acute stroke patients, showing
score, a new coma scale, evaluates 4 components: eye and a good correlation with the GCS and the NIH stroke scale.
motor responses, brainstem reflexes and respiration. We Copyright © 2010 S. Karger AG, Basel
aimed to study the interobserver variability of the FOUR
score in acute stroke patients. Methods: We prospectively
enrolled consecutive patients with acute stroke admitted Introduction
from February to July 2008 to the stroke unit of our Neurol-
ogy Department. Patients were evaluated by neurology res- Acute stroke may possibly affect brain structures that
idents and nurses using the FOUR score and the GCS. For control consciousness. Initially, consciousness is rarely
both scales, we obtained paired and total weighted kappa impaired in acute stroke, unless it is strategically located
values (Kw) and intraclass correlation coefficients (ICC). NIH in the dorsal brainstem or when the brainstem shifts
stroke scale was also recorded on admission. Results: We ob- from mass effect. Coma or any altered state of conscious-
tained a total of 75 paired evaluations in 60 patients (41 ce- ness predicts the outcome and thus is an important clin-
rebral infarctions, 15 cerebral hemorrhages and 4 transient ical variable [1–3]. There are only limited methods for the

© 2010 S. Karger AG, Basel Exuperio Diez-Tejedor, MD, PhD


0014–3022/10/0636–0364$26.00/0 Department of Neurology, La Paz University Hospital
Fax +41 61 306 12 34 Paseo de la Castellana 261, Planta 11
E-Mail karger@karger.ch Accessible online at: ES–28046 Madrid (Spain)
www.karger.com www.karger.com/ene Tel. +34 917 277 189, Fax +34 917 277 444, E-Mail hplapazneuro @ meditex.es
assessment of impaired consciousness in acute stroke, NIH stroke scale (only by neurologists with NIH stroke scale cer-
and they typically include the NIH stroke scale and the tification), and 15 patients were reevaluated with the same scales
during hospitalization because of changes in their levels of con-
Glasgow Coma Scale (GCS). The NIH stroke scale broad- sciousness (LOC) (‘mental status’).
ly defines consciousness as alert, not alert with or without Patients were classified into 1 of 4 clinical categories (alert,
‘stimulation’, and coma, and subsequently evaluates com- drowsy, stuporous and comatose) using definitions proposed by
prehension and the response to 2 questions. The GCS is Ropper [12] and were rated on both scales (FOUR score and GCS)
by 2 raters (2 neurology residents, R/R, or 2 nurses, N/N, or the
the most widely used coma scale, but the verbal compo- combination of a resident and a nurse, R/N). Each pair of raters
nent in particular becomes much less reliable in patients performed their examination (the order in which the raters exam-
with stroke due to dysphasia and dysarthria or is impos- ined the patients was random, with equal probability for either
sible to assess in intubated patients [4–6]. Clearly, there is observer to be the first or second) within 1 h of each other, without
a need for a more comprehensive coma scale that could knowledge of the other’s scores [10]. Six neurology residents (3rd
or 4th year) and 6 neuroscience nurses with 2–5 years of experi-
be useful in evaluating stroke patients, but prior attempts ence in stroke patient management participated in the data col-
to modify or replace the GCS have not been successful lection. Raters watched a 15-min instructive presentation on both
[7–9]. The Full Outline of UnResponsiveness (FOUR) scales that used videotaped patient examples and a detailed de-
score, a new coma scale developed in the Mayo Clinic, scription of the scales’ components. Raters were provided with
pocket instruction cards for the FOUR score, scoring sheets with
evaluates 4 components: eye and motor responses, brain- detailed descriptions of both scales, and were asked to rate a few
stem reflexes and respiration, and has been recently vali- patients before starting the validation process.
dated in acute neurological patients in an intensive care
unit setting [10, 11]. This new scale provides greater neu- Statistical Analysis
rological detail than the GCS and predicts the outcome For a power of 80% and a CI of 95% a total of 20 evaluations
for each rater combination were calculated. Cronbach’s ␣ was cal-
[10], but its reliability has not been specifically studied in culated for each score to assess internal consistency, and Spear-
the stroke unit setting. Our aim was to study the interob- man’s correlation coefficients between the FOUR score and the
server variability of a Spanish version of the FOUR score GCS were calculated to assess construct validity. For both scales,
in acute stroke patients. we measured paired and total weighted kappa (Kw) values and
intraclass correlation coefficients (ICC) for overall scores with a
CI of 95%. Kappa does not take into account the degree of dis-
agreement between observers and all disagreement is treated
Patients and Methods equally as total disagreement. We used Kw values to take into ac-
count the inherent incremental severity of the score despite not
We prospectively studied consecutive adult patients (118 years being a continuum.
old) with diagnosis of acute stroke (less than 48 h after the onset Kappa statistic values of 0.4 or less are considered poor, those
of symptoms) admitted to the stroke unit of our Neurology De- between 0.4 and 0.6 are considered fair to moderate, those be-
partment. This study was approved by the Clinical Research Eth- tween 0.6 and 0.8 suggest good interobserver agreement and those
ical Committee (HULP PI-644), and written informed consent greater than 0.8 suggest excellent agreement [13]. SPSS 16.0 for
for the use of patient data was obtained from the patients or their Windows was used for the descriptive analysis, internal consis-
relatives. The validation process was divided into 2 phases. tency, construct validity and ICC calculations, and MedCalc
9.5.2.0 for Windows was used for Kappa statistics.
Phase 1
This was a 4-week standardized process which included the
following steps: first, 2 translators (A.T. and L.I.) independently Results
translated the original FOUR instructions. Then, a third transla-
tor (P.M.-S.) created a draft of the Spanish translation of the
FOUR score that was subsequently presented to 3 Spanish-speak- We enrolled 60 patients with an average age of 69.5 8
ing physicians for comments and suggestions (G.R.-A., M.J.A. 14.4 years (median 75; range 31–88); 36 (60%) were men.
and B.F.)which led to a final Spanish version (fig.  1). The final None of the patients received sedative drugs or neuro-
Spanish version was then translated back into English (J.M.). muscular blocking agents prior to neurological evalua-
tion. Neurological evaluation revealed that 33 (55%) pa-
Phase 2
We prospectively enrolled consecutive stroke patients between tients were alert, 17 (28.3%) were drowsy and 10 (16.7%)
February and July 2008. According to our stroke unit’s hospital- were stuporous or comatose. Patients scored an average
ization clinical pathway, we admit acute stroke patients (isch- NIH stroke scale of 11.4 8 9.1 points (range 0–36). Forty-
emic/hemorrhagic stroke and transient ischemic attack; TIA) one (68.3%) patients had a cerebral infarction, and the
with sudden neurological symptoms of less than 48 h after onset.
Patients with more than 48 h since the onset of stoke symptoms distribution of ischemic stroke subtypes according to the
were not included in this study analysis. On the day of admission, TOAST classification was: 16 with cardioembolism, 13
the patients were evaluated with the GCS, the FOUR score and the with small-vessel occlusion, 6 with large-artery athero-

FOUR Score in Acute Stroke Eur Neurol 2010;63:364–369 365


Color version available online

Fig. 1. a Pocket instruction card of the


FOUR score (Spanish version).

366 Eur Neurol 2010;63:364–369 Idrovo et al.


sclerosis, 4 of other determined etiology and 2 of unde-
ESCALA FOUR
termined etiology. Fifteen (25%) patients had a spontane-
Instrucciones para la valoración de las categorías individuales ous cerebral hemorrhage (5 deep, 4 lobar, 2 massive, 2
Para la respuesta ocular (O), puntuar la mejor respuesta posi- cerebellum, 1 brainstem and 1 subarachnoid hemor-
ble (de al menos tres intentos) intentando obtener el mejor nivel rhage) and 4 (6.7%) had a TIA. Our 4 patients with TIA
de conciencia posible. Una puntuación O4 indica al menos tres included in the study were evaluated on admission after
movimientos oculares voluntarios. Si se encontrara con los ojos
cerrados, el examinador deberá abrírselos y evaluar el seguimien- regression of the major neurological symptoms. TIA
to ocular ayudado con un objeto o con su dedo. El seguimiento symptoms usually last less than 1 h; therefore, our TIA
ocular con la apertura de un solo ojo será suficiente en casos de patients were alert and asymptomatic at the moment of
edema palpebral o traumatismo facial. Si el seguimiento horizon-
evaluation. Two patients required endotracheal intuba-
tal está ausente, examine el seguimiento vertical. Alternativamen-
te, deberá demostrar parpadeo (en dos ocasiones) mediante una tion, and 1 patient was diagnosed with locked-in syn-
orden verbal. Esto identificará un síndrome de cautiverio. Una drome. We obtained a total of 75 paired evaluations. Six-
puntuación O3 indica la ausencia de seguimiento voluntario con ty evaluations were collected within 24 h of admission
los ojos abiertos. Una puntuación O2 indica apertura ocular con
estímulo verbal. Una puntuación O1 indica apertura ocular con un
and 15 were performed when changes in responsiveness
estímulo doloroso. O0 indica ausencia de apertura ocular con el urged neurological examination.
estímulo doloroso. The overall rater agreement was excellent for both the
Para la respuesta motora (M), puntuar la mejor respuesta po- FOUR score (Kw 0.93, 95% CI 0.89–0.97; ICC 0.94, 95%
sible en los brazos. Una puntuación M4 indica que el paciente rea-
liza por lo menos una de las tres posiciones (señales) de la mano CI 0.91–0.96) and the GCS (Kw 0.96, 95% CI 0.94–0.98;
(pulgar arriba, puño, signo de la paz), pudiendo ser realizadas con ICC 0.96; 95% CI 0.93–0.97). Rater agreement was the
cualquier mano. Una puntuación M3 indica que el paciente ha to- highest among neurology residents and lower Kw scores
cado la mano del examinador luego que este haya realizado un
estímulo doloroso mediante la estimulación de la articulación
were obtained by neuroscience nurses for both scales (ta-
temporomandibular o del nervio supraorbitario. Una puntuación ble 1). The brainstem component of the FOUR score and
M2 indica movimientos de flexión de las extremidades superiores. the verbal component of the GCS had the lowest total
Una puntuación M1 indica respuesta extensora al dolor. Si no hay scores (Kw 0.66 and 0.86, respectively). Nevertheless,
respuesta motora al dolor o el paciente presenta un estatus epilép-
tico mioclónico se puntuará como M0.
good-to-excellent interobserver agreement was observed
Para los reflejos del tronco cerebral (T), puntuar la mejor res- on these scale components.
puesta posible. Examine los reflejos pupilares y corneales. Prefe- Rater agreement by LOC was comparable in stuporous
rentemente, los reflejos corneales se examinan mediante la aplica- and comatose patients in both scales [FOUR score: 0.83
ción de dos o tres gotas de solución salina en la cornea desde una
distancia de 10–12 cm. La estimulación corneal con algodón puede (95% CI 0.67–0.99); GCS: 0.83 (95% CI 0.61–0.99)]. In
también ser utilizada para este propósito. El reflejo tusígeno pro- alert and particularly in drowsy patients, we found less
vocado por succión traqueal se examina sólo si los reflejos pupila- interobserver agreement. Total Kw values of the FOUR
res y corneales están ausentes. Una puntuación T4 indica la presen-
cia de reflejos pupilares y corneales normales. Una puntuación T3
score and the GCS for the alert group were 0.70 (95% CI
indica una pupila midriática y fija. Una puntuación T2 indica ausen- 0.46–0.99) and 0.84 (95% CI 0.72–0.96); for the drowsy
cia del reflejo pupilar o corneal (cualquiera de los dos). Una pun- group the values were 0.43 (95% CI 0.11–0.75) and 0.76
tuación T1 indica ausencia de ambos reflejos. T0 indica ausencia de (95% CI 0.56–0.96).
reflejos pupilar corneal y tusígeno (utilizando succión traqueal).
Para la respiración (R), evaluar el patrón espontáneo de respi- Interobsever agreement by type of stroke was similar
ración en un paciente no intubado y puntúe simplemente como in both scales. The total Kw scores of the FOUR score for
regular (R4), o irregular (R2), y respiración de Cheyne-Stokes (R3). the ischemic and hemorrhagic stroke groups were 0.89
En pacientes ventilados mecánicamente, evalúe la onda de pre- (95% CI 0.72–0.99) and 0.96 (95% CI 0.92–0.99), respec-
sión del patrón espontáneo de respiración o del activador del ven-
tilador mecánico (R1). El monitor del ventilador que muestre los tively; and for the GCS the scores were 0.97 (95% CI 0.95–
diferentes patrones respiratorios puede ser utilizado para identifi- 0.99) and 0.94 (95% CI 0.88–0.99), respectively.
car las respiraciones generadas por el paciente con el ventilador A high degree of internal consistency was found by
mecánico. Mientras el paciente está siendo evaluado, no se debe-
rán realizar ajustes de ventilación, y la puntuación se realizara pre-
measuring Cronbach’s ␣ for the FOUR score (0.84 for the
ferentemente con una PaCO2 dentro de límites de normalidad. Un first rater; 0.85 for the second rater) and for the GCS (0.86
examen estándar de apnea (difusión de oxígeno) puede ser nece- for the first rater; 0.87 for the second rater). Spearman’s
sario cuando el paciente respire a la velocidad del ventilador me- correlation coefficients between GCS and FOUR scores
cánico (R0).
were 0.83 for the first rater and 0.82 for the second rater
(p ! 0.01) (fig. 2a).
Fig. 1. b Spanish translation of the instructions for the evaluation Spearman’s correlation coefficients between total NIH
of the individual categories of the FOUR score. stroke scale and FOUR scores showed an expected in-

FOUR Score in Acute Stroke Eur Neurol 2010;63:364–369 367


16 16
15 15
14 14
13 13
12 12
11 11
10 10

FOUR score
FOUR score

9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0

3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
a GCS b NIH stroke scale LOC

Fig. 2. a Correlation between the FOUR score and the GCS (␳ = 0.83, p ! 0.01). b Correlation between the FOUR
score and the NIH stroke scale’s LOC item (␳ –0.78, p ! 0.001).

Table 1. Interobserver variability of the FOUR score and the GCS estimated with Kw scores

Number FOUR score GCS


of evalua-
tions ocular motor brainstem respiratory total ocular motor verbal total

R/R 26 0.79 0.93 0.79 1 0.95 0.88 0.96 0.92 0.97


R/N 24 0.93 0.93 0.56 0.65 0.89 0.91 0.97 0.93 0.97
N/N 25 0.77 0.67 0.50 0.65 0.72 0.70 0.81 0.67 0.73
Total 75 0.85 0.92 0.66 0.91 0.93 0.90 0.96 0.86 0.96
95% CI 0.77–0.90 0.88–0.95 0.52–0.77 0.6–0.94 0.89–0.97 0.84–0.94 0.94–0.98 0.78–0.94 0.94–0.98

verse correlation (␳ –0.61; p ! 0.001). We also found an countries and among Hispanic communities in North
inverse correlation (␳ –0.78; p ! 0.001) between the FOUR America and Europe.
score and the LOC item (1A + 1B + 1C; 0–7 points) of the The FOUR score may have advantages over the GCS
NIH stroke scale (fig. 2b). and the LOC item of the NIH stroke scale. The FOUR
score can identify brainstem abnormalities in patients
with stroke of the vertebrobasilar territory and is able to
Discussion recognize locked-in syndrome [10, 11]. The FOUR score
is also able to identify signs of brain herniation (e.g.
This is the first interobserver variability study of a brain edema in malignant middle cerebral artery in-
Spanish version of the FOUR score in an acute stroke unit farction) that may require prompt neurosurgical proce-
setting, and it shows not only a good correlation between dures [10, 11]. Furthermore, we noted in our patients
the GCS and the NIH stroke scale, but also a high degree that language disturbances such as motor, transcortical
of interobserver agreement. Moreover, a Spanish version or global aphasia, predominantly due to cardioembolic
of this new coma scale may facilitate its dissemination not stroke, could be misclassified as altered consciousness
only in Spain but also in Spanish-speaking developing by raters using the GCS, but with the FOUR score these

368 Eur Neurol 2010;63:364–369 Idrovo et al.


dysphasic but alert patients were rated more according training and perhaps more clinical skills (for brainstem re-
to their LOC. flexes and recognition of respiration patterns) could fur-
A prior study identified a negative correlation between ther improve rater agreement [11]. The lowest agreement
the GCS and the NIH stroke scale (␳ –0.57, p ! 0.001), but was seen in the brainstem component and specifically in
no association has been previously established between the nurse-nurse evaluation. This further supports an im-
the NIH stroke scale and the FOUR score [14]. The FOUR provement in reproducibility linked to level of training.
score and NIH stroke scale (total and LOC item scores) However, the Kw evaluation remained at least fair.
showed an inverse correlation which was even stronger In our study, half of the patients were alert at the mo-
than that between the GCS and NIH stroke scale (␳ –0.61 ment of neurological evaluation. This may have improved
and –0.78 vs. –0.57; p ! 0.001). rater agreement values, but in stuporous and comatose
We found a high degree of interobserver agreement in patients we also found excellent agreement among raters.
the FOUR score with the best rater agreement among Another caveat to be considered is that, as occurs with the
neurology residents, followed by resident and nurse, and GCS, the clinical significance of the error between scores
nursing staff pairs. Individual components of the FOUR is not a continuum. Therefore, therapeutic decisions shall
score showed different Kw values (table 1). Brainstem re- be individualized.
flexes had the lowest agreement among individual com- We conclude that the FOUR score is a reliable scale,
ponents of the FOUR score; however, neurology residents has good internal consistency, a good correlation with
had better scores than neuroscience nurses. Rating of the the GCS and an inverse correlation with the NIH stroke
eye responses with the FOUR score was comparable be- scale. We have found excellent interobserver agreement
tween raters, and it was better than in the original valida- in both the FOUR score and the GSC when neurology
tion study [10]. Perfect agreement in the respiration com- residents evaluated the stroke patients. Nurses had lower
ponent was found among neurology residents; however, agreement in both scoring systems; nevertheless, the
the combination of nurses and physicians had difficulty agreement between nurses and residents was excellent.
identifying the irregular breathing and Cheyne-Stokes The FOUR score is a reliable tool for evaluating LOC in
respiration patterns seen in 4 patients. Increasing the in- patients with acute stroke.
ternal validity of our study, all exams were completed
within 1 h of the original evaluation. This was the same
methodology as in the original validation of the scale [10]. Acknowledgements
Use of the FOUR score for evaluating acute stroke pa- We appreciate the work of the staff nurses of the Department
tients may have some limitations. Although this new scale of Neurology of La Paz University Hospital who collaborated in
has been proven simple to apply, previous experience, more this study.

References
1 Weir CJ, Bradford AP, Lees KR: The prog- 6 Levy DE, Bates D, Caronna JJ, Cartlidge NE, 11 Wolf CA, Wijdicks EF, Bamlet WR, McClel-
nostic value of the components of the Knill-Jones RP, Lapinski RH, Singer BH, land RL: Further validation of the FOUR
Glasgow Coma Scale following acute stroke. Shaw DA, Plum F: Prognosis in nontraumat- score coma scale by intensive care nurses.
QJM 2003;96:67–74. ic coma. Ann Intern Med 1981;94:293–301. Mayo Clin Proc 2007;82: 435–438.
2 Tsao JW, Hemphill JC 3rd, Johnston SC, 7 Sternbach GL: The Glasgow coma scale. J 12 Ropper AH: Lateral displacement of the
Smith WS, Bonovich DC: Initial Glasgow Emerg Med 2000;19:67–71. brain and level of consciousness in patients
Coma Scale score predicts outcome follow- 8 Gill M, Martens K, Lynch EL, Salih A, Green with an acute hemispheral mass. N Engl J
ing thrombolysis for posterior circulation SM: Interrater reliability of 3 simplified neu- Med 1986;314:953–958.
stroke. Arch Neurol 2005; 62:1126–1129. rologic scales applied to adults presenting to 13 Landis JR, Koch GG: The measurement of
3 Wijdicks EF, Rabinstein AA: Absolutely no the emergency department with altered lev- observer agreement for categorical data. Bio-
hope? Some ambiguity of futility of care in els of consciousness. Ann Emerg Med 2007; metrics 1977;33:159–174.
devastating acute stroke. Crit Care Med 49:403–407. 14 Dominguez R, Vila JF, Augustovski F, Ira-
2004;32:2332–2342. 9 Healey C, Osler TM, Rogers FB, Healey MA, zola V, Castillo PR, Rotta Escalante R, Brott
4 Teasdale G, Jennett B: Assessment of coma Glance LG, Kilgo PD, Shackford SR, Mere- TG, Meschia JF: Spanish cross-cultural ad-
and impaired consciousness: a practical dith JW: Improving the Glasgow Coma Scale aptation and validation of the National Insti-
scale. Lancet 1974;ii:81–84. score: motor score alone is a better predictor. tutes of Health stroke scale. Mayo Clin Proc
5 Prasad K, Menon GR: Comparison of the J Trauma 2003;54:671–678. 2006;81:476–480.
three strategies of verbal scoring of the 10 Wijdicks EF, Bamlet WR, Maramattom BV,
Glasgow Coma Scale in patients with stroke. Manno EM, McClelland RL: Validation of a
Cerebrovasc Dis 1998;8:79–85. new coma scale: the FOUR score. Ann Neu-
rol 2005; 58:585–593.

FOUR Score in Acute Stroke Eur Neurol 2010;63:364–369 369

You might also like