You are on page 1of 7

䡲 NEURORADIOLOGY

Detection of Simulated Multiple


Sclerosis Lesions on T2-
weighted and FLAIR Images of
ORIGINAL RESEARCH

the Brain: Observer Performance1


John H. Woo, MD
Purpose: To determine observer performance in the detection of
Lana P. Henry, MD
multiple sclerosis (MS) lesions on magnetic resonance
Jaroslaw Krejza, MD, PhD2
(MR) images of the brain and to assess the dependence of
Elias R. Melhem, MD observer performance on lesion size, parenchymal location,
pulse sequence, and supratentorial versus infratentorial
level.

Materials and This HIPAA-compliant protocol was approved by the insti-


Methods: tutional review board, and previously acquired MR data
from a healthy volunteer and a patient with MS were used
to derive parameter maps, with waiver of informed con-
sent. Parameter maps and image simulator software were
used to generate 320 phantom brain images with simu-
lated supratentorial and infratentorial MS lesions. Images
were displayed with T2-weighting or fluid-attenuated in-
version recovery (FLAIR) contrast. Four readers indepen-
dently evaluated the images, rating lesions on a five-point
certainty scale. Observer performance was measured by
using the area under the alternative free-response receiver
operating characteristic curve (A1), and significance was
determined with the z test.

Results: Pooled A1 scores were significantly better for FLAIR imag-


ing (0.96 ⫾ 0.01 [standard error]) than for T2-weighted
MR imaging (0.89 ⫾ 0.04) supratentorially (P ⫽ .05) but
were similar for FLAIR imaging (0.90 ⫾ 0.06) and T2-
weighted MR imaging (0.88 ⫾ 0.05) infratentorially. A1
scores for cortical, deep white matter, and periventricular
lesions were 0.93 ⫾ 0.05, 0.97 ⫾ 0.02, and 0.89 ⫾ 0.04,
respectively, for FLAIR imaging and 0.77 ⫾ 0.06, 0.99 ⫾
0.01, and 0.89 ⫾ 0.05, respectively, for T2-weighted MR
imaging. FLAIR scores were significantly higher than T2-
weighted scores for cortical lesions. Linear correlation was
found between A1 and lesion size (r ⫽ 0.5).

Conclusion: Supratentorially, performance was better with FLAIR im-


aging than with T2-weighted MR imaging. Infratentorially,
performance was moderate with both modalities. Observ-
ers did better with FLAIR imaging in the detection of
1
From the Department of Radiology, Division of Neurora- cortical lesions, and performance improved with increas-
diology, Hospital of the University of Pennsylvania, 3400
ing lesion size.
Spruce St, Dulles 2, Philadelphia, PA 19104. From the
2004 RSNA Annual Meeting. Received May 9, 2005; revi-
sion requested July 7; revision received October 9; ac- 娀 RSNA, 2006
cepted November 14; final version accepted December
22. Address correspondence to J.H.W. (e-mail:
woojohn@uphs.upenn.edu).
2
Current address: Bialystok Medical Academy, Bialystok,
Poland.

姝 RSNA, 2006

206 Radiology: Volume 241: Number 1—October 2006


NEURORADIOLOGY: Detection of Simulated Multiple Sclerosis Lesions Woo et al

M
agnetic resonance (MR) imag- the dependence of observer perfor- and infratentorial). The supratentorial
ing of the brain plays a crucial mance on lesion size, parenchymal loca- level included the lateral ventricles at
role in the imaging of multiple tion, pulse sequence, and supratentorial the septum pellucidum, and the infra-
sclerosis (MS). However, it is difficult to versus infratentorial level. tentorial level intersected the fourth
obtain objective measures of observer ventricle and orbital floors.
performance in the task of MS lesion
detection on MR images. Lesions are (in Materials and Methods Lesion Data
general) easily seen, particularly on T2- The details of the MR simulator have By performing the same multiecho MR
weighted and fluid-attenuated inversion been previously published (1,2). A spe- sequence on the patient with MS, we
recovery (FLAIR) MR images, but cialized acquisition sequence was used obtained similar parameter maps for a
pathologic correlation with tissue sam- to derive MR parameter maps of the lesion in the left centrum semiovale. All
pling is rarely, if ever, performed. As a brain in a healthy 40-year-old male vol- four authors agreed on the presence of
result, any attempt to design a study by unteer who had no known neurologic this lesion. The lesion was digitally ex-
using real MR images to measure ob- disease. These images were judged to tracted by including only those pixels
server performance in the detection of be normal by all four authors (0 –12 that had T1, T2, and proton density val-
MS lesions would be impeded by this years of experience in diagnostic neuro- ues that were more than 2 standard de-
lack of correlative truth data. radiology). Parameter maps were simi- viations above the mean value in the
As a partial solution to this problem, larly acquired in a 32-year-old woman contralateral normal-appearing white
previous investigators used an MR im- with MS. Both participants had previ- matter, which resulted in a lesion diam-
age simulator to create phantom images ously given written informed consent to eter of 7 pixels. This threshold of 2 stan-
with simulated lesions (1,2). Herskovits a prior study that was approved by the dard deviations above the mean was
et al (1) tested observers to assess how institutional review board. For the cur- previously shown to separate the lesion
performance depended on lesion loca- rent study, which was also approved by from the surrounding white matter (2).
tion and MR sequence. Their testing the institutional review board, informed Resampling with bicubic interpolation
paradigm was limited, however, by the consent was waived. The study was generated smaller lesions that were 2,
forced binary nature of the observer re- compliant with the Health Insurance 3, 4, 5, or 6 pixels in diameter.
sponse—a lesion was either present or Portability and Accountability Act be-
absent. As a result, their analysis cause only the MR data were used, with MR Image (Test Case) Generation
yielded only sensitivity and specificity all the identifying information deleted. The MR image simulator software used
values as measures of performance. a steady-state solution to the Bloch
Moreover, they did not test observers Template Data equation and calculated signal intensity
with images of the posterior fossa, A 1.5-T MR imager (ACS-NT; Philips (S) as a function of three tissue parame-
where many researchers believe that Medical Systems, Best, the Nether- ters (ie, proton density [␳], T1, and T2)
MS lesions are less detectable on FLAIR lands) was used to perform mixed mul- and three machine parameters (repeti-
images (3,4). Thus, the purpose of our tiecho spin-echo and inversion-recovery tion time [TR], echo time [TE], and in-
study was to determine observer per- MR imaging in the healthy volunteer.
formance in the detection of MS lesions This sequence yielded eight spin-echo
on MR images of the brain and to assess MR images (repetition time, 1500 Published online
msec) with different echo times (20, 40, 10.1148/radiol.2411050792
60, 80, 100, 120, 140, and 160 msec)
Advances in Knowledge Radiology 2006; 241:206 –212
and eight inversion-recovery MR im-
䡲 Observer performance on fluid- ages (repetition time, 2000 msec; inver- Abbreviations:
attenuated inversion recovery sion time, 400 msec) with the same A1 ⫽ area under the AFROC curve
AFROC ⫽ alternative free-response receiver operating
(FLAIR) images is superior to that eight echo times. Section thickness was
characteristic
on T2-weighted MR images su- 5 mm, in-plane resolution was 0.80 ⫻ FLAIR ⫽ fluid-attenuated inversion recovery
pratentorially and is similar to 0.86 mm (rectangular field of view, MS ⫽ multiple sclerosis
that on T2-weighted MR images 165 ⫻ 220 mm; matrix, 205 ⫻ 256), and
Author contributions:
infratentorially. acquisition time was 9 minutes 30 sec-
Guarantors of integrity of entire study, J.H.W., L.P.H.;
䡲 Evidence indicates that perfor- onds. Pixel maps (256 ⫻ 256) of T1 study concepts/study design or data acquisition or data
mance on FLAIR images is supe- relaxation rate, T2 relaxation rate, and analysis/interpretation, all authors; manuscript drafting or
rior to that on T2-weighted MR proton density were generated online manuscript revision for important intellectual content, all
images in the detection of cortical (software release 6.2; Philips Medical authors; manuscript final version approval, all authors;
lesions. Systems) by using the multiecho data. literature research, J.H.W., L.P.H.; experimental studies,
䡲 Evidence indicates that perfor- The multiecho sequence was performed J.H.W., L.P.H.; statistical analysis, J.H.W., J.K.; and
manuscript editing, all authors
mance improves with increasing twice, with images obtained in the brain
lesion size. at two transverse levels (supratentorial Authors stated no financial relationship to disclose.

Radiology: Volume 241: Number 1—October 2006 207


NEURORADIOLOGY: Detection of Simulated Multiple Sclerosis Lesions Woo et al

version time [TI]) at the x and y coordi- four readers were authors. Each reader any number of lesions, including zero,
nates: was instructed to find and designate all with no maximum specified. Each test-
suspected lesions on the images by us- ing session occurred at the same work-
S共x,y兲 ⫽ ␳共x,y兲兩1 ⫺ 2e⫺TI/T1共x,y兲兩 ing a point-and-click mouse interface station with a 21-inch monitor, and win-
(described below). Before actual testing dow width and center levels were cho-
关1 ⫺ e ⫺TR/T1共x,y兲兴e⫺TE/T2共x,y兲/
began, the reader was allowed to prac- sen to match those of standard MR
关1 ⫹ e ⫺TR/T1共x,y兲e⫺TR/T2共x,y兲兴. tice with this interface during a demon- images of the brain.
stration session. The reader was in- The 320 images, each linearly inter-
This is the same formula used in pre- formed that each image could contain polated to a 512 ⫻ 512 matrix, were
vious implementations of the MR simu-
lator (1,2), with an additional term in Figure 1
the denominator that better approxi-
mates steady-state signal intensity (5).
By applying this formula on a pixel-by-
pixel basis, we generated images with
either T2-weighting (4500/100 [repeti-
tion time msec/echo time msec]) or
FLAIR contrast (11 000/140/2600 [rep-
etition time msec/echo time msec/in-
version time msec]). A lesion was em-
bedded in a template by choosing the
greater of the signal intensities com-
puted for the lesion and normal brain
tissue.
A total of 80 supratentorial images
were first generated, with 20 images
each containing zero, one, two, or three
lesions. Thus, a total of 120 lesions were
placed. These lesions were deliberately
chosen in order to represent equally
each of the five lesion sizes (2, 3, 4, 5, or
6 pixels), with 24 lesions of each size.
An approximately equal number of le-
sions was also distributed among corti-
cal, deep white matter, and periven-
tricular locations, with about 40 lesions
in each location. This procedure was
repeated at the infratentorial level, with
similar attention given to the distribu-
tion of sizes and locations, thereby cre-
ating 80 more images. Each of the re-
sulting 160 images was used twice (once
with T2-weighting and once with FLAIR
contrast), yielding 320 total test cases
(Fig 1). All 320 images were shown to
each reader.

Testing
Our experiment followed a multiple-
reader multiple-case design. Four
board-certified neuroradiologists, each Figure 1: Representative simulated transverse FLAIR (11000/140/2600) and T2-weighted (4500/100)
in their 1st year of diagnostic neuroradi- MR images show lesions in same location. (a) At the supratentorial level, the callosal lesion (arrow) is seen
ology fellowship training, evaluated the equally well on T2-weighted (right) and FLAIR (left) images, but the cortical lesion (arrowhead) is much more
visible on the FLAIR image. (b) At the infratentorial level, the two lesions are seen equally well on T2-weighted
images by using testing software written
(right) and FLAIR (left) images, with one lesion (arrowhead) in the pontine white matter and the other (arrow) in
in Interactive Data Language (Research
the periventricular white matter near the fourth ventricle.
Systems, Boulder, Colo). None of the

208 Radiology: Volume 241: Number 1—October 2006


NEURORADIOLOGY: Detection of Simulated Multiple Sclerosis Lesions Woo et al

displayed one at a time in a random positive image score recorded the max- the supratentorial and infratentorial
order, alternating between supratento- imum rating of all false-positive re- levels by using the z test. We also ap-
rial and infratentorial images. After de- sponses for that image. For example, if plied the z test to evaluate differences in
tecting a possible lesion, the reader two false-positive responses were given, performance (pooled across supraten-
used the mouse to locate the lesion with with certainty scores of 1 and 2, then torial and infratentorial levels) in the
a button click. A drop-down menu then the resulting false-positive image score detection of lesions located at cortical,
allowed the reader to assign one of the for that image would be 2. An image deep white matter, and periventricular
following certainty ratings: 4, definite with no false-positive responses was as- locations. Finally, we evaluated the cor-
lesion; 3, probable lesion; 2, possible signed a false-positive image score of 0 relation coefficient in order to see how
lesion; and 1, cannot exclude lesion. (true-negative result). In this way, each overall performance (pooled across
These ratings reflected a decreasing image was assigned a false-positive im- both modalities and both levels) varied
scale of reader certainty regarding the age score of 0 – 4. with lesion size.
presence of a lesion so that, for exam- For each reader, this scoring
ple, a rating of 1 (cannot exclude lesion) method enabled the construction of an
was less certain than a rating of 2 (pos- AFROC curve (6), which is analogous to Results
sible lesion). The location and rating the receiver operating characteristic At the supratentorial level, observer
score were recorded for each response. curve that is used in traditional receiver performance was excellent on FLAIR
If desired, the reader could delete a re- operating characteristic studies. This images (pooled A1 score, 0.96 ⫾ 0.01
sponse with a different button click. curve plotted the true-positive fraction [standard error]) and only moderate on
This functionality was allowed, for ex- of lesions versus the false-positive frac- T2-weighted MR images (pooled A1
ample, in case the reader mistakenly tion of images. Observer performance score, 0.89 ⫾ 0.04). AFROC curves for
clicked at an unintended location or if was quantified as the area under the the four readers and pooled performance
the reader simply reconsidered a prior AFROC curve (A1), which was esti- values are plotted in Figure 2a (for T2-
response. mated by using trapezoidal integration. weighted MR images) and 2b (for FLAIR
The reader was asked to locate and The four A1 values were averaged to- images). The difference in performance
score all lesions that were found on the gether to form a pooled metric of ob- between modalities was statistically sig-
image within a 30-second time limit. A server performance. nificant (z ⫽ 1.97, P ⫽ .05).
visual signal indicated when 25 seconds At the infratentorial level, pooled A1
had elapsed, and at 30 seconds the im- Statistical Analysis scores showed moderate performance
age would be erased from the screen. If We applied the z test to assess for sta- on FLAIR images (0.90 ⫾ 0.06) and T2-
satisfied that all the lesions were found, tistical significance at a P level of .05 for weighted MR images (0.88 ⫾ 0.05).
the reader could proceed to the next comparing performance by using a sta- Corresponding AFROC curves are plot-
image before the time limit expired. tistical software program (Excel 2000; ted in Figure 3a (for T2-weighted MR
Breaks were allowed between images, Microsoft, Redmond, Wash). To apply images) and 3b (for FLAIR images).
and each reader was allowed to take the the z test for comparing pooled metrics This difference was not statistically sig-
test in two separate 160-image sessions of observer performance, we needed to nificant (P ⫽ .76).
to reduce fatigue. estimate the appropriate means and Performance also varied according
standard errors. Mean performance to lesion location (Fig 4), with scores for
Scoring was estimated by using the average of the supratentorial and infratentorial
The tests were scored by using the al- the four A1 values of the four individual levels aggregated together. For cortical
ternative free-response receiver operat- readers. Then, as outlined by Hanley lesions, performance was good on
ing characteristic (AFROC) scoring (7), the overall standard error could be FLAIR images (0.93 ⫾ 0.05) but only
method (6). A response was deemed estimated by using a function that effec- moderate on T2-weighted MR images
true-positive if located within 5 pixels tively sums the error components (0.77 ⫾ 0.06). For deep white matter
(in both x and y coordinates) of a true caused by interreader and intercase lesions, performance was excellent on
lesion. Each true-positive response was variability. Interreader variability was FLAIR images (0.97 ⫾ 0.02) and T2-
assigned a “hit” score of 1– 4, depending estimated by using the sample variance weighted MR images (0.99 ⫾ 0.01). For
on the reader’s certainty (1 for least of the four individual A1 values. Inter- periventricular lesions, performance
certain to 4 for most certain). An unde- case variability was estimated with a was moderate on FLAIR images (0.89 ⫾
tected lesion (ie, a false-negative find- jackknife approach as the sample vari- 0.04) and T2-weighted MR images
ing) was assigned a hit score of 0. In this ance of the resulting pseudovalues cal- (0.89 ⫾ 0.05). FLAIR scores were sig-
way, each lesion (detected or not) was culated by the JAFROC software (version nificantly higher than T2-weighted MR
assigned a hit score of 0 – 4. 1.00; www.devchakraborty.com) (8). imaging scores for cortical lesions (z ⫽
Any response not within 5 pixels of We tested for significant differences 2.06, P ⫽ .04) but not for deep white
a true lesion was deemed a false-posi- in performance with respect to modality matter (P ⫽ .51) or periventricular (P ⫽
tive response. For each image, the false- (FLAIR vs T2-weighted MR imaging) at .93) lesions. We also found a moderate

Radiology: Volume 241: Number 1—October 2006 209


NEURORADIOLOGY: Detection of Simulated Multiple Sclerosis Lesions Woo et al

linear correlation (r ⫽ 0.5, P ⬍ .05) Perhaps the explanation lies in the ventricles are larger and the sulci are
between A1 and lesion size (Fig 5). anatomic differences between the pos- more abundant. This hypothesis, while
terior fossa and the cerebrum. Because reasonable, remains unproved.
the main advantage of FLAIR imaging is We found a monotonic decrease in
Discussion that it nulls the high signal intensity performance with decreasing lesion
In this study, we applied free-response within the cerebrospinal fluid spaces, size, which is an expected and reason-
methods to obtain prospective mea- one might postulate that the “perfor- able finding because smaller lesions
sures of observer performance in the mance gain” of FLAIR imaging is more should be more difficult to detect. We
detection of simulated MS lesions. We accentuated supratentorially, where the deliberately chose this size range,
found that, supratentorially, perfor-
mance was significantly better on FLAIR
Figure 2
images than on T2-weighted MR im-
ages. This finding is in agreement with
the findings of Herskovits et al (1) and is
also consistent with the results of older
studies (3,4,9–13). In many of these
older studies, researchers compared mo-
dalities by calculating the number or vol-
ume of lesions, as determined by the con-
sensus opinion of radiologists (3,4,9–11)
or by using semiautomated lesion-detec-
tion software (12,13). While such meth-
ods can provide some basis for compari-
son, they do not approximate the usual
conditions of image interpretation, and
they cannot account for interreader vari-
ability. By using a multiple-reader multi-
ple-case design, we provide better objec- Figure 2: AFROC curves for supratentorial (a) T2-weighted and (b) FLAIR images. Individual AFROC
tive evidence that FLAIR images allow a curves for the four observers (R1, R2, R3, and R4) are plotted as dashed lines; the pooled AFROC curve is
meaningful improvement in performance plotted as a solid line. In a, overall performance in lesion detection is good. Compared with performance in
compared with T2-weighted MR images a, performance in b is improved. The y-axis shows the true-positive fraction of lesions, and the x-axis shows
at the supratentorial level. false-positive fraction of images.
Performance was equivalent and
moderate for the two modalities at the
infratentorial level, corroborating the
belief that FLAIR imaging performs Figure 3
worse in the posterior fossa (3,4). Vari-
ous explanations for this phenomenon
include the additional artifacts intro-
duced by the FLAIR sequence—for ex-
ample, cerebrospinal fluid flow artifacts
(14) and/or possible differences in the
intrinsic MR characteristics of lesions.
Indeed, Stevenson et al (15) reported
differences in the T1 and T2 parameters
of infratentorial MS lesions that may at
least partially explain their decreased
visibility on FLAIR images. Interest-
ingly, our results demonstrated that
FLAIR imaging performed worse infra-
tentorially, even though we did not
model these additional artifacts or con-
sider any differences in underlying le- Figure 3: AFROC curves for infratentorial (a) T2-weighted and (b) FLAIR images. Individual AFROC curves
for the four observers (R1, R2, R3, and R4) are plotted as dashed lines; the pooled AFROC curve is plotted as a
sion characteristics. Therefore, an addi-
solid line. In a, overall performance in lesion detection is good. Performance is equivalent in a and b. The
tional factor must be responsible for
y-axis shows the true-positive fraction of lesions, and the x-axis shows false-positive fraction of images.
this performance decrease.

210 Radiology: Volume 241: Number 1—October 2006


NEURORADIOLOGY: Detection of Simulated Multiple Sclerosis Lesions Woo et al

knowing full well that MS lesions are vious study, Obuchowski (17) estimated duced, because readers may not score a
usually larger, so that our measures the number of cases that would be lesion that they might otherwise have
likely underestimate the true perfor- needed for a receiver operating charac- questioned simply because they remem-
mance of radiologists in clinical prac- teristic study to have adequate power in ber it as an artifact from a previous
tice. However, by using small sizes, we demonstrating a significant difference image. This bias, however, would have
assessed performance with the most between diagnostic modalities. For stud- affected all measurements equally so
difficult of lesions to detect, thereby ies with four observers, a high (approxi- that comparisons of performance may
heightening any differences between mately 0.90) accuracy of techniques, a still be valid.
the variables that we tested, such as small (approximately 0.05) difference be- Our study is also limited by our use
modality, location, or transverse level. tween modalities, a 1:1 ratio of normal to of a single MS lesion. Similar to Melhem
Had we included lesions of larger size, abnormal cases, and a small interreader et al (2), we used the MR characteristics
the differences would have been harder variability, Obuchowski estimated that a of an actual MS lesion as a model for the
to detect because all performance mea- total of 287 cases would be needed to simulated lesions. This method con-
sures would have been high. detect a difference with 80% power. Our trasts with the work of Herskovits et al
We found that observer perfor- study had a smaller ratio (1:3) of normal (1), wherein the model lesion was artifi-
mance varied with lesion location, with to abnormal cases and was a free-re- cially constructed by using an octagonal
improved performance on FLAIR im- sponse design rather than a receiver op- shape and MR parameters that are typ-
ages in the detection of cortical lesions, erating characteristic design, but we be- ically reported in MS plaques. However,
which corroborates the results of Her- lieve that 320 cases should supply ade- MS lesions in general have a widely vari-
skovits et al (1). Conceptually, this find- quate power to detect any relevant able appearance on MR images, which
ing makes sense because FLAIR imaging differences. likely reflects the heterogeneous pro-
nulls the cerebrospinal fluid signal in- Our study has several limitations. cesses, such as demyelination, gliosis,
tensity that might obscure these lesions We used only a single template at either axonal loss, remyelination, inflamma-
on T2-weighted MR images. The corti- transverse level to generate the test tion, edema, or some combination
cal and subcortical lesion burden has cases. Therefore, we did not account thereof, that are involved at the his-
been found to correlate with regional at- for interindividual variability in the topathologic level. Because we did not
rophy (10) and with cognitive impairment brain or for section to section variabil- account for interlesion variability, it
(9,16), thereby highlighting the impor- ity. Moreover, having only a single tem- may be difficult to extrapolate our re-
tance of lesion detection in this location. plate may have falsely elevated our per- sults to estimate the true observer per-
The number of cases used in this formance measures because of recall ef- formance in the detection of all MS le-
study represented a compromise be- fect. Although the images were shown sions. Moreover, the validity of our
tween too many images, which would in an alternating sequence to reduce method to generate gray matter lesions
lengthen the study time and perhaps in- this effect, almost certainly the readers, may be questioned because our model
troduce the confounding element of fa- as trained neuroradiologists, would be lesion was extracted from white matter.
tigue, and too few images, which might able to remember previous images as However, removing interlesion variabil-
reduce the power of the study. In a pre- they interpreted subsequent images. ity should reduce our resulting vari-
False-positive responses would be re- ances, perhaps increasing the power of
our results in demonstrating significant
Figure 4
differences in performance.
Figure 5 Besides the limitations related to
image generation, others arise from our
observer testing methods. We believe
that 30 seconds allows readers ade-
quate time to evaluate each image, but
this allotment may not accurately reflect
actual reading conditions in which sin-
gle images are usually evaluated only for
a few seconds, perhaps longer if abnor-
malities are detected. We did allow
Figure 4: Observer performance in the detec- readers to spend less time if they were
tion of MS lesions is significantly better on FLAIR Figure 5: Observer performance in the detec- satisfied that all lesions were found.
images (white bars) than T2-weighted MR images tion of MS lesions on FLAIR and T2-weighted MR Still, it is unclear what effect this would
(black bars) for gray matter (GM) lesions. Perfor- images is shown as a function of lesion size. Per- have had on observer performance. On
mance is similar for deep white matter (DWM) and formance increases with increasing maximum
the one hand, by looking at an image
periventricular (PVWM) lesions. Error bars ⫽ lesion size from 2 to 6 pixels (r ⫽ 0.5; P ⫽ .04).
longer, readers might find more true
standard error. Error bars ⫽ standard error.
lesions than they might have otherwise

Radiology: Volume 241: Number 1—October 2006 211


NEURORADIOLOGY: Detection of Simulated Multiple Sclerosis Lesions Woo et al

found, thereby improving performance. mance measurements by using an MR simulator images. AJNR Am J Neuroradiol 1999;20:
On the other hand, they might also find and Abbas Jawad, PhD, for his help in statistical 1956 –1962.
analysis.
more false-positive lesions, thereby reduc- 10. Bakshi R, Ariyaratana S, Benedict RH, Ja-
ing performance. Again, we expect that cobs L. Fluid-attenuated inversion recovery
such biases would affect performance mea- References magnetic resonance imaging detects cortical
sures equally so that differences in perfor- 1. Herskovits EH, Itoh R, Melhem ER. Accu- and juxtacortical multiple sclerosis lesions.
mance may still be applicable. racy for detection of simulated lesions: com- Arch Neurol 2001;58:742–748.
parison of fluid-attenuated inversion-re-
Finally, the accuracy of our testing 11. Yousry TA, Filippi M, Becker C, Horsfield
covery, proton density–weighted, and T2-
methods is limited because of other fun- weighted synthetic brain MR imaging. AJR MA, Voltz R. Comparison of MR pulse se-
damental differences with standard im- Am J Roentgenol 2001;176:1313–1318. quences in the detection of multiple sclerosis
age interpretation. Our tests used single lesions. AJNR Am J Neuroradiol 1997;18:
2. Melhem ER, Herskovits EH, Karli-Oguz K, et 959 –963.
images, whereas normally readers can
al. Defining thresholds for changes in size of
assess contiguous images to help decide simulated T2-hyperintense brain lesions on 12. Tubridy N, Barker GJ, MacManus DG,
whether the findings represent actual the basis of qualitative comparisons. AJR Moseley IF, Miller DH. Three-dimensional
lesions. Moreover, modern picture ar- Am J Roentgenol 2003;180:65– 69. fast fluid attenuated inversion recovery (3D
chiving and communication systems al- 3. Gawne-Cain ML, O’Riordan JI, Thompson
fast FLAIR): a new MRI sequence which in-
low readers to adjust the window set- creases the detectable cerebral lesion load in
AJ, Moseley IF, Miller DH. Multiple sclerosis
multiple sclerosis. Br J Radiol 1998;71:840 –
tings to improve lesion visibility; our lesion detection in the brain: a comparison of
845.
software did not have this feature. fast fluid-attenuated inversion recovery and
Again, because these limitations can be conventional T2-weighted dual spin echo. 13. Bastianello S, Bozzao A, Paolillo A, et al. Fast
Neurology 1997;49:364 –370.
expected to affect performance mea- spin-echo and fast fluid-attenuated inver-
sures equally, differences may still be 4. Filippi M, Yousry T, Baratti C, et al. Quanti- sion-recovery versus conventional spin-echo
tative assessment of MRI lesion load in mul- sequences for MR quantification of multiple
detected and valid.
tiple sclerosis. IV. A comparison of conven- sclerosis lesions. AJNR Am J Neuroradiol
In summary, performance on FLAIR 1997;18:699 –704.
tional spin-echo with fast fluid-attenuated in-
images is excellent and is better than version recovery. Brain 1996;119:1349 –
that on T2-weighted MR images su- 1355.
14. Arakia Y, Ashikaga R, Fujii K, Nishimura Y,
pratentorially. Infratentorially, perfor- Ueda J, Fujita N. MR fluid-attenuated inver-
5. Droege RT, Wiener SN, Rzeszotarski MS, sion recovery imaging as routine brain T2-
mance is moderate and is similar be-
Holland GN, Young IR. Nuclear magnetic weighted imaging. Eur J Radiol 1999;32:
tween the two modalities. Observers resonance: a gray scale model for head im- 136 –143.
performed better on FLAIR images in ages. Radiology 1983;148:763–771.
the detection of cortical lesions, and 15. Stevenson VL, Parker GJ, Barker GJ, et al.
6. Chakraborty DP, Winter LH. Free-response Variations in T1 and T2 relaxation times of
performance improved with increasing
methodology: alternate analysis and a new normal appearing white matter and lesions
lesion size. We believe that, as a first observer-performance experiment. Radiol- in multiple sclerosis. J Neurol Sci 2000;178:
approximation, our methods provide a ogy 1990;174:873– 881. 81– 87.
useful approach to measure observer
7. Hanley JA. Receiver operating characteristic
performance in the task of MS lesion 16. Rovaris M, Filippi M, Minicucci L, et al. Cor-
(ROC) methodology: the state of the art. Crit
detection on MR images of the brain. In tical/subcortical disease burden and cogni-
Rev Diagn Imaging 1989;29:307–335.
tive impairment in patients with multiple
the future, one might imagine improving
8. Chakraborty DP, Berbaum KS. Observer sclerosis. AJNR Am J Neuroradiol 2000;21:
the method, perhaps with multiple brain 402– 408.
studies involving detection and localization:
templates, multiple digitized lesions, and modeling, analysis, and validation. Med Phys
testing software that more closely approxi- 2004;31:2313–2330. 17. Obuchowski NA. Sample size tables for re-
mated actual viewing conditions. ceiver operating characteristic studies. AJR
9. Moriarty DM, Blackshaw AJ, Talbot PR, et Am J Roentgenol 2000;175:603– 608.
al. Memory dysfunction in multiple sclerosis
Acknowledgments: We thank Edward H. Her- corresponds to juxtacortical lesion load on
skovits, MD, PhD, for his original work in perfor- fast fluid-attenuated inversion-recovery MR

212 Radiology: Volume 241: Number 1—October 2006

You might also like