You are on page 1of 8

Alzheimers & Dementia 8 (2012) 399406

Comparing new templates and atlas-based segmentations in the


volumetric analysis of brain magnetic resonance images for diagnosing
Alzheimers disease
Qian Shena,b, Weizhao Zhaob, David A. Loewensteina,c, Elizabeth Pottera, Maria T. Greiga,
Ashok Rajd, Warren Barkera, Huntington Potterd, Ranjan Duaraa,c,d,e,f,*
a

Wien Center for Alzheimers Disease and Memory Disorders, Mount Sinai Medical Center, Miami Beach, FL, USA
b
Department of Biomedical Engineering, University of Miami, Coral Gables, FL, USA
c
Department of Psychiatry and Behavioral Sciences, Miller School of Medicine, University of Miami, Miami, FL, USA
d
Johnnie B. Byrd, Sr. Alzheimers Center & Research Institute, University of South Florida, Tampa, FL, USA
e
Department of Medicine, Miller School of Medicine, University of Miami, Miami, FL, USA
f
Department of Neurology, Miller School of Medicine, University of Miami, Miami, FL, USA

Abstract

Background: The segmentation of brain structures on magnetic resonance imaging scans for
calculating regional brain volumes, using automated anatomic labeling, requires the use of both
brain atlases and templates (template sets). This study aims to improve the accuracy of volumetric
analysis of hippocampus (HP) and amygdala (AMG) in the assessment of early Alzheimers
disease (AD) by developing template sets that correspond more closely to the brains of elderly
individuals.
Methods: Total intracranial volume and HP and AMG volumes were calculated for elderly subjects
with no cognitive impairment (n 5 103), with amnestic mild cognitive impairment (n 5 68), or with
probable AD (n 5 46) using the following: (1) a template set consisting of a standard atlas (atlas S),
drawn on a young adult male brain, and the widely used Montreal Neurological Institute template
(MNI template set); (2) a template set (template S set) in which the template is based on smoothing
the image from which atlas S is derived; and (3) a new template set (template E set) in which the template is based on an atlas (atlas E) created from the brain of an elderly individual.
Results: Correspondence to HP and AMG volumes derived from manual segmentation was highest
with automated segmentation by template E set, intermediate with template S set, and lowest with the
MNI template set. The areas under the receiver operating curve for distinguishing elderly subjects
with no cognitive impairment from elderly subjects with amnestic mild cognitive impairment or probable AD and the correlations between HP and AMG volumes and cognitive and functional scores
were highest for template E set, intermediate for template S set, and lowest for the MNI template set.
Conclusions: The accuracy of automated anatomic labeling and the diagnostic value of the derived
volumes are improved with template sets based on brain atlases closely resembling the anatomy of the
to-be-segmented brain magnetic resonance imaging scans.
! 2012 The Alzheimers Association. All rights reserved.

Keywords:

Volumetric segmentation; Magnetic resonance imaging; Hippocampus; Amygdala; Alzheimers disease; Mild
cognitive impairment

1. Introduction

*Corresponding author. Tel.: 305-674-2543; Fax: 305-532-5241.


E-mail address: duara@msmc.com

Regional brain volumes of the hippocampus (HP) and


amygdala (AMG), which undergo atrophy early in the
course of Alzheimers disease (AD), serve as biomarkers
in research studies and can assist in the diagnosis of AD,

1552-5260/$ - see front matter ! 2012 The Alzheimers Association. All rights reserved.
doi:10.1016/j.jalz.2011.07.002

400

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

even in its mild cognitive impairment stage [13]. The


volumes of these structures, as measured on magnetic
resonance imaging (MRI) scans, can be calculated by
various methods [1], including the following: (1) subjective,
labor-intensive, manual delineation; (2) semiautomated
segmentation, using operator-initiated landmarks, seed
points, or boundary boxes [4,5]; and (3) fully automated
segmentation, which is especially advantageous in largescale studies [611]. Among fully automated systems, the
most common are template-based methods, for example, individual brain atlases using statistical parametric mapping
(IBASPM) [12], which registers target brains to a prelabeled
template brain, and probabilistic-based methods (e.g., FreeSurfer [Martinos Center for Biomedical Imaging, Charlestown, MA] [13,14]), which transforms the target brain
scans into stereotaxic space of a standard atlas.
The main advantage of IBASPM over FreeSurfer is its
greater general utility for clinical research (it operates on
Windows [Microsoft Corporation, Seattle, WA]-based computers) and the quick throughput (minutes per case). FreeSurfer can take several hours per case on a Windows-based
computer [14], but it has been reported to be more accurate
than IBASPM by some investigators [8,15]. Errors in
segmentation while using IBASPM may arise from
inaccurate image registration. Further errors are caused by
the use of the default Montreal Neurological Institute (MNI)
template, which is derived from a single normal young
subject. Although healthy control subjects can be
distinguished from AD subjects by using IBASPM [16], the
use of such templates may result in inaccuracies when segmenting a structure such as the HP, the volume and shape of
which is particularly distorted by age and disease [17].
The neuroanatomical atlas used in this standard approach
is typically created by manually delineating the region of interest (ROI) in the standard space of a single-subject image .
However, if the subject is not representative of the investigated population, the accuracy of the segmentation may be
reduced [6]. A possible solution to this problem is to use
multiple atlases with decision fusion approach [11], or a family of templates with automatic selection of the best template
[18]. Although the use of multiple atlases in atlas-based segmentation has shown better performance as compared with
a single atlas [19], the tradeoff is additional computational
cost, as the method involves the registering of several atlases
with each of the to-be-segmented images and the processing
of decision fusion or an atlas selection step. Another possible
solution is to use a probabilistic atlas from a set of images,
thereby reducing computation time as compared with the
multiatlas approach. However, because it still relies on a single registration, average atlas segmentation is less accurate
than registering multiple atlases [11].
The utility of any volumetric method for the early diagnosis of AD is dependent on the accuracy of segmentation
of brain regions on the subjects brain scans. Therefore, the
goal of this endeavor was the accurate segmentation of the
HP and AMG on MRI scans for use among elderly subjects

with and without degenerative brain diseases, by reducing


errors arising from anatomical incompatibilities between
the atlases, the templates, and individual subjects MRI
brain scans. The comparative effectiveness of these new atlases and templates in addressing such errors was measured
by assessing correspondence of automated and manual segmentation, and by assessing the utility of the resulting HP
and AMG volumes for distinguishing between subjects
with no cognitive impairment (NCI), those with amnestic
mild cognitive impairment (aMCI), and those with probable AD.
2. Methods
2.1. Subjects
A total of 241 subjects from the Wien Center for Alzheimers disease and Memory Disorders at Mount Sinai
Medical Center (Miami Beach, FL) participated in this
study. Subjects were aged !65 years, male or female, and
with a minimum score of 20 on the Folstein Mini-Mental
State Examination (MMSE) [20]. All subjects or a legal representative provided informed consent, and the study was
approved by the Mount Sinai Medical Center Institutional
Review Board. All subjects had: (1) a neurological and medical evaluation by a physician; (2) a full battery of neuropsychological tests [21], according to the National Alzheimers
Coordinating Center protocol (http://www.alz.washington.
edu/), and the following additional tests: the Three-Trial
Fuld Object Memory Evaluation [22] and the Hopkins Verbal Learning Test [23]; as well as (3) a structural volumetrically acquired MRI scan of the brain. The demographics and
clinical characteristics of subjects are presented in Table 1.
The sum of boxes from the Clinical Dementia Rating Scale
(CDR-sb) was used as the index of functional ability, and the
MMSE was used as the index of cognitive ability.
The cognitive diagnosis was made using a combination of
the physicians diagnosis and neuropsychological diagnosis,
as described previously [24]. The etiological diagnosis was
made by the examining physician. The diagnosis of NCI
required that the physicians diagnosis was NCI and no
cognitive test scores were !1.5 SD below age- and
education-corrected means. The diagnosis of aMCI required
a physicians diagnosis of mild cognitive impairment after
clinical evaluation and a neuropsychologists diagnosis of
aMCI based on one or more measures at or less than 1.5
SD below expected levels, thereby satisfying Petersen criteria [25]. A probable AD diagnosis required a dementia syndrome and National Institute of Neurological and
Communicative Disorders and Stroke/Alzheimers Disease
and Related Disorders Association criteria for AD [26].
The subjects MRI brain scans were obtained on a 1.5-T
MRI machine using proprietary volumetric sequences (3D
Magnetization Prepared Rapid Acquisition Gradient Echo;
Siemens Medical Solutions, Iselin, NJ). Volumetric analysis
of brain MRIs was performed by a modified version of

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

401

Table 1
Demographic variables of diagnostic groups (N 5 217)
Diagnostic group
Variable
Age, mean (SD)
Sex (% female)
MMSE score, mean (SD)
CDR-sb, mean (SD)

NCI (n 5 103)
70.8* (5.3)
75.7%
29.1* (1.0)
0.60* (0.57)

aMCI (n 5 68)
y

76.8 (5.8)
51.4%
26.2y (2.2)
2.1y (1.18)

AD (n 5 46)
y

79.1 (7.1)
58.7%
22.3z (3.3)
5.3z (2.63)

F value or c2

P value

39.75
10.8 (c2)
176.85
110.02

,.001
,.001
,.001
,.001

Abbreviations: NCI, no cognitive impairment; aMCI, amnestic mild cognitive; AD, Alzheimers disease; MMSE, Mini-Mental State Examination; CDR-sb,
Clinical Dementia Rating Scale-sum of boxes.
NOTE. Means with different superscript symbols (*,y,z) are statistically significant at P , .05 by the post hoc Scheff!e procedure.

IBASPM [12], using an extension of SPM5 (Welcome Trust


Centre for Neuroimaging, London, United Kingdom; http://
www.fil.ion.ucl.ac.uk/spm/) and operating in MATLAB
(MathWorks, Natick, MA; http://www.mathworks.com/)
environment.
2.2. Volumetric analysis of brain MRIs using IBASPM with
default atlas S and the MNI template (MNI template set)
Calculation of brain structure volumes was performed as
follows: (1) MRIs of individual subjects were segmented
into gray matter, white matter, and cerebrospinal fluid; (2)
MRI scans were spatially transformed into the MNI stereotactic space, which is standardized by a 3D template image,
using affine transformation for approximate registration, and
nonlinear transformation for fine registration to obtain the
transformation parameters (the default sterotactic template
in the IBASPM toolbox is the MNI template); (3) Each
ROI from a predefined anatomical atlas [27] (referred to
here as atlas S) was encoded with a unique intensity value
and then transformed from MNI space to each individual
MRI scan; (4) Volume information for each structure was derived by summing the number of voxels assigned the same
coding intensity value and multiplying by the voxel size;
(5) Intracranial volume (ICV) was calculated as the summation of gray matter, white matter, and cerebrospinal fluid volumes, and ICV was used to normalize each brain structures
volume. All operators involved with volumetric analysis
were blinded to the subjects diagnosis and demographic information.
2.3. Volumetric analysis using IBASPM with atlas S and
template S (template S set)
1. Creation of a new stereotactic template (template S):
The high resolution single-subject image was
processed by convolution to a 3D Gaussian kernel.
Optimization criterion was the normalized crosscorrelation between the amplitude spectrum of MNI
template and processed single-subject image. Their
frequency information is optimally matched when the
SD is 1.9.
2. Calculation of brain structure volumes using template
S set was performed as for the MNI template set.

2.4. Volumetric analysis using IBASPM with a custom


elderly atlas (atlas E) and template E (template E set)
A custom elderly template (template E) was created using the following steps:
1. An MRI scan from a 70-year-old, cognitively normal
woman was normalized to MNI space, using a 12parameter affine transformation, and reslicing to
a voxel size of 1 ! 1 ! 1 mm3.
2. Manual parcellation of bilateral HP and AMG regions
was performed, following the published protocol described by Tzourio-Mazoyer et al [27]. Coded intensity values for right and left HP and AMG regions
were used to create a custom anatomical elderly atlas
of these brain regions.
3. The normalized and resliced image was processed, as
described previously, to generate a custom coordinate
template (template E), using the steps described previously for template S.
4. Calculation of brain structure volumes using template
E set was performed as for the MNI template set.
2.5. Volumetric analysis of brain MRIs using
a modification of FreeSurfer
A modification of the FreeSurfer program (Neuroquant; Cortech Labs, La Jolla, CA), a proprietary direct
voxel-to-voxel mapping procedure, was used to compute
volumes of the HP and AMG among 72 subjects (NCI
5 21, aMCI 5 26, AD 5 25). This enabled a comparison
of HP and AMG volumes obtained through Neuroquant
and the three atlas/template sets using automated atlasbased segmentation.
2.6. Manual tracing
Manual segmentation by an expert is still regarded as the
gold standard. The boundary definition of the ROIs [27], using automated atlas-based labeling, followed the published
protocol described by Tzourio-Mazoyer et al. Using a publicly available software (MRIcroN [Chris Rorden, http://
www.mricro.com]; http://www.nitrc.org/projects/mricron/),
manual tracing of an ROI on a T1-weighted MRI image was
performed on coronal slices that included, from anterior to
posterior, the ROI, as seen in the sagittal plane.

402

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

Fig. 1. An 89-year-old male subjects magnetic resonance imaging scan was randomly selected to assess the correspondence of the manually delineated hippocampus and amygdala volumes to automated segmentation using MNI template set, template S set, and template E set on the same regions. The automated
segments were superimposed on the subjects magnetic resonance imaging scan. Hippocampus is in red, and amygdala is in blue. Visually, the automated delineation of the hippocampus and amygdala using template S set and template E set was better than MNI template set.

2.7. Assessment of anatomical registration errors for


AMG and HP ROIs
To assess the correspondence of the gold standard manually delineated HP and AMG volumes to automated segmentation of the same ROI volumes using atlas-based
approaches, a randomly chosen elderly subject was selected
(Fig. 1). The HP and AMG volumes from this 89-year-old
male subjects MRI scan were computed using manual tracing, MNI template set, template S set, and template E. The
accuracy of the automated methods was compared with
manual tracing using the following measures:
1. Jaccard similarity index (J) is a widely used spatial
overlap agreement measure. Values range between
0 (no overlap) and 1 (perfect agreement) [28];
J5

X jAi XGi j
jAi WGi j

where Ai and Gi are binary values for each voxel, X


stands for logic AND operation, W stands for logic
OR operation.
2. False-positive error (FP) measures how much of the
volume is incorrectly assigned to the volumes label
[29];

FP5

X 2jAi XG~i j
jAi j1jGi j

where G~i is logic NOT operation on Gi.


3. False-negative error (FN) measures how much of
the volume is incorrectly labeled. FN and FP both
range between 0 (perfect overlap) and 1 (no overlap) [29];

FN5

X 2jA~i XGi j
jAi j1jGi j

4. Volume similarity coefficient (VS) is a conventional


measure used for retrospective evaluation of the ratio
of the absolute difference between two segmented volumes Va and Vg. Va is obtained through automated
atlas-based labeling, and Vg is manual segmented volume (gold standard) [5].
VS5

jVa 2Vg j
Vg

VS is equal to 0, if there is no difference between Va and Vg


(perfect overlap).

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

2.8. Statistical analysis


Group comparisons of means were analyzed using a series of 1-way analyses of covariance, with age entered
into models as a covariate. After obtaining a statistically
significant F test, Scheff!e post hoc procedure was used
to examine differences between means. Pearson correlation coefficients were used to evaluate the relationship between cognitive (MMSE) or functional (CDR-sb)
measures and normalized HP and AMG values calculated
using each atlas/template set. Comparison of the three
atlas/template sets for distinguishing subjects with NCI
from those with aMCI or AD was conducted by comparison of the area under the receiver operating characteristic
curve (aROC). Comparisons between HP volumes
obtained using Neuroquant and those obtained using the
three template sets were limited to elderly subjects
with NCI.
3. Results
3.1. Demographic variables
Subjects in the three diagnostic groups differed with
respect to age (P , .001), sex (P , .001), MMSE score
(P , .001), and CDR-sb (P , .001) (Table 1). Post hoc
tests of means by Scheff!e procedure showed NCI subjects
were younger than both aMCI and AD subjects. Subjects
in the NCI group were predominantly female, in comparison with a more even sex distribution in other diagnostic
groups. On the MMSE and CDR-sb tests, subjects diagnosed with NCI had better scores than those with aMCI,
whereas those with AD had the worst scores.
3.2. Correspondence of HP and AMG volumes by
automated atlas-based segmentation and manual
segmentation
Statistical comparison of J, FP, FN, and VS values for HP
and AMG volumes across the three atlas/template sets was
not feasible because the data were derived from a single, randomly chosen subject (Table 2). However, it is apparent from
Table 2 and Fig. 1 that, with few exceptions, the largest errors were found using the MNI template set and the smallest
errors were found with template E set; template S set yielded
results that were intermediate between the MNI template
and template E sets.
3.3. Normalized HP and AMG volumes as a percentage of
ICV
Using the MNI template set, the mean normalized HP and
AMG volumes were lowest for probable AD and highest for
NCI subjects, with aMCI subjects being intermediate
(Table 3). Post hoc tests of means using Scheff!e procedure
showed that left and right HP and left AMG volumes were
significantly different among all three diagnostic groups.

403

Table 2
Segmentation quality indices for the three template sets
Template

ROI

FP

FN

VS

MNI template set

Left HP
Right HP
Left AMG
Right AMG
Left HP
Right HP
Left AMG
Right AMG
Left HP
Right HP
Left AMG
Right AMG

0.51
0.35
0.61
0.54
0.77
0.73
0.84
0.81
0.80
0.73
0.93
0.93

0.42
0.54
0.24
0.42
0.20
0.22
0.11
0.13
0.10
0.27
0.10
0.10

0.56
0.75
0.53
0.49
0.25
0.32
0.20
0.25
0.19
0.27
0.04
0.04

0.13
0.19
0.25
0.06
0.05
0.10
0.09
0.11
0.02
0.01
0.06
0.06

Template S set

Template E set

Abbreviations: ROI, region of interest; HP, hippocampus; AMG, amygdala; J, Jaccard similarity index; FP, false-positive ratio; FN, falsenegative ratio; VS, volume similarity index coefficient.

The right AMG volume showed no differences between


the three diagnostic groups, and the left AMG volume was
different between NCI and AD groups and between aMCI
and AD groups only.
Using template S set and template E set, there were highly
significant differences in volumes among all three diagnostic
groups, with HP and AMG volumes being greatest using
template E set and lowest using the MNI template set. In
comparison, using Neuroquant, HP and AMG volumes
were approximately 10% larger than those obtained using
the MNI template, but equivalent to those using template S
set or template E set.
3.4. Comparison of aROC for distinguishing diagnostic
groups using different template sets
3.4.1. aMCI versus NCI
The aROC was greater for the template S set than the
MNI template set in the right AMG only (Table 4). The
aROC was greater for the template E set than for the MNI
template set in the right and left HP and right AMG. Additionally, the aROC for template E set was greater than for
template S set in the right AMG.
3.4.2. NCI versus AD
The aROCs for template S set were greater than for the
MNI template set in the right HP and the right AMG. The
aROCs for template E set were greater than for the MNI template set for all regions, except the left HP. Additionally, the
aROC for template E set was greater than for template S set
in the left HP and the right AMG.
Correlations of MMSE and CDR-sb scores to HP and
AMG volumes (Table 5) using each template set were compared using bivariate correlation analysis (Pearson r). All
correlations, with the exception for those with the right
AMG, were highly significant (P , .001). With the exception of the correlation between left HP and CDR-sb, all correlations of MMSE and CDR-sb scores to HP and AMG

404

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

Table 3
Normalized HP and AMG volumes by diagnostic groups
Template set

ROI

NCI (n 5 103)

aMCI (n 5 68)

AD (n 5 46)

F value
(df 5 2,214)

MNI template set

HP volumeleft, mean (SD)


HP volumeright, mean (SD)
AMG volumeleft, mean (SD)
AMG volumeright, mean (SD)
HP volumeleft, mean (SD)
HP volumeright, mean (SD)
AMG volumeleft, mean (SD)
AMG volumeright, mean (SD)
HP volumeleft, mean (SD)
HP volumeright, mean (SD)
AMG volumeleft, mean (SD)
AMG volumeright, mean (SD)

1.995* (0.26)
1.834* (0.28)
0.646* (0.06)
0.758 (0.08)
2.602* (0.24)
2.464* (0.29)
0.795* (0.06)
0.816* (0.06)
3.025* (0.29)
2.828* (0.38)
0.972* (0.07)
0.996* (0.09)

1.753y (0.26)
1.577y (0.32)
0.606y (0.08)
0.729 (0.07)
2.308y (0.28)
2.146y (0.38)
0.749y (0.07)
0.768y (0.07)
2.621y (0.33)
2.411y (0.37)
0.917y (0.08))
0.918y (0.08)

1.520z (0.33)
1.430y (0.39)
0.565z (0.1)
0.719 (0.06)
2.056z (0.42)
1.916z (0.45)
0.685z (0.10)
0.746y (0.10)
2.291z (0.47)
2.153z (0.50)
0.823z (0.11)
0.871z (0.11)

49.9
29.4
20.1
3.8
57.9
41.6
38.4
19.0
77.1
50.0
51.8
32.7

Template S set

Template E set

Neuroquant

Brain region

NCI (n 5 21)

aMCI (n 5 26)

AD (n 5 25)

F value
(df 5 2,69)

HP volumeleft, mean (SD)


HP volumeright, mean (SD)
AMG volumeleft, mean (SD)
AMG volumeright, mean (SD)

2.388* (0.26)
2.537* (0.36)
1.077* (0.10)
1.088* (0.19)

2.139y (0.24)
2.163y (0.31)
0.913y (0.15)
0.872y (0.18)

1.866z (0.34)
1.911y (0.46)
0.772z (0.17)
0.836y (0.22)

19.2
15.1
23.2
11.0

NOTE. HP and AMG volumes and SD are expressed as 1/000 of intracranial volume. All F test statistics are significant at P ,.001, except for AMG volume
right mean of the MNI template (P 5.02) set. Means with different superscript symbols (*,y,z) are significantly different from each other at P ,.05 by the Scheff!e
procedure.

volumes obtained using template E and S sets were significantly greater than those obtained using the MNI template
set. Correlations between template S and E sets were not different from each other.
4. Discussion
This study demonstrates that replacing the MNI template
set, which consists of a brain atlas (atlas S) and a template (the
MNI template), with template S set, which consists of atlas S
and a template derived from atlas S, improved the accuracy of
anatomical segmentation of the HP and AMG, as reflected in
measures of correspondence between manual and automated
Table 4
aROC values among diagnostic groups using three different templates sets
Comparison groups
and region
aMCI versus NCI
Left HP
Right HP
Left AMG
Right AMG
AD versus NCI
Left HP
Right HP
Left AMG
Right AMG

MNI template set

Template
S set

Template
E set

0.75* (0.04)
0.73* (0.04)
0.64* (0.05)
0.60* (0.05)

0.80* (0.04)
0.75* (0.04)
0.70* (0.04)
0.71y (0.04)

0.83y (0.03)
0.80y (0.04)
0.70* (0.04)
0.75z (0.04)

0.86*,y (0.04)
0.80* (0.04)
0.78* (0.05)
0.63* (0.06)

0.86* (0.04)
0.85y (0.04)
0.85*,y (0.04)
0.75y (0.04)

0.90y (0.03)
0.87y (0.03)
0.90y (0.03)
0.82z (0.04)

Abbreviation: aROC, area under the receiver operating characteristic


curve.
NOTE. HP and AMG volumes and SD are expressed as 1/000 of intracranial volume. aROC values with different superscript symbols (*,y,z) are significantly different from each other at P , .05 by the method of DeLong.

segmentation of these structures. A further improvement in


accuracy of segmentation was seen when template S set
was replaced by template E set, which consists of a custom
atlas (atlas E), drawn on an elderly subjects brain, and a template derived from atlas E. In addition to the accuracy of segmentation, the distinction of NCI subjects from aMCI and
probable AD subjects and the correlations of HP and AMG
volumes to MMSE and CDR-sb scores across subject groups
were improved by replacing the MNI template set with template S set, and even more so with template E set.
The most accurate method of obtaining the volumes of
a brain structure on MRI scans has been considered to be
Table 5
Correlation coefficients for CDR-sb and MMSE scores to HP and AMG
volumes, using different template sets
Correlation variables MNI template set Template S set Template E set
CDR-sb
Left HP
Right HP
Left AMG
Right AMG
MMSE
Left HP
Right HP
Left AMG
Right AMG

20.484
20.424
20.275
20.091
0.390
0.308
0.319
0.126

20.519
20.517**
20.424***
20.382**
0.455*
0.376*
0.414*
0.322**

20.545*
20.524**
20.434***
20.421**
0.462***
0.388***
0.437***
0.384**

NOTE. Correlation coefficients for template S and template E sets are significantly different from those for the MNI template at the following significance levels:
*P , .05.
**P , .001.
***P , .01.

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

through the use of manual segmentation [24]. Mean


volumes of the HP, obtained using template E set (3.03 for
the left HP), underestimated, by approximately 9%, results
obtained using manual segmentation-based approaches for
HP volumes in elderly cognitively normal subjects [30]
(3.34 for the left HP). Mean volumes of the HP reported using the Neuroquant FreeSurfer-based program (2.39 for the
left HP) and those using IBASPM with the MNI template
(1.99 for the left HP) underestimated by approximately
28% and 40%, respectively, the volumes of the left HP obtained by manual segmentation. However, HP volumes obtained using the MNI template set are equivalent to those
obtained using a variety of automated or semiautomated
methods that have been reported in the literature [31,32].
Using automated atlas-based segmentation, the accuracy
of HP and AMG segmentation is dependent on the following: (1) correspondence of the ROIs drawn on the atlas
with the same ROIs on the spatial normalization template;
(2) optimization of the frequency information of the atlas
image to that of the template to be used; (3) correspondence
of the anatomy of the atlas brain and the spatial normalization template to the to-be-segmented brain MRI scan. Using
manual segmentation to validate the automated segmentation obtained through use of three different template sets,
it became evident that the correspondence of automated segmentation using a template-based approach to manual segmentation in elderly normal or cognitively impaired
subjects is most accurate when the template set is based on
an atlas drawn on an aged subject and least accurate using
a template set, which is based on a young male subjects
MRI scan. Although the reasons for the high accuracy of
template E set for obtaining HP volumes are self-evident,
it is not generally recognized that the template used for
obtaining regional brain volumes should be customized so
that it approximates the anatomy of the target brain(s) to
be segmented.
Most research studies involving volumetric analysis continue to rely on the MNI template set, without regard to the
anatomy of the to-be-segmented MRI scan. It is likely that
the assumption made in developing the MNI template was
that averaging MRI brain scans from .150 individuals
would provide sufficient accuracy for automated segmentation, regardless of the anatomical atlas used and the anatomical characteristics of the to-be-segmented brain MRI scan.
However, the results of this study suggest that this assumption may not be correct and that a customized atlas with corresponding template, taking into consideration the age of the
subject from whom the atlas and template are derived, may
be particularly important for providing accurate automated
segmentation. Another factor that may need to be taken
into account for customizing atlases and templates is the
manner in which disease states affect the anatomy of the
brain. This study demonstrated that the use of an elderly subjects brain for creating the atlas and template for automated
atlas-based segmentation not only improved the accuracy of
measurement of the volume of the HP, but also improved the

405

clinical utility of this measurement for distinguishing NCI


subjects from aMCI and AD subjects.
Important and relatively unique aspects of this study
were the comparisons of the performance of different template sets and resulting HP and AMG volumes for distinguishing between NCI and aMCI or AD groups and
comparisons of correlations between scores of cognition
(MMSE) or functional ability (CDR-sb) and HP and AMG
volumes. Atrophy of the entorhinal cortex, HP, and AMG
is an early and reliable indicator of the development of neurofibrillary pathology in the brain, characteristic of the neurodegenerative stage of AD. Increasing severity of atrophy
of the HP and AMG has been shown in this study and
many previous studies to be associated with the severity
of cognitive and functional impairment in patients with
aMCI and AD. It is highly likely that the measurement of
HP and AMG volumes, obtained through the use of template
E set, resulted in better distinctions between diagnostic
groups than those obtained using template S set and the
MNI template set because of greater accuracy in measurement of HP and AMG volumes.
5. Conclusions
In summary, this study has shown that fully automated
atlas-based segmentation using a template-based approach
can be customized so as to make the measurement of regional volumes more accurate for certain target groups of
subjects, such as those who are elderly and with degenerative
diseases. A potential limitation of this study is the lack of
a direct comparison of automated segmentation, using different atlases and templates, to manual segmentation done
in all the NCI, aMCI, and AD subjects. However, the results
of this study show that convenient, template-based approaches to performing MRI volumetry can be optimized
for research and clinical purposes by customizing the anatomical atlases and corresponding spatial coordinate templates, so as to account for the age and disease state of the
subject to be assessed.
References
[1] Barnes J, Whitwell JL, Frost C, Josephs KA, Rossor M, Fox NC. Measurements of the amygdala and hippocampus in pathologically confirmed Alzheimer disease and frontotemporal lobar degeneration.
Arch Neurol 2006;63:14349.
[2] den Heijer T, Geerlings MI, Hoebeek FE, Hofman A, Koudstaal PJ,
Breteler MM. Use of hippocampal and amygdalar volumes on magnetic resonance imaging to predict dementia in cognitively intact elderly people. Arch Gen Psychiatry 2006;63:5762.
[3] Basso M, Yang J, Warren L, MacAvoy MG, Varma P, Bronen RA, van
Dyck CH. Volumetry of amygdala and hippocampus and memory performance in Alzheimers disease. Psychiatry Res 2006;146:25161.
[4] Chupin M, Mukuna-Bantumbakulu AR, Hasboun D, Bardinet E,
Baillet S, Kinkingnhun S, et al. Anatomically constrained region deformation for the automated segmentation of the hippocampus and
the amygdala: method and validation on controls and patients with
Alzheimers disease. Neuroimage 2007;34:9961019.

406

Q. Shen et al. / Alzheimers & Dementia 8 (2012) 399406

[5] Colliot O, Ch!etelat G, Chupin M, Desgranges B, Magnin B, Benali H,


et al. Discrimination between Alzheimer disease, mild cognitive impairment, and normal aging by using automated segmentation of the
hippocampus. Radiology 2008;248:194201.
[6] Carmichael OT, Aizenstein HA, Davis SW, Becker JT, Thompson PM,
Meltzer CC, Liu Y. Atlas-based hippocampus segmentation in Alzheimers disease and mild cognitive impairment. Neuroimage 2005;
27:97990.
[7] Morey RA, Petty CM, Xu Y, Hayes JP, Wagner HR 2nd, Lewis DV,
et al. A comparison of automated segmentation and manual tracing
for quantifying hippocampal and amygdala volumes. Neuroimage
2009;45:85566.
[8] Tae WS, Kim SS, Lee KU, Nam EC, Kim KW. Validation of hippocampal volumes measured using a manual method and two automated
methods (FreeSurfer and IBASPM) in chronic major depressive disorder. Neuroradiology 2008;50:56981.
[9] van der Lijn F, den Heijer T, Breteler MM, Niessen WJ. Hippocampus
segmentation in MR images using atlas registration, voxel classification, and graph cuts. Neuroimage 2008;43:70820.
[10] Rodionov R, Chupin M, Williams E, Hammers A, Kesavadas C,
Lemieux L. Evaluation of atlas-based segmentation of hippocampi
in healthy humans. Magn Reson Imaging 2009;27:11049.
[11] Heckemann RA, Hajnal JV, Aljabar P, Rueckert D, Hammers A. Automatic anatomical brain MRI segmentation combining label propagation and decision fusion. Neuroimage 2006;33:11526.
[12] Alem!an-G!
omez Y, Melie-Garc!a L, Vald!es-Hernandez P. IBASPM:
toolbox for automatic parcellation of brain structures [CD-ROM]. Presented at: 12th Annual Meeting of the Organization for Human Brain
Mapping; June 1115, 2006; Florence, Italy. Neuroimage 27 (No.1).
[13] Khan AR, Wang L, Beg MF. FreeSurfer-initiated fully-automated subcortical brain segmentation in MRI using large deformation diffeomorphic metric mapping. Neuroimage 2008;41:73546.
[14] Fischl B, Salat D, Busa E, Albert M, Dieterich M, Haselgrove C, et al.
Whole brain segmentation. Automated labeling of neuroanatomical
structures in the human brain. Neuron 2002;33:34155.
[15] Seixas FL, Saade DCM, Conci A, de Souza AS, Tovar-Moll F, Bramatti I. Anatomical brain MRI segmentation methods: volumetric assessment of the hippocampus. In: IWSSIP 201017th International
Conference on Systems, Signals and Image Processing. January 1719, 2010; Rio de Janiero, Brazil: EdUFF; 2010. pp. 247250.
[16] Hayashi T, Wada A, Uchida N, Kitagaki H. Enlargement of the hippocampal angle: a new index of Alzheimer disease. Magn Reson Med Sci
2009;8:338.
[17] Garc!a-V!azquez V, Reig S, Janssen J, Pascau J, Rodriguez-Ruano A,
Udias A, Chamorro J, Vaquero JJ, Desco M. Use of IBASPM atlasbased automatic segmentation toolbox in pathological brains: effect
of template selection. In: Nuclear Science Symposium Conference Record, NSS 08. Dresden, Germany: IEEE; 2008. p. 42702.
[18] Wu M, Rosano C, Lopez-Garcia P, Carter CS, Aizenstein HJ. Optimum template selection for atlas-based segmentation. Neuroimage
2007;34:16128.
[19] Barnes J, Foster J, Boyes RG, Pepple T, Moore EK, Schott JM, Frost C,
Scahill RI, Fox NC. A comparison of methods for the automated cal-

[20]

[21]

[22]
[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

culation of volumes and atrophy rates in the hippocampus. Neuroimage 2008;40:165571.


Folstein M, Folstein S, McHugh P. Mini-Mental State: a practical
method for grading the cognitive state of patients for the physician.
J Psychiatr Res 1975;12:18998.
Loewenstein DA, Barker WW, Harwood DG, Luis C, Acevedo A,
Rodriguez I, Duara R. Utility of a modified mini-mental state examination with extended delayed recall in screening for mild cognitive impairment and dementia among community dwelling elders. Int J
Geriatr Psychiatry 2000;15:43440.
Fuld PA. Fuld object-memory evaluation. Wood Dale, IL: Stoelting
Co.; 1981.
Lacritz LH, Cullum CM, Weiner M, Rosenberg RN. Comparison of
the Hopkins Verbal Learning Test Revised to the California Verbal
Learning Test in Alzheimers disease. Appl Neuropsychol 2001;
8:1804.
Duara R, Loewenstein DA, Greig M, Acevedo A, Potter E, Appel J,
et al. Reliability and validity of an algorithm for the diagnosis of normal cognition, mild cognitive impairment, and dementia: implications
for multicenter research studies. Am J Geriatr Psychiatry 2010;
18:36370.
Petersen RC, Smith GE, Waring SC, Ivnik RJ, Tangalos EG,
Kokmen E. Mild cognitive impairment: clinical characterization and
outcome. Arch Neurol 1999;56:3038.
McKhann G, Drachman DA, Folstein MF, Katzman R, Price DL,
Stadlan E. Clinical diagnosis of Alzheimers disease: report of the
NINCDS-ADRDA Work Group under the auspices of the Department
of Health and Human Services Task Force on Alzheimers Disease.
Neurology 1984;34:93944.
Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F,
Etard O, Delcroix N, et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the
MNI MRI single-subject brain. Neuroimage 2002;15:27389.
!
Jaccard P. Etude
comparative de la distribution florale dans une portion
des Alpes et des Jura. Bulletin de la Soci!et!e Vaudoise des Sciences Naturelles 1901;37:54779.
Klein A, Andersson J, Ardekani BA, Ashburner J, Avants B,
Chiang MC, et al. Evaluation of 14 nonlinear deformation algorithms
applied to human brain MRI registration. Neuroimage 2009;
46:786802.
S!anchez-Benavides G, G!omez-Ans!on B, Sainz A, Vives Y, Delfino M,
Pe~na-Casanova J. Manual validation of FreeSurfers automated hippocampal segmentation in normal aging, mild cognitive impairment, and
Alzheimer disease subjects. Psychiatry Res 2010;181:21925.
Teipel SJ, Pruessner JC, Faltraco F, Born C, Rocha-Unold M, Evans A,
Moller HJ, Hampel H. Comprehensive dissection of the medial temporal lobe in AD: measurement of hippocampus, amygdala, entorhinal,
perirhinal and parahippocampal cortices using MRI. J Neurol 2006;
253:794800.
Mori E, Yoneda Y, Yamashita H, Hirono N, Ikeda M, Yamadori A. Medial temporal structures relate to memory impairment in Alzheimers
disease: an MRI volumetric study. J Neurol Neurosurg Psychiatry
1997;63:21421.