This action might not be possible to undo. Are you sure you want to continue?
Magnetic Resonance Imaging (MRI) is the favoured technique for the identification of lesions over other available methods due to its capability to be used in a wide variety of examinations as well as the fact it is non-invasive and doesn‟t make use of nonionizing radiation (Stamatakis & Tyler, 2005). It does however; present the medical professional with the challenge of providing a constant and reliable method of identification of lesion areas that is repeatable across different operators as well as the need in some medical conditions to provide fast and accurate diagnosis of the affected areas to determine an appropriate course of treatment. This paper will review existing research surrounding this problem. It will not focus specifically on any lesion type or application to any particular disease but will look at the problem in broad terms. Three main areas will be covered; section 1 will discuss manual segmentation techniques and issues surrounding this approach, section 2 will discuss the development of automatic segmentation techniques and section 3 will focus on new techniques and approaches which show some promising results and may be the attention of further future research.
Magnetic Resonance Imaging was first discovered in the 1950s and used initially in the field of spectroscopy. It was not until the 1970s when work undertaken by Lauterbur expanded the use of Magnetic Resonance Imaging into medical applications which then enabled examinations of the human body in vivo (Liney, 2005). The technique produces the MR image through the detection of the presence of hydrogens (protons) within the body. The MRI machine subjects these hydrogens to a large magnetic field which partially polarizes their nuclear spins. The spins are then Figure 1 Example of a normal brain MRI image (McGill excited using tuned radio frequency University, 2006) radiation. Radio frequency radiation is then detected from them as they relax from this magnetic interaction. The frequency of the signal from the proton is proportional to the magnetic field applied during the radiation process. Using these signals, a map of the body area scanned is then produced which forms the magnetic resonance image. (Nave) While the use of MRI scans has provided good insight into the pathology of the human body, for the identification of lesions it presents some areas of concern. Broadly speaking there are two main categories of lesions that are of interest to medical professionals; white matter lesions (WML), resulting in blood-brain barrier damage (Calabrese, et al., 2008) gray matter lesions (GML), resulting in demyelination of nerve fibres (Calabrese, et al., 2008) These lesions point towards a number of different medical conditions and their identification is often paramount to determine treatment for the patient as well as critical in monitoring the effects of drug therapy in clinical trials. (Van Leemput, Maes, Bello, Vandermeulen, Colchester, & Suetens, 2000)
The process of segmenting the MRI scan of patients with WML is difficult because the characteristics of WML are similar to those of gray matter. Techniques such as intensity based statistical classification potentially may classify some WML as gray matter and some gray matter as WML. (Warfield, et al., 1995) To further highlight the subtleties involved in the process required to segment lesions from an MRI scan, take the images displayed in Figure 1 and Figure 2 as an example. These are simulated images generated from the online BrainWeb resource (McGill University, 2006). Both images show a T1 MRI scan taken in 5 mm slices (slice 21 displayed).
Figure 2 Example of a brain MRI image showing MS lesions. (McGill University, 2006)
The differences between the two images are very subtle and identifying the lesion would be a difficult task taking into account the potentially large number of images within a standard MRI scan. Additionally, an operator examining a large number of images in a given work day may eventually start to misidentify some of the less apparent lesions. Add to this a level of complexity introduced due to varying type and quality of images under review. Due to the fact that MRI techniques were well established prior to any concept of an automated method for analysis, the first techniques developed to assess MRI scans were of course manual. These consisted of trained operators following a predefined measurement scale as will be discussed in more detail later in this paper. With advancements in the areas of computer assisted analysis and its application to the medical profession, a number of techniques have been developed which automate the work of the trained operators. These include two main categories; fully automated or hybrid approach which still requires some involvement with a trained operator A large amount of the literature I covered throughout the course of this review covered the application of lesion segmentation in relation to its application specifically to the disease of Multiple Sclerosis. Certainly other lesion-causing diseases have been covered such as Alzheimer‟s and stroke. It should also be noted 3
that in the research covered, the specificity of the application to the disease Multiple Sclerosis by no means invalidates the application of the lesion segmentation technique to that disease. It is notable, and possibly can be drawn as a cause resulting from the observation above, the majority of the lesions segmentation techniques also focus on the segmentation of WML with potential applicability to the segmentation of GML. With an initial belief that Multiple Sclerosis is primarily a disease of the white matter (Kutzelnigg & Lassmann, 2005) this may have resulted in a disproportionate focus on the segmentation of WML over GML. This disproportion would seem to have been the focus of some attention at least within the last ten years which has drawn conclusions that Multiple Sclerosis also has an impact on the cause of lesions within gray matter structures (Kidd, Barkhof, McConnell, Algra, Allen, & Revesz, 1999). Demyelination has also been noted prominently in the gray matter of deep cerebral nuclei and the cerebral cortex. (Kutzelnigg & Lassmann, 2005). Where possible the approach taken in this review has been to look at the problem of lesion segmentation divorced from any specific disease or specific lesion type. My observations throughout the course of this review have primarily revealed that the issue of segmentation exists across most applications of MRI technology. That said however, the focus of the source material used within this review has a narrow focus towards specific applications. It is conceivable that future developments within the field of MRI technology may address some of these issues by producing images that more clearly identify the areas of interest. However, until that stage lesion segmentation will be a necessary area of research and development.
2. Manual Segmentation
The concept behind manual segmentation is fairly simple; provide a rating system, usually numeric, and an accurate description that enables a similar result across disparate operators and applications. Over time and in the absence of any automated quantitative methodology to assess MR images, operator observation techniques developed.
2.2 White Matter Lesions on CT and MRI
One such technique (van Swieten, Hijdra, Koudstaal, & van Gijn, 1990) focused on white matter lesions within CT and MRI scans identifies in addition to the proposed scale, three key observations that could be applied effectively to any manual rating system. They are: 1. The scale used should incorporate anatomical distribution and severity and provide clear definitions for each of the different categories Any scale used should be applicable to a given anatomical area and provide a measure of the severity of the lesion being examined 2. Simple. Given this is a quantitative measure involving operator observation, a granular approach to rating would increase the likelihood of variation between operators. As such, the scale needs to remain relatively simple with clearly defined categories that most reasonably trained operators can readily identify against. 3. The scale should be assessed for reliability against an inter-observer study. The key mechanism involved within this type of approach is a human element. As a result, a number of factors can potentially be involved which may bias the result. Aspects such as operator training, timeframe, equipment/image quality may all play a part in producing different results across different operators. While the human element cannot be reduced entirely, it can be mitigated by studies that provide a statistical measure of the accuracy of observations made against this scale. While the paper did cover specifically the application of this scale to white matter lesions within CT or MR images, the observations above and the principles outlined 5
in the scale could be readily applied with only minor modification and tuning to most lesion grading requirements. This system identified three severity categories and associated definitions Grade 0 1 2 Description No lesion or only a single one Multiple focal lesions Multiple confluent lesions scattered throughout the white matter
Table 1 Three grade rating system, (van Swieten, Hijdra, Koudstaal, & van Gijn, 1990)
During the study conducted for this paper, examinations were undertaken on both CT and MRI scans. For the MRI scans, twenty four images were obtained from a study of elderly hyperintensive patients. The results from the MRI portion of the study were calculated using kappa statistics with a weighted value of 0.78. While this would seem a reasonable outcome, the conclusions draw within this paper raise two main questions: 1. Is the sample size of 24 sufficient to draw this conclusions 2. The only measure of success of this methodology is a measure generated using kappa statistics. The utility of this measure for this type of analysis is seen as controversial with opinions differing as to its applicability (Uebersax, 2002).
2.3 ARWMC Scale
Another manual segmentation technique (Wahlund, et al., 2001) takes a very similar approach to that identified above. This technique, the ARWMC (Age Related White Matter Change) scale uses two four point scales divided across two different regions of the brain. As you can see from the scales identified in Tables 2 and 3, the three key observations identified above are present within this scale; anatomical and severity measurements have been identified, the scale is simple and (as the study indicates) provides good inter-rater reliability.
Grade 0 1 2 3
Description No lesions (including symmetrical, well-defined caps or bands) Focal lesions Beginning confluence of lesions Diffuse involvement of the entire region, with or without involvement of U fibres
Table 2 White Matter Lesion Scale from AWRMC scale
Grade 0 1 2 3
Description No lesions 1 focal lesions (≥5 mm) > 1 focal lesions Confluent lesions
Table 3 Basal Ganglia Lesion Scale from AWRMC scale
The observations of this study were conducted across both MRI and CT images. The results of this study indicated good inter-rater reliability of each of the scans. It should be noted that similar statistical measures were used to reach this conclusion and therefore the same issues as identified by (Uebersax, 2002) could potentially apply.
Manual segmentation was essentially born from necessity. MRI and other scanning technologies provided insight into areas of the human body where in vivo examination had never been able to be performed previously. While techniques were developed to apply this type of methodology in a consistent and scientific manner, some shortfalls could realistically never really be adequately addressed. These issues include; 1. Generally a high-level of expertise will be required In essence; this process will only involve two elements; the rater and the images. There is little additional assistance provided to complete this task 2. The process is time and labour consuming Each image needs to be carefully examined in great detail. With this requirement and the large number of images involved in a given MRI scan, this is a large amount of work to complete 3. The process is subjective and therefore not reproducible While statistically, manual segmentation methods have proven to be more or less reliable, the subjective nature of assessment cannot be eliminated entirely (Stokking, Vincken, & Viergever, 2000)
While the future direction of lesion segmentation rests with better and more efficient automated processes, it should be noted that manual segmentation processes still have a place as viable tools to validate new methodologies. A number of studies such as those covered in later sections within this review (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004); outline steps taken to perform manual segmentation as part of the validation of the proposed automated techniques. This highlights the need to maintain expertise within this area of study.
3. Automatic Segmentation
The fundamental flaw in the manual segmentation approach is the inconsistency of the human operator. A number of factors need to be taken into account which may result in errors during an assessment. These include: Training level; each operator may be at a varying level of experience and expertise Time constraints; a manual segmentation approach will take time, with a large number of MRI scans to assess an operator may not have sufficient time to make an adequate identification Large lesions; if a lesions is large enough to be spread over a number of different image slices, this may lead to the full extent of a given lesion not being accurately assessed To this end, studies have been devoted to producing automated methods for the segmentation of MRI scans. Automatic procedures will remove a number of human related issues and produce a more consistent result across any number of operators. A number of different methodologies have been developed to achieve this. While each takes a unique approach, there are also a number of common elements that are generally present within each; uniformity correction, a method used to correct for any inhomogeneities that are present within the scan; patient movement, correction for any inconsistencies introduce due to the movement of the patient during the scan; isolate brain tissue; minimize the size of the problem by ensuring that the only areas of the scan that are examined are the required areas and not areas of noninterest such as cerebrospinal fluid (CSF) or skull Additionally, two main approaches can be identified across the different techniques; fully automated segmentation, a process able to be performed by an operator untrained in image segmentation and analysis; and partially automated segmentation, a process still requiring some image segmentation and analysis decision making by a skilled operator.
3.2 k-Nearest Neighbour Technique
3.2.1 Introduction A methodology used in a number of different lesion segmentation techniques is that of the k-Nearest Neighbour classification. Used within the problem of lesion segmentation, this classification algorithm makes a determination of the classification of a given voxel based upon the classification of its neighbouring voxels and a predefined „learning‟ set of voxels provided to the system prior to a segmentation attempt. (Statsoft Inc., 1984-2008). The k-Nearest Neighbour (k-NN) algorithm is an example of a type of automated machine-based learning where a given object is labelled based upon the frequency of that label in comparison to its neighbours (Columbia University, 2007); (van den Bosch, 2009). During the course of this review, I found three different approaches which make use of this classification methodology. 3.2.2 Probability Maps In the application we see demonstrated here (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004), the learning element is undertaken based upon a features space. This specific implementation of this algorithm makes use of five different types of MRI including: T1-weighted (T1-w), Inversion Recovery (IR), Proton Density-Weighted (PD), T2-Weighted (T2-w) and Fluid Attenuation Inversion Recovery (FLAIR) The implementation of the k-NN algorithm for this study determines a feature space based upon voxel intensity features and spatial information. The result of this method is the generation of an image (probability map) representing the probability on a per voxel basis of a given voxel being part of a WML. (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004). These probability maps were then evaluated using two methodologies; binary segmentation and direct probability evaluation. For the binary segmentation evaluation, varying thresholds were applied to the probability map to create different segmentations of the WMLs. From this a ROC curve analysis was taken from the True Positive Fraction (TPF) as a function of the False Positive Function (FPF). (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004)
In addition to this, each binary segmentation were evaluated using three different similarity measures; Similarity Index (SI), a measure for the correctly classified lesion area; Overlap Fraction (OF), a measure of the correctly classified lesion area relative to only the reference WML area; Extra Fraction (EF), a measure of the area falsely classified as lesion relative to the reference WML area (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004). These measures were defined by 𝑆𝐼 = 𝑂𝐹 = 𝐸𝐹 = 2 𝑋 𝑇𝑃 2 𝑋 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 𝐹𝑃
𝑇𝑃 + 𝐹𝑁 (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004) For the probabilistic evaluation, each result was analysed using probabilistic versions of the similarity measures. These measures; the probabilistic similarity index (PSI), probabilistic overlap fraction (POF) and the probabilistic extra fraction (PEF) are defined by: 𝑃𝑆𝐼 = 2 𝑋 𝑃𝑥,𝑔𝑠=1 1𝑥,𝑔𝑠=1 + 𝑃 𝑥 𝑃𝑥 ,𝑔𝑠=1 1𝑥 ,𝑔𝑠=1 𝑃𝑂𝐹
,𝑔𝑠=0 1𝑥,𝑔𝑠=1 (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004) 𝑃𝐸𝐹 = The preparation process applied to each of these images included three steps. Step 1 - Inhomogeneities correction; this step involved the application of a process in which the intensity histogram of each given image is transformed into a “standard” histogram (Nyul & Udupa, 1999). This is done in a two stage processes. Stage 1 is the training stage where parameters of the standardizing transformation are learned from a set of images. This stage identifies specific landmarks of a standard histogram and is estimated from a given set of volume images. (Nyul & Udupa, 1999). Stage 2 is the transformation stage. The image intensity scale is computed by mapping the
landmarks determined from the image histogram to those of the standard histogram. (Nyul & Udupa, 1999). Step 2 - Correction for difference due to patient movement; all patient images were registered by rigid registration (translation and rotation) (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004). Step 3 - Reduce amount of data to be investigated; this stage uses a technique called MBRASE (Morphology-based Brain Segmentation). This is a segmentation process that uses a region-based growing technique. A seed pixel is selected in the given image and neighbouring pixels are added progressively based upon their meeting set criteria such as maintaining a particular intensity range (Stokking, Vincken, & Viergever, 2000) 3.2.3 Results The end result of this process is a probability map which maps against each voxel the probability of it being a lesion. Additionally it also provides spatial and volumetric information about the identified WML. This classification method resulted in a high degree of accuracy for a range of different lesion sizes. While focussing on WML for this study, the authors do acknowledge the further possible utility of the method in the identification of other lesion types. Possibly a disadvantage that this method clearly brings is the need for multiple types of MRIs to be conducted. It has been observed generally in the other papers reviewed that an objective is to standardise the approach to lesion segmentation and use MRIs that would have already been taken for other diagnostic reasons rather that requiring images to be taken for this specific purpose. However, this is obviously needed to be weighed against the relative success of this technique by comparison to others and the situation required.
3.2.4 Brain Atlas Method Another study utilizing the k-Nearest Neighbour technique (de Boer, et al., 2009) takes an approach that utilizes the registration of brain-atlases. This method is a two staged approach using T1-weighted and FLAIR MRI. This study identifies a fully automated methodology for the segmentation of CSF, gray matter and white matter and WML. The technique outlines: The use of atlas registration to automatically train a k-nearest neighbour classifier Automatic WML segmentation Twelve brain atlases were acquired for this study. These atlases were sourced from the Rotterdam Scan Study; a large population-based imaging study conducted between 1995-1996 consisting of approximately 1700 subjects who underwent MRI scan which was then manually segmented (de Leeuw, et al., 2001). This followed with the acquisition of test data taken from the Rotterdam Scan Study conducted 20052006. This study involved 215 subjects. The segmentation process consisted of two main stages, brain tissue segmentation; identifying gray matter, white matter and CSF, and WML lesion segmentation; the final stage of the process. In the first stage (brain tissue segmentation), the CSF, gray matter and white matter are automatically segmented using the trained k-Nearest Neighbour classifier with the T1-weighted image. The training samples for the k-NN classifier are obtained from the subject via atlas-based registration using either one or more registrations of atlases to the subject. In the second stage (WML lesion segmentation), a process of thresholding is applied to obtain the segmentation of the WML. Initially WMLs present in the image are misclassified at gray matter with a „halo‟ of white matter (de Boer, et al., 2009). From this image, a histogram is then created of all voxels in the image classified as gray matter. Within this histogram, the highest peak corresponds to the true gray matter voxels with the intensities corresponding to the WML voxels located to the right of this peak. The histogram is then smoothed by a convolution with a Gaussian kernel making it possible to estimate FLAIR intensity corresponding to the centre of the gray matter peak by the histogram bin containing the most true positive gray matter voxels (de Boer, et al., 2009). 126.96.36.199 Results The final analysis of the results from this study showed a high degree of accuracy that was validated by a separate and independent manual segmentation process.
By comparison to the previously identified method as well as the method outlined in section 3.2.5, this approach requires the least number of MRI images which would present an advantage to time, cost and possible dual use of scans. 3.2.5 k-Nearest Neighbour and TDS + This study expands on previous work conducted by the authors. In the previous work, the authors developed and validated a template-driven segmentation methodology combined with heuristic partial volume correction algorithm (TDS+). In this study, the work has been expanded upon to develop an automated three-channel TDS (3chTDS+) MRI segmentation pipeline for the identification of MS lesion subtypes (Wu, et al., 2006). There are five stages involved in this methodology which utilise Proton Density, T2 and contrast-enhanced T1-weighted images. These are described as follows. 188.8.131.52 Segmentation of the Intracranial Cavity Masks of the Intracranial Cavity were generated from the Proton Density and T2 images. This was done utilising an extraction procedure combining non-parametric intensity-based statistical (Parzen windows) segmentation and automated morphological operations (Wu, et al., 2006). Parzen windows are similar to k-NN. The key difference being that k-NN will look at k closest points to the designated training data whereas with a Parzen window, a fixed distance is considered (Vawter). Further segmentation of material not of interest was undertaken by superimposing the masks onto the Proton Density, T2 and contrast T1 images. 184.108.40.206 Image Correction Once the Intracranial Cavity masking was complete, EM segmentation was applied to provide inhomogeneity correction and intensity normalisation. The EM segmenter compensated for intra/inter-scan intensity inhomogeneities and normalised the scan intensities. 220.127.116.11 k-Nearest Neighbour Segmentation The k-Nearest Neighbour segmentation approach selected was developed based on Friedman‟s k-NN algorithm (Friedman et al., 1975; Warfield, 1996) (Wu, et al., 2006). Two stages were involved with this process. In a similar fashion as with other k-NN based approaches, a learning phase was initially required. For this implementation, two randomly chosen (from the full set of scans used within this study) were selected as calibration scans. The information obtained from this process was then applied to the remaining scans in the study.
18.104.22.168 TDS + TDS+ (Template Driven Segmentation and partial volume artefact correction) was applied to correct misclassifications after the k-NN segmentation process. This improved lesion classification by providing a priori anatomical probabilities. 22.214.171.124 Refining “Black Holes” Segmentation The “black holes” in the MRI image previously identified in the k-NN segmentation stage do not include areas of the white matter that are hypointense with respect to healthy white matter but isointense with respect to gray matter (Wu, et al., 2006). To address this, an additional classification step is taken to refine the “black holes” to include subtly hypointense signals. To this end, a more sensitive k-NN classifier is obtained by adding training points from mildly T1-hypointense WM regions (Wu, et al., 2006). 126.96.36.199 Results The results of this study when compared to manual tracing demonstrated that the kNN segmentation was able to identify most of the lesions. Most notable is that three types of misclassifications were apparent. These included; misclassification of choroid plexus and other enhancing vascular structures as enhancing lesions, misclassifications of subtle signal abnormalities of the white matter as gray matter and misclassification of pixels on the cortical surface as white matter lesions (Wu, et al., 2006). With these issues identified, it would generally appear that further examination of this technique is required. The authors outline in their discussion on these findings various modifications and other enhancements applied to the original methodology.
3.3 Gray Matter Atrophy
An algorithm developed (Nakamura & Fisher, 2009) focuses on the measurement of gray matter atrophy in MS patients. While not specifically looking at lesion load, this approach could be used in determining WML load as damage to the white matter has been show to be associated with upstream gray matter atrophy (Sepulcre, et al., 2009). This algorithm (Nakamura & Fisher, 2009) approaches the problem by the combination of intensity, anatomical and morphological probability maps. It uses analysis from FLAIR and T1-weighted images as well as brain atlas information. The intensity based probability map is generated with a modified fuzzy c-means (FCM) clustering method to generate probability maps for each tissue type. (Nakamura & Fisher, 2009). During the course of this study, the FCM was applied to the T1-weighted images. The anatomy-based probability map was derived from the Harvard Brain Atlas, a 3-D digitized atlas of the human brain designed for use with MR image sets (Kikinis, et al., 1996). The process at this stage involved converting the atlas to a general GM probability map and then applying morphologic operations and Gaussian filters to smooth the result. The converted map is then aligned with each patient‟s MRI using a 12 DF affine transformation (Nakamura & Fisher, 2009). The individualized morphological probability map is created from morphological models of the cortical and deep GM. The final stage of this process creates a combined probability image which is a product of all of the GM probability maps. The binary GM mask is then generated by setting a threshold of 0.5 on the combined probability map. The normalized Gm volume is defined as: 𝐺𝑟𝑎𝑦 𝑀𝑎𝑡𝑡𝑒𝑟 𝑉𝑜𝑙𝑢𝑚𝑒 𝑂𝑢𝑡𝑒𝑟 𝐶𝑜𝑛𝑡𝑜𝑢𝑟 𝑉𝑜𝑙𝑢𝑚𝑒 (Nakamura & Fisher, 2009) 𝐺𝑟𝑎𝑦 𝑀𝑎𝑡𝑡𝑒𝑟 𝐹𝑟𝑎𝑐𝑡𝑖𝑜𝑛 =
Four different tests were developed to validate the results of this method. These included; segmentation of simulated MRI data and comparison to correct results, segmentation of real MRI data and comparison to manual tracing results, segmentation of scan-rescan images to determine the reproducibility of the method and segmentation of the same image with simulated MS lesions to determine the effects of lesions on the results.
Simulated MRI data was used to determine the accuracy in terms of volumetric errors and similarity indices by comparing the segmented tissues masks to the gold standard tissue masks. The evaluation were conducted against the results using the similarity index defined as 2𝑇𝑃 2𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 (Nakamura & Fisher, 2009) 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝐼𝑛𝑑𝑒𝑥 = MRIs from three MS patients and three normal controls were used to evaluate the segmentation accuracy of the algorithm in real MRIs. Each image was processed through the algorithm and then the GM was manually traced in a separate process. Analysis was conducted on each of these results. For a separate study, MRIs were obtained from nine MS patients. Each of the images were analysed with the reproducibility of the algorithm evaluated by calculating the coefficient of variation of GM volumes calculated from repeated images of each patient. (Nakamura & Fisher, 2009). The final test measured the effect of WML in the FLAIR images. To achieve this test, masks of segmented MS lesions were simulated within the MRI images. This test was conducted over 18 MS patients. The results of each of these tests are detailed in full within the study (Nakamura & Fisher, 2009). This particular methodology brings with it a number of advantages over other studies. The requirements for this methodology are similar to those required for patients undertaking normal MRI procedures. This makes this process greatly applicable to many standard MRI tests in retrospect without the need to specialised images to be taken for the purposes of applying this methodology only. 3.3.1 Results Statistically the results from this methodology appear to be promising. Additionally, a number of observations were made that provide additional benefit to the use of this methodology. Comparison to other GM segmentation methodologies has identified an advantage over other methodologies such as SPM (Ashburner) and partial volume model (Shattuck). The similarity index for this methodology was 0.938 compared to the other methodologies reporting 0.932 and 0.893 respectively. (Nakamura & Fisher, 2009).
Statistically this methodology doesn‟t correlate the measurement of GM volumes strongly to lesion volume in comparison to methodologies such as SPM. This eliminates the need for any form of manual correction to correct segmentation errors between the GM and lesions volumes. An interesting point to note with this study (which is further expanded upon in section 3) is the application of an indirect measure to achieve a result. That is, the measurement of one element that is known can also provide information in regards to another element that is not known. This may not seem the most direct approach to achieving the desired segmentation, however it may provide an easier measure or at least confirmation of a known measure. While this study focuses on an application to MS, an application to a range of medical conditions such as schizophrenia, HIV dementia and Alzheimer‟s disease could also be applicable. (Nakamura & Fisher, 2009).
3.4 Measuring the Whole Brain Structure
An approach taken within a number of methodologies covered has been to look at segmentation issue from the perspective of the entire brain structure and then divide and segment into its respective classifications of matter. This approach differs in the manner that it doesn‟t initially focus on the immediate identification of GML or WML but addresses each component of the brain. From this macro scale analysis, it would be possible to identify each component of the brain eventually eliminating everything other than the area of interest by a process of elimination if nothing else. This methodology would be particularly beneficial in application to longitudinal studies where measurements of the course of the study could very easily identify areas of change. One such application of this methodology (Iosifescu, et al., 1997), implements this approach using an atlas image and elastic matching from automatically segmented MRI scans. 3.4.1 Automated Segmentation The first stage in this methodology is to perform the initial segmentation of the images. For this stage, a segmentation methodology selected was that published by Wells and co-workers (1996). (Iosifescu, et al., 1997). This was a two stage process initially segmenting the image into white matter, gray matter and CSF. The second stage then further segmented the image into cortical gray matter, subcortical gray matter, white matter and CSF. This methodology used a priori knowledge of tissue properties and intensity inhomogeneities to correct for intensity differences in MRI data. (Iosifescu, et al., 1997). 3.4.2 Image Correction The next stage in this methodology was to match the atlas brain image onto the patient brain image. This was undertaken with a linear registration program designed to correct for differences in size, rotation and translation between the two images. (Iosifescu, et al., 1997). The linear registration performed an alignment of the two data sets through a combination of energy minimisation registration techniques. The outcome of this stage was an atlas brain image linearly registered onto the patient brain image. (Iosifescu, et al., 1997). 3.4.3 Elastic Matching The procedures used to elastic match the source and target data (segmented atlas and segmented patient image) was Dengler‟s regularisation procedure (Dengler et al. 1988; Schmidt and Dengler, 1989).
This process used a procedure that “warped” the atlas image onto the patient‟s image. Due to the nature of the two images, a simple uniform global displacement (translation, rotation or scaling) would not work. (Iosifescu, et al., 1997) 3.4.4 Application to Lesion Segmentation and Identification As identified earlier, this technique is not specifically aimed at the segmentation and identification of either WML or GML. However, it would appear to have the capability of being applied to this problem. The results from this study determined that the methodology outlined is able to measure the volumes of brain structures with a very high level of accuracy. (Iosifescu, et al., 1997). This capability could be utilised to assist with the identification of lesion areas by the lesion itself having an impact on overall brain structure volume. Over a long-term study, this could be used to track the development of targeted lesion areas. In the current implementation outlined in this study, some key disadvantages are however identified. It was found that the most accurate matching was done with large regularly shaped objects. This limitation would result in the application of this method to some brain areas being less that optimum due to size. Certainly, for general application to the issue of lesion segmentation some modification or development to this methodology would need to be undertaken.
3.5 Artificial Neural Networks (ANN)
The main objective of an automated lesion segmentation methodology is basically just that; automation, removal of as much interaction and manual processing as possible and the reduction of the human-error element of any process. This study (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) has applied the use of artificial neural networks to try and achieve this objective. The approach undertaken here has achieved automatic detection of white matter MS lesions in axial proton density, T2-weighted, gadolinium enhanced and fast FLAIR brain MR images.
Figure 3 the original Proton Density image prior to process being conducted (GoldbergZimring, Achiron, Miron, Faibel, & Azhari, 1998)
The general process consists of three stages. Firstly, detection and contouring of all hyperintense signal regions within the image. Secondly, elimination of false positive segments by size, shape index and anatomical location and thirdly, the use of an artificial neural network (ANN) for final removal and differentiation from true MS lesions. (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998). This methodology outlines four basic assumptions with its processing. 1. In PD, T2-weighted, gadolinium enhanced, and FF-MR images, MS lesions appear much brighter than the rest of the brain 2. Non-MS regions in the brain, which also produce high signal intensity, (especially in T2-weighted MR images) such as blood vessels, and cerebrospinal fluid within the ventricles, have either a relatively very small or very large (in the case of the ventricles) area 3. MS lesions have a relatively circular shape 4. Most of the MS lesions occur in the periventricular white matter area, and are rarely seen in cortical regions on MR images. Furthermore, they are typically located asymmetrically relative to the brain (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998). Based upon these four assumptions, it was determined that a brain region would be a possible candidate for an MS lesion if it has a relatively high signal intensity, is relatively circular in shape, its size is within a predefined range and its location complies with assumption number four. (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) The algorithm itself is applied in three stages (as indicated above). 21
3.5.1 Detection and Contouring of all Hyperintense Signal Regions within the Image Normalisation of the image takes place within this stage with the application of an adaptive threshold algorithm. The output from this stage is a set of closed contours described by arrays of contour data points (see Figure 4). 3.5.2 Partial Elimination of Artefacts (False Positives) The output of this stage is displayed in Figure 5. Area, perimeter and shape index of each of the contoured regions from the previous stage is calculated using the following formulas. Area: A cross-sectional area bounded by a closed contour can be estimated by green‟s Theorem in the plane. (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998)
Figure 4 the processed Proton Density image after the first stage of the algorithm. Note the presence of artefacts (GoldbergZimring, Achiron, Miron, Faibel, & Azhari, 1998)
1 𝑥𝑑𝑦 − 𝑦𝑑𝑥 2 (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) 𝐴 =
Perimeter: The perimeter was estimated using the following
2 2 𝑃𝑒𝑟𝑖𝑚𝑒𝑡𝑒𝑟 = 𝑛 𝑖=1 (𝑥𝑖 − 𝑥𝑖−1 ) + (𝑦𝑖 − 𝑦𝑖−1 ) (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998)
Shape Index: The resemblance of each segmented shape to a circular shape was evaluating using the shape index applied by Gibson et al. (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998). 𝑆ℎ𝑎𝑝𝑒 𝐼𝑛𝑑𝑒𝑥 =
4𝜋 ∗ 𝐴𝑟𝑒𝑎 𝑃𝑒𝑟𝑖𝑚𝑒𝑡𝑒𝑟 2 (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998)
Figure 5 the process image after removal of the artefacts (GoldbergZimring, Achiron, Miron, Faibel, & Azhari, 1998)
3.5.3 Final Removal of Artefacts by ANN The Artificial Neural Network (ANN) is applied at the final stage to remove the remaining artefacts. An ANN is a computer algorithm that attempts to describe the biological behaviour of brain neurons (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998). This methodology has selected the BackPropagation ANN which uses a form of supervised learning in a training phase.
Figure 6 the final Proton Density image after removal of all artefacts and final tuning stage (GoldbergZimring, Achiron, Miron, Faibel, & Azhari, 1998)
During this training phase a set of input patterns close to the desired output is entered into ANN. The ANN then adjusts its synaptic weighting to attempt to closely match the targeted outputs.
For this implementation a set of 40 positively identified MS lesions and 40 positively identified artefacts were taken from across 20 images. Once the training was complete, the trained ANN was used for the final sorting of the selected images. 3.5.4 Results A fully automated algorithm for the detection and segmentation of MS lesions is of course a very desirable tool for this function. The ANN produces a significant result over other automated algorithms cover so far; namely it does have the potential for learning based upon previous experience. The more information provided during the training phase will ultimately produce a better tool. With this implementation however, a number of limitations can be observed. The assumptions identified above produce constraints that may not be suitable for all possible MRI scans. It makes the assumption that the MS lesions being examined are brighter than the brain (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998). This would certainly limit the use of this tool in a number of circumstances. As identified earlier in this paper, the involvement of GML within MS which in more recent years has been identified as playing a role within MS would not be seen by this method. While this implementation would seem to have some significant limitations, this method does demonstrate the utility of ANN in terms of the lesions segmentation problem. Further study into this methodology may identify possible future applications for MS and other relevant medical conditions.
4. New Techniques and areas of further study
Traditional approaches to this problem have seen advancement from fully manual processes relying on judgment by trained operators to the introduction of either fully or partially automated techniques. There have also been some approaches that have taken different directions with the resolution of this problem. Some studies have been undertaken which look at the problem of segmentation with the emphasis on determining what is known and easily identifiable and using that to assist in the determination of the areas or items of interest in the scan.
4.2 Brain Atrophy
A study conducted to determine if White Matter Hyperintensities (WMH) were related with sub cortical brain atrophy (Wen, Sachdev, Chen, & Anstey, 2006) has provided some evidence to suggest that the brain‟s WMH load can be correlated with atrophy in other regions of interest such as gray matter volume reduction. This study doesn‟t draw any direct conclusions on any causality to this observation; however it does raise an interesting line of reasoning for future study or conjecture. A more recent study (Bendfeldt, et al., 2009) has looked to establishing a stronger link between WML and changes of gray matter volumes by means of voxel-based morphometry (VBM). In this study, two hypotheses are raised; 1. Regional gray matter volume reductions occur predominantly in patients with increasing WML volumes 2. Patients with both increasing T1 and T2 lesion burden would show volumetric GM reductions that are qualitatively similar but even more pronounced. (Bendfeldt, et al., 2009) The results of this study draw a conclusion that suggests that gray matter volume reductions are directly related to increase white matter lesion volumes. Based on the results of these two studies, a simple but potentially effective approach to the problem of lesion segmentation may be to approach the matter with not so 24
much identifying what is unknown, but identifying what is known and working backwards from there. It should be noted that both studies do indicate that further long term follow-up studies are required to further support this conclusion.
4.3 Physical Impairment as a Measure
This study (Charil, et al., 2003) looks at identifying a link between lesion location and neurological disability in Multiple Sclerosis. The author acknowledges initially that there is generally only weak correlation between disability and the volume of white matter lesions (Charil, et al., 2003); however the study was able to determine some correlation between lesion location and cognitive dysfunction. The study consisted of a large sample of 452 relapse-remitting MS patients. From each of the patients a Proton Density, T1 and T2 MRI image were obtained. Disabilities were measure using the Functional System Scale (FSS) and Expanded Disability Status Scale (EDSS). The EDSS scale takes a measurement ranging from 0 (normal) through to 10 (death due to MS). The FSS scale looks at specific functional systems and includes pyramidal, cerebellar, brainstem, sensory, bowel and bladder, visual, and mental. They are graded from 0 (normal) through to 5 or 6 (maximal impairment). 4.3.1 Image Capture Unlike other methodologies covered within this review, this particular technique doesn‟t present a unique and specially developed image processing technique. The technique used within this study to analyse the images from each patient in the study was INSECT (Intensity Normalised Stereotaxic Environment for Classification of Tissue), a fully automatic system for the mass quantitative analysis of MRI data with a focus on the detection of Multiple Sclerosis lesions (Zijdenbox, Forghani, & Evans, 1998). 4.3.2 Data Analysis Spearman‟s rank correlation coefficient was used to calculate the correlations between the total lesion load and each disability score (Charil, et al., 2003). Two main correlation measures were taken; correlation between total lesion load and disability, and correlation between lesion location and the rate of disease progression. 4.3.3 Results The analysis of the results from this study demonstrated that a relationship between lesion site and type of disability does exist. It also offers an explanation for the poor relationship between lesion load and disability shown in previous studies being a result of lesions within restricted sites in the white mater (Charil, et al., 2003).
4.3.4 Impact on General Lesion Identification While the results of this study did statistically prove a link between lesion site and type of disability, it also presents some drawbacks from the perspective of utilising this as a measure for identification solely for the identification of lesion load. The measure taken for disability (the EDSS and FSS scales) are both undertaken manually. While the scale in question and the scope of study is broader than could be compared to the manual segmentation systems covered earlier in this study, it still does involve potential for human interpretation and error.
This section of the review has looked at two techniques that have applications to indirectly be used to address the problem of lesion segmentation. Currently, while showing some merit, neither appeared to be entirely suitable at their respective current stages of development to be used to address the problem as a whole. Both would however show some suitability for a subject of further study and research.
This review has identified a range of methodologies utilised to address the issue of lesion segmentation within MR images. While determining the most viable and appropriate methodology is outside the scope of this paper, a few observations can reasonably be drawn from the material reviewed. The requirement of type and number of MR images needed for each methodology varied. To ensure a methodology remains flexible to the majority of circumstances it would be a clear advantage to ensure that the methodology doesn‟t require anything over and above the type or number of images that would normally be taken in support of patient treatment Approaches identified that take more novel approaches may provide further scope for study in the future. Given some of the complex issues involved in segmentation of lesions across the gray and white matter as well as the segmentation of other matter contained within the MRI and also taking into account the fact that all methodologies do present (however small) some aspect of error, an approach that uses other measures to enhance traditional methodologies may provide assistance to reduce the level of error to further insignificant levels. Two key areas identified here were the use of cognitive and physical deficit and the measurement of other brain matter to help define areas of interest. While based upon the material reviewed, neither appears to be sufficient to stand as viable lesion segmentation methodologies by themselves, using them in conjunction with other methodologies may be an approach to follow.
Anbeek, P., Vincken, K. L., van Osch, M. J., Bisschops, R. H., & van der Grond, J. (2004). Automatic segmentation of different-sized white matter lesions by voxel probability estimation. Medical Image Analysis , 8 (3), 205-215. Bendfeldt, K., Kuster, P., Traud, S., Egger, H., Winklhofer, S., Mueller-Lenke, N., et al. (2009). Association of regional gray matter volume loss and progression of white matter lesions in multiple sclerosis - A longitudinal voxel-based morphometry study. NeuroImage , 45 (1), 60-67. Calabrese, M., Filippi, M., Rovaris, M., Mattisi, I., Bernardi, V., Atzori, M., et al. (2008). Morphology and evolution of cortical lesions in multiple sclerosis. A longitudinal MRI study. NeuroImage , 42 (4), 1324-1328. Charil, A., Zijdenbos, A. P., Taylor, J., Boelman, C., Worsley, K. J., Evans, A. C., et al. (2003). Statistical mapping analysis of lesion location and neurological disability in multiple sclerosis: application to 452 patient data sets. NeuroImage , 19 (3), 532-544. Cocosco, C. A., Kollokian, V., Kwan, R. K., & Evans, A. C. (1997). BrainWeb: Online Interface to a 3D MRI Simulated Brain Database. NeuroImage , 5, 425. Collins, D. L., Zijdenbos, A. P., Kollokian, V., Sled, J. G., Kabani, N. J., Holmes, C. J., et al. (1998). Design and construction of a realistic digital brain phantom. IEEE Transactions on Medical Imaging , 17 (3), 463-468. Columbia University. (2007, May 23). Tutorial - Classification. Retrieved June 4, 2009, from Workbench: http://wiki.c2b2.columbia.edu/workbench/index.php/Tutorial__Classification de Boer, R., Vrooman, H. A., van der Lijn, F., Vernooij, M. W., Ikram, M. A., van der Lugt, A., et al. (2009). White matter lesion extension to automatic brain tissue segmentation on MRI. NeuroImage , 45 (4), 1151-1161. de Leeuw, F. E., de Groot, J. C., Achten, B., Oudkerk, M., Ramos, L. M., Heijboer, R., et al. (2001). Prevalence of cerebral white matter lesions in elderly people: a population based magnetic resoance imaging study. The Rotterdam Scan Study. Journal of Neurology, Neurosurgery and Psychiatry , 70 (1), 9. Goldberg-Zimring, D., Achiron, A., Miron, S., Faibel, M., & Azhari, H. (1998). Automated detection and characterisation of multiple sclerosis lesions in brain MR images. Magnetic Resonance Imaging , 16 (3), 311-318.
Iosifescu, D. V., Shenton, M. E., Warfield, S. K., Kikinis, R., Dengler, J., Jolesz, F. A., et al. (1997). An automated registration algorithm for measuring MRI subcortical brain structures. Neuroimage , 6 (1), 13-25. Kidd, D., Barkhof, F., McConnell, R., Algra, P. R., Allen, I. V., & Revesz, T. (1999). Cortical lesions in multiple sclerosis. Brain , 122 (1), 17-26. Kikinis, R., Shenton, E. M., Iosifescu, D. V., McCarley, W. R., Saiviroonporn, P., Hokama, H. H., et al. (1996). A digital brain atlas for surgicle planning, model-drive segmentation, and teaching. IEEE Transactions on Visualisation and Computer Graphics , 2 (3), 232-241. Kutzelnigg, A., & Lassmann, H. (2005). Cortical lesions and brain atrophy in MS. Journal of the Neurological Sciences , 233 (1-2), 55-59. Kwan, R. K., Evans, A. C., & Pike, G. B. (1996). An extensible MRI simulator for postprocessing evaluation. Visualisation in Biomedical Computing (VBC'96). Lecture Notes In Computer Science , 1131 , 135-140. Kwan, R. K., Evans, A. C., & Pike, G. B. (1999). MRI simulation-based evaluation of image-processing and classification methods. IEEE Transactions on Medical Imaging , 18 (11), 1085-1097. Liney, G. P. (2005). Magnetic Resonance Imaging (MRI). Retrieved May 23, 2009, from MRI Physics Lectures: http://www.hull.ac.uk/mri/lectures/gpl_page.html McGill University. (2006, June 12). BrainWeb: Simulated Brain Database. Retrieved June 9, 2009, from McConnell Brain Imaging Center: http://www.bic.mni.mcgill.ca/brainweb/ Nakamura, K., & Fisher, E. (2009). Segmentation of brain magnetic resoance images for measurement of gray matter atrophy in multiple sclerosis patients. NeuroImage , 44 (3), 769-776. Nave, R. (n.d.). Magnetic Resonance Imaging. Retrieved May 23, 2009, from Magnetic Resonance Imaging: http://hyperphysics.phy-astr.gsu.edu/hbase/nuclear/mri.html Nyul, L. G., & Udupa, J. K. (1999). On standardizing the MR image intensity scale. Magnetic Resonance in Medicine , 42 (6), 1072-1081. Sepulcre, J., Goni, J., Masdeu, J. C., Bejarano, B., de Mendizabal, N. V., Toledo, J. B., et al. (2009). Contribution of white matter lesions to gray matter atrophy in multiple sclerosis. Archives of Neurology , 66 (2), 173-179.
Stamatakis, E. A., & Tyler, L. K. (2005). Identifying lesions on structural brain images Validation of the method and application to neuropsychological patients. Brain and Language , 94 (2), 167-177. Statsoft Inc. (1984-2008). k-Nearest Neighbors. Retrieved May 30, 2009, from Electronic Textbook Statsoft: http://www.statsoft.com/textbook/stknn.html Stokking, R., Vincken, K. L., & Viergever, M. A. (2000). Automatic morphology-based brain segmentation (MBRASE) from MRI-T1 data. NeuroImage , 12 (6), 726-738. Uebersax, J. (2002, July 20). Kappa Coefficients: A Critical Appraisal. Retrieved May 20, 2009, from http://ourworld.compuserve.com/homepages/jsuebersax/kappa.htm van den Bosch, A. (2009). K-nearest neighbor classification. Retrieved June 4, 2009, from Videolectures.net: http://videolectures.net/aaai07_bosch_knnc/ Van Leemput, K., Maes, F., Bello, F., Vandermeulen, D., Colchester, A., & Suetens, P. (2000). Automated segmentation of MS lesions in MR. NeuroImage , 11 (5). van Swieten, J. C., Hijdra, A., Koudstaal, P. J., & van Gijn, J. (1990). Grading white matter lesions on CT and MRI: a simple scale. Journal of Neurology, Neurosurgery, and Psychiatry , 53 (12), 1080-1083. Vawter, N. (n.d.). Parzen Windows. Retrieved June 14, 2009, from Parzen Windows: http://web.media.mit.edu/~nvawter/projects/rhythmClassification/c05.html Wahlund, L. O., Barkhof, F., Fazekas, F., Bronge, L., Augustin, M., Sjogren, M., et al. (2001). A new rating scale for age-related white matter changes applicable to MRI and CT. Stroke , 32 (6), 1318-1322. Warfield, S., Dengler, J., Zaers, J., Guttmann, C. R., Wells III, W. M., Ettinger, G. J., et al. (1995). Automatic identification of grey matter structures from MRI to improve the segmentation of white matter lesions. Journal of Image Guided Surgery , 1 (6), 326338. Wen, W., Sachdev, P. S., Chen, X., & Anstey, K. (2006). Gray matter reducation is correlated with white matter hyperintensity volume: A voxel-based morphometric study in a large epidemiological sample. NeuroImage , 29 (4), 1031-1039. Wu, Y., Warfield, S. K., Tan, I. L., Wells III, W. M., Meier, D. S., van Schijndel, R. A., et al. (2006). Automated segmentation of multiple sclerosis lesion subtypes with multichannel MRI. NeuroImage , 32 (3), 1205-1215.
Zijdenbox, A., Forghani, R., & Evans, A. (1998). Automatic Quantification of MS Lesions in 3D MRI Brain Data Sets: Validation of INSECT. In A. Zijdenbox, R. Forghani, & A. Evans, Medical Image Computing and Computer-Assisted Intervention - MICCAI '98 (p. 439).
Figure 1 Example of a normal brain MRI image Image generated from BrainWeb, http://www.bic.mni.mcgill.ca/brainweb/. (McGill University, 2006) Figure 2 Example of a brain MRI image showing MS lesions. Image generated from BrainWeb, http://www.bic.mni.mcgill.ca/brainweb/. (McGill University, 2006) Figure 3 the original Proton Density image prior to process being conducted Image taken from (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) Used with permission, Assoc. Prof. Haim Azhari D. Sc., Technion Israel Institute of Technology, Israel. Figure 4 the processed Proton Density image after the first stage of the algorithm. Note the presence of artefacts Image taken from (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) Used with permission, Assoc. Prof. Haim Azhari D. Sc., Technion Israel Institute of Technology, Israel. Figure 5 the process image after removal of the artefacts Image taken from (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) Used with permission, Assoc. Prof. Haim Azhari D. Sc., Technion Israel Institute of Technology, Israel. Figure 6 the final Proton Density image after removal of all artefacts and final tuning stage Image taken from (Goldberg-Zimring, Achiron, Miron, Faibel, & Azhari, 1998) Used with permission, Assoc. Prof. Haim Azhari D. Sc., Technion Israel Institute of Technology, Israel.