You are on page 1of 36

Segmentation of Medical Images Using

LEGION

Naeem Shareef†, DeLiang L. Wang†‡, and Roni Yagel†


†Department of Computer and Information Science
‡Center
for Cognitive Science
The Ohio State University, Columbus, Ohio 43210, USA
{shareef, dwang, yagel}@cis.ohio-state.edu

Abstract. Advances in visualization technology and specialized graphic workstations allow clini-
cians to virtually interact with anatomical structures contained within sampled medical image
datasets. A hindrance to the effective use of this technology is the difficult problem of image seg-
mentation. In this paper, we utilize a recently proposed oscillator network called LEGION (Locally
Excitatory Globally Inhibitory Oscillator Network), whose ability to achieve fast synchrony with
local excitation and desynchrony with global inhibition makes it an effective computational frame-
work for grouping similar features and segregating dissimilar ones in an image. We identify four
key computational components that determine its dynamics. Then, we describe an image segmenta-
tion algorithm that uses these LEGION components and an adaptive scheme to choose tolerances.
We show results of the algorithm to 2D and 3D (volume) CT and MRI medical image datasets, and
compare our results with manual segmentation. LEGION’s computational and architectural prop-
erties make it a promising approach for real-time medical image segmentation using multiple fea-
tures.

1
I. INTRODUCTION
Advances in imaging, visualization, and virtual environments technology are allowing the cli-
nician to not only visualize, but also interact with a virtual patient [25][26]. In many cases, ana-
tomical information contained within sampled image datasets is essential to clinical tasks. One
typical example is in surgery where pre-surgical planning and post-operative evaluations are not
only enhanced, but in many cases depend upon sampled image data for successful results [9][21].
Many sampling devices are now available to image a variety of material. Two commonly used
modalities to image bony structures and soft tissue are CT (Computerized Tomography) and MRI
(Magnetic Resonance Imaging), respectively. Datasets are realized as a sequence of 2D cross-sec-
tional slices that together represent the 3D sample space. The entire image stack may be viewed
as a 3D array of scalar (or vector) values, called a volume, where each voxel (volume pixel) repre-
sents a measured physical quantity at a location in space. Advances in the fields of surface and
volume graphics now make it possible to render a volume dataset with high image quality using
lighting and shading models. Graphic workstations equipped with specialized hardware and tex-
ture mapping capabilities show promise for real-time rendering [25]. Currently, clinicians view
images on photographic sheets containing adjacent image slices, and must mentally reconstruct
the 3D anatomical structures. Even though clinicians have developed the necessary skills to make
use of this presentation, computer-based tools will allow for a higher level of interaction with the
data.
A major hurdle in the effective use of this technology is the accurate identification of anatom-
ical structures within the volume. The computer vision literature typically identifies three process-
ing stages before object recognition: image enhancement, feature extraction, and grouping of
similar features. In this paper, we address the last step, called image segmentation, where pixels
are grouped into regions based on image features. The goal is to partition an image into pixel
regions that together represent objects in the scene. Segmentation is a very difficult problem for
general images, which may contain effects such as highlights, shadows, transparency, and object
occlusion. On the other hand, sampled image datasets lack these effects with a few exceptions.
One such exception is ultrasound datasets which may still contain occluded objects. On the other
hand, challenges to segment sampled image datasets involve handling noise artifacts introduced
during the acquisition process and dataset size. For example, noise artifacts appear as “salt and
pepper” noise or arbitrary size areas of noise which represent complex or small size objects when
compared with the device sampling rate. With new imaging technology, the increasing size of
volume datasets is an issue for most applications. For example, a “medium” size dataset with
dimensions 256x256x125 contains over 8 million voxels. The MRI, CT, and color image datasets
for an entire human male cadaver from the National Library of Medicine’s Visible Human Project
[1] require approximately 15 gigabytes of storage. An additional challenge is that objects may be
arbitrarily complex in terms of size and shape.
Many segmentation methods proposed for medical image data are either direct applications or
extensions of approaches from computer vision. Image segmentation algorithms can be classified
in many ways [15][11][27]. We identify three broad classes that divide algorithms to segment
sampled image data as: manual, semi-automatic, and automatic. Reviews of algorithms from each
class can be found in [7][18]. The image segmentation algorithm presented in this paper is a semi-
automatic approach. The manual method requires a human segmenter with knowledge of anat-
omy to use a graphical software tool to outline regions of interest. Obviously this method pro-
duces high quality results, but is time consuming and tedious.

2
Semi-automatic methods require user interaction to set algorithm parameters or to enhance
the algorithm results. They can be classified according to the space in which features are grouped
together [11]. Measurement space methods map and cluster pixels in a feature space. A com-
monly used method is global thresholding [18][11], where pixel intensities from the image are
mapped into a feature space called a histogram. Thresholds are chosen at valleys between pixel
clusters that each represent a region of similar-valued pixels in the image. This works well if the
target object has distinct and homogeneous pixel values, which is usually the case with bony
structures in CT datasets. On the other hand, spatial information is lost in the transformation to
feature space, which may produce disjoint regions. Segmentation may be difficult because pixels
within the same object may be mapped to different clusters and pixels from different objects may
be mapped into the same cluster.
Spatial domain methods use spatial proximity in the image to group pixels. Edge detection
methods use local gradient information to define edge elements, which are then combined into
contours to form region boundaries [7][11]. For example, a 3D version of the Marr-Hildreth oper-
ator was used to segment the brain from MRI data [5]. Other edge operators, such as the Canny
operator [6], have been used to some extent. However, edge operators are generally sensitive to
noise and produce spurious edge elements that make it difficult to construct a reasonable region
boundary. Region growing methods [2][11][3][27], on the other hand, construct regions by group-
ing spatially proximate pixels, so that some homogeneity criterion is satisfied over the region.
Algorithms may be classified into four categories: merge, split, hybrid, and seeded region grow-
ing. Merge algorithms initially partition the entire image into many homogeneous regions, which
can be single pixels or groups of pixels. At each iteration, two neighboring regions are merged
into one if the resulting region is “homogeneous.” Most metrics for computing homogeneity are
statistical, for example, two regions are merged if their pixel means are close. Split algorithms
define the entire image as a single region, and then successively subdivide regions into smaller
ones. Subdivision stops when subdivided regions are no longer homogeneous. Hybrid approaches
use both split and merge operations, and usually place additional region constraints. For example,
a small region with size below a specified minimum may be merged with a bordering region.
These approaches have drawbacks. Regions thus produced usually have blocky shapes because
initial regions are defined with geometric primitives, such as rectangular shapes. Split approaches
subdivide regions in a regular manner, such as subdivision into quadrants. This makes it difficult
to segment arbitrarily shaped objects. Different segmentation results may be produced for the
same image if merge and split operations are performed in different orders. For example, a top-
down image traversal may generate different results than a bottom-up traversal. Also, noise in the
image may erroneously contribute to the homogeneity metric, and hinder merging or cause pre-
mature splitting.
Seeded region growing algorithms [2][11][27] grow a region from a seed, which can be a sin-
gle pixel or cluster of pixels. Seeds may be chosen by the user, which can be difficult because the
user must predict the growth behavior of the region based on the homogeneity metric. Since the
number, locations, and sizes of seeds may be arbitrary, segmentation results are difficult to repro-
duce. Alternatively, seeds may be defined automatically, for example, the min/max pixel intensi-
ties in an image may be chosen as seeds if the region mean is used as a homogeneity metric [2]. A
region is constructed by iteratively incorporating pixels on the region boundary. Most methods
use two criteria to add pixels. One adds a boundary pixel if its intensity satisfies a homogeneity
metric defined over the entire region, for example, the mean or variance of the region. The other
criterion adds the pixel if its intensity is close enough to a pixel in its local neighborhood.

3
In contrast to these approaches, we utilize a new biologically inspired oscillator network,
called Locally Excitatory Globally Inhibitory Oscillator Network (LEGION) [20][23], to perform
segmentation on sampled medical image datasets. The network was proposed based on theoretical
and experimental considerations that point to the plausibility of oscillatory correlation as a repre-
sentational scheme. The oscillatory correlation theory assumes that the brain groups and segre-
gates visual features on the basis of correlation between neural oscillations [23][14]. It has been
found that neurons in the visual cortex respond to visual features with oscillatory activity (See
[16] for review). Oscillations from neurons detecting features of the same object tend to synchro-
nize with a zero phase-shift, whereas oscillations from different objects tend to desynchronize
from each other. Thus, objects seem to be segregated in time.
After briefly introducing some background on neural oscillations, we describe LEGION and
its four key computational components in Sections II and III. In Section IV, a graph formulation
is used to show that LEGION is able to label connected components. From a graph algorithm
derived from LEGION’s computational components, we present results of segmenting CT and
MRI image datasets in Section V. Our algorithm is based on a local neighborhood for grouping
and chooses seeds automatically. Next, we compare our results with manual segmentation in Sec-
tion VI. Finally we conclude with a discussion in Section VII.

II. LEGION MODEL


Recent experimental evidence and earlier theoretical considerations point to neural oscilla-
tions in the visual cortex as a possible mechanism by which the brain detects and binds features in
a visual scene. It is known that neurons in the visual system respond to features such as color, ori-
entation, and motion, and are arranged in regular structures called columns and hypercolumns. In
addition, these neurons respond only to stimuli from a particular part of the visual field. Recent
experimental work showed persistent stimulus-dependent oscillations around 40 Hz exhibited by
neuronal groups in the primary visual cortex [10][8]. Furthermore, synchronized behavior was
observed between spatially separate neuronal groups. This supports earlier theoretical consider-
ations [14] which suggest that cells acting as visual feature detectors bind together through corre-
lation of their firing activities. Previously proposed oscillator networks [19][4][12] represent an
oscillatory event by a single phase variable. These networks are limited when applied to image
segmentation. Oscillations in these networks are built into the system rather than stimulus-depen-
dent. More substantially, these systems rely on fully connected network architectures to achieve
synchronization, which results in indiscriminate grouping and loss of topological information.
LEGION is able to overcome these deficiencies with stimulus-dependent oscillations and fast and
long range synchrony by using local connections. In addition, LEGION achieves fast desyn-
chrony with a global inhibitory mechanism.
LEGION was proposed by Terman and Wang [20][23] as a biologically plausible computa-
tional framework for image segmentation and has been used successfully to segment binary and
grey-level image data [24]. It is a network of relaxation oscillators, each constructed from an
excitatory unit x and an inhibitory unit y as shown in Fig. 1. Unit x sends excitation to unit y,
which responds by sending inhibition back. When external input stimulus I is continuously
applied to x, this feedback loop produces oscillations. Neighboring oscillators are connected via
mutual excitatory coupling, as well as the global inhibitor (see (1a) below).
LEGION is formally defined and analyzed in [24][20]. The behavior of each oscillator,

4
indexed by i in a network, is defined by the following equations:

ẋi = 3xi – xi3 + 2 – yi + ρ + Ii H ( pi + exp ( – αt ) – θ ) + Si (1a)

ẏi = ε ( γ ( 1 + tanh ( xi ⁄ β ) ) – yi ) (1b)

ṗi = λ ( 1 – pi ) H ∑ Tik H(xk – θx ) – θp – δpi (2)


k ∈ N1(i)

Si = ∑ Wik H ( xk – θx ) – Wz H ( z – θxz ) (3)


k ∈ N2 (i)

ż = φ ( σ ∞ – z ) (4)

The behavior of ẋ is defined in (1a) which contains a cubic function. The subtractive term yi rep-
resents inhibition from unit y, Ii is external stimulus, and Si represents excitatory coupling with
neighboring oscillators and coupling with the global inhibitor. We call i stimulated if Ii > 0 , and
unstimulated if Ii ≤ 0 . The Heaviside step function H is defined as H(a) = 1 if a ≥ 0 , and
H(a) = 0 if a < 0 . The function H determines oscillatory behavior by it multiplying Ii . The vari-
able pi is called the lateral potential of the oscillator and is used to suppress noisy regions. Param-
eter θ is set in the range 0 < θ < 1 and is used as a threshold for pi and an exponential term which
decays at rate α . The oscillator whose lateral potential exceeds θ is referred to as the leading
oscillator. Parameter ρ is the amplitude of Gaussian noise and plays a role in assisting the separa-
tion of synchronized groups of oscillators. The behavior of ẏ is defined in (1b) which contains a
sigmoid function with β chosen small. Parameter ε is chosen to be small, i.e. 0 < ε « 1 , and
determines that the oscillator defined in (1) is a relaxation oscillator with two time scales.
The potential term pi plays the role of removing the oscillations of noisy regions. Its value is
determined by the activities of its coupled neighbors. As shown in equation (2), if the activity of
each neighbor k ∈ N1(i) , is larger than threshold θx , Tik , which is a permanent connection weight
defining the topology of the network, is accumulated. If the sum is greater than threshold θp , then
the outer Heaviside function will be one and pi will grow its value to 1 by the term λ ( 1 – pi ) ,
where λ is a constant and chosen to be O(1) . If H = 0, then the potential will decay to 0 with rate
δ , chosen to be O(ε) . Following Wang and Terman [24], N1(i) is called the potential neighbor-
hood of i and N 2(i) is called the recruiting neighborhood of i.
Equation (3) defines the coupling to oscillator i, which includes excitation from neighboring
oscillators and inhibition from a global inhibitor z. A threshold θx is applied to the activity of each
coupled oscillator xk , k ∈N 2(i) , using the function H. The resultant value is weighted by Wik ,
which is a dynamic weight used to achieve weight normalization to improve synchronization
[22]. Note that the neighborhood N 2(i) may be different from N1(i) that is used to compute the
potential. If the activity of the global inhibitor z is above the threshold θxz , then the weight Wz is

5
subtracted (see (3)). The global inhibitor is activated when at least one oscillator in the network is
excited, i.e. in the active phase as will be described. In (4), σ ∞ is 1 if xi ≥ θzx for at least one
oscillator i, and 0 otherwise, and φ is a parameter. Fig. 2 shows a 2D network architecture with 4-
neighborhood coupling. The global inhibitor, shown with a black circle, is coupled with the entire
network.
The behavior of an oscillator is qualitatively shown in the phase plane diagram (see Fig. 3),
which is an XY plot of the activities of x and y. Fig. 3a illustrates the oscillatory behavior by a
limit cycle trajectory. When equations (1a) and (1b) are each set to zero, i.e. ẋ = 0 and ẏ = 0 ,
two nullclines are defined called the x-nullcline and the y-nullcline, respectively. The x-nullcline
is a cubic with three branches called the left, middle, and right branches (Fig. 3a). The left and
middle branches connect at a point called the left knee (LK) and the middle and right branches
connect at the right knee (RK). The y-nullcline is a sigmoid function, where β is chosen to be
small so that the sigmoid is close to a step function. The two nullclines intersect at an unstable
fixed point along the middle branch of the cubic. A stimulated oscillator starting at some random
position will be attracted to a counter-clockwise limit cycle trajectory illustrated in Fig. 3a. The
section of the orbit that lies on the left branch is called the silent phase, because x has low activity.
Similarly, the section on the right branch is called the active phase. The temporal behavior of an
oscillator exhibits two time scales due to ε . The slow time scale occurs during the two phases,
denoted by the single arrows in Fig. 3a. Parameter γ in equation (1b) controls the relative times
an oscillator spends in the two phases, whereby a larger γ leads to a shorter time in the active
phase. The fast time scale, denoted by double arrows in Fig. 3a, occurs when an oscillator alter-
nates, or jumps, between the two phases at either LK or RK.
In equation (1a), I and S have the effect to vertically shift the cubic. The position of the cubic
relative to the sigmoid defines two states that an oscillator may be in at any given time. The first is
an oscillatory state where the cubic intersects the sigmoid at exactly one point and a limit cycle
exists (Fig. 3a). The non-oscillatory state occurs when the cubic shifts downward so that LK is
below the left part of the sigmoid (Fig. 3b). Two fixed points are created on two sides of LK, and
the stable fixed point to the left of LK will attract an oscillator travelling in the silent phase and
prevent it from jumping up to the active phase and oscillating.

III. COMPUTATIONAL COMPONENTS OF LEGION


We use LEGION to group similar features and segregate dissimilar ones in a scene. Each
oscillator will respond to a detected feature at some location in the image. Two or more oscillators
that likely respond to features of the same object will group together by the process of synchroni-
zation, where these oscillators have the same phase. A synchronized group of oscillators is sepa-
rated from other synchronized groups that represent different objects by desynchronization. There
are four computational components of LEGION that describe its behavior: group participation,
group formation, group segregation, and group suppression. Each is implemented by the follow-
ing network components: input stimulus, excitatory coupling, global inhibitor, and oscillator
potential, respectively.
From some initial network state, groups of oscillators will form by synchronization due to the
effects of the first two network components. Two or more oscillators are said to be synchronous if
they jump up and down simultaneously. Two coupled oscillators synchronize by excitation via the
coupling term S. An oscillator is able to participate in a group only when it receives a positive

6
input stimulus, i.e. when I > 0 and H = 1 in equation (1a). In this case, the cubic is shifted
upward as in the oscillatory state shown in Fig. 3a. When an oscillator jumps up, it sends excita-
tion to its neighbors and shifts their cubics upward. Stimulated neighbors that are close to the
jumping oscillator in the silent phase will immediately jump up as well because they will lie
below their LK [17]. Thus, the propagation of local excitation between coupled oscillators forms
groups of oscillators, each of which is stimulated and connected. Terman and Wang [20] have
shown that synchronization occurs at an exponential rate.
A group of synchronized oscillators representing an object are distinguished from other
groups representing different objects by a selective gating mechanism using the global inhibitor
[20]. Global inhibition works to ensure that only one group will be in the active phase at a time.
Assume n oscillator groups are all travelling in the silent phase. When the leading group jumps
up, it sends excitation to the global inhibitor, which in turn responds with global inhibition. The
cubics of all oscillators in the network shift downward. Oscillators in other groups travelling in
the silent phase enter the non-oscillatory state (Fig. 3b). The stable fixed point to the left of LK
acts as a gate to prevent other groups from jumping up. On the other hand, the leading group will
travel in the active phase and eventually reach RK and jump down. This removes global inhibi-
tion and the next group in the silent phase closest to LK will be able to jump up to repeat the pro-
cess. Thus, each synchronous group is desynchronized from all other groups because of the global
inhibitor.
The oscillator potential is added to LEGION to remove small noise fragments that could dis-
rupt network dynamics [24]. It forces certain groups to enter and stay in the non-oscillatory state.
From equation (1a) it is easy to see that the potential is able to remove the effect of I via function
H. At the beginning of network dynamics, the potential is initially zero and the exponential func-
tion has a large value so that H has value 1. The exponential decay is chosen to be slow enough so
that group formation occurs. The potential starts to play an important role when the exponential
function becomes less than the threshold θ . If all oscillators in a group together could not develop
a large enough potential through equation (2), then the entire group enters and stays in the non-
oscillatory state. On the other hand, if at least one oscillator, called a leader [24], is coupled with
enough synchronized neighbors and develops a high potential, then the entire group remains in
the oscillatory state. All oscillators in the group will remain synchronous, even though not all
oscillators in the group are leaders. This is because the leader is able to sustain a large potential at
each cycle and its jumping up propagates local excitation to its coupled neighbors. Leaders are
able to maintain high potentials because they are able to reinforce their potentials when their
neighbors jump up to the active phase synchronously.
The above components have been rigorously analyzed by Terman and Wang [24][20]. To
illustrate, Fig. 4 shows a computer simulation of a 20x20 LEGION using 4-neighborhood con-
nectivity for both N1 and N 2 (see Fig. 2). The differential equations (1)-(4) were solved using a
fourth-order Runge-Kutta method. The input image is a 20x20 binary image shown in Fig. 4a,
with four objects (arrow, square, cross, and rhombus) and some background noise. Pixels are
mapped to oscillators in one-to-one correspondence, and network connections preserve pixel
adjacency and are set to 1. An oscillator receives a positive input stimulus if its associated pixel is
black. Thus, each object in the image, i.e. a spatially separate black region, is represented by a
synchronized group of oscillators that desynchronizes from all other groups. Fig. 4b shows plots
of the temporal activity of oscillators in each group. Each trace combines the activities for all
oscillators in a group. The horizontal axis is time and the vertical axis is the normalized x activity.
The trace for the global inhibitor, is the z activity. The activity of the oscillators representing the

7
background noise is shown in the trace labeled “Background.” Initially, each stimulated oscillator
has random activity, but is able to quickly produce oscillatory behavior. The active phase corre-
sponds to high activity durations and the silent phase corresponds to low activity durations. The
instantaneous jumping between phases is shown by the sudden change from low to high activity
or vice versa. The global inhibitor is excited whenever an oscillator enters the active phase. Syn-
chronization is achieved quickly and desynchronization occurs after only two cycles. The near
steady limit cycle behavior occurs after the vertical dotted line because all synchronized groups
are separated, and oscillators representing the background have entered the non-oscillatory state.

IV. CONNECTED LABELLING USING LEGION


Using direct computer simulation of a LEGION network to segment large 2D images and vol-
ume datasets is computationally infeasible because it requires numerically integrating a huge
number of differential equations. Based on LEGION’s computational components, Wang and Ter-
man [24] gave a straightforward algorithmic implementation of LEGION. In this section, we take
a functional viewpoint by showing that the network essentially performs connected labelling of
image pixels. We construct an image segmentation algorithm that utilizes a region growing tech-
nique to group pixels. First, the LEGION network is formulated with a graph representation. Sec-
ond, we use a set of graph operations that are directly derived from LEGION’s computational
components to construct a graph representation of the network in its steady limit cycles. This for-
mulation leads to an image segmentation algorithm called Graph-LEGION, or G-LEGION. Our
algorithm is functionally equivalent to LEGION with appropriately chosen network parameters. It
differs from the Wang and Terman algorithm in our graph formulation, which leads to a more effi-
cient implementation. In the next section, we show results of the image segmentation algorithm
when used to segment medical image datasets.
The LEGION network can be represented by an undirected graph G(V, E, L), where V is the
set of vertices, E is the set of edges, and L is a set of integer labels assigned to vertices. Given a
LEGION network N, the graph G is constructed using the following steps:
1. Create a vertex in G for each stimulated oscillator.
2. Add an edge between two neighboring vertices if their associated oscillators are coupled
with large enough dynamic connection weight Wij .
3. Assign an integer label to each vertex such that the same label is assigned to two different
vertices if and only if their associated oscillators are synchronous.
Only the stimulated oscillators from N are represented in G because only these oscillators will
participate in oscillatory groups. Edges in G represent synchronization between coupled oscilla-
tors. It is easy to see that G contains a collection of components, each of which is a largest con-
nected subgraph (LCS) of G such that a path exists between any two vertices within the subgraph.
A path between any two vertices is a traversal of edges and vertices that starts at one of the two
vertices and ends at the other. Also note that all vertices within the same component will have the
same label assignment, because synchronization is transitive. The graph representing a network in
its initial state may have a large number of small components if oscillators have not synchronized
yet. A graph representing the network in its steady limit cycles will assign the same label to all
vertices in the same component and assign different labels to vertices in different components.
We define graph operations that construct a graph G representing the network in its steady
limit cycles. Each operation is directly derived from the four computational components of

8
LEGION in the following way:

1. Group Participation
Construct a vertex in G for each stimulated oscillator.

2. Group Formation
Add an edge between two vertices if and only if their associated oscillators
are coupled and synchronous in the steady limit cycles.

3. Group Segregation
Assign a unique label to all vertices in the same LCS.

4. Group Suppression
Remove all LCS’s in G that do not contain any leading vertex.

The first two operations construct LCS’s by creating all vertices and establishing appropriate
edges. Next, each LCS is assigned a unique label. Finally, LCS’s with no leading vertices are
removed from the graph, where a leading vertex corresponds to a leading oscillator in the
LEGION network and will be described in the following two subsections. This graph formulation
reflects some desirable properties of LEGION when applied to image segmentation. First, the
topology of G corresponds to spatial proximity (a key grouping principle), which ensures that
segmented regions will be connected and can represent arbitrarily complex objects. The edge
assignments imply that grouping is local, which allows for effective segmentation of objects with
a relatively smooth change in pixel intensities locally, but with a large intensity variation over the
entire object. The segmentation is order independent because the construction of the LCS is inde-
pendent of the assignment of edges and vertex labels. The last graph operation implements a noise
removal mechanism to prune small noisy fragments as well as large noisy areas, such as sampling
artifacts commonly found in sampled medical images.

A. G-LEGION For Binary Images


It is easy to see that our graph formulation maps to a LEGION representation in a straightfor-
ward manner. Since all final LCS’s contain at least one leading vertex and all vertices within the
same LCS are reachable from a leading vertex, we can construct an image segmentation algo-
rithm that utilizes a region growing technique to identify pixels within the same region of an
image. Let a leading pixel or leader in an image correspond to a leading vertex in the correspond-
ing graph, which also corresponds to a leading oscillator in LEGION. We use a leader as a seed
from which a region is grown by traversing all possible pixel paths. Whereas each seed spawns
exactly one distinct region in seeded region growing techniques, leaders in our algorithm may or
may not produce a resultant region.
In Section II, we have shown results of a computer simulation of a LEGION network to seg-
ment black pixel regions in Fig. 4a. Here, we present our algorithm to segment binary images. A
leader is a black pixel with a sufficient number of black pixels in the potential neighborhood of
the corresponding leading oscillator (see (2)). The G-LEGION algorithm for binary images has
the following steps:

9
G-LEGION For Binary Images
1. Initialization
Set all pixel labels to 0;
Set current_label = 1;
2. Leader Identification and Recruiting
For (each black pixel p) Do
If (unlabeled) AND (# of black pixel neighbors of p > θp ) Then
Set p’s label to current_label;
Traverse all paths from p that consist of only unlabeled black pixels and
Set the pixel labels of all traversed pixels to current_label;
current_label = current_label + 1;
EndIf;
EndFor;
3. Background Collection
Set the label of all remaining unlabeled black pixels to -1.

All pixels are initially unlabeled (a label 0). For each unlabeled leader that is found in step 2,
called Leader Identification, a region is grown and assigned a unique label in the Recruiting pro-
cess. Step 2 requires only a single scan of the image. For a leader, the number of black pixels in its
potential neighborhood N1 is greater than the threshold θp . For example, if the potential neighbor-
hood is a 4-nearest neighborhood and θp = 2.5 , then our algorithm will produce the same segmen-
tation result as in the network simulation of Fig. 4. Note that sometimes it is more sensible to
define θp as the proportion of black pixels in N1. Also note that an unlabeled leader may be
recruited by another leader, and thus does not initiate recruiting by itself. Once an unlabeled
leader is chosen, recruiting will label all unlabeled black pixels (including leaders by definition)
that are reachable from the leader along graph paths consisting of only black pixels. A graph path
consists of a sequence of pixels starting at the leader, and each successive pixel on the path is in
the coupling neighborhood N 2 (see (3)) of the previous pixel in the path. After all unlabeled lead-
ers receive labels, the remaining unlabeled pixels are collected into the background, which may
be disjoint. For example, the small noise fragments shown in Fig. 4a will be collected into the
background by this algorithm. It follows from our graph formulation that the order in which unla-
beled pixels are chosen for labeling and the order in which pixels are traversed during recruiting
do not affect the outcome of the algorithm.

B. Segmentation Algorithm For Grey Level Images


The LEGION network described in Section II to segment binary images uses external stimu-
lus to define black pixels. To use this network for grey level images, a threshold could be used to
binarize each pixel intensity before it is input to the network. This is equivalent to applying a glo-
bal threshold to the image and then labeling each spatially separate region by the algorithm given
in the previous subsection. As mentioned before, however, this gives poor segmentation results in
grey level images, because there are usually no clear boundaries between objects. Instead, pixels
should be grouped together based upon the similarity between their local intensity values. Thus,
to segment grey level image datasets, we utilize a LEGION network in which external stimulus is
applied uniformly to all oscillators in the network, and dynamic connection weights are assigned

10
values that reflect the similarity between the corresponding pixels of coupled oscillators.
Our segmentation algorithm for grey level images has the same three steps as the algorithm
for binary image data:

G-LEGION For Grey Level Images


1. Initialization
Set all pixel labels to 0;
Set current_label = 1;
Set boundary_set to an empty set;
2. Leader Identification and Recruiting
For (each pixel p) Do
If (unlabeled) AND (# of “similar” pixel neighbors of p > θp ) Then
Place p into the boundary_set;
While (|boundary_set| ≠ 0) Do
Remove a pixel b from the boundary_set;
Set b to current_label;
For (each a ∈ N 2( b ) ) Do
If (unlabeled) AND (a is “similar” to b) AND (a ∉ boundary_set) Then
Place a into boundary_set;
EndIf;
EndFor;
EndWhile;
current_label = current_label + 1;
EndIf;
EndFor;
3. Background Collection
Set the label of all remaining unlabeled pixels to -1.

As before, pixels are initially unlabeled and the image is scanned once for unlabeled leader pixels
from where regions are grown and labeled. A straightforward assignment for dynamic connection
weights between two coupled oscillators is 1/(1 + |p1 - p2|), where we use p1 and p2 to denote the
pixels, as well as their intensity values, that correspond to the oscillators. A more sophisticated
weight assignment, called adaptive measure to be given later, is used in this paper. Given a “simi-
larity” metric, the grouping between two neighboring pixels, both for leader identification and
recruiting, can simply be a threshold (or tolerance) applied to the pixel intensity difference. A
leader has at least θp pixels in its potential neighborhood that are similar to the leader. Three-
dimensional segmentation is readily obtained by using 3D neighborhood kernels. For example, in
2D, a 4-neighborhood kernel defines the neighboring pixels to the left of, right of, above, and
below the center pixel; an 8-neighborhood kernel contains all pixels with one pixel away, includ-
ing diagonal directions; and a 24-neighborhood kernel includes all pixels that are two pixels
away. In 3D, a neighborhood may be viewed as a cube. A 6-neighborhood kernel contains voxels
that are face-adjacent, a 26-neighborhood kernel contains voxels that are face, edge, and vertex
adjacent, and 124-neighborhood includes voxels that are two voxels away. As before, the back-
ground collection step accumulates all remaining unlabeled pixels into the background.
Once a leader is identified, the recruiting process grows a region by traversing and labeling
pixels along all possible paths from the leader. A path is an ordered list of pixels such that each
pixel is similar to the preceding pixel along the path. Our algorithm finds all paths from the leader

11
by processing unlabeled pixels on the boundary of the growing region with a data structure called
the boundary_set. Initially, only the leader is placed into the boundary_set. At each iteration, a
pixel is removed from the set, assigned the region’s label, and all unlabeled similar pixels in its
coupling neighborhood N 2 are placed onto the boundary_set. Our graph formulation implies that
the boundary_set is unordered since we essentially identify an LCS. The recruiting process ends
when the boundary_set is empty. The implementation of the boundary_set is a tradeoff between
memory and computation time, which are key performance issues when dealing with large image
datasets. A less memory demanding solution is to eliminate the boundary_set altogether and
search the entire image for an unlabeled boundary pixel at each iteration. But this is time consum-
ing, especially for volume data, because most of the visited pixels will not be on the boundary and
thus need not be visited. A bounding box may be placed around the growing region to reduce the
search. Our implementation uses a stack for the boundary_set (see also [2]). Computation time is
reduced because stack operations contain simple push and pop operations and only boundary pix-
els are processed. The size of the stack is equal to the surface of the region, which may be large
for volume datasets.
Our algorithm requires that the user set three parameters: N1, N 2, and θp, and the parameters
for the similarity metric. Each parameter has an influence on the number and sizes of segmented
regions. It is easy to see that leader pixels lie in homogeneous parts of an image. N1 and θp deter-
mine the identification of leaders and the minimum size for a segmented region in an image. Usu-
ally, the number of leaders increases as N1 decreases. Since more than one leader may be
incorporated into the same region, a smaller potential neighborhood does not necessarily produce
more segmented regions. A smaller threshold θp usually yields many tiny regions in the final seg-
mentation, and a larger value will produce fewer and larger regions. This is because when θp is
small, tiny regions in the image can produce leaders. N 2 affects the sizes and number of final seg-
mented regions because it determines the possible paths from a leader in the recruiting process. A
smaller coupling neighborhood implies that recruiting is restricted and usually produces smaller
regions. In addition, more regions are produced because fewer leaders will be placed into the
same region.
Most of the interesting structures in sampled medical image datasets have relatively brighter
pixel intensities when compared with other structures. The separation between objects is usually
defined by a relatively large change in the intensities of pixels. In addition, an interesting object
usually has a cluster of pixels that have relatively homogeneous and gradually varying pixel
intensities. For example, see the bony and soft tissue structures in the CT image shown in Fig. 5a,
and the cortex and the extracranial tissue in the MRI image shown in Fig. 6a. The simple thresh-
old scheme mentioned previously applies a constant tolerance r to the difference between the
intensities of two pixels. This means that a pixel p is considered similar to pixels with intensities
in the range p ± r . Thus, small values of r will restrict region growing in areas such as the soft tis-
sue in Fig. 5a, and the brain in Fig. 6a. Large values of r, however, may cause undesirable region
merging between darker pixels and brighter ones - the flooding problem. This consideration led us
to use an adaptive similarity measure as given in the following.
We have three criteria for defining pixel similarity. The first is that leaders should be gener-
ated from both homogeneous and brighter parts of the image. Secondly, brighter pixels should be
considered similar to a wider range of pixels than darker pixels. The third criterion stipulates that
regions should expand to the boundaries of objects where pixel intensities have relatively large
variation. We use an adaptive tolerance scheme that satisfies the above three criteria. In this
scheme, larger tolerances are used when brighter pixels participate in the similarity measure. A

12
mapping is constructed between the grey level pixel intensities [ 0, p max ] known from the imag-
ery type, and a range of tolerance values [ r min, r max ] given by the user. When two pixels are
tested for similarity, their corresponding tolerance values are found and the larger of the two toler-
ances is applied to the difference between the pixel intensities. The following formula defines our
adaptive similarity measure: two pixels p1 and p2 are similar if and only if |p1 - p2| <
range(Max(p1, p2)). The function range returns the mapped tolerance value for a given pixel
intensity. This mapping determines how much a segmented region depends upon the local pixel
intensity variation. For example, a constant range function results in the simple tolerance scheme
described earlier. In the segmentation experiments to be reported in Section V, we use three map-
ping functions that are linear, square, and cubic. Specifically we define these functions as,
( r max – r min ) ⋅ ( p ⁄ p max ) n + r min , for n = 1, n = 2 , and n = 3 , respectively. Since the range of
pixel intensities is finite, the mapping function may be pre-computed and stored in a lookup table.
This greatly reduces the segmentation time, especially when the range function is complex,
because the function is used heavily within the recruiting procedure.

V. RESULTS
We show results of the G-LEGION image segmentation algorithm on 2D and volume CT and
MRI medical datasets of the human head. The user provides six input parameters: the potential
neighborhood N1, the coupling neighborhood N 2 , the threshold θp , the power n of the adaptive
tolerance mapping function, and the tolerance range variables r min and r max .
Fig. 5a is a CT image of a view of a horizontal section extracted at the nasal level of a human
head. The white areas are bone and grey areas are soft tissue. Fig. 5b shows results of the G-
LEGION algorithm using an 8-neighborhood N1 , a 24-neighborhood N 2 , θp = 7.5, n = 1, r min =
3, and r max = 32. In this as well as all following figures, we use a color map to indicate the results
of segmentation, where each color indicates a distinct segment. The color map in Fig. 5b contains
105 segmented regions, and the background corresponds to black scattered areas. Due to limited
color discriminability of the printer, spatially separate regions that appear to have the same color
are in fact different segmented regions. Whereas a global thresholding method would collect all
bony structures into a single region and would have difficulty distinguishing the soft tissue, our
algorithm is able to segment both. Furthermore, each spatially distinct bony structure or soft tis-
sue area is separately labeled, as well as surrounding nasal cavities and passages.
MRI image data have the advantage of being able to display soft tissue structures better than
CT image data. They are also more difficult to segment because objects are usually nonhomoge-
neous and adjacent objects may have a low contrast. For example, see the MRI image shown in
Fig. 6a, which is a mid-sagittal section of the head showing many structures, including brain, ver-
tebrae, oral cavity, extracranial tissue, bone marrow, and muscle. Fig. 6b shows the result of seg-
mentation by G-LEGION using an 8-neighborhood N1, a 24-neighborhood N 2 , θp = 3.5, n = 3,
r min = 1, and r max = 95. As shown in the color map of Fig. 6b, the MRI image is segmented into
70 regions, plus a background which is again indicated by black scattered areas. The entire brain
is segmented as a single region (colored in yellow). Other significant segments include the
extracranial tissue (colored in red), the chin and the neck parts, the vertebrae, and so on.
As discussed before, each parameter has an effect on the number and sizes of segmented
regions. An understanding of these effects is helpful for the user to choose appropriate parameter
values in order to achieve desired segmentation. In Figs. 7-9, we illustrate the effects of various

13
parameter values on results of segmentation. First, we vary the size of the coupling neighborhood.
Figs. 7a-b differ from Fig. 6b in only the size of N 2 , where 8-neighborhood and 4-neighborhood
are used respectively, and our algorithm segmented 309 and 599 regions, respectively. These
results show that when N 2 is reduced, region expansion is more restrictive, and regions become
smaller and more numerous. This is because a smaller coupling neighborhood results in more
restricted pixel paths during the recruiting step of the algorithm. This effect is clearly demon-
strated, for example, in the fragmented regions representing the brain and extracranial tissue in
Figs. 7a and 7b. The size of the potential neighborhood and the threshold θp are used in the leader
identification step to determine which pixels lie in a relatively homogeneous region of an image
and are thus qualified as leaders. When the homogeneity criterion is relaxed, more leaders will be
generated. On the other hand, when the size of N1 decreases, a pixel more easily becomes a leader
because fewer similar pixels are needed in leader identification. Using the same parameter set-
tings of Fig. 6b, we show results of changing N1 from an 8-neighborhood to a 24-neighborhood in
Fig. 7c and 4-neighborhood in Fig. 7d. Fig. 7c contains 15 segments, and Fig. 7d contains 318
segments, consistent with our analysis. In Fig. 7c, the area between the brain and the extracranial
tissue, which contains fragmented bone marrows in Fig. 6b, is collected into the background,
while Fig. 7d due to the smaller N1 shows more fragments within the area than Fig. 6b. Changing
the value of θp has a similar effect. Figs. 7e-f show results using the same parameters as for Fig.
6b, except for θp = 7.5 in Fig. 7e and θp = 2.5 in Fig. 7f. The color map in Fig. 7e contains 10
segments and in Fig. 7f contains 485 segments.
The roles of the adaptive similarity measure and its parameters are illustrated in Figs. 8 and 9.
Fig. 8a displays an MRI image of a horizontal section of a human head at the eye level, showing
the structures of brain, two eyeballs, the ventricle at the center, extracranial tissue, and scattered
bone marrow. To apply our adaptive similarity measure, pixel intensities are first mapped to toler-
ance values, which are then applied to the difference between a neighboring pixel pair. As the
power n increases, relatively larger tolerance values are assigned to brighter pixels, which means
that brighter pixels have relatively larger ranges of grouping. This is desirable if brighter pixels
represent more interesting structures, as is usually the case for sampled medical images. In other
words, a brighter pixel groups with a larger range of pixels than does a darker one. Figs. 8b-8d
show segmentation results using the cubic, square, and linear tolerance functions, respectively, on
the MRI image of Fig. 8a. In this case, N1 and N 2 are both set to 8-neighborhood, θp = 4.5, r min
= 4, and r max = 18. The color maps in Figs. 8b-8d contain 288, 239, and 173 regions, respectively.
The three color maps show that areas with bright pixels are consistently segmented, such as the
white structures behind the eyes, the brain, and the brighter parts of the extracranial tissue. Also
consistently segmented are the two eyeballs. In the leader identification step of the G-LEGION
algorithm, the tolerance function has a similar effect in determining leaders as do N1 and θp. In
the recruiting step, the tolerance function affects region growth because it specifies which pixels
in N 2 are on pixel paths. Region expansion depends on how fast pixel intensities change within a
region. The boundaries between regions occur where intensity changes are too large. When the
pixel intensities within a region change substantially, a tolerance function with a higher n usually
tends to break the region apart. For example, the brain is segmented into many parts in Fig. 8b,
whereas it remains a single region in Figs. 8c and 8d. In Fig. 8d, however, boundary details of the
brain are lost. Segmentation of the extracranial tissue shows the same effect. Thus, an appropriate
choice of the tolerance function depends upon the intensity variation characteristics of target
objects.
The user can also control the grouping for darker and brighter pixels by selecting r min and

14
r max . When either parameter is changed, the tolerance function remaps all intensities to a new
range of tolerance values. When r min is increased, tolerance values will increase more quickly for
darker pixels. A similar effect holds for brighter pixels when r max is increased; moreover, the
effect is more dramatic. In Fig. 9, we show segmentation results using a cubic tolerance function,
clamping r min to 1, and varying r max from 80 in Fig. 9a, to 40 in Fig. 9b and 20 in Fig. 9c. The
remaining parameters are the same as in Fig. 8. The results in Fig. 9a-c contain 278, 375, and 459
regions, respectively. The white areas behind the eyes, the bright areas that correspond to the
brain and the extracranial tissue (see Fig. 8a) are grouped more fully in Fig. 9a than in Figs. 9b
and 9c.
Figs. 9 and 10 also illustrate how our algorithm handles the various types of noise artifacts
commonly found in sampled image data. Fig. 8a contains "salt and pepper" noise in the areas sur-
rounding the head, as well as arbitrary sized noisy areas caused by too small a sampling rate com-
pared with the sizes of the structures being imaged, such as the nasal area between the two eyes.
Fig. 9 shows that these noise artifacts are collected into the background and do not affect segmen-
tation of other regions. To further illustrate the robustness of our algorithm to noise artifacts, we
reduce the resolution of the image of Fig. 8a by half by throwing away every other pixel value. As
shown in Fig. 10a, this magnifies the noise artifacts, especially near the boundaries of objects.
The result of segmenting Fig. 10a is shown in Fig. 10b using a 24-neighborhood N1 , an 8-neigh-
borhood N 2 , θp = 11.5, n = 3, r min = 1 and r max = 150. The segmentation result contains 131
regions. The algorithm is able to segment the brain, the areas behind the eyes, and the extracranial
tissue while placing a large number of noisy areas into the background.
Figure 11 shows segmentation results on a number of 256x256 MRI images with different
sections. Parameters have been chosen in order to extract various meaningful structures. In Fig.
11b, significant regions that are segmented include the cortex, the cerebellum, and the extracranial
tissue, as well as two ear segments. In Fig. 11d, the entire brain is segmented as a region, so are
the extracranial tissue and the neck muscle. In Fig. 11f, the cortex and the cerebellum are well
separated. Other interesting regions that are segmented include the chin part, and the extracranial
tissue. In Fig. 11h, again the cortex and the cerebellum are well segmented. In addition, the
brainstem and the ventricle lying at the center of the brain are correctly separated. Other struc-
tures are also well segmented as in Fig. 11f.
As discussed before, our G-LEGION algorithm easily extends to segment volume data by
expanding 2D neighborhoods to 3D. To illustrate 3D segmentation, we show the result of seg-
menting an entire MRI volume dataset, from which the image in Fig. 8a was obtained. The vol-
ume dataset consists of 128 horizontal sections, and each section consists of 256x256 pixels, with
a total of 128x256x256 pixels. The dataset was partitioned into four stacks along the vertical
direction. From top to bottom, stack 1 consists of sections 1-49, stack 2 sections 50-69, stack 3
sections 70-89, and stack 4 sections 90-128. We divide the entire dataset to four stacks for the
purpose of dividing total computing to different stages, and for reflecting major anatomical shifts.
The following parameters are common for all stacks: N 2 = 26, θp is half the size of N1 , a square
tolerance function, and r min = 1. In addition, N1 = 26 for stack 1, and 6 for the remaining stacks;
r max = 10 for stack 1, 25 for stack 2, 30 for stack 3, and 35 for stack 4. The parameters are chosen
to extract the 3D brain. Figs. 12a and 12c show two views of the segmented 3D brain using a vol-
ume rendering software developed in [13]. Fig. 12a displays a top view with the front of the brain
facing downward. Fig. 12c displays a side view of the segmented 3D brain, with the front of the
brain facing leftward. To put our results in perspective, Figs. 12b and 12d show the corresponding
views of manual segmentation of the same volume dataset by a human technician (more discus-

15
sions in Sect. VI). As shown in Fig. 12, the results of manual segmentation fit well with the proto-
type of our anatomical knowledge. On the other hand, as to be discussed in Sect. VI, the results of
our algorithm can better reflect details of a specific dataset.
We have performed many other tasks of medical image segmentation using our algorithm,
including tasks such as segmenting blood vessels in other 3D volume MRI datasets and segment-
ing sinus cavities from the Visible Human Dataset [1]. The segmentation results are comparable
with those illustrated above. We note that our segmentation results are usually robust to consider-
able parameter variations, that is, major segments are not sensitive to these variations.
Determining appropriate algorithm parameters is not as difficult as it may appear, because
each parameter has an intuitive meaning and its effect on segmentation is fairly predictable. The
method we use to set the parameters is an incremental one where each parameter is set individu-
ally while holding the others constant. First, target structures in the dataset are determined for seg-
mentation. Usually these structures correspond to bright and relatively homogeneous regions
within images. To reduce the number of extraneous regions in segmentation, N1 and θp should
both be set large initially. They essentially act as filters for removing small and nontarget regions.
A cubic tolerance function should be used first, in order to identify bright regions. To limit expan-
sion into extraneous regions, N 2 can be chosen small initially. The most tedious task is to choose
r min and r max , which requires some trial-and-error. Again, with some experience choosing these
parameters is not very hard.
The G-LEGION software is written in C and is compiled to run on both the SGI and HP plat-
forms. The user is able to set each of the six algorithm parameters through a graphical user inter-
face (GUI) written in Motif. The software displays a 3D volume with integer tags, so that the user
can select one particular segmented 3D region for viewing purposes. The latter utility is also writ-
ten in C and Motif for the GUI.

VI. COMPARISON WITH MANUAL SEGMENTATION


Manual segmentation gives the best and most reliable results when identifying structures for a
particular clinical task. Usually, a medical practitioner with knowledge of anatomy utilizes a
mouse-based software to outline or fill regions of target structures on each image slice in an image
stack, i.e. a volume. The segmenter needs to mentally reconstruct a structure in 3D because only
2D image slices can be viewed. Segmentation of each slice requires that the segmenter view adja-
cent images in the stack in order to correctly reconstruct an object in 3D. The task is very tedious
and time consuming for the segmenter, and thus does not serve the need of daily clinical use well.
On the other hand, no fully automatic method exists that is able to provide comparable segmenta-
tion quality to manual segmentation. Our algorithm lends itself to a semi-automatic approach that
is easy to use and fast to generate results, thus providing the segmenter a valuable tool to attain
acceptable segmentation quality.
Figure 12 compares the performance of our approach in segmenting MRI volume datasets
with manual segmentation performed by a medical technician. Using a segmentation software
available from the National Institute of Health on the Apple Macintosh platform, the task required
the segmenter many hours. To have a closer comparison, we display in Fig. 13 one horizontal sec-
tion and its segmentation in Fig. 12. The sample section from the volume dataset is shown in Fig.
13a, and Fig. 13d shows the manually segmented brain for this section. Fig. 13b shows the seg-
mented brain from the same sample image using G-LEGION. A simple post-processing step is

16
then performed to remove the tiny noise artifacts that are collected in the background, and the
result is then shown in Fig. 13c. A comparison between Fig. 13c and Fig. 13d reveals that the G-
LEGION algorithm is able to provide more details that are omitted in the corresponding manual
segmentation. For example, the brain ventricles and many fissures are correctly outlined in Fig.
13c by our algorithm, but are too tedious to outline in manual segmentation.

VII. CONCLUDING REMARKS


We propose a new neurally inspired algorithm to the problem of segmenting sampled medical
image datasets. Studies from neurobiology and the oscillatory correlation theory suggest that
objects in a visual scene may be represented by the temporal binding of activated neurons via
their firing patterns. In oscillatory correlation, neural oscillators respond to object features with
oscillatory behavior and group together by synchronization in their phases. Different groups rep-
resenting different objects desynchronize from each other. Our algorithm is derived from
LEGION dynamics for image segmentation because of its desirable properties of stimulus-depen-
dent oscillations, rapid synchronization for forming oscillator groups, and rapid desynchroniza-
tion via a global inhibitor for separating oscillator groups.
This paper takes a functional view of LEGION by analyzing its four key computational com-
ponents: input stimulation that determines group participation, local excitatory coupling used to
form groups of oscillators by synchronization, global inhibition used to separate oscillator groups
representing different objects, and oscillator potential used to place noisy fragments into a back-
ground. Due to performance issues of direct computer simulation of the LEGION network on
large image datasets, we derive a graph-based algorithm, called G-LEGION, that closely follows
the four computational components. A graph formulation reveals that the formation of oscillator
groups is akin to connected labeling of largest subgraphs. To segment grey level CT and MRI
sampled datasets, we group grey level pixels based on their intensity contrasts and introduce an
adaptive tolerance scheme to better identify structures of interest. Our results show that the
LEGION approach is able to segment volume datasets, and with appropriate parameter settings
produces results that are comparable to commonly used manual segmentation. Our software is
currently used to segment structures from sampled medical datasets from various sources includ-
ing the Visible Human Project dataset.
Various extensions to the G-LEGION algorithm can be explored for further improving seg-
mentation results and reducing user involvements in parameter setting. The neighborhood kernels
used for both the potential and coupling neighborhoods may be set in many ways. For example, a
Gaussian kernel may be used as a coupling neighborhood, with the connection strength between
two oscillators falling off exponentially. Such a Gaussian neighborhood can potentially alleviate
unwanted region merging, or the flooding problem (see [11]). Other tolerance functions may also
be used to better define pixel similarity. For example, a tolerance mapping using a Gaussian func-
tion may provide proper tolerance values for both darker and brighter pixels.
So far, segmentation is performed by a single network layer only. Another G-LEGION layer
may be added to process the result of the first layer. Because the second layer deals with seg-
mented regions, it is easy to make the second layer extract only major, say large, regions while
putting other regions into a background. This can be implemented by choosing a larger potential
neighborhood for the second layer while using the same coupling neighborhood. With the addi-
tion of this second layer, we expect that a majority of small segments in the segmentation results

17
of Sect. V will be removed, and the number of final segments will be cut dramatically.
While G-LEGION captures major components of LEGION dynamics and leads to much
faster implementation, it should be pointed out that our algorithm is not equivalent to LEGION
dynamics. For example, LEGION dynamics exhibits a segmentation capacity - only a limited
number of segments can be separated, whereas our algorithm can produce an arbitrary number of
segments. The ability to naturally exhibit a segmentation capacity may be a very useful property
for explaining psychophysical data concerning perceptual organization. Also, G-LEGION is an
iterative algorithm in nature. The dynamical system of LEGION, on the other hand, is fully paral-
lel, and does not require synchronous algorithmic operations.
To conclude, our results suggest that LEGION is an effective computational framework to
tackle the image segmentation problem. The network has been used to perform image segmenta-
tion of other features, such as texture and motion. Layers of LEGION networks may be envi-
sioned that are capable of grouping and segregation based on multiple features, which are
required for general segmentation. The network architecture is amenable to VLSI chip implemen-
tation, which would make LEGION a plausible architecture for real-time segmentation tasks.

ACKNOWLEDGMENTS
We would like to thank S. Campbell for his invaluable help in preparing this manuscript.
Thanks to W. Mayr for his time and effort in providing manual segmentation and K. Mueller for
providing the volume rendering software and images of 3D brain visualizations. We have also
benefited significantly from interactions with G. Wiet and D. Stredney. Volume datasets were pro-
vided by the Cancer Hospital Research Institute of the Ohio State University, the Alcohol and
Drug Abuse Research Center at the McLean Hospital of the Harvard Medical School, and the
National Library of Medicine. This work was supported in part by an ONR grant N00014-93-1-
0335, an NSF grant IRI-9423312, an ONR Young Investigator Award (to DLW), and a DOD grant
USAMRDC 94228001.

18
REFERENCES
[1] M. J. Ackerman, “The visible human project,” J. Biocomm., vol. 18, p. 14, 1991.
[2] R. Adams and L. Bischof, “Seeded region growing,” IEEE Trans. Pattern Anal. Machine
Intell., vol. 16, pp. 641-647, 1994.
[3] T. Asano and N. Yokoya, “Image segmentation schema for low-level computer vision,” Pat-
tern Recog., vol. 14, pp. 267-273, 1981.
[4] P. Baldi and R. Meir, “Computing with arrays of coupled oscillators: an application to preat-
tentive texture discrimination,” Neural Comput., vol. 2, pp. 458-471, 1990.
[5] M. Bomans, K-H Hohne, U. Tiede, and M. Rieme, “3-D segmentation of MR images of the
head for 3-D display,” IEEE Trans. Med. Imaging, vol. 9, pp. 177-183, 1990.
[6] J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern Anal. Machine
Intell., vol. 8, pp. 679-698, 1986.
[7] L. P. Clarke, et al., “Review of MRI segmentation: methods and applications,” Magnetic Res-
onance Imaging, vol. 13, pp. 343-368, 1995.
[8] R. Eckhorn, et al., “Coherent oscillations: A mechanism of feature linking in the visual cor-
tex?” Biol. Cybern., vol. 60, pp. 121-130, 1988.
[9] M. Friedman, “Three-dimensional imaging for evaluation of head and neck tumors,” Arch.
Otolaryng. Head Neck Surg., vol. 119, pp. 601-607, 1993.
[10] C. M. Gray, P. König, A. K. Engel, and W. Singer, “Oscillatory responses in cat visual cortex
exhibit inter-columnar synchronization which reflects global stimulus properties,” Nature,
vol. 338, pp. 334-337, 1989.
[11] R. M. Haralick and L. G. Shapiro, “Image segmentation techniques,” Comp. Graphics
Image Process., vol. 29, pp. 100-132, 1985.
[12] D. M. Kammen, P. J. Holmes, and C. Koch, “Origin of oscillations in visual cortex: feed-
back versus local coupling,” Models of Brain Func., R. M. J. Cotterill, Ed. Cambridge,
England: Cambridge University Press, pp. 273-284, 1989.
[13] K. Mueller and R. Yagel, “Fast perspective volume rendering with splatting by utilizing a
ray-driven approach,” Proc. IEEE Vis., pp. 65-72, 1996.
[14] C. von der Malsburg, “The correlation theory of brain functions,” Max-Planck-Institut Bio-
phys. Chem., Internal Rep. 81-2, Göttingen FRG, 1981.
[15] S. Sarkar and K. L. Boyer, “Perceptual organization in computer vision: a review and a pro-
posal for a classificatory structure,” IEEE Trans. Syst. Man. Cybern., vol. 23, pp. 382-399,
1993.
[16] W. Singer and C. M. Gray, “Visual feature integration and the temporal correlation hypothe-
sis,” Ann. Rev. Neurosci., vol. 18, pp. 555-586, 1995.
[17] D. Somers and N. Kopell, “Rapid synchronization through fast threshold modulation,” Biol.
Cybern., vol. 68, pp. 393-407, 1993.
[18] P. Suetens, et al., “Image segmentation: methods and applications in diagnostic radiology
and nuclear medicine,” Eur. J. Radiol., vol. 17, pp. 14-21, 1993.
[19] H. Sompolinsky, D. Golomb, and D. Kleinfeld, “Cooperative dynamics in visual process-
ing,” Phys. Rev. A, vol. 43, pp. 6990-7011, 1991.
[20] D. Terman and D.L. Wang, “Global competition and local cooperation in a network of neu-
ral oscillators,” Physica D, pp. 148-176, 1995.
[21] M. W. Vannier, et al., “Three-dimensional CT reconstruction images for craniofacial surgi-
cal planning and evaluation,” Radiol., vol. 150, pp. 179-184, 1984.

19
[22] D. L. Wang, “Emergent synchrony in locally coupled neural oscillators,” IEEE Trans. Neu-
ral Net., vol. 6, pp. 941-948, 1995.
[23] D. L. Wang and D. Terman, “Locally excitatory globally inhibitory oscillator networks,”
IEEE Trans. Neural Net., vol. 6, pp. 283-286, 1995.
[24] D. L. Wang and D. Terman, “Image segmentation based on oscillatory correlation,” Neural
Comput.; See also Proc. of World Congress on Neural Net., pp. II.521-II.525, 1995.
[25] G. J. Wiet, et al., “Cranial base tumor visualization through high performance computing,”
Proc. of Med. Meets Virt. Real. 4, Health Care Inf. Age, pp. 43-59, 1996.
[26] R. Yagel, et al., “Multisensory platform for surgical simulation,” Proc. IEEE Virt. Real.
Ann. Int. Symp., pp. 72-78, 1996.
[27] S. W. Zucker, “Region growing: childhood and adolescence,” Comp. Graphics Image Pro-
cess., vol. 5, pp. 382-399, 1976.

20
FIGURES
FIGURE 1. Diagram of a single oscillator with an excitatory unit x and an inhibitory unit y in a
feedback loop. A triangle indicates an excitatory connection and a circle indicates an
inhibitory connection. I indicates external input and S indicates the coupling with the rest of
the network.

FIGURE 2. A 2D LEGION network with 4-neighborhood connections. The global inhibitor is


indicated with a black circle and is coupled with the entire network.

FIGURE 3. Phase plane diagrams illustrating two states of a single oscillator. (a) An oscillatory
state occurs when LK of the cubic is above the left part of the sigmoid. (b) A non-oscillatory
state occurs when LK is below the left part of the sigmoid. Two fixed points are created on
two sides of LK, where the fixed point to the left of LK is stable and acts as a gate to prevent
the oscillator from oscillating.

FIGURE 4. Computer simulation of a 20x20 LEGION network to segment a binary image. (a)
The input image with four objects (arrow, square, cross, and rhombus) and some background
noise. (b) Temporal activity of stimulated oscillators shown for 13,750 integration steps.
Except for the global inhibitor, the vertical axis shows the x value and the horizontal axis
indicates time. The activities of all oscillators in a group are shown together in one trace. Both
synchronization and desynchronization occur after two cycles, denoted by the dashed line.
The networks parameters are: ε = 0.02 , α = 0.003, β = 0.1, γ = 20.0, θ = 0.8, λ = 2.0 ,
θx = – 0.5 , θp = 7.0 , Wz = 2.0 , δ = 0.0006 , φ = 3.0 , ρ = 3.0 , Tik = 2.0 , θzx = 0.1 , and θxz = 0.1 .
Dynamic weights to a stimulated oscillator are normalized to 8.0.

FIGURE 5. The G-LEGION algorithm applied to a 256x256 CT image. (a) Original grey level
image showing a horizontal section of a human head. (b) A color map showing the result of
segmentation.

FIGURE 6. G-LEGION applied to a 256x256 T1-weighted MRI image. (a) Original grey level
image showing a mid-sagittal section of a human head. (b) A color map showing the result of
segmentation.

FIGURE 7. Effects of parameters in segmenting the MRI image of Fig. 6a. The parameters vary
with respect to those used in Fig. 6b. (a) and (b) show segmentation results when the coupling
neighborhood N 2 changes to 8- and 4-neighborhood, respectively. (c) and (d) show
segmentation results when N1 changes to 24- and 4-neighborhood, respectively. (e) and (f)
show segmentation results when θp changes to 7.5 and 2.5, respectively.

FIGURE 8. G-LEGION applied to a 256x256 T1-weighted MRI image. (a) Original grey level
image showing a horizontal section of a human head. (b), (c), and (d) show segmentation
results when the tolerance function is cubic, square, and linear, respectively.

FIGURE 9. Effects of tolerance parameters in segmenting the MRI image of Fig. 8a. A cubic
tolerance function is used with r min clamped to 1. (a), (b), and (c) show segmentation results
when r max varies among 80, 40, and 20, respectively.

21
FIGURE 10. Segmentation of a reduced resolution image of Fig. 8a. (a) Original grey level image
after a reduction in resolution by half. (b) The result of segmentation.

FIGURE 11. G-LEGION applied to 256x256 T1-weighted MRI images. For all segmentation
results a cubic tolerance function is used. (a) Original grey level image showing a coronal
section of a human head. (b) Segmentation result for the image in (a) with a 24-neighborhood
N1 , a 4-neighborhood N 2 , θp = 11.5, r min = 1 and r max = 316. (c) Original grey level image
showing another coronal section of a human head. (d) Segmentation result for the image in (c)
with an 8-neighborhood N1 , an 8-neighborhood N 2 , θp = 7.5, r min = 7 and r max = 90. (e)
Original grey level image showing a sagittal section of a human head. (f) Segmentation result
for the image in (e) with a 24-neighborhood N1 , an 8-neighborhood N 2 , θp = 17.5, r min = 7
and r max = 75. (g) Original grey level image showing another sagittal section of a human
head. (h) Segmentation result for the image in (g) with a 24-neighborhood N1 , an 8-
neighborhood N 2 , θp = 19.5, r min = 4 and r max = 250.

FIGURE 12. Segmentation of a 3D MRI volume dataset for extracting the brain. The
segmentation results are displayed using volume rendering. (a) and (c) show the results of G-
LEGION with a top view (the front facing downward) and a side view (the front facing
leftward), respectively. (b) and (d) show the corresponding results produced by slice-by-slice
manual segmentation.

FIGURE 13. Comparison between G-LEGION and manual segmentation. (a) Original grey level
image showing a horizontal section of the volume dataset used in Fig. 12. (b) Segmentation
result of the image in (a) from 3D segmentation in Fig. 12 by G-LEGION. (c) Result of (b)
after tiny holes merge with the brain segment. (d) Segmentation result of the image in (a) from
3D manual segmentation in Fig. 12.

22
I

x S

Figure 1

23
Oscillator

Global Inhibitor

Figure 2

24
y
ẋ = 0 ẏ = 0

RK
Limit
Cycle

LK
x

Stable
Fixed Point

Figure 3

25
a

Arrow

Square

Circle

Rhombus

Background

Inhibitor

Time

Figure 4

26
a b

Figure 5

27
a b

Figure 6

28
a b

c d

e f

Figure 7

29
a b

c d

Figure 8

30
a

b c

Figure 9

31
a b

Figure 10

32
a b

c d

Figure 11

33
e f

g h

Figure 11

34
a b

c d

Figure 12
a b

c d

Figure 13

You might also like