You are on page 1of 9

2018 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO.

12, DECEMBER 2000

Real-Time Morphology Processing Using Highly


Parallel 2-D Cellular Automata CAM2
Takeshi Ikenaga, Member, IEEE, and Takeshi Ogura, Member, IEEE

Abstract—Mathematical morphology is a promising computer them employ a pipeline technique in which a raster-scan image
paradigm based on set theory and has many applications in is sequentially fed into a processing element (PE) array and the
image processing. Although some architectures have been pro- morphological operations are carried out in parallel in each PE.
posed, there are as yet no compact, practical computers that can
handle a variety of morphological operations with large, complex Since the functions of the PEs and the network structure are fully
structuring elements at video rates. This has prevented the great tuned to morphology, other operations crucial to practical image
potential of morphology from being fully realized. This paper processing can not be performed. The fixed network structure
describes a morphology processing method that uses a highly also limits the size and shape of the structuring elements. Fur-
parallel two-dimensional (2-D) cellular automaton architecture thermore, there are at most several dozen PEs. This prevents the
called it CAM2 (Cellular AutoMata on Content Addressable
Memory). New mapping methods achieve high-throughput com- full use of the abundant parallelism (pixel order) of morphology,
plex morphology processing. Evaluation results show that CAM2 and, as a result, the processing speed of the pipeline type is not
performs one morphological operation for basic structuring high enough for many real-time applications. Against this back-
elements within 30 s. Furthermore, CAM2 can also handle an drop, it is pretty clear that none of these conventional architec-
extremely large and complex structuring element of 100 100 at tures are suitable for building a morphology processing platform
video rates. CAM2 will increase the potential use of morphology
and make a significant contribution to the development of various that satisfies the above three prerequisites.
real-time image processing systems. A two-dimensional (2-D) cellular array architecture [14]–[16],
which consists of 2-D PEs and interconnection networks, is an-
Index Terms—Cellular automaton, content addressable
memory, mathematical morphology, pattern spectrum, real-time other candidate for the platform because it is the most natural
image processing. architecture for morphology. The drawback of the conventional
fully parallel approach is the huge amount of hardware involved.
At most, only several dozen PEs can be embedded onto one
I. INTRODUCTION VLSI chip. So, enormous numbers of VLSI chips are required to
realize pixel-order parallelism, which is crucial for extracting the
M ATHEMATICAL morphology [1] is an image trans-
formation technique that locally modifies geometric
features through set operations. It is a powerful tool with various
performance. Moreover, 2-D interconnection networks cause I/O
bottlenecks, so it is difficult to increase the number of PEs even if
applications [2]–[6], such as nonlinear image filtering, noise sup- state-of-the-art LSI technology is used.
pression, smoothing and shape recognition; and it is becoming This paper describes a morphology processing method that
very common in image processing. There are three prerequisites uses CAM . CAM is a compact, high-performance, flexible,
for the fuller realization of the potential of morphology: and highly parallel 2-D cellular automata (CA) [17]. CAM can
• complex processing combining various morphological op- attain pixel-order parallelism on a single PC board because it
erations (including other operations, such as discrete-time is composed of a content addressable memory (CAM), which
cellular neural networks [7], linear filtering [8], and area makes it possible to embed great numbers of PEs, corresponding
calculation); to CA cells, onto one VLSI chip. New mapping methods achieve
• processing with large and complex structuring elements; high-throughput complex morphology processing. Evaluation
• high-speed (real-time) processing. results show that CAM performs one morphological operation
for basic structuring elements within 30 s. This means that
The achievement of these goals requires hardware with ex-
more than 1000 operations can be carried out on the whole
tremely high performance and high-frequency memory accesses.
pixel image at video rates (33 ms). CAM also handles an
That makes general-purpose sequential machines like personal
extremely large and complex structuring element
computers (PC) and workstations (WS) totally unsuitable.
at video rates. Furthermore, it performs practical image pro-
To address these problems, some special-purpose architec-
cessing, such as pattern spectrum and multiple object tracking,
tures for morphology have been proposed [9]–[13]. Most of
through a combination of morphology and other algorithms.
Section II presents the features of CAM . This is followed by
a description of the morphology processing method including
Manuscript received February 26, 1998; revised June 12, 2000. This work was
supported by Y. Sakai, O. Karatsu, K. Takeya, and R. Kasai. The associate editor pattern spectrum processing in Sections III and IV. After dis-
coordinating the review of this manuscript and approving it for publication was cussing the application development environment in Section V,
Dr. Sridhar Lakshmanan. performance evaluation results and some examples of image
The authors are with NTT Lifestyle and Environmental technology Labora-
tories, Kanagawa 243-0198, Japan (e-mail: ikenaga@aecl.ntt.co.jp). processing combining morphology and other algorithms are
Publisher Item Identifier S 1057-7149(00)10069-7. presented in Section VI.
1057–7149/00$10.00 © 2000 IEEE

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
IKENAGA AND OGURA: REAL-TIME MORPHOLOGY PROCESSING 2019

Fig. 1. Image processing system based on HiPIC.

II. FEATURES OF CAM


A. Key Technologies: CAM and HiPIC
Fig. 2. Block diagram of CAM .
CAM was established on our CAM LSI and CAM-based
system technologies. As a CAM-based system model, a Highly-
parallel Integrated Circuits and System (HiPIC) was proposed value of the original cell is transferred to its nearest neigh-
[18], [19] for real-time image processing and various practical boring cells. In the latter, the next value of the original cell
real-time image processing systems [20]–[24] have been de- is calculated by a particular transition rule. In order to carry
veloped. Fig. 1 illustrates a typical image processing system out them, CAM has not only normal RAM operation, such
based on HiPIC. The configuration is very simple. It consists as word reads and writes using addresses (3.1), but also the
of a video camera, a personal computer, and add-on boards following three functions:
based on HiPIC. Using HiPIC, an application-specific system • maskable OR search (3.2);
that achieves high performance and flexibility can be easily re- • partial and parallel write (3.3);
alized. That is why we also employed it for it CAM . • shift up/down mode of hit-flags (3.4).
Fig. 2 shows a block diagram of CAM . According to the
For the search, the results are accumulated in hit-flag registers
HiPIC concept, CAM consists of a highly parallel PE array,
by means of OR logic. For the writes, the data are written into
a reconfigurable logic element, a RISC processor or DSP, and
specific bit positions of multiple words for which the value of
some memory. The highly parallel PE array, a 2-D array of dedi-
the hit-flag register is 1. For the shift, the hit flags are shifted to
cated CAMs, executes SIMD (Single Instruction, Multiple Data
upper or lower words. Although they are very simple, any type
stream) processing for high-volume image data. The logic ele-
of CA operations can be carried out in a bit-serial, word-parallel
ment controls the PE array and interfaces with the image data
manner through the iteration of these operations.
and an external processor. The processor performs serial data
Since CAM has only simple functions (thus allowing
processing. The memory stores images, microprograms, and
high-density implementation of CAM ), the processing power
temporary data.
of each CA cell (PE) is much lower than that of conventional
The main feature of CAM is a dedicated CAM for the highly
highly parallel machines, which support a variety of multibit
parallel PE array. A CAM performs various types of parallel
arithmetic functions. Because of this drawback, processing time
data processing for CA with words as the basic unit. Moreover,
becomes longer as the complexity and bit length of operations
since its memory-based structure is the most suitable for im-
increase. However, morphology requires only simple operations
plementation with LSI technology, several hundred thousand
like logical OR and maximum, not complex ones like multi-
PEs, that is CA cells, can be realized on a single PC board
plication, which is commonly used in image processing filters.
using state-of-the-art deep-submicron CMOS technology. Fur-
Furthermore, the dynamic range of morphological operations
thermore, multiple zigzag mapping [17] enables 2-D CA cells
is fixed; for example, 1 bit and 8 bits for SP (set processing)
to be mapped into CAM words, even though physically a CAM
and FSP (function and set processing), respectively. Therefore,
has a one-dimensional structure as shown in Fig. 2.
the drawback mentioned above is not a serious obstacle to
Another important feature is a control scheme that uses
morphological processing. On the contrary, the simplicity is an
an FPGA, which is a reconfigurable logic element. Since an
advantage because it enables an enormous number of PEs to be
FPGA can easily generate various command sequences, CAM
built on a single CAM chip, which allows the parallelism of
efficiently performs practical image processing using not only
morphology to be more fully exploited.
binary but also gray-scale morphology in combination with
other algorithms, such as discrete-time cellular neural networks
[25]. III. MORPHOLOGY PROCESSING USING CAM
A. Definition of Morphology
B. Basic Functions of CAM
Morphology falls into three categories [2]: set processing
CA processing using CAM is carried out by iterative op- (SP), function and set processing (FSP), and function pro-
erations of CA-value transfer and update. In the former, the cessing (FP). Each has four basic operations: dilation, erosion,

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
2020 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 12, DECEMBER 2000

closing, and opening. Dilation ( ) and erosion ( ) are defined


by

SP
(1)
(2)
FSP
(3)
(4)
FP
(5)
(6)

where is the original image and is the structuring element


(SE).
As shown in the equations, dilation in SP employs the
Fig. 3. Example of dilation (FSP).
Minkowski addition of the original image and the structuring
element. For erosion, addition is replaced by subtraction.
FSP and FP are used for gray-scale image processing, which
employs maximum and minimum operations. Closing ( )
and opening ( ) are combinations of dilation and erosion.

B. Morphology Mapping to CAM


Fig. 4. CAM word configuration for dilation (rhombus).
CAM is a 2-D cellular automaton defined as follows:
• set of 2-D cells (PEs) each with its own value;
• all the cells update their value simultaneously in discrete
steps by a transition rule using the values of the original
and the nearest neighbors.
For efficient execution of morphological equations by CAM , Fig. 5. CAM word configuration for P bit greater than operation.
the following mapping scheme was devised:
• map each pixel of the original image to a CA cell (PE) of neighboring cell fields, and a temporary field. The original cell
CAM ; field ( ) and the dilation image field ( ) store the value of the
• next value of the CA cells (the result of morphological original image and dilation image, respectively. The neighbor
operations) is determined by set operations (logical cell fields ( , , and ) store values in the right (R),
OR/AND, maximum, minimum, etc) for the values of left (L), up (U), and down (D) cells. The temporary field is used
the original and its neighboring cells. The cell location is for storing carry, flag, and so on.
defined by the structuring element. The dilation is executed in the following sequence:
If this mapping is adopted, morphology can be considered to 1) Load all pixel data of the original gray-scale image into
be CA, in which neighbors are determined by the structuring the field of the corresponding CA cells of CAM .
element. 2) Transfer the data of to the field of the same cell.
An example of dilation in FSP, in which the original image 3) Transfer the data of of four neighboring cells (right,
is gray scale and the structuring element is binary, is shown in left, upper and lower cells) to the corresponding neigh-
Fig. 3. The set operation is maximum. For cell (7, 7), the value boring cell fields.
in the pixel below is the maximum of the values of the original 4) Find out the maximum value among the data of and
and neighboring pixels. So, this value is selected as the dilation , and store it into the field.
result. And for cell (7, 3), all the values are 0. So, the dilation 5) Read out the dilation result from the field.
result is also 0. Any type of morphological processing can be The image data loading and retrieval processing in steps 1 and 5
done in the same way. can be done by the normal RAM operations (3.1). Steps 2 and 3
are also effectively performed by the combination of intra-CAM
C. Morphology Processing Method and inter-CAM transfer [17]. Step 4 is executed by the iteration
In this section, morphology processing using CAM is ex- of “greater than” operations. Next, the processing method is ex-
plained in detail. As an example, dilation in FSP with a rhombic plained in more detail.
structuring element is used. Fig. 5 shows the CAM word configuration for the “P-bit
Fig. 4 shows the CAM word configuration for it. Each CAM greater than” operation, where the P bits of the and fields
word consists of an original image field, a dilation image field, are compared, and the one that is larger is stored in the field.

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
IKENAGA AND OGURA: REAL-TIME MORPHOLOGY PROCESSING 2021

The “greater than” operation is executed in the following se-


quence.
1) Set search and write mask. \item Maskable search for cells
whose are .
2) Maskable search for cells whose are
.
3) Maskable OR search for cells whose
are .
4) Parallel writing of to of the hit
cells.
5) Maskable search for cells whose are
.
6) Parallel writing of to of the hit
cells.
7) Maskable search for cells whose are
.
8) Parallel writing of to of the hit
cells.
9) Repeat 1-8 from MSB ( ) to LSB ( )
In the sequence, and are used for flags and are stored in
the temporary field. The initial values of and are both 1.
The condition indicates that “ ” is
determined and the condition indicates the
opposite.
As this example shows, the operation is carried out through the
iteration of the maskable search (3.2) and the parallel write (3.3).
For example, in steps 2 and 3, CAM words for which the value of
is greater than that of , are detected, and “1” is written into
the field of the words in step 4. Fig. 6 shows examples of the
maskable search in step 7 and the parallel write in step 8.
In the operation, the processing time is proportional to the bit
length. It is nine cycles per bit for the greater-than operation.
However, since all the words are processed in parallel, the op-
erations can be finished in an extremely short period of time.
Fig. 6. Examples of maskable search and parallel write.
D. Processing Method for Large and Complex SEs
The size and shape of the structuring element are important
factors in increasing the potential use of morphology. For the
processing, the CA value must be transferred a long distance and
to optional-position CA cells. To do this efficiently, we devised
the following method.
The CAM word configuration is shown in Fig. 7. A CAM
word consists of the original image field ( ), processed image Fig. 7. CAM word configuration for large and complex SEs.
field ( ), and shift image fields ( , , ). The
separated cell value is transferred efficiently and processed as
follows. In the sequence, any shape of structuring element can be coped
with by determining whether the set operation is executed or not
1) Transfer the data of of horizontal cells to the field
according to the structuring element.
using intra-CAM transfer.
2) Execute the set operation to and if the corre-
spondent structuring element is defined. IV. PATTERN SPECTRUM PROCESSING
3) Repeat steps 1 and 2 until the horizontally defined struc- The pattern spectum processing [26] has been proposed as a
turing element runs out. morphology-based algorithm. It is very useful for getting infor-
4) Transfer the data of of vertical cells to (after that mation on the global features of target objects, and some appli-
or are used alternately) using inter-CAM cations, such as a gender recognition [27], have been devised.
transfer. This step is carried out at the same time as step 1. Fig. 8 shows an example of pattern spectrum processing,
5) Repeat steps 1 to 4 using or instead of C until where and show scales and structuring elements, respec-
the vertically defined structuring element runs out. tively. shows the area of image . As shown in Fig. 8, to

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
2022 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 12, DECEMBER 2000

Fig. 9. CAM structure for area calculation.

allel machines based on local operations, including cellular au-


tomata, cannot handle such global operation efficiently. Indeed,
normal CAM has only one global network, which is the data I/O.
So, to perform the calculation, data in each cell (word) must be
retrieved through the I/O one by one and summed up using an
external processor or some special circuits.
To address these problem, CAM has both counters in each
CAM block (to count the number of hit flags) and horizontal
and vertical inter-CAM connection networks (for data transfer
between adjacent CAM blocks), as shown in Fig. 9. These func-
tions can be implemented just by changing the peripheral cir-
Fig. 8. Examples of pattern spectrum processing. cuit of CAM, i.e., changing the memory cell part is unnecessary.
Moreover, the counter can be shared with the pipeline register
for the transfer. So, they can be implemented without degrading
obtain a pattern spectrum, operations other than morphological
the high density of CAM.
ones are required. They are summarized as follows:
The area calculation is done as follows.
• pixel-by-pixel subtraction ( );
• area calculation (number of black pixels in ( 1) Shift the hit-flag registers of CA cells in which the pixel
) image); values are stored by means of pixel-by-pixel subtraction
• null set ( ) assessment of . and count the number of hit flags using the counters. (This
yields the area of each CAM block.)
The following mapping schemes provide an efficient way to
2) Sum up the areas of all the CAM blocks through the iter-
do these.
ation of the inter-block transfer using the connection net-
A. Pixel-by-Pixel Subtraction works and addition using the maskable searches (3.2) and
the parallel writes (3.3). (This operation moves through
The first scheme is for pixel-by-pixel subtraction. Op- the area data in a tree-like fashion and stores the area of
tional-bit-width subtraction can be performed at the rate of the whole image in a particular cell.)
about ten cycles per bit by combining the maskable search
Step 1 of this sequence is carried out in parallel in each block.
(3.2) and parallel write (3.3), just as in the “greater than”
Moreover, the number of iterations in step 2 varies logarithmi-
operations described in Section III-C. However, making use
cally with the number of blocks. So, an area calculation only
of the relationship of opening, , shortens the
takes a short time.
processing time still more. This can be done in a sequence that
takes only two cycles:
1) set search mask; C. Null Set Assessment
2) maskable search for cells whose is 1 and The final mapping technique is for null set assessment. The
is 0. scale when becomes a null set changes according to
A positive result for a particular cell is stored in its hit-flag reg- the geometric features of the input image. So, null set assess-
ister. Thus, the processing time can be shortened significantly ment is required in order to eliminate redundant transitions.
by exploiting the features of target algorithms, even though the In the assessment, we must examine whether values
performance of each cell (PE) of CAM is not very high, as stored in all the cells become “0” or not. To do this efficiently, we
mentioned before. output a hit flag (HFO) that is the logical OR of all the hit-flag
registers as shown in Fig. 9.
B. Area Calculation The assessment using the HFO is executed as follows:
To calculate an area, the pixel values stored in all the cells 1) set search mask;
must be summed up. Generally speaking, however, highly par- 2) maskable search for cells whose is 1;

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
IKENAGA AND OGURA: REAL-TIME MORPHOLOGY PROCESSING 2023

TABLE I
PROCESSING TIME FOR BASIC SEs (s)

Fig. 10. Basic structuring elements.

3) null set assessment (If “HFO ,” then end the pro-


cessing. Otherwise repeat the processing for a new scale.).
When all values become 0, the result of the maskable
search for all the cells becomes unhit. So, HFO becomes 0. A
null set is assessed by examining the value of HFO. Since CAM
performs these sequences in only several cycles, the null set
assessment is also finished in an extremely short period of time.

V. APPLICATION DEVELOPMENT ENVIRONMENT


To execute various types of processing including morphology
using CAM , control logic for generating the various command
sequences, described in Sections III-C, III-D and IV, must be Fig. 11. Processing time for large SEs.
mapped into the FPGA on CAM . For efficient mapping, a pro-
gramming language (CAM PL) and an application develop- dilation within 30 s. This means that more than 1000 dilations
ment environment (CAM ADE) for CAM have been devel- can be carried out on the whole pixel image at video
oped. rates (33 ms).
CAM PL includes various arithmetic and logical operations, Fig. 11 shows the processing performance for one dilation of
such as addition and logical OR, and various associative op- large SEs. In the evaluation, regular square structuring elements
erations, such as maskable search and parallel write. Further- with size L were used. The figure shows that the execution
more, to describe various morphological operations easily, the time for one dilation is almost proportional to the size of the
following dedicated operations are added: structuring element. Therefore, an extremely large structuring
• (dilation type se); element with a size of about can be handled
• (erosion type se); at video rates.
where the morphology category (SP, FSP, and FP) and shape In view of these simulation results, we think CAM satisfies
of the structuring element (rhombus, square, etc) are given in the three prerequisites mentioned in Section I, and, therefore,
“type” and “se,” respectively. An example of CAM PL is pre- has great potential for morphology processing.
sented in Section VI-B3.
B. Image Processing
CAM ADE consists of a compiler and a simulator. The com-
piler compiles CAM PL, and generates a microprogram for a Some examples of image processing using CAM are shown
CAM board. The simulator reports simulation results, like pro- in this section. These data were calculated by CAM ADE
cessing speed, for various input images and test data. They are based on the Verilog functional simulator. Although these
used for debugging and evaluation. CAM ADE should signif- examples make it necessary to perform iterative morphological
icantly speed up system development. operations, the Verilog simulator takes an extremely long time
to run. So, a CAM with CA cells and a
VI. EVALUATION image were used here. Since CAM has scalability, a
image can be processed in almost the same time (except for data
A. Processing Performance loading and retrieval processing) if a CAM with
Morphology processing performance is evaluated in this sec- CA cells is used.
tion. We have already finished the design of CAM and have 1) Data Loading and Retrieval Processing: The completion
described it in Verilog HDL [28]. This data comes from the Ver- of image processing requires not only morphological processing
ilog functional simulator. In the evaluation, the system clock of but also data loading and retrieval processing. For loading, all
CAM was assumed to be 40 MHz. Image size was the pixel data of the input image are loaded into the corre-
pixels. sponding CA cells of CAM . For retrieval, the processed data
Table I shows the processing performance for one dilation are retrieved from the result fields of all CA cells.
of basic SEs shown in Fig. 10. Erosion is performed in almost Using parallel loading and partial word retrieval techniques
the same time. Through a combination of these structuring el- [17], CAM can also handle such processing effectively. It takes
ements, morphological operations with various-sized regular- about 0.1 ms for both the data loading and retrieval of a
shaped structuring elements, such as square and circle, image. The processing time needed for data loading and re-
can be performed. As shown in the table, CAM performs one trieval processing lengthens with image size. However, since it

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
2024 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 12, DECEMBER 2000

Fig. 13. Example of image processing (morphology).

Fig. 12. Pattern spectrum for images with and without crack.

TABLE II
PATTERN SPECTRUM PROCESSING TIME

only takes 1.6 ms even for a relatively large image ( ), Fig. 14. Example of CAM PL.
more than 30 ms is available in which to perform morphology
or other CA algorithms for real-time, or video rate (33 ms), ap- tively. Thus, a pattern spectrum can also be obtained at video
plications. rates.
2) Pattern Spectrum: Fig. 12 shows examples of pattern 3) Multiple Object Tracking: Another example of image
spectrum processing (SE: circle) for two different images, processing using CAM is shown in Fig. 13. By applying
in which one object has a crack and the other one does not. various morphological operations to perform line erasure, edge
As shown in Fig. 12, since the size and shape of the objects detection, hole filling and noise reduction, a binary image of
in image 1 are uniform, the spectrum concentrates on scales 5 target objects can be obtained. The processing requires 15
and 6. In contrast, since the object in image 2 has cracks, the morphological operations with various structuring elements.
spectrum is scattered. Since the features of the spectra are quite CAM can do it in just 200 s. As shown in Section VI-B1,
different, it is easy to distinguish them. the data loading and retrieval times are 0.1 ms and 1.6 ms for
Table II shows the processing time for the images. As shown and images, respectively. The processing
in the table, the processing time for opening increases with the can be finished at video rates.
scale because the size of the structuring element becomes larger Fig. 14 shows an example of CAM PL [25] for the pro-
as the scale increases. In contrast, the processing time for the cessing in Fig. 13. Here, “copy_8” means the intra-word copy
area calculation is constant, and is about that of the con- of 8 bits and “sub_data8” means pixel-by-pixel subtraction of 8
ventional method using the data I/O and the external processor. bits. “Dilation” and “erosion” are the dedicated operations for
It takes only 0.6 s per scale for the rest of the processing, such describing morphological operations with various structuring
as pixel-by-pixel subtaction. elements, as mentioned before. Using these operations, the pro-
It takes about 1 ms for the whole pattern spectrum processing. cessing is described in only 20 operations.
When data loading time is included, the total time becomes As discussed above, CAM efficiently performs not only
1.1 ms and 1.8 ms for and images, respec- morphology, but also other CA-based algorithms. Using these

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
IKENAGA AND OGURA: REAL-TIME MORPHOLOGY PROCESSING 2025

the LSIs. Fig. 16 is a photograph of the board. The board con-


sists of a highly parallel PE array, FPGAs, and some memory.
The PE array is a 2-D array of 1-Mb CAM LSIs, and
can handle a image. (Since the CAM LSI is scal-
able, a larger image can be processed by increasing the number
of chips.) The FPGA generates various command sequences to
perform practical image processing based on morphology and
Fig. 15. Example of image processing (other CA).
other CA-based algorithms. In addition, PCI bus and NTSC
video interfaces are also embedded in this board. So, a compact
image-processing platform can be built simply by connecting
the board to a personal computer and a video camera.
This prototype board demonstrates that an economically fea-
sible morphology platform can actually be obtained. Using this
CAM board, we plan to develop various real-time image pro-
cessing applications based on morphology and other algorithms.

ACKNOWLEDGMENT
The authors would like to thank Y. Takahashi, Y. Fujino, T.
Fig. 16. CAM board with 256 2 256 CA cells. Tsuchiya, T. Nakanishi, M. Nakanishi, and E. Hosoya for their
many valuable suggestions and constructive discussions.
algorithms, the center points of target objects and a distance
map for them can be obtained as shown in Fig. 15. By applying REFERENCES
the processings in Figs. 13 and 15 to the input image and by [1] J. Serra, Image Analysis and Mathematical Morphology. New York:
finding the center points nearest those in the previous frame, Academic, 1982.
[2] P. Maragos, “Tutorial on advances in morphological image processing
multiple object tracking can be performed. and analysis,” Opt. Eng., vol. 26, 1987.
These examples demonstrate that CAM is flexible enough to [3] R. M. Haralick, S. R. Sternberg, and X. Zhuang, “Image analysis using
perform practical image processing employing a combination of mathematical morphology,” IEEE Trans. Pattern Anal. Machine Intell.,
vol. 9, no. 4, pp. 532–550, 1987.
morphology and other algorithms. [4] L. Vincent, “Graphs and mathematical morphology,” Signal Process.,
vol. 16, no. 4, pp. 365–388, 1989.
[5] S. Yamamoto, M. Matsumoto, Y. Tateno, T. Iinuma, and T. Matsumoto,
VII. CONCLUSION “Quoit filter—A new filter based on mathematical morphology to ex-
tract the isolated shadow, and its application to automatic detection of
This paper has described a morphology processing method lung cancer in X-ray CT,” in Proc. 13th Int. Conf. Pattern Recognition
based on a highly-parallel 2-D cellular automata called CAM (ICPR’96), vol. 2, 1996, pp. 3–7.
and has presented some evaluation results. New mapping [6] Y. Takahashi, A. Shio, and K. Ishii, “Morphology based thresholding for
character extraction,” IEICE Trans. Inform. Syst., vol. E76-D, no. 10, pp.
methods using maskable search, partial & parallel write and 1208–1215, 1997.
hit-flag shift achieve high-throughput complex morphology [7] H. Harrer, “Multiple layer discrete-time cellular neural networks using
processing. Evaluation results show that CAM performs one time-variant templates,” IEEE Trans. Circuits Syst. II, vol. 40, pp.
191–199, Mar. 1993.
morphological operation for basic structuring elements within [8] E. R. Dougherty and P. A. Laplante, Real-Time Imaging. New York:
30 s. This means that more than 1000 operations can be IEEE Press, 1995.
carried out on an entire pixel image at video rates (33 [9] M. Hassoun, T. Meyer, P. Siqueira, J. Basart, and S. Gopalratnam, “A
VLSI gray-scale morphology processor for real-time NDE image pro-
ms). CAM can also handle an extremely large and complex cessing applications,” SPIE Image Algebra Morphological Image Pro-
structuring element at video rates. Furthermore, cessing, 1990.
CAM can perform practical image processing, such as pattern [10] R. Lin and E. K. Wong, “Logic gate implementation for gray-scale mor-
phology,” Pattern Recognit. Lett., vol. 13, no. 7, 1992.
spectrum and multiple object tracking, through a combination [11] C. H. Chen and D. L. Yang, “Realization of morphological operations,”
of morphology and other algorithms. Thus, CAM will enable IEE Proc. Circuits Devices Systems, vol. 142, 1995.
fuller realization of the potential of morphology and make a [12] L. Lucke and C. Chakrabarti, “A digital-serial architecture for gray-
scale morphological filtering,” IEEE Trans. Image Processing, vol. 4,
significant contribution to the development of real-time image pp. 387–391, Mar. 1995.
processing systems based on morphology and other algorithms. [13] E. R. Dougherty and D. Sinha, “Computational gray-scale mathemat-
ical morphology on lattices (a comparator-based image algebra)—Part
I: Architecture,” Real-Time Imag., vol. 1, pp. 69–85, 1995.
APPENDIX [14] T. Kondo et al., “Pseudo MIMD array processor-AAP2,” in Proc. 13th
Symp. Computer Architecture Conf., 1986, pp. 330–337.
We have completed our development of a dedicated 1-Mb [15] Thinking Machines Corp, Connection machine model, CM-2 tech. sum-
mary, Ver. 5.1, 1989.
CAM LSI for CAM [29], [30]. We fabricated a chip capable [16] J. R. Nickolls, “The design of the MasPar MP-1: A cost-effective mas-
of operating at 56 MHz and 2.5 V using 0.25- m full-custom sively parallel computer,” in Proc. COMPCON Spring’90, 1990, pp.
CMOS technology with five aluminum layers. Since it has 16k 25–28.
[17] T. Ikenaga and T. Ogura, “CAM : A highly-parallel 2D cellular au-
words, or CA cells, a single chip can process pixels in tomata architecture,” IEEE Trans. Comput., vol. 47, pp. 788–801, July
parallel. We have also developed a prototype CAM board using 1998.

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.
2026 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 12, DECEMBER 2000

[18] T. Ogura, M. Nakanishi, T. Baba, Y. Nakabayashi, and R. Kasai, Takeshi Ikenaga (M’95) received the B.E. and M.E.
“A 336-kbit content addressable memory for highly parallel image degrees in electrical engineering from Waseda Uni-
processing,” in Proc. Custom Integrated Circuits Conf. (CICC’96), versity, Tokyo, Japan, in 1988 and 1990, respectively.
1996, pp. 273–276. He joined LSI Laboratories, Nippon Telegraph
[19] T. Ogura and M. Nakanishi, “CAM-based highly-parallel image and Telephone Corporation (NTT), in 1990, where
processing hardware,” IEICE Trans. Electron., vol. E80-C, no. 7, pp. he has been working on the research of the design
868–874, 1997. and test methodologies for high-performance ASICs.
[20] Y. Fujino, T. Ogura, and T. Tsuchiya, “Facial image tracking system He is presently a Senior Research Engineer with
architecture utilizing real-time labeling,” in Proc. SPIE VCIP’93, 1993. the Parallel Processing Systems Research Group,
[21] M. Nakanishi and T. Ogura, “A real-time CAM-based Hough trans- NTT Lifestyle and Environmental Technology
form algorithm and its performance evaluation,” 13th Int. Conf. Pattern Laboratories, Kanagawa, Japan. His current interests
Recognition (ICPR’96), vol. 2, pp. 516–521, 1996. are highly parallel system design and its applications to computer vision. In
[22] M. Nakanishi and T. Ogura, “Real-time extraction using a highly par- 1999–2000, he was a Visiting Researcher with the University of Massachusetts,
allel Hough transform board,” Proc. IEEE Int. Conf. Image Processing Amherst.
(ICIP’97), vol. 2, pp. 582–585, 1997. Mr. Ikenaga is a member of the Institute of Electronics, Information, and
[23] M. Meribout, M. Nakanishi, and T. Ogura, “Hough transform imple- Communication Engineers (IEICE) of Japan, and the Information Processing
mentation on a reconfigurable highly parallel architecture,” in Proc. Society of Japan (IPSJ). He received the IEICE Research Encouragement Award
Computer Architectures Machine Perception (CAMP’97), 1997, pp. in 1992 for his paper “A test pattern generation for arithmetic execution units.”
276–279.
[24] E. Hosoya, T. Ogura, and M. Nakanishi, “Real-time 3D feature
extraction hardware algorithm with feature point matching capability,”
in Proc. IAPR Workshop Machine Vision Applications (MVA’96), 1996,
pp. 430–433.
[25] T. Ikenaga and T. Ogura, “A DTCNN universal machine based on
highly-parallel 2-D cellular automata CAM ,” IEEE Trans. Circuits Takeshi Ogura (M’86) received the B.S., M.S., and
Syst. I, vol. 45, pp. 538–546, May 1998. Ph.D. degrees in electrical engineering from Osaka
[26] P. Maragos, “Pattern spectrum and multiscale shape representation,” University, Osaka, Japan, in 1976, 1978, and 1991,
IEEE Trans. Pattern. Anal. Machine Intell., vol. 11, pp. 701–716, July respectively.
1989. In 1978, he joined Musashino Electrical Commu-
[27] K. Sudo, J. Yamato, and A. Tomono, “Determining gender using mor- nication Laboratories, Nippon Telegraph and Tele-
phological pattern spectrum” (in Japanese), IEICE Trans. Inform. Syst., phone Public Corporation (NTT), Tokyo, Japan. He
vol. J80-D-II, no. 5, pp. 1037–1045, 1997. is currently an Executive Manager with the Multi-
[28] E. Sternheim et al., Digital Design with Verilog HDL. New York: Au- media Electronics Laboratory of NTT Lifestyle and
tomata, 1990. Environmental Technology Laboratories, Kanagawa,
[29] T. Ikenaga and T. Ogura, “A fully-parallel 1 Mb CAM LSI for real-time Japan. He is engaged in the research and development
pixel-parallel image processing,” in IEEE Int. Solid-State Circuits Conf. of CAM LSIs and their applications and is also engaged in the development of
(ISSCC99) Dig. Tech. Papers, 1999. image processing and encoding LSIs.
[30] T. Ikenaga and T. Ogura, “A fully parallel 1-Mb CAM LSI for real-time Dr. Ogura is a member of the Institute of Electronics, Information, and Com-
pixel-parallel image processing,” IEEE J. Solid State Circuits, vol. 35, munication Engineers (IEICE) of Japan and the Information Processing Society
pp. 536–544, Apr. 2000. of Japan (IPSJ).

Authorized licensed use limited to: Ming Chi University of Technology. Downloaded on July 27,2021 at 15:52:16 UTC from IEEE Xplore. Restrictions apply.

You might also like