Professional Documents
Culture Documents
net/publication/223814887
CITATIONS READS
48 2,339
5 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Arthur Gruber on 06 February 2018.
Received 21 July 2006; received in revised form 21 November 2006; accepted 6 December 2006
Abstract
We describe an approach of automatic feature extraction for shape characterization of seven distinct species of Eimeria, a protozoan parasite
of domestic fowl. We used digital images of oocysts, a round-shaped stage presenting inter-specific variability. Three groups of features were
used: curvature characterization, size and symmetry, and internal structure quantification. Species discrimination was performed with a Bayesian
classifier using Gaussian distribution. A database comprising 3891 micrographs was constructed and samples of each species were employed
for the training process. The classifier presented an overall correct classification of 85.75%. Finally, we implemented a real-time diagnostic
tool through a web interface, providing a remote diagnosis front-end.
䉷 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Keywords: Shape analysis; Feature extraction; Pattern classification; Image processing; Remote diagnosis; Real-time systems; Eimeria; Avian coccidiosis
Fig. 1. Photomicrographs of oocysts of the seven Eimeria species of domestic fowl. Samples: (a) E. maxima, (b) E. brunetti, (c) E. tenella, (d) E. necatrix,
(e) E. praecox, (f) E. acervulina, and (g) E. mitis.
pathogenicity and virulence, their precise discrimination is im- The authors followed the work developed by Sommer [15],
portant for epidemiological studies and disease control mea- where the parametric contour of the object was used to com-
sures. Parasite oocysts, a round-shaped developmental stage, pute the amplitude of the Fourier transform. Cross-validation
are shed in profuse amounts in the feces of infected chicks. results showed correct classification rates of 86.1–90.3%, but
Oocysts of distinct species present differences of size (area, the small number of samples utilized severely restricted an es-
diameter), contour (elliptic, ovoid, circular), internal structure, timation of the confidence level of the approach. Another work
thickness and color of the oocyst wall, among other morpho- using ANNs for object detection was described by Widmer
logical variations (Fig. 1). However, the correct species dis- et al. [18] for the identification of Cryptosporidium parvum.
crimination by human visual inspection is severely restricted by The authors differentiated parasite oocysts from sample debris
the slight morphological differences that exist among the dis- with success, but no species differentiation was conducted.
tinct species and the overlap of characteristics. Considering the The small number of features utilized in these previous works
limitations imposed by morphology-based diagnosis, different can be explained by the difficulty of quantifying morphologi-
molecular approaches have been devised for species discrimi- cal features. This limitation, together with the high complexity
nation, such as a PCR-based diagnostic assay using the ribo- of the algorithms, makes the development of real-time systems
somal ITS1 as a target [11,12]. Our group has also developed for automatic diagnosis a challenging task. In addition, the set
molecular diagnostic tools for Eimeria spp., including a mul- of features to be used is strongly dependent on the characteris-
tiplex PCR assay for the simultaneous diagnosis of the seven tics of the image domain. In this regard, our group has reported
species that infect the domestic fowl [13]. These molecular several techniques for shape characterization. Thus, Bruno
diagnosis assays are very sensitive and specific, but require et al. [19] used multiscale features for the characterization
highly trained personnel and sophisticated infrastructure. of cat ganglion neural cells, whereas Coelho et al. [20] pro-
Previous works have reported the differentiation of Eime- posed another set of features (diameter, eccentricity, fractal
ria [7–9] and helminths [14] using digital image recognition. dimension, influence histogram, influence area, convex hull
Kucera and Reznicky [7] reported the species differentiation of area and convex hull diameter) for the same problem. Costa
Eimeria spp. of domestic fowl using only two features, length et al. [21] used digital curvature as a feature for morphological
and width of oocysts, which were computed in a semiauto- characterization and classification of landmark shapes.
matic fashion. Such a limited number of characters, however, In this paper we present an approach to extract morphological
restricted the ability to differentiate all seven species due to information by using different computer vision techniques in
the similar morphology and overlap among the distinct species. order to perform an automatic species differentiation of Eime-
Sommer [15,16], working with cattle Eimeria, used a more ria spp. oocysts. We report the development of a shape repre-
complex approach, where the parametric contour was consid- sentation approach that considers three types of morphologi-
ered as input to compute the amplitude of the Fourier trans- cal characteristics: (a) multiscale curvature, (b) geometry, and
form. Nevertheless, the classification method (average linkage (c) texture. All these features are automatically extracted con-
clustering) does not consider the distribution of elements and is stituting a 13D (13-dimensional) future vector for each oocyst
not particularly suitable for real-time systems. Yang et al. [17] image.
developed an automatic system for human helminth egg detec- While the considered measurements and adopted classifica-
tion and classification using artificial neural networks (ANNs). tion methods used throughout this work are not necessarily
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910 1901
3. Shape characterization We used oocyst micrographs as the starting point for the au-
tomatic analysis. The pictures were obtained with an optical mi-
Images can be mathematically understood as sets of croscope (Nikon Eclipse E800) coupled to a 4-megapixel CCD
connected points in a bidimensional space F, that can be camera (Nikon Coolpix 4500). The images were captured with
approximated in a discrete binary image space. Image classifi- a 40× magnification objective and saved as 24-bit JPEG (fine
cation, performed directly on F, is a hard task that may require quality option) files. Using these conditions, all pictures pre-
O(N 2 ) comparisons, assuming that each image has N pixels. sented a spatial resolution of 11.1pixels/m. Depending on the
1902 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910
Fig. 2. Different stages of oocyst image pre-processing. An original color image is firstly converted into (a) a gray scale image. After segmentation, the
resulting (b) binarized image is used for (c) contour detection.
sample concentration and purity, a single micrograph could pro- vision. Riggs [28], for instance, postulated that curvature detec-
vide many oocysts for further processing. Low quality oocyst tors would be present at the neuronal level in humans. Due to
images were not considered for downstream analysis. Common its biological motivation, curvature analysis has gained atten-
practical problems included out of focus images, oocysts not tion from the pattern recognition community, and many meth-
adequately well positioned, and atypical oocyst morphologies ods have been proposed to compute it [29].
caused by accidental cracking or squeezing. Another problem- Our approach takes advantage of the closed parametric con-
atic aspect we observed is related to the presence of debris tour that is represented by the x(t) and y(t) signals, which are
and bacteria in dirty samples, thus complicating object seg- used for curvature estimation using the Fourier derivative prop-
mentation. The process of oocyst image isolation was carried erty [30]. Let the parametric representation of the contour be
out manually by using an image processing program (Gimp
or Adobe Photoshop). The objects of interest, single oocysts, c(t) = (x(t), y(t)) (1)
were cropped out of the picture and used to create new im-
age files that were in turn used as input data to our system. the curvature k(t) of c(t) is defined as
A total of 3891 oocyst images constituted the data set of the ẋ(t)ÿ(t) − ẍ(t)ẏ(t)
present work. k(t) = , (2)
(ẋ(t)2 + ẏ(t)2 )3/2
Image quality can be substantially heterogeneous due to dif-
ferences of illumination, contrast, focus, acquisition resolution, where ẋ and ẏ are the first derivatives, ẍ and ÿ are the second
thus hampering object detection. To reduce the effect of illu- derivatives, of the signals x(t) and y(t), respectively. Those val-
mination variations, we equalized the images through the his- ues can be easily computed using the Fourier derivative prop-
togram specification method [26], considering as eigenimage a erty [25].
prototype computed previously for each species from the train- Using an arc length parameterization, and convolving the
ing set. For object segmentation, we applied a thresholding original contour signal (t) with derivatives of Gaussian function,
approach [26] with a cut-off value manually determined for with varying standard deviation a, then derived from Eq. (2),
each image. As a result, binary images were produced with the the multiscale curvature is defined as described by Mokhtarian
respective object being defined by black pixels on a background et al. [29]:
of white pixels. The steps of converting an original color oocyst
image into a parametric contour are depicted in Fig. 2. k(t, a) = ẋ(t, a)ÿ(t, a) − ẍ(t, a)ẏ(t, a). (3)
The binarized images (see Fig. 2b) are submitted to an al-
gorithm that extracts the external contour of the object. This is The multiscale approach to curvature estimation leads to
done by selecting an initial point belonging to the contour of the so-called curvegram, where the curvature values appear as
the object. The algorithm involves successive detections of the a scale-space representation. Fig. 3 shows the contour of an
next contour pixel by using chain-code directions. The result is oocyst (panel a) and its corresponding curvegram (panel b).
a parametric representation, where every point in the contour Gaussian smoothing is essential for controlling curvature insta-
is identified by coordinates x(t) and y(t) [25]. bilities caused by noise along the contour (t), which would oth-
erwise produce many peaks of variable height. The smoothing
level is determined by the standard deviation a of the Gaussian
3.3. Curvature based on multiscale Fourier transformation function. A small a value (Fig. 3b, a = 10) results in a noisy
curvature, whereas a higher value yields a smoother curvature
The curvature of an object is an important characteristic (Fig. 3c, a = 50). This effect can be better observed in a 3D
that can be extracted from the respective contour. The pioneer curvegram that includes different scale values (Fig. 3d).
work of Attneave [27] emphasized the importance that tran- While the curvature itself can be used as a feature vector, this
sient events and asymmetries have in human visual perception, approach presents some serious drawbacks, including the fact
thus influencing the subsequent research on shape in computer that the curvature signal can be too large (involving thousands
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910 1903
Fig. 3. An oocyst contour (a) and the corresponding curvegrams using gamma values of 10 (b) and 50 (c), or a range of standard deviation values for Gaussian
function, displayed in a 3D curvegram (d).
of points, depending on the contour) and highly redundant. degree, the same process is applied with respect to the minor
Once the curvature has been estimated, the following shape axis [25]. Some additional measurements related to symmetry
measures [25] can be calculated in order to circumvent these have also been described in the literature [32–36].
problems: sampled curvature, curvature statistics (mean, me- In the present work, we considered the diameters (major and
dian, variance, standard deviation, entropy higher moments, minor axis) and symmetry of the oocysts. Simple global de-
etc.), maxima, minima, inflection points, and bending energy. scriptors included area (number of pixels into region), eccen-
tricity (length of major axis/length of minor axis), circularity
(perimeter2 /area), and bending energy [37].
3.4. Geometrical measurements
Some oocyst species present distinctive characterization 3.5. Texture characterization based on co-occurrence matrices
based only in shape and size, making necessary to find out ad-
ditional features to characterize them. For instance, principal The several methods for texture analysis have been classified
component analysis [25] was applied in order to find the main by Tuceryan and Jain [38] into four categories: statistical, geo-
directional vectors (eigenvectors), and used to define some metrical, model based and signal processing based. A powerful,
measurements such as diameters and symmetry. We used the frequently used method, involves the so-called co-occurrence
bilateral symmetry, that is considered a primary case from a matrices [39]. This method provides a second-order approach
geometric concept of symmetry [31]. Considering a binary for generating texture features. Although mainly applied to tex-
image, the shape is reflected with respect to the orientations ture discrimination of images, co-occurrence matrices have also
being defined by its major axis to find a bilateral symmetry been used for region segmentation [40].
1904 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910
The co-occurrence matrices take into account information Finally, the classification matrix was calculated as the average
about the relative positions of the various gray levels within of all confusion matrices resulting of each training:test partition.
the image. There are two parameters used for computing
Algorithm 1. CLASSIFICATION()
co-occurrence matrices: (a) the relative distance among the
Require: DataSet;
pixels and (b) their relative orientation. They involve the con-
Require: N c ← # of classes;
ditional joint probabilities, Cij , of all pairwise combination of
Require: Nf ← # of features;
gray levels given the inter-pixel displacement vector (x , y ),
Require: %training ← % of training set;
which represents the separation of the pixel pairs in the x- and
Require: %test ← % of test set;
y-directions, respectively. Traditionally, the probabilities are
Require: NrandomPartitions ← # of random sets;
stored in a gray level co-occurrence matrix (GLCM) [39,41].
Require: LC ← # of learning cycles;
This resultant matrix is a second-order histogram from which
Ensure: MclassMean[ ][ ]
some information can be extracted [42]: angular second mo-
1: set MclassAux[ ][ ] with zeros;
ment, contrast, inverse difference moment and entropy.
2: for i = 1 to NrandomPartitions do
3: [TrainingSet, TestSet]=PARTITION(Dataset, %training,
4. Pattern classification
%test, N c);
4: Mclass = BAYESIANCLASSIFIER(TrainingSet, TestSet,
4.1. Bayesian classifier
N c, Nf , LC);
5: MclassAux = MclassAux + Mclass;
Classification is always performed with respect to some prop-
6: end for
erties (or features) of the objects. Indeed, the fact that objects
7: MclassMean = MclassAux/NrandomPartitions;
share the same property defines an equivalence relation in terms
8: return MclassMean;
of partitioning of the object space. In this sense, a sensible clas-
sification operates in such a way as to group together into the The procedure requires a DataSet with a defined number of
same “class” entities that share some properties, while distinct classes (Nc) and a number of features (Nf). The partition is
classes are assigned to entities with distinct properties. We use defined by the %training : %test proportion, and the number
the term “classifier” for each statistical tool, trained using a of times that the random process of partition will occur is
specific data set, to discriminate distinct “classes”. determined by the NrandomPartitions parameter. Additionally,
A Bayesian classifier [43] utilizes a probabilistic approach a LC parameter defines the number of learning cycles of the
for classification. It can be used to compute the probability classifier. The resultant matrix is MclassMean.
that an example x belongs to class i . The computer imple- For a better understanding of the algorithm, the partition pro-
mentation is facilitated using the multivariate normal density cess and the classifier are represented as separate implemen-
function, which is entirely defined by two parameters: the tations. The PARTITION function is responsible for the random
mean i and covariance matrix i . Although the Bayesian process of partition of the DataSet, using the following pa-
decision rule is not a discriminant function, it defines regions rameters as input: the data set, the training:test proportion, and
that can be expressed in terms of discriminant functions gi (x). the number of classes. The function thus returns the respec-
To classify a new element x into one of the i classes, we take tive training and test sets. The BAYESIANCLASSIFIER function
the highest value of the gi functions as the corresponding true is the core process that implements the classifier. The classifier
class. is trained with the TrainingSet and evaluated with the TestSet.
Both tasks also require as input the number of classes, features,
4.2. Algorithm for the partition process and classifier and learning cycles. The function then returns a classification
confusion matrix Mclass. Finally, MclassMean is the resultant
Aiming at obtaining a robust and reliable classifier, we devel- confusion matrix, calculated as the average of all Mclass con-
oped an algorithm (Algorithm 1) to select the best combination fusion matrices, computed for each of the distinct random par-
of features, evaluate the most adequate size of the training set, titions.
and evaluate the classification accuracy.
For each class, the corresponding data set was randomly di- 4.3. Image similarity
vided into two groups, the training and the test sets. Different
proportions of these sets were tested using intervals defined by Following class assignment of the x vector through a
integers (e.g. from 10:90 to 90:10). In addition, for each train- Bayesian classifier, the next step is to know the level of sim-
ing:test proportion we generated a user-defined number of ran- ilarity between the query image and the assigned species. In
domly selected paired sets, which were evaluated independently this sense, the prototype element of the class is the mean
to reduce possible sampling biases. Each set was then evaluated of the normal density. Considering a training set composed
with respect to its ability to correctly classify. The average of by samples x1 , . . . , xn , the prototype of this set is the average
the classification scores, obtained for each of these paired sets, of the samples. Thus, we adopted this prototype as the most
was considered as the final score of correct classification for that representative element for each class.
particular proportion of training:test sets. This approach was The Mahalanobis distance is used as a similarity metric
recursively applied to the different training:test set percentages. between the element x classified in class i and its
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910 1905
prototype i . This distance is adequate for multivariate normal numbers of utilized features. Thus, the best combination of two
data, that tends to cluster around the mean vector , falling features (4 and 5) yielded a correct classification of 77.25%.
in an ellipsoidally shaped cloud whose principal axes are the The highest correct classification value (85.90%) overall was
eigenvectors of the covariance matrix . Thus, the natural obtained with a combination of 12 features. Since the correct
measure of the distance from x to the mean is provided by classification rates observed by using 10–13 features varied
the quantity within the range of one standard deviation (data not shown), we
decided to employ the 13 features in all subsequent analyses.
r 2 = (x − )t −1 (x − ). (4)
Table 2
Feature space for morphological characteristics of Eimeria spp. of domestic 5.3. Analysis of species differentiation
fowl
Species differentiation experiments were performed with a
Type ID Feature name
data set of 3891 oocyst images, comprising multiple strains of
Curvature 1 Mean of curvature the different Eimeria species that infect the domestic fowl. The
2 Standard deviation of curvature complete list of the strains and species utilized in this work
3 Entropy of curvature
is presented in Table 1. From the overall data set, we used
Geometry 4 Major axis 30% of the images for the training set, and 70% for the test
5 Minor axis set. A total of 100 paired sets were randomly generated and
6 Symmetry through major axis
7 Symmetry through minor axis
each one was used as an input for the classification process
8 Area (see Algorithm 1), which in turn generated a confusion matrix
9 Entropy of oocyst content as a result of species discrimination. Therefore, at the end of
Texture 10 Angular second moment
the recursive process, we generated 100 confusion matrices
11 Contrast which were used to compute the average confusion matrix. This
12 Inverse difference moment latter matrix contained the mean of correct classification for all
13 Entropy tested species. Finally, by computing the diagonal average, we
The 13D space is divided into three types of features: curvature, geometry, obtained the overall percentage of correct classification of the
and texture. system.
1906 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910
Table 3
Feature selection using the SFS test
1 2 3 4 5 6 7 8 9 10 11 12 13
2 × × 77.25
3 × × × 79.90
4 × × × × 81.02
5 × × × × × 82.45
6 × × × × × × 83.89
7 × × × × × × × 85.04
8 × × × × × × × × 85.64
9 × × × × × × × × × 85.63
10 × × × × × × × × × × 85.75
11 × × × × × × × × × × × 85.73
12 × × × × × × × × × × × × 85.90
13 × × × × × × × × × × × × × 85.75
The best combination of features and the resulting correct species classification values are presented.
The overall percentage of correct species assignment ob- The analysis subsystem represents the kernel of the system and
served was 85.75%. Table 4 presents the final confusion matrix, is responsible for the image pre-processing, feature extraction
where we can clearly see that the best classification was ob- and pattern classification. This module was entirely developed
tained for E. maxima (99.21%). Conversely, E. praecox and E. in C + +, resulting in a rapid response of the system during the
necatrix presented the worst results, with 74.23% and 74.90% image processing step, thus permitting a real-time processing
of correct discrimination rates, respectively. These results were through the web.
due to a cross-classification with other Eimeria species. Thus, Considering that different users have distinct setups of mi-
E. necatrix was incorrectly classified as E. acervulina (6.10%) croscopes and digital cameras, the magnification and resolution
and E. tenella (9.94%). Similarly, some other Eimeria species of the captured images can vary significantly from those used in
were also incorrectly classified as E. necatrix (E. acervulina in this work. In order to normalize the image scale, the user must
12.53%, E. praecox in 10.94% and E. tenella in 12.22%). These first determine the number of pixels/m of the captured image.
results show that E. necatrix and E. praecox are certainly the This can be simply done using a calibrated microscope scale,
most difficult species to be differentiated due to the morpho- such as those imprinted on specialized measuring slides. Alter-
logical similarity among themselves and to other species. This natively, hemocytometer counting chambers, commonly used
is in agreement with what is classically reported by personnel in many laboratories, can also be employed. Once a picture of
involved with visual inspection and classification of Eimeria the scale is obtained, the custom spatial resolution, expressed as
field samples. the number of pixels/m, can be easily determined using any
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910 1907
Table 4
Confusion matrix of species differentiation of Eimeria spp. of domestic fowl
Fig. 5. Framework of the real-time system for automatic diagnosis of Eimeria species.
discrimination. Other sources of data variability were also is considerably simpler for on-line and interactive implemen-
assessed, including differences on microscope illumination tations of the system.
and contrast, as well as the volume of the parasite suspension Several possible applications of our system can be foreseen
between the slide and coverslip. Finally, we used a relatively in a near future. Initially, the large image data set of Eimeria
high number of features, that were submitted to a feature selec- oocysts was made publicly available as the Eimeria Image
tion process to evaluate how many and which of them would Database. The database also includes now circa 2500 images
compose the most discriminative set. of 11 Eimeria species that infect the domestic rabbit. Since this
The approach described here is simple and permits a reliable database can be added with new parasite images in the future,
identification of the parasite species. Features are not limited it may represent an invaluable resource for classical parasitol-
to the simplest and most traditional geometric measures, as ogists and also for teaching purposes. From the computational
we also computed curvature to represent the form, and texture standpoint, it represents a novel repository of parasite im-
for internal structure characterization. Considering that this age data, useful for experimental protocols involving pattern
diagnostic system is based on morphology, the correct species recognition methods. As such, new algorithms could be tested
assignment rate obtained (85.75%) can be considered a very using this data set as a golden standard of validated bio-
good result, especially if compared to a subjective human logical samples.
diagnosis. Furthermore, given the complexity of the algorithms In addition to the image database, the precise morphomet-
for feature extraction, the current implementation is computa- ric data of the different Eimeria species provides a unique op-
tionally efficient, permitting a rapid and real-time interaction portunity to revisit the classic size estimations [10]. As such,
of the end-user through a web interface. we intend to provide new parasite identification charts where
Finally, because the system uses generic algorithms, it can be morphometric data will be presented in the light of the current
easily extended to discriminate other organisms. For this task, modern microscope optics and digital image technology. This
the user just needs to provide a new image data set and use it to kind of data will certainly be of a high value to the Eimeria
train the system to discriminate the different classes. In fact, a scientific community, as well as to researchers in pattern recog-
preliminary study, including 11 Eimeria species that infect the nition, which may use such repository to test new measurement
domestic rabbit, showed a similar discriminative performance and classification methodologies.
(data not shown). An envisaged application of the shape characterization
Previous studies using digital image processing applied to methodology described here is the implementation of a real-
Eimeria [7–9] have been reported in the literature. These sys- time diagnostic tool through a web interface. In this direction,
tems, however, were restricted to a semiautomatic oocyst diam- we have created an experimental front-end for public access.
eter measurement and still required a strong human interaction Since diagnosis is performed in real-time, there are almost no
during processing. In addition, most studies employed a small delay between the sample querying and the final diagnostic
number of morphological characters. Thus, some works used result. We foresee that such system would allow for a reliable
as features the oocyst diameters [7,9], whereas others used the diagnosis with no need of biological sample transportation be-
Fourier transform of the contour [15] or computed statistics tween the farms and the reference laboratory. This represents
from it [17]. a particularly important achievement, since live sample traffic
Another general limitation was related to the classification may represent a sanitary risk due to the potentiality of disease
method, where multidimensional data distribution has not been dissemination. Also, compared to other diagnostic approaches,
considered. Sommer [15] used Euclidean distance as a metric our system does not require trained personnel on parasite iden-
for clusterization. This metric assumes that the data is homo- tification or molecular biology techniques. The incorporation
geneously distributed, which is not necessarily the case, espe- of other parasites to the system may even increase the scope
cially when multidimensional data is used. Yang et al. [17], of applicability of this electronic diagnostic tool. Coccidian
working with human helminth eggs, used four morphometric protozoa and helminth eggs, by presenting a morphology
features and two stages of ANNs. These ANNs were used for similar to Eimeria oocysts, are the obvious candidates to be
the identification of eggs from artifacts, and for species dis- included in a near future. With the current decreasing prices of
crimination, respectively. However, the estimation of the aver- high resolution (above four megapixels) digital cameras, our
age correct classification ratio was based on a very small image system is relatively cheap. In fact, any reasonable microscope
data set, and the possible influence of intra-specific variability with a digital photo documentation system (a camera and an
was not assessed by the authors. adapter tube) would represent the minimum apparatus for such
We also preliminarily considered alternative classification methodology.
methodologies, such as SVM [47,48]. More specifically, we Another aspect where shape characterization may have an
compared the performance of Bayesian classifier and SVM con- interesting impact is on phylogenetic analysis. Classic phylo-
sidering situations involving seven Eimeria categories and 13 genetics used to rely on morphometric data, but since DNA
features. Because the obtained results did not indicate superior sequencing became a mainstream and relatively cheap tech-
performance of the SVM methodology (actually, slightly bet- nique, most current inferences are now based on molecular data.
ter results were achieved for the Bayesian classifier), we de- Because our morphological features have a quantitative repre-
cided to adopt the Bayesian methodology. An additional reason sentation, they can be discretized and converted into data matri-
motivating such a choice is the fact that the Bayesian classifier ces amenable to phylogenetic methods. Phylogenetic inference
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910 1909
of the genus Eimeria has been reported using the ribosomal [9] A. Plitt, S. Imarom, A. Joachim, A. Daugschies, Interactive classification
18S sequence [49]. Our group has recently characterized the of porcine Eimeria spp. by computer-assisted image analysis, Vet.
complete mitochondrial genome of the seven chicken Eimeria Parasitol. 86 (2) (1999) 105–112.
[10] P.L. Long, B.J. Millard, L.P. Joyner, C.C. Norton, A guide to laboratory
species (Romano et al.—manuscript in preparation) and used techniques used in the study and diagnosis of avian coccidiosis, Folia
this data set to reconstruct the phylogeny of this group. Pre- Vet. Lat. 6 (3) (1976) 201–217.
liminary results show a good agreement between inferences [11] B.E. Schnitzler, P.L. Thebo, J.G. Mattsson, F.M. Tomley, M.W. Shirley,
based on these molecular markers and the morphological fea- Development of a diagnostic PCR assay for the detection and
tures described in this work. Thus, morphometric data applied discrimination of four pathogenic Eimeria species of the chicken, Avian
Pathol. 27 (5) (1998) 490–497.
to phylogenetic inference may provide an interesting counter- [12] B.E. Schnitzler, P.L. Thebo, F.M. Tomley, A. Uggla, M.W. Shirley, PCR
part to molecular-based phylogenies, with potentially exciting identification of chicken Eimeria: a simplified read-out, Avian Pathol.
evolutionary implications. 28 (1) (1999) 89–93.
[13] S. Fernandez, A.H. Pagotto, M.M. Furtado, A.M. Katsuyama, A.M.
Madeira, A. Gruber, A multiplex PCR assay for the simultaneous
7. Conclusions detection and discrimination of the seven Eimeria species that infect
domestic fowl, Parasitology 127 (4) (2003) 317–325.
In this paper, an effective shape characterization approach for [14] A. Joachim, N. Dulmer, A. Daugschies, Differentiation of two
automatic species differentiation in Eimeria spp. is proposed. Oesophagostomum spp. from pigs, O. dentatum and O. quadrispinulatum,
The extracted features identify different morphological prop- by computer-assisted image analysis of fourth-stage larvae, Parasitol.
Int. 48 (1) (1999) 63–71.
erties of the oocysts, related to the characterization of form, [15] C. Sommer, Quantitative characterization, classification and recon-
geometry and internal structure. This shape representation was struction of oocyst shapes of Eimeria species from cattle, Parasitology
applied for the differentiation of the seven Eimeria species of 116 (1) (1998) 21–28.
domestic fowl, and the results revealed a good reliability of the [16] C. Sommer, Quantitative characterization of texture used for
feature set. Finally, a real-time diagnosis system was imple- identification of eggs of bovine parasitic nematodes, J. Helminthol. 72
(2) (1998) 179–182.
mented and made available for the scientific community. We [17] Y. Yang, D. Park, H. Kim, M. Choi, J. Chai, Automatic identification
believe that our system demonstrates the feasibility of using of human helminth eggs on microscopic fecal specimens using digital
computer-assisted systems to provide an interesting alternative image processing and an artificial neural network, IEEE Trans. Biomed.
for the rapid diagnosis of parasites. Eng. 48 (6) (2001) 718–730.
[18] K.W. Widmer, K.H. Oshima, S.D. Pillai, Identification of
Cryptosporidium parvum oocysts by an artificial neural network
Acknowledgments approach, Appl. Environ. Microbiol. 68 (3) (2002) 1115–1121.
[19] O. Bruno, R. Cesar Jr., L. Consularo, L. Costa, Automatic feature
Luciano da F. Costa (308231/03-1) and Arthur Gruber selection for biological shape classification in SYNERGOS, in:
(306793/2004-0) are grateful to CNPq for financial support. Proceedings of the SIBGRAPI’98, International Symposium on
Computer Graphics, Image Processing, and Vision, 1998, pp. 363–370.
César A.B. Castañón received a fellowship from CAPES and [20] R. Coelho, V.D. Gesù, G.L. Bosco, J. Tanaka, C. Valenti, Shape-based
the work presented herein formed part of his Ph.D. Thesis. features for cat ganglion retinal cells classification, Real-Time Imaging
Jane S. Fraga and Sandra Fernandez received fellowships from 8 (3) (2002) 213–226.
CNPq and FAPESP, respectively. [21] L. Costa, S. dos Reis, R. Arantes, A. Alves, G. Mutinari, Biological
shape analysis by digital curvature, Pattern Recognition 37 (3) (2004)
515–524.
References [22] D. Regan, Human Perception of Objects, York University, New York,
2000.
[1] D. Comaniciu, P. Meer, D. Foran, Image-guided decision support system [23] B. Olshausen, D. Field, Vision and the coding of natural images, Am.
for pathology, Mach. Vision Appl. 11 (4) (1999) 213–224. Sci. 88 (3) (2000) 238–245.
[2] D. Sabino, L. Costa, E. Rizzatti, M. Zago, A texture approach to [24] D. Zhang, G. Lu, Review of shape representation and description
leukocyte recognition, Real-Time Imaging 10 (4) (2004) 205–216. techniques, Pattern Recognition 37 (1) (2004) 1–19.
[3] A. Jalba, M. Wilkinson, J. Roerdink, Shape representation and recognition [25] L. Costa, R. Cesar Jr., Shape Analysis and Classification: Theory and
through morphological curvature scale spaces, IEEE Trans. Image Practice, CRC Press, Boca Raton, FL, 2000.
Process. 15 (2) (2006) 331–341. [26] R. Gonzales, R. Woods, Digital Image Processing, Addison-Wesley,
[4] S. Trattner, H. Greenspan, G. Tepper, S. Abboud, Automatic identification Reading, MA, 1993.
of bacterial types using statistical imaging methods, IEEE Trans. Med. [27] F. Attneave, Some informational aspects of visual perception, Psychol.
Imaging 23 (7) (2004) 807–820. Rev. 61 (3) (1954) 183–193.
[5] X. Long, W. Cleveland, Y. Yao, Effective automatic recognition of [28] L. Riggs, Curvature as a feature of pattern vision, Science 181 (4104)
cultured cells in bright field images using Fisher’s linear discriminant (1973) 1070–1072.
preprocessing, Image Vision Comput. 23 (13) (2005) 1203–1213. [29] F. Mokhtarian, A. Mackworth, A theory of multiscale, curvature-based
[6] M. Sampat, A. Bovik, J. Aggarwal, K. Castleman, Supervised parametric shape representation for planar curves, IEEE Trans. Pattern Anal. Mach.
and non-parametric classification of chromosome images, Pattern Intell. 14 (8) (1992) 789–805.
Recognition 38 (8) (2005) 1209–1223. [30] R. Cesar Jr., L. Costa, Towards effective planar shape representation
[7] J. Kucera, M. Reznicky, Differentiation of species of Eimeria from with multiscale digital curvature analysis based on signal processing
the fowl using a computerized image-analysis system, Folia Parasitol. techniques, Pattern Recognition 29 (9) (1996) 1559–1569.
(Praha) 2 (38) (1991) 107–113. [31] H. Weyl, Symmetry, Princeton University Press, New Jersey, 1980.
[8] A. Daugschies, S. Imarom, W. Bollwahn, Differentiation of porcine [32] H. Zabrodsky, S. Peleg, D. Avnir, A measure of symmetry based on
Eimeria spp. by morphologic algorithms, Vet. Parasitol. 81 (3) (1999) shape similarity, in: Proceedings of the IEEE Conference on Computer
201–210. Vision and Pattern Recognition (CVPR92), 1992, pp. 703–706.
1910 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899 – 1910
[33] M. Brady, H. Asada, Smoothed local symmetries and their [42] S. Theodoridis, K. Koutroumbas, Pattern Recognition, Academic Press,
implementation, Technical Report, Cambridge, MA, USA, 1984. San Diego, 1998.
[34] J. Sato, R. Cipolla, Affine integral invariants for extracting symmetry [43] R. Duda, P. Hart, D. Stork, Pattern Classification, Wiley, New York,
axes, Image Vision Comput. 15 (8) (1997) 627–635. 2001.
[35] Y. Bonneh, D. Reisfeld, Y. Yeshurun, Quantification of local symmetry: [44] P.M. Narendra, K. Fukunaga, A branch and bound algorithm for feature
application to texture discrimination, Spat. Vision 8 (4) (1994) 515–530. subset selection, IEEE Trans. Comput. 26 (9) (1977) 917–922.
[36] B. Zavidovique, V.D. Gesù, Kernel based symmetry measure, in: ICIAP, [45] A. Jain, R. Duin, J. Mao, Statistical pattern recognition: a review, IEEE
2005, pp. 261–268. Trans. Pattern Anal. Mach. Intell. 22 (1) (2000) 4–37.
[37] I. Young, J. Walker, J. Bowie, An analysis technique for biological shape [46] A. Jain, D. Zongker, Feature selection: evaluation, application, and small
I, Inform. Control 25 (4) (1974) 357–370. sample performance, IEEE Trans. Pattern Anal. Mach. Intell. 19 (2)
[38] M. Tuceryan, A. Jain, Texture analysis, in: C.H. Chen, L.F. Pau, P.S.P. (1997) 153–158.
Wang (Eds.), The Handbook of Pattern Recognition and Computer [47] N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector
Vision, second ed., World Scientific Publishing Co., Singapore, 1998, Machines and Other Kernel-based Learning Methods, Cambridge
pp. 207–247. University Press, Cambridge, 2000.
[39] R. Haralick, K. Shanmugam, I. Dinstein, Textural features for image [48] K. Crammer, Y. Singer, On the algorithmic implementation of multiclass
classification, IEEE Trans. Systems Man Cybern. SMC-3 (6) (1973) kernel-based vector machines, J. Mach. Learn. Res. 2 (5) (2001)
610–621. 265–292.
[40] R. Jobanputra, D. Clausi, Preserving boundaries for image texture [49] J.R. Barta, D.S. Martin, P.A. Liberator, M. Dashkevicz, J.W. Anderson,
segmentation using grey level co-occurring probabilities, Pattern S.D. Feighner, A. Elbrecht, A. Perkins-Barrow, M.C. Jenkins, H.D.
Recognition 39 (2) (2006) 234–245. Danforth, M.D. Ruff, H. Profous-Juchelka, Phylogenetic relationships
[41] R. Conners, Towards a set of statistical features which measure visually among eight Eimeria species infecting domestic fowl inferred using
perceivable qualities of texture, in: Proceedings of Pattern Recognition complete small subunit ribosomal DNA sequences, Parasitology 83 (2)
Image Processing Conference, 1979, pp. 382–390. (1997) 262–271.
About the Author—CÉSAR ARMANDO BELTRÁN CASTAÑÓN holds a B.Sc. degree from Universidad Católica de Santa María, Peru, in Systems
Engineering, and a M.Sc. in Computer Science from the University of São Paulo (USP), Brazil. He is currently a Ph.D. student in Bioinformatics at the USP.
His research interests include 2D image shape analysis, pattern recognition, feature extraction and selection, content-based image retrieval, and computational
biology.
About the Author—JANE SILVEIRA FRAGA holds a Veterinary Medicine degree from the University of São Paulo (USP). She is currently finishing her
Ph.D. Thesis on the characterization of dsRNA viruses infecting Eimeria spp. of domestic fowl.
About the Author—SANDRA FERNANDEZ holds a Veterinary Medicine degree and a Ph.D. of Parasitology (USP). She is currently heading a research and
development group at Laboratório Biovet S/A, a private company that is a major vaccine producer in Brazil.
About the Author—ARTHUR GRUBER holds a Veterinary Medicine degree, and a Ph.D. in Biochemistry (USP). He is currently an Associate Professor at
the Department of Parasitology of the USP. His main research interests include molecular biology and genomics of coccidian parasites, and the development
of bioinformatics applications for sequence analysis.
About the Author—LUCIANO DA FONTOURA COSTA holds a B.Sc. in Electronic Engineering and Computer Science, a M.Sc. in Applied Physics, and a
Ph.D. in Electronic Engineering (King’s College, University of London). He is currently a Full Professor at the USP. His main interests include natural and
artificial vision, shape analysis, pattern recognition, computational neuroscience, computational biology, and bioinformatics.