You are on page 1of 13

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS 1

Animal–Vehicle Collision Mitigation


System for Automated Vehicles
Abdelhamid Mammeri, Depu Zhou, and Azzedine Boukerche

Abstract—Detecting large animals on roadways using auto- some European countries witnessed more than 507 000 colli-
mated systems such as robots or vehicles is a vital task. This sions, resulting in around 300 human fatalities, 30 000 human
can be achieved using conventional tools such as ultrasonic injuries, and more than $1 billion in damages every year.
sensors, or with innovative technology based on smart cam-
eras. In this paper, we investigate a vision-based solution. We Over the last decade, many AVC mitigation architectures
begin the paper by performing a comparative study between have been proposed in [3] and [4]. These architectures can be
three detectors: 1) Haar-AdaBoost; 2) histogram of oriented grouped into two main categories: 1) passive methods, which
gradient (HOG)-AdaBoost; and 3) local binary pattern (LBP)- use deterrence to keep large animals away from roadways and
AdaBoost, which were initially developed to detect humans and 2) active methods, based on animal detection. Passive methods
their faces. These detectors are implemented, evaluated, and com-
pared to each other in terms of accuracy and processing time. make use of deterrence strategies to warn animals; an exam-
Based on our evaluation and comparison results, we design a ple of this is the use of ultrasonic noise such as whistles (e.g.,
two-stage architecture which outperforms the aforementioned Hornet V120 [3]), or the generation of high intensity lights
detectors. The proposed architecture detects candidate regions from vehicles, which increase the distance from which they
of interest using LBP-AdaBoost in the first stage, which offers may be perceived by animals. Other earlier and inefficient
robustness to false positives in real-time conditions. The sec-
ond stage is based on support vector machine classifiers that techniques, such as electronic mats, animal reflectors, road-
were trained using HOG features. The training data are gener- side refractors, and break the beam methods, have also been
ated from our novel dataset called large animal dataset, which used to keep animals away from roads. It is observed in [4]
contains common and thermographic images of large road- that the most effective way to reduce the number of AVCs is
animals. We emphasize that no such public dataset currently to detect animals using cameras, rather than relying on deter-
exists.
rence strategies. This is because camera-based systems are the
Index Terms—Animal–vehicle collisions (AVCs) detection, most efficient and accurate way to see around regions under
automated systems, obstacle detection and avoidance. investigation in order to reduce AVCs. However, the disad-
vantages of camera-based solutions lie in the fact that they
focus on animals within the road, and ignore those outside
I. I NTRODUCTION the field of investigation. These systems also fail to detect
ITH the rapid advances in technology, there is a animals on curve-lanes. Deterrence methods (e.g., ultrasonic
W concerted determination to use automatic systems to
increase driver aptitude in safety tasks such as driving vehi-
devices), on the other hand, require a clear line of sight to
establish beam connections. Another problem is their false
cles. Automatic collision avoidance systems are intended to activation by smaller species of animals, air movement, or
assist drivers in obstacle detection and avoidance [1]. The humans, which results in false alarms. The advantage of these
animal–vehicle collision (AVC) avoidance system is an exam- systems is that they are relatively insensitive to changes in tem-
ple of such systems, which serve to enhance the safety of perature. In this paper, we focus only on the active methods
roadway users and increase highway throughput. from a computer-vision perspective. To detect the presence of
AVCs are challenging issues for vehicles, particularly in animals, vision-based detection systems use visible range cam-
rural regions of North America and Europe. They account for eras, thermographic cameras, radars, or lasers. These devices
about 200 human deaths, 29 000 injuries, and $1.1 billion in are installed inside cars or along roadsides. When a large ani-
property damage every year in the U.S. [2]. Similar situations mal is detected, drivers are notified through a warning message
are encountered in Europe, Africa, and Asia. For instance, from a car dashboard system, or through flashing roadside
signs.
In this research paper, based on [5], we explore the detec-
Manuscript received November 4, 2014; revised March 7, 2015; accepted
July 31, 2015. This work was supported in part by the Canada Research Chair tion of roadway animals using a camera-based architecture.
Programs, in part by the DIVA Strategic Research Network, and in part by the We begin this paper with a comparative study between
Natural Sciences and Engineering Research Council of Canada. This paper three detectors: 1) Haar-AdaBoost; 2) histogram of ori-
was recommended by Associate Editor E. Tunstel.
The authors are with the School of Electrical Engineering and Computer ented gradient (HOG)-AdaBoost; and 3) local binary pattern
Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada (e-mail: (LBP)-AdaBoost, all of which were originally developed
amammeri@uottawa.ca). to detect humans and their faces. We then evaluate and
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. compare these detectors in terms of accuracy and process-
Digital Object Identifier 10.1109/TSMC.2015.2497235 ing time. Additionally, we compare these detectors to the
2168-2216 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

well-known HOG-SVM approach. Based on the evaluation method of extracting features improves the robustness of the
results, we propose a two-stage architecture that outperforms lion-face detector against shadow and illumination changes in
the aforementioned schemes Haar-AdaBoost, HOG-AdaBoost, natural scenes. However, no performance results are drawn
LBP-AdaBoost, and HOG-SVM. All of these detectors were in [8] to support this claim. Moreover, this method only con-
evaluated and tested in different daytime and nighttime condi- siders the anterior view of lions, making it vulnerable to
tions. Our two-stage architecture exhibits good performance in situations in which animal faces are not available. In [10],
daytime conditions; however, at nighttime, it is less efficient. a Haar-based method that builds visual models of animals is
The first stage, i.e., LBP-Adaboost, is used to detect regions proposed. This method is based on the assumption that ani-
of interests (ROIs) that potentially contain large animals. The mals can be represented by many segments signifying different
second stage is based on support vector machine (SVM) body parts, as already performed by part-based methods to
classifiers that were trained using HOG increases the detec- detect humans in [5] and [11]. To identify animals in [10],
tion rate, particularly in daytime conditions, as explained in three different techniques are used: 1) histograms of textons;
Section VIII [6]. 2) intensity-normalized patch pixel values; and 3) the scale
To train our classifiers and to test the aforementioned archi- invariant feature transform (SIFT) descriptor. Three classifiers
tectures, we have created a new dataset called large animal are also tested: 1) K-way logistic regression; 2) SVMs; and
dataset (LADSet). We emphasize that no such public dataset 3) K-nearest neighbors (KNN). It is reported that KNN, with
for large animals exists in the literature. This dataset is fre- K = 1, performs better than SVM and K-way logistic regres-
quently updated by the addition of new images. To perform sion. Despite the good performance of the proposed system,
our experiments, we have focused mainly on lateral views of some limitations are still found. For instance, its application
moose; the lateral position is the most common for animals to is restricted to lateral views of animals. Additionally, this sys-
take when crossing roadways. Other uncommon postures of tem is only appropriate for frames with mono-targeted animals.
animals, such as posterior view and the recumbent position, The authors also use SIFT, which is considered slow compared
are left to our future work. to the HOG method. This is because SIFT is a local descriptor
We begin this paper by reviewing the most important which only computes the gradient histogram for blocks around
research works on large animal detection in Section II, and specific interest points, while HOG is computed for an entire
presenting the animal detection scenario in Section III. Next, in image. HOG also performs better than SIFT in terms of its
Section IV, we introduce our dataset, which primarily consists false positive rate (FPR). Haar-like features are used to detect
of Internet images and video frames. In Section V, the three and identify African penguins by their chest pattern in [12],
features used to detect animals are briefly explained. The main based on the presence of black chest spots in adult penguins.
algorithms used in training are then introduced in Section VI. AdaBoost algorithm is used to train the penguins’ features.
The three detectors constructed in this paper are explained The system proposed in [12] does not work for penguins that
and compared to each other in Section VII. To address the change their feather pattern, for penguins with extraordinary
problems found with these detectors, we propose a two-stage patterns or for posterior views.
architecture in Section VIII. In Section IX, we investigate A two-stage strategy architecture is developed in [7]. It
nighttime conditions, and suggest a new system. We conclude begins by segmenting images into many regions according to
our paper with some useful remarks that outline our future their grayscale values. Contours of animals are found using a
work in Section X. contour finding function. After the generation of ROIs, they are
resized in such a way that may contain the contour of animals.
In the second stage, HOG has been applied to detect animals
II. R ELATED W ORK without any adaptation, leading to unoptimized results. In this
Surprisingly, large animal detection in the automotive con- paper, an extensive set of experiments are conducted to select
text has not received great deal of interest from the human- the best parameters from HOG that yield optimized results.
machine systems community, despite the existence of some In [7], HOG features are extracted from ROIs instead of the
AVC mitigation architectures. In fact, almost all of these entire image. For animal identification, a linear SVM classi-
papers present some countermeasures that are mainly used to fier is used. In general, the system proposed in [7] fails to
prevent collisions; however, they do not address the detection detect animals at longer distances. Moreover, directly apply-
algorithms of AVC systems (see [3], [4]). The only excep- ing HOG without tuning its parameters leads to unoptimized
tion comes from [7], in which a contour-based HOG-SVM results. Instead of extracting Haar features from gray or color
method is developed to detect deer. In this section, we review images as performed originally in [13], Zhang et al. [14], [15]
the detection and recognition methods for large animals in extracted Haar features from four channels to capture local pat-
general applications. Haar-like features, originally developed terns. These features, which are essentially Haar-like features,
to detect human faces, are explored in [8] to detect the faces are renamed in [15] as Haar of oriented gradients (HOOG).
of lions. Indeed, these features are extracted from a color map HOOG is used to handle the shape and variation in texture
calculated from the difference between the R and G color chan- of the animal head. The classification stage jointly captures
nels, instead of being extracted from gray images as performed shape and texture features; a second step, called deformable
in [9]. These features are trained using the AdaBoost algo- detection, is then performed. The role of this second step is
rithm. A combination of the Kanade–Lucas–Tomasi method to handle the spatial misalignment observed between the out-
is used to track animal faces. It is stated in [8] that this put of shape and texture detectors. The results shown in [14]
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

MAMMERI et al.: AVC MITIGATION SYSTEM FOR AUTOMATED VEHICLES 3

Fig. 1. Scenario of animal detection using roadside cameras. When a roadside


camera detects an animal, it sends a warning message to the flashing system.

are very promising, since they consider both texture and edge Fig. 2. Example (video) of large animal detection using our architecture
proposed in this paper.
information. However, this paper is validated only by using
still images (obtained from the Internet) of animals’ heads.
Aside from texture and shape, color hue can also be used to environment, camera resolution, vehicle mobility, and the large
detect animals, particularly in daylight. For instance, the work intraclass variability between different types of large animals.
in [16] starts by preprocessing the input images to reduce the The detection of large animals imposes several new chal-
amount of processed data. The color space Luv is used, and lenges compared to the detection of human faces or pedestri-
the mean-shift clustering algorithm is applied to the prepro- ans. Human faces have a relatively fixed texture features that
cessed images to perform color segmentation. This approach can be accurately described by Haar features, as shown in [9].
uses training images to obtain a color model for an ani- The HOG descriptor was originally developed to detect pedes-
mal. Unfortunately, this method is only applicable to daytime trians, since their outlines are nearly invariable even when they
conditions, and does not seem adaptable to night conditions. are walking in different directions. It is shown in [18] that
Khorrami et al. [17] proposed a method that is capable of HOG outperforms most object detection algorithms. With large
detecting multiple types of animals using principal compo- animals, however, colors and outlines vary greatly. Compared
nents analysis (PCA). PCA is a mathematical technique used to human faces and heads, which are almost unique and
to reduce data dimensionality. After detecting animals, local standard, animal faces and heads have greater variation in
entropy and connected component analysis are used to isolate appearance. Furthermore, human faces are characterized by
the foreground, containing animals from the background. At skin texture, while the texture of animals’ faces are more var-
the end, large displacement optical flow is applied to ensure ied and complex [14]. Moreover, the body of a large animal
that areas in the frames correspond to large changes in veloc- exhibits high variability within the same class (e.g., moose),
ity. The mentioned works indicate that gradient features can and between animals of different classes. This is due to the
also be used alongside color or texture features. This is due to fact that animals possess specific properties including texture,
the fact that gradient features, such as HOG, are invariant to height, shape, and different views (posterior, anterior, and lat-
scale and illumination, and are hence well suited to nighttime eral), that distinguish them from others. The effective use of
applications. those features to detect large animals is a challenging issue,
and is largely discussed in this paper.
III. A NIMAL D ETECTION S CENARIO
In this paper, the proposed architecture can be implemented IV. DATASET C REATION
in a dashboard system or in roadside units (RSUs). Stationary We created our dataset (called LADSet) from a large set
cameras are installed at the roadside; when a large animal of images and videos collected mainly from the Internet [6].
enters their field of view, the cameras detect the animal and Collecting videos directly from nature is challenging, and to
notify upcoming vehicles through flashing signs installed on the best of our knowledge, no such public dataset exists.
the roadside (see Fig. 1). Approaching drivers, after seeing the Approximately 20 h of various videos were selected, contain-
flashing signs, immediately reduce their vehicle’s speed and ing images of large animals such as moose, elk, horses, cows,
make appropriate decisions to avoid a collision. If a dashboard and deer in residential areas, zoos, or forests. Particular interest
camera (see Fig. 2) is installed inside the vehicle, flashing was given to large animals crossing rural roads and highways,
signs are not necessary. The dashboard camera will detect ani- and to videos recorded from moving vehicles. This reflects real
mals crossing the roadway, and will notify the driver through situations, and helps improve system performance. Moreover,
a warning message. videos with different weather conditions (rain, sun, and snow)
We have successfully tested our proposed architecture on are considered. The collected videos were downloaded from
many videos and images taken from moving dashboard and video websites (e.g., Youtube and Youku) and converted to
RSU cameras. The detection of road-faring animals by station- image format using the ratio of 1:4 continuous video frames
ary or mobile cameras is a challenge due to several undesirable to avoid repeated sampling. For nighttime conditions, a small
factors. These factors are mainly related to the surrounding dataset was constructed; it is explained in Section IX.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 3. HTR shape category images taken from LADSet. Fig. 4. HTL shape category images taken from LADSet.

A North American moose stands 1.4–2.1 m high at shoul-


der height; the length from head to tail, however, lies between
2.4–3.2 m [19]. Other Cervidae animals such as black-tailed
deer, elk, donkey, young horses, and caribou have simi-
lar body ratios, and are considered in our dataset. Based
on these general characteristics, we select the image aspect
ratio that matches animals’ natural sizes. Hence the image
aspect ratio width:height = 7:5 is adopted. The size of the
images collected from the Internet varies from 42 × 30 to
560 × 400 pixels. These images are normalized to 21 × 15,
28 × 20, 35 × 25, and 56 × 40 pixels. All images are anno-
tated and aligned manually. We noticed that each image should
contain a 10% boundary area around the body of the animal
(around 5–7 pixels for the size of 56×40 pixels), which allows
for the use of edge detectors [20].
Unlike pedestrian datasets (see [21], [22]) which take into Fig. 5. Example of images taken from the negative dataset.
consideration very limited human poses, the process of design-
ing a LADSet is a challenging task; this is because it should
consider different categories of large animals with diverse pos- B. Negative Animal Dataset Images
tures. Therefore, as a first step toward the creation of this
dataset in this first version, we consider the following two main A good negative dataset helps improve the performance
categories of lateral view shapes shown in Figs. 3 and 4: “head of the detector by reducing the set of false positives. The
to left” (HTL) and “head to right” (HTR). The justification most crucial point is that the target object (i.e., animals)
behind this choice is that these postures are the most frequently are not included in the negative dataset. Any other animals,
encountered shapes on roads. However, some uncommon pos- such as small dogs, cats, etc., are excluded from the nega-
tures of animals such as posterior view, anterior view, and the tive dataset. This is because they may have similar shapes to
reclining position will be considered in a future version of large animals. In addition, images that contain road objects
our dataset. This is because these animal postures are rarely such as vehicles, pedestrians, traffic signs, road surfaces,
encountered in roads. The current version of the LADSet con- grass, houses, trees, etc., are considered in the dataset. These
tains training and testing datasets, which include positive and images should consider different types of illumination and
negative images. other environmental conditions. We collected around 10 000
background images; 7000 were used in the training pro-
cess; and 31 712 small negative images were used for false
A. Positive Animal Dataset Images positive testing. A sample of the negative dataset is shown
After completing the video collection, we cut the standard in Fig. 5.
image from the sampled video frames using the aspect ratio When comparing detectors in Section VII, we divide the
(width:height = 7:5). As previously mentioned, the window testing samples into ten nonoverlapping subsets; that is, ten
size was determined by the size of the animal’s body, plus 10% groups of 711 positive images for computing true detec-
of the margin. Our positive training dataset mainly includes tion/miss rates and ten groups of 3171 negative images for
moose and deer, in addition to some other large animals such computing false positive detection rates. The average mean
as horse, caribou, etc. It contains 3462 images equally divided values of each detector are recorded and compared accord-
into the HTL and HTR categories (see Figs. 3 and 4). ingly, as shown in Table II.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

MAMMERI et al.: AVC MITIGATION SYSTEM FOR AUTOMATED VEHICLES 5

TABLE I
D IFFERENT PARAMETERS OF HOG D ESCRIPTORS U SED TO S ELECT A
C OMBINATION T HAT Y IELDS THE B EST P ERFORMANCE R ESULTS
FOR D ETECTING A NIMALS . T HE R ESULTS OF V IDEO T EST
S PEEDS C OME F ROM THE S AME V IDEO (L ENGTH : 57 S ,
S IZE : 320 × 240, T OTAL F RAMES : 1027)

Fig. 6. Image representation of MB-LBP feature with a block size of 3 × 2.


Compared to the original LBP with 3 × 3 pixels, MB-LBP can capture large-
scale structures.

V. F EATURES E XTRACTION
In this paper, we used three features to describe large ani-
mals: LBP, HOG, and Haar. A brief description of these
features is given in the following sections [6].

A. LBP Features
The LBP is a simple and powerful texture descriptor used
to discover and summarize local patterns in frames. Each pixel
is described by its relative gray level to its direct neighboring
pixels using the original basic version of LBP, or to indirectly
neighboring pixels using the extended LBP version. If the
intensity of the neighboring pixel pn is lower than the inten-
sity of the center pixel pc , it is set to zero; otherwise, it is set
to one. Consequently, each pixel is represented by a binary
code. For instance, for a region of 3 × 3 pixels, the LBP code
 (xc , yc ) is expressed in its decimal form as:
of its pixel center
LBP(xc , yc ) = 7i=0 S(pn − pc ) × 2i , where S(v) = 1 if v ≥ 0, Fig. 7. Comparison between different parameters of HOG.
and 0 otherwise.
More distinctive block features, called multiblock
LBP (MB-LBP) features, were proposed in [23]. Here, using the 1-D mask [−1 0 1] as shown in [18]. The image
the authors applied MB-LBP and AdaBoost learning methods is then divided into a set of cells with a size of 8 × 8 pixels.
to face detection, which perform better than the original LBP After that, four adjacent cells are regrouped to form a block.
features and Haar-like features. The basic idea of MB-LBP is For each cell, the histogram of gradient with nine orientation
that the simple binary difference rule that works in a single bins is computed for later use as a descriptor block. This is per-
pixel is transferred to a block, which may have a different formed by accumulating votes into bins for each orientation.
size. This means that the MB-LBP operator is defined by The vote is weighted by the magnitude of a gradient at each
comparing the central block intensity bc with those of its pixel. Finally, when all histograms are computed, the descrip-
eight neighborhood blocks b0 , . . . , b7 (see Fig. 6). The size tor vector is built into a single vector, and cell histograms are
of the blocks considered in this paper varies from 1 × 1 normalized. For normalization purposes, cell histograms are
to 5 × 5 pixels. Furthermore, we only apply LBP(8,1) as organized into blocks of 16 × 16 pixels. L1-norm, L1-sqrt,
the operator of MB-LBP. This is because the larger blocks L2-norm, and L2-hys can be applied to normalize the gradi-
have very limited descriptive functions compared with ent intensity (to make the feature vector space robust to local
adjacent blocks.The final binary sequence can be obtained illumination changes). Once this normalization step has been
as MB-LBP = 7i=0 S(bi − bc ) × 2i , where bc is the average performed, all the histograms can be concatenated into a single
intensity of the central block and bi (i = 0, . . . , 7) is the feature vector.
average intensity of the neighbor block i. The parameters of HOG, as defined in [18], yield a good
In this paper, we consider MB-LBP for animal detection, performance when used to detect pedestrians. In this paper,
and we use the AdaBoost algorithm in order to select an opti- we vary the parameters of HOG in order to obtain the com-
mal set of local regions and their weights (see Section VI). bination that yields the best performance in detecting large
By doing this, a smaller MB-LBP features set, which repre- animals using many videos and images (see Table I). We
sents animals, may be generated and compared to instances in show in Table I and Fig. 7 that HOG1 performs better
which earlier versions of LBP are used. than HOG2–HOG5 in terms of both detection speed and
low FPRs. On the other hand, HOG3 has the highest true
B. HOG Features positive rate; unfortunately, it is extremely slow (343 ms)
HOGs [18] is a gradient-based method initially proposed for compared to HOG1, HOG2, HOG4, and HOG5. Its FPR is
the detection of pedestrians. Usually the first step of HOG is also high, thus it does not meet the requirements of our
gamma and color normalization of the input image. The hori- entire system. Hence, HOG1 parameters are used in our
zontal and vertical gradients of each pixel are then computed experiments.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

of powerful weak classifiers from the enormous amount of


feature sets; this function gives each selected weak classifier a
specific weight to form a final strong classifier. The detection
process is quite similar to Haar-AdaBoost proposed in [9].

B. AdaBoost Training
1) Weak Classifiers: As we know, the sets of Haar,
MB-LBP, and HOG features computed over a given image
Fig. 8. Example of Haar features shown in the detection Window, for
are extremely large. For instance, in a sub-window of
instance: edge features (a, d), line features (b, c, e) and center-surround 20 × 20 pixels, we may find 45 891 Haar-like features,
features (f) [9], [13], [24]. 3600 MB-LBP features, and 576 HOG features. Although the
number of feature MB-LBP, and HOG is much lower than
C. Haar Features Haar-like features, the three descriptors contain a considerable
Haar-like features are a set of 2-D Haar functions used to amount of redundant information. For instance, the features
encode the texture of objects [13] in images. Each Haar-like that only appear in the background are useless. The AdaBoost
feature consists of at least two adjacent “black" and “white" algorithm is then used to select significant features and to
rectangles. The value of a Haar-like feature is then found by construct a powerful binary classifier.
computing the difference between the sum of the pixel values In this paper, each single feature (i.e., Haar, HOG, or
within black and white rectangular regions. Originally, two MB-LBP) is used as a weak classifier to separate positive
kinds of Haar features were introduced in [13], as shown in from negative images. For each weak classifier, an optimal
Fig. 8(a) and (b). These features were first extended in [9], to threshold classification function is defined with the purpose of
include a third feature as shown in Fig. 8(c); in [24] they were maximizing the classification ability and minimizing the num-
extended to contain a tilted (45◦ ) Haar-like feature as shown ber of misclassified sub-windows. For that purpose, we define
in Fig. 8(d)–(f). a weak classifier by ht ; this consists of a feature ft , a threshold
The set of the basic Haar features extracted from a given θt and a coefficient factor pt which indicates the direction of
image is extremely large. For instance, in a sub-window of the inequality sign (“<” or “>” between ft and θt ), as per-
24 × 24 pixels, more than 45 000 Fig. 8(a)–(c) types of Haar formed in [25]. Equation (2) defines the optimal threshold
features were computed. A convenient and fast method for classification function for a weak classifier as follows:

computing the huge number of Haar-like features is through 1, if pt ft (x) < pt θt
the integral image. In this paper, the features shown in ht (x) = (2)
Fig. 8(a) and (b), and a tilted (90◦ ) version of them, are 0, otherwise
used to represent animals. After that, we trained them using where x is a sub-window of an input image, ft indicates the
the AdaBoost algorithm as performed in [9] to detect large tth Haar feature or the HOG histogram bin value (we use Hk,t
animals. instead of ft to illustrate the tth histogram bin value in the kth
cell when extracting HOG weak classifier). If we apply the
VI. C LASSIFICATION AdaBoost algorithm on MB-LBP features, it will be difficult
In this paper, AdaBoost is used to separately train the three to use the threshold classification function, since the value
aforementioned features: 1) Haar; 2) MB-LBP; and 3) HOG. of MB-LBP features is nonmetric. As performed in [23], the
AdaBoost is extremely simple to use and to implement, and weak classifier known as the decision tree or regression tree is
often yields very effective results. applied. Hence, the multibranch tree is adopted to design the
weak classifiers based on MB-LBP features. The multibranch
A. AdaBoost Algorithm tree has 256 branches, each of which correspond to a certain
Given T weak classifiers ht (each of them represented by discrete value of MB-LBP features [23]. The weak classifier
one feature: HOG, MB-LBP, or Haar) learned through an iter- (in the case of MB-LBP features) is defined as

ative process, the strong classifier is then formed through a ⎪
⎪ a0 , if xk = 0


linear combination of weak classifiers ⎨ · · ·
 T 
 ht (x) = aj , if xk = j (3)


H(x) = sign αt .ht (x) (1) ⎪···


t=1 a255 , if xk = 255
where αt refers to the weight of each weak classifier found in where xk is the kth element of feature vector x and aj , ( j =
the boosting process and x represents the input sub-window, 0, . . . , 255) are regression parameters learned in the AdaBoost
ht is a weak classifier generated by a single feature. The weight training process. We calculate the best tree-based weak clas-
coefficients are computed by an iterative process as described sifier just as we would learn a node in a decision tree [23].
in [9]. Assuming each feature is considered a weak classi- The minimization of (3) gives the following parameters:
fier, the total number of each type of feature in a certain  k
image would be extremely large. For each type of feature, i wi yi δ xi = j
aj =  k . (4)
the AdaBoost training function aims to select a small number i wi δ xi = j
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

MAMMERI et al.: AVC MITIGATION SYSTEM FOR AUTOMATED VEHICLES 7

Obviously, the parameters aj ∈ [−1, +1]. aj > 0 indicates that Algorithm 1: Cascade Classifier
the tth MB-LBP feature extracted from a positive sample is 1 Input: Detection sub-window
greater than that which is extracted from a negative sample. 2 Output: Detection result: Positive or Negative
Thus, when ht (x) is greater than a threshold TLBP , we say that 3 for i = 1; i ≤ N; i + + do
the input window x represents a true animal. Otherwise, it is 4 Si = 0
Initialize the calculation result Si for stage i/N classifier.
not considered to be an animal. 5 for t = 1; t ≤ T; t + + do
2) HOG-AdaBoost Algorithm: With the SVM classifier, 6 Si = Si + αj hj (x)
all the 36N 1 (N is the number of blocks) HOG vectors are where ht (x) is the t/T weak classifier, namely the tth
extracted to participate in the classification process. However, selected feature in this stage.
7 end
in the case of the AdaBoost algorithm, only a small set of 8 if Si ≥ Thresholdi then
histogram values, known as weak classifiers, are used. This 9 RESULT ← TRUE
means that each single histogram value in one bin of the cell The looping would continue
has classification capabilities. Actually, in each cell, we have 10 end
nine weak classifiers, each of which corresponds to one bin. 11 else
12 RESULT ← FALSE
The AdaBoost algorithm aims to pick up the most powerful skip detection and the result is negative.
weak classifiers from the 36N histogram bins. We set a thresh- 13 end
old θt for each bin value; we then compared the value Hk,t 14 end
(which indicates the tth histogram bin value in the kth cell) 15 return RESULT
of the input image to the threshold θt (which corresponds to
the tth feature); this was done based on (2) using hk,t (x) = 1
if pt Hk,t (x) < pt θt , and 0, otherwise. Finally, we combined
the selected weak classifiers into a strong final one. A num-
ber of trained, strong AdaBoost classifiers can be linked by a
“cascade” algorithm to get a more efficient and accurate
classifier, as explained in the next section.
3) Cascade of AdaBoost Classifiers: A cascade of classi-
fiers was constructed in order to improve detection perfor-
mance and reduce processing time. The word cascade means
that the resulting classifier consists of several simple classifiers
Fig. 9. Number of features chosen by each stage for the three classifiers.
applied subsequently to an input window, until the recogni-
tion or rejection of the target takes place. In this paper, the
AdaBoost training algorithm was applied to each stage. The referred in this paper as “Haar-AdaBoost,” “LBP-AdaBoost,”
key principle is that the simple, but efficient, AdaBoost clas- and “HOG-AdaBoost.”
sifiers are arranged at early stages to reject many negative Fig. 9 shows the number of features selected at each stage.
sub-windows and to accept almost all positive sub-windows. The classifier HOG-AdaBoost uses the largest number of weak
The subsequent complicated and strong classifiers aim to classifiers in each stage; the LBP cascade classifier uses less
achieve low FPRs. Algorithm 1 shows the pseudo-code of the than ten features in earlier stages, and around 17 in later stages;
cascade classifier [9]. and the Haar detector chooses from 15 to 56 features. Before
4) Classifier Training: We assume that we need to train a we conducted a comparison, we observed that LBP features
classifier with N stages. The FPR and the detection N rate of were the most powerful and efficient features compared to
the classifier are then: F = N i=1 f i and T = i=1 t i , where HOG and Haar.
F and T are the desired FPR and the accuracy rate, respec-
tively; fi and ti are, respectively, the FPR and the detection VII. C OMPARISON B ETWEEN D ETECTORS
rate of the ith stage classifier. If we intend our classifier to
To evaluate the performance of each detector, two main cri-
achieve a detection rate of 90% (for the whole 20 stages), the
teria were considered: 1) detection accuracy and 2) processing
minimum bit rate of each stage should be: log0.9 20 ≈ 0.995.
time. We tested these detectors on more than 7000 testing still
At the same time, if we define the FPR of each classifier
images and on 13 sequences of video (see Figs. 2 and 16–18)
stage as ≤ 50%, the maximum FPR should be less than
under different weather conditions. The experiments were per-
0.5020 ≈ 0.95 × 10−6 . This is considered to be a very low
formed on an Intel core i5-2450M 2.50 GHz dual-core with
FPR. Note that F and T rates are not the final performance
4 GB of RAM.
rates of our architecture; they are estimated rates obtained from
the training dataset. Recall that in this paper, Haar, MB-LBP
A. Processing Time
and HOG features are trained separately in our image dataset
using the AdaBoost learning algorithm. These classifiers are In order to measure the time required to recognize animals
in a given scene, otherwise known as processing time, we
1 Assume we have N blocks in an image window, and that each block choose continuously playing video instead of a single image to
contains four cells divided into nine bins; the final HOG feature vectors should simulate a real-time situation (as shown in Figs. 2, 17, and 18).
be a dimension of N × 4 × 9 = 36N. We tested each detector (i.e., HOG-AdaBoost, Haar-AdaBoost,
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 10. Processing time per frame: input video size: 320 × 240; total
frame number: 1026; initial (minimal) detection window size: 56 × 40;
scale rate: 1.05.

Fig. 12. Video detection results using (from left to right) Haar-AdaBoost,
HOG-AdaBoost, and LBP-AdaBoost.

Fig. 11. Average processing time of HOG-AdaBoost, Haar-AdaBoost,


MB-LBP-AdaBoost, and HOG-SVM.
Fig. 13. Evaluation results of HOG-AdaBoost, Haar-AdaBoost,
and MB-LBP-AdaBoost) on each video to record the pro- MB-LBP-AdaBoost, and HOG-SVM.
cessing time required to recognize animals. We then plotted
Figs. 10 and 11. Fig. 10 illustrates the specific detection time
types, the number of features (weak classifiers) triggered per
needed for each frame; Fig. 11, however, shows the average
sub-window, and the quantity of final “positive” results.
processing time of each detector, calculated from Fig. 10.
In Fig. 11, we also show the performance of the three detec-
Fig. 11 shows that the Haar-AdaBoost detector consumes
tors compared to the well-known HOG-SVM. The processing
the most time (89.0 ms) compared to the other two detec-
time required for HOG-SVM to detect animals seems to be
tors; conversely, HOG-AdaBoost performs the best (49.1 ms).
higher than that of the three detectors.
However, it performs faster than LBP-AbaBoost by just 8 ms
per frame, when only considering the speed criterion. The
explanation for this finding can be understood below. In fact, B. Detection Accuracy
calculating a single feature (Haar or LBP) is more compli- We ran each classifier several times on positive and nega-
cated than calculating an HOG feature when the HOG block tive images of LADSet, separately, with different scale rates.
size is very small. Moreover, LBP-AdaBoost is faster than We then recorded whether the positive sample was identified,
Haar-AdaBoost for two reasons. First, LBP is a binary feature and how many false positive results were obtained. After sev-
compared to Haar, which is integer-based. Extracting LBP fea- eral experiments, we plotted the detection error tradeoff (DET)
tures is, therefore, less-time consuming than extracting Haar curves on a log-linear coordinate, i.e., the “miss rate versus
features. Second, the total number of features in the LBP cas- false positives per image (FPPI)” curve of each detector, as
cade classifier is much lower than those in the Haar cascade illustrated in Fig. 13.
classifier (see Fig. 9). We note that even HOG-AdaBoost has a Fig. 13 shows that LBP-AdaBoost and Haar-AdaBoost
significant number of false positive results (see Fig. 12) which detectors outperform HOG-AdaBoost, especially for low FPPI
reduces its speed; however, it remains the fastest option. Our values. Meanwhile, the former detector performs slightly bet-
first observation can be summarized as follows. Besides the ter than the latter when the FPPI rate is greater than 10−3 .
“minimum initial window size” and “scale rate,” the speed of In addition, the LBP-AdaBoost detector has a relatively low
the cascaded classifiers is directly related to the image feature miss rate of around 30% compared to the 45% miss rate of
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

MAMMERI et al.: AVC MITIGATION SYSTEM FOR AUTOMATED VEHICLES 9

TABLE II
the Haar-AdaBoost, when the FPPI rate is less than 10−4 . We S OME I NFORMATION C OMPARISONS FOR H AAR , LBP, AND HOG
have also noticed that the Haar-AdaBoost classifier curve can
achieve an extremely low miss rate; at this point, the system
can recognize 7030/7110 positive samples, but has more than
3000 false positive results in 31 712 negative images.
On the other hand, if FPPI is greater than 10−3 ,
LBP-AdaBoost and Haar-AdaBoost methods have compara-
ble miss rates. However, as Haar-AdaBoost consumes much
more time than LBP-Adaboost (as shown in Figs. 10 and 11),
and has a significant number of false positives (see Table II),
LBP-AdaBoost can be considered the best solution, followed
by Haar-AdaBoost. The video detection results of Fig. 12 show
a clear difference in performance between the three detectors.
The LBP-AdaBoost detector wins the comparison in terms of
true positive results and false negative misjudgment. In addi-
tion, the Haar-AdaBoost detector obtains better results than
the HOG-AdaBoost detector, particularly when considering
FPRs. The three aforementioned schemes are also compared
to the well-known HOG-SVM in terms of accuracy of detec-
tion (see Fig. 13). Again, LBP-AdaBoost seems to be the most
accurate scheme compared to the three other detectors. On the
other hand, HOG-SVM only outperforms the HOG-AdaBoost
detector. If we associate the evaluation results of Fig. 13 with Fig. 14. Architecture design of the two-stage animal detection system.
Fig. 9, we find that LBP features are very efficient at animal
detection: only 219 LBP features applied in 18 stages achieve
final detection accuracy rate. The stages’ order is driven by
the highest detection rate.
the accuracy of detection and processing time.
With the two-stage architecture, the whole image is scanned
VIII. P ROPOSED A RCHITECTURE in the first stage to yield ROIs that possibly contain animals.
After a preprocessing step, which involves adapting the size
We want to emphasize that due to the large diversity of ani-
of ROIs to the requirements of the second stage, these resized
mal types and postures, compared to human faces or pedestrian
ROIs are then scanned by a second stage to verify whether
postures, it is hard to develop detectors for all large animals
or not they contain animals. The first stage of our proposed
at once. As a first attempt, we propose a two-stage archi-
scheme can be considered as a detectorİand the second stage
tecture in this paper that can detect some large animals, as
as a “classifier” or “recognizer.”
highlighted in Section IV. We begin this section with a qualita-
tive comparison between the aforementioned detectors, which
clarifies fundamental reasons and advantages of our proposed A. Architecture Design
two-stage architecture for detecting large animals. HOG-SVM We design our system with two main criteria in mind: accu-
and HOG-AdaBoost methods are less beneficial than LBP- racy of detection and processing time. This was performed in
AdaBoost and Haar-AdaBoost algorithms if they are applied order to decrease FPRs and to increase the accuracy of the
to the entire image (in the first stage), for the following rea- system. Moreover, in order to obtain a real-time detection sys-
sons. In Fig. 13, we show that HOG-AdaBoost and HOG-SVM tem, the processing time is used as a second design criterion.
have the highest FPRs compared to Haar-AdaBoost and This enables quick target detection. To achieve these criteria,
LBP-AdaBoost. The application of these two detectors might a two-stage system is suggested in Fig. 14.
have a severe impact on our system. Furthermore, as we are In the first stage, we apply a fast detection algorithm, which
dealing with real-time systems, HOG-SVM has the highest supplies the second stage with a set of ROIs that may contain
processing time compared to the other schemes, as shown in animals and other similar objects (false positive targets). To
Fig. 11. Therefore, only LBP-Adaboost and Haar-Adaboost fulfil the system requirements, the detector of the first stage
can be applied to the entire image (first stage) [6]. should operate simply and quickly, because it is applied to
Now, if we compare between LBP-AdaBoost and Haar- the entire input frame. We chose LBP-AdaBoost at the first
AdaBoost to determine which is best suited as a first stage, stage, since AdaBoost is more quick to reject false targets and
we select LBP-AdaBoost because of its low FPR and notice- less complex than SVM. Moreover, it is shown in Section VII
ably superior processing time, despite its lower true positive and Fig. 13 that LBP-AdaBoost has good performance results
rate. Moreover, we show in Table II, a quantitative compari- compared to other schemes. The first stage LBP-AdaBoost is
son between the three features. Again, we observe that the use applied to the entire image to obtain ROIs that may contain
of LBP allows for a few positive rates. The second stage of animals. Furthermore, this first stage uses a single classifier
our architecture is used to eliminate the false positive results trained by both animal-dataset categories (HTL and HTR).
extracted by the first stage. This can dramatically increase the Conversely, the second stage uses two parallel sub-classifiers.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 15. Example shows that an ROI detected by the first 1st classifier (red
rectangle) does not cover the minimum detectable rectangle size (green rect-
angle) of the 2nd stage classifiers; this is the case unless the ROI is extended
to the blue region.
Fig. 16. Experimental results of large animal detection performed on still
images in snowy, sunny, cloudy, and nighttime conditions.

Each sub-classifier uses HOG-SVM because of its excellent


performance in contour detection, and its strong ability to
eliminate false positive results obtained from the first stage.
Each sub-classifier of the second stage was used to recognize
a category of animals HTL or HTR (see Figs. 3 and 4). This
means that the two sub-classifiers are trained by the HTL and
HTR datasets separately. Moreover, HOG-SVM is used in the
second stage compared to Haar-SVM because the training time
of Haar-based methods is high compared to the method using
HOG, as shown in Table II. This is basically due to the num-
ber of features per window. For instance, in a window of size
28 × 20, the number of Haar features is greater than 200 000,
while there are only 560 HOG features. Nighttime detection
using Haar-AdaBoost also seems to be the poorest, as shown
in Section IX (Fig. 21). This is due to the fact that HOG excels
in contour detection over the Haar-based method. We empha-
size that we have adapted the parameters of HOG, developed Fig. 17. Experimental results performed on video showing a moose crossing
originally for pedestrians, in the context of animal detection. the roadway in evening conditions.
Specifically, the following parameters were selected after a set
of experiments: block size: 2 × 2 cells (16 × 16 pixels); stride
size: 8 × 8; cell size: 8 × 8 and 9 bins. Consequently, a substantial amount of time can be saved if we
According to our experiments, the second stage, resize the 280 × 200 region to a smaller one. Experimentally,
HOG-SVM, usually recognizes ROI rectangles larger the dimensions 70 × 50 were selected as a normalization size.
than those generated by the first stage, LBP-AdaBoost. Figs. 17 and 18 show some frames of large animal detection
The unfortunate result of this was that the ROIs (the red cut from three videos in different weather conditions.
ROIs shown in Fig. 15) obtained from the LBP-AdaBoost
stage were not large enough to be detected by the second
stage HOG-SVM (green rectangle in Fig. 15). To solve this B. Evaluation of the Two-Stage Scheme
problem, a preprocessing step was involved that consisted of We evaluated the performance of the two-stage system,
adapting the size of ROIs to the requirements of the second LBP-AdaBoost/HOG-SVM, following the same criteria used
stage. Many scale sizes were included, varying from 1.05 in Section VII, i.e., detection accuracy and processing time.
to 1.4 using hundreds of images. Their average mean value Hence, two sets of experiments were performed. In the first
(i.e., 1.2 which has the lowest standard deviation and covers set, we compared our LBP-AdaBoost/HOG-SVM system with
the minimum detectable rectangle size by HOG-SVM) was the well-known HOG-SVM and with the aforementioned
adopted; this was shown by the blue rectangle in Fig. 15. one-stage schemes (i.e., HOG-AdaBoost, LBP-AdaBoost, and
The resized ROIs can also be very large, which may result Haar-AdaBoost). After that, in the second set of experiments, a
in extra processing time in the second stage. For example, cascade of these detectors were compared to each other. More
if the ROI is a large moose with a size of 280 × 200 pixels precisely, the LBP-AdaBoost/HOG-SVM combination is com-
and the minimum detection window of the second stage is pared to HOG-SVM/LBP-AdaBoost. The results of processing
set to 56 × 40 for a larger detection range, important pro- time and detection accuracy are regrouped in Figs. 19 and 20,
cessing time will be wasted on the incomplete animal body. respectively.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

MAMMERI et al.: AVC MITIGATION SYSTEM FOR AUTOMATED VEHICLES 11

Fig. 18. Experimental results performed on video showing a horse crossing Fig. 20. Miss rate versus FPPI curves used to demonstrate the two-stage
the roadway in snowy conditions. animal detection system.

Fig. 19. Average detection speed; LBP-AdaBoost cascade HOG-SVM is Fig. 21. Miss rate versus FPPI curves used to demonstrate the two-stage
64.32 ms and HOG-SVM operating as a first stage cascade LBP-AdaBoost is animal detection system (nighttime scenario).
167.45 ms. Video size: 320×240; total frame number: 1026; initial (minimal)
detection window size: 56 × 40; scale rate: 1.05.

HOG-SVM/LBP-AB, and LBP-AdaBoost, have almost sim-


ilar levels of accuracy. Moreover, Haar-AdaBoost detector
The average processing time of each classifier is expressed achieves a moderate level of performance but still does much
in the bar chart of Fig. 19. The left bar represents our two-stage better than HOG-AdaBoost and HOG-SVM detectors.
system processing time; this is only 7.2 ms slower than the sin-
gle LBP-AdaBoost. As explained before, although HOG-SVM
consumes a great deal of time compared with other detectors IX. N IGHTTIME D ETECTION
due to the preprocessing of ROIs, only an extra 7.2 ms is spent It is shown in [27] that AVCs might occur anytime over a
on the second stage. However, exchanging the order of the 24-h period. However, the most dangerous period for AVCs
two stages leads to a completely different result. The cost for seems to be from 6:00 P. M . to 8:00 A . M . The night period
the HOG-SVM/LBP-AdaBoost detector would be 167.45 ms, is characterized by weak illumination and a limited field of
which is slower than the single HOG-SVM detector. view, thus creating a different detection environment than that
On the other hand, we evaluated the performance accuracy of daytime detection. Usually, a thermographic (or infrared)
of our system, and drew the final DET curves. We plot- camera with a wavelength of 14 000 nm is applied to capture
ted the miss rate versus FPPI curves of all one-stage and objects at night. That is, an image is formed using the infrared
two-stage classifiers in Fig. 20, and drew the following con- radiation released from warm-blooded animals [28].
clusion. The two-stage detector LBP-AdaBoost/HOG-SVM The use of LADSet for the purpose of animal detection
outperforms all single detectors, in addition to the two-stage at night is unrealistic. That is, a specific dataset for night-
HOG-SVM/LBP-AdaBoost detector. Particularly in the event time conditions is mandatory. We have created only a small
that the false positive per image rate is around 1 × 10−4 , nighttime-dataset, since it is hard to find such videos/images
LBP-AdaBoost/HOG-SVM has a distinct advantage com- on the Internet. That is, only 654 thermographic positive
pared with other detectors. When the FPPI increases to more images (327 per subclassifier) and 6000 negative images were
than 5 × 10−4 , the three detectors, LBP-AB/HOG-SVM, collected.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Some objects, such as tree branches or underbrush, and some


color backgrounds are automatically filtered out by the infrared
camera, which lowers the FPR. Furthermore, some textural
features of the animals are generalized or undermined; how-
ever, profile features are strengthened. However, we should
note that since our dataset contains only 654 thermographic
positive images, the nighttime detection result is not as good
as that of the daytime conditions. The final result would
be better if we used a large number of positive images.
In Figs. 22 and 23, nighttime animal detection of infrared
video is shown.

X. C ONCLUSION
In this paper, we investigated the problem of AVC mitigation
Fig. 22. Nighttime experimental results performed on infrared still images. using camera-based systems. Three detectors based on HOG,
LBP, and Haar features were used to detect large animals.
These features were trained separately on our dataset, LADSet,
using the well-known AdaBoost algorithm. That is, three dif-
ferent detectors called Haar-AdaBoost, HOG-AdaBoost, and
LBP-AdaBoost were constructed. These detectors were then
assessed and compared with each other. The comparison was
performed in terms of accuracy and processing time. After
that, these detectors were again compared to the well-known
HOG-SVM. Overall, LBP-AdaBoost has shown important
results compared to other schemes. However, a high FPR was
observed. To cope with this issue, we used LBP-AdaBoost and
took advantage of its good detection rate, and we combined
it with HOG-SVM. The latter has shown good performance
when detecting the contours of animals. That is, a new scheme
based on the two-stage strategy was developed. The afore-
mentioned schemes were evaluated and tested in different
illuminated conditions. To perform our experiments, we have
concentrated on the lateral view of a moose since this is the
position it takes when crossing roadways. Other positions,
such as posterior and anterior views and postures at vari-
ous angles, are left to future work. Our two-stage architecture
LBP-AdaBoost/ HOG-SVM has shown a good performance
Fig. 23. Experimental results performed on infrared videos of nighttime. in daytime conditions. However, we have observed that dur-
ing the nighttime, the combination of LBP-AdaBoost and
HOG-SVM has shown limited capabilities.
We then retrained the aforementioned one-stage and two-
stage architectures based on the thermographic animal images, ACKNOWLEDGMENT
using the same strategy as performed with daytime detec- The authors would like to thank the anonymous reviewers
tion. Of the four one-stage detectors, while LBP-AdaBoost for their comments that help to improve the manuscript.
has the best performance in terms of animal detection in
daytime conditions (see Fig. 13), its classification ability at R EFERENCES
nighttime seems to be the worst (with Haar-AdaBoost), as
[1] I. D. Katzourakis, C. F. J. de Winter, M. Alirezaei, M. Corno, and
shown in Fig. 21. More precisely, the Haar-AdaBoost detec- R. Happee, “Road-departure prevention in an emergency obstacle avoid-
tor performs slightly better than LBP-AdaBoost when the FPPI ance situation,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 44, no. 5,
rate is greater than 0.4×10−2 . This is because texture features May 2014.
[2] J. M. Conn, J. L. Annest, and A. Dellinger, “Nonfatal motor-vehicle
(LBP and Haar features) are weakened in nighttime environ- animal crash-related injuries, United States, 2001-2002,” J. Safety Res.,
ments. Conversely, the gradient features are strong enough vol. 35, no. 5, pp. 571–574, 2004.
to be detected at night. That is, HOG-AdaBoost followed by [3] M. A. Sharafsaleh et al., “Evaluation of an animal warning
system effectiveness phase two-final report,” Dept. Transp.,
HOG-SVM have the best detection rate in nighttime conditions Inst. Transp. Studies Univ. Calfornia, Berkeley, CA, USA,
(see Fig. 21). Tech. Rep. UCB-ITS-PRR-2012-12, 2012.
These results are explained as follows. In the context of [4] K. Knapp et al., “Deer-vehicle crash countermeasure toolbox: A decision
and choice resource,” Midwest Regional Univ. Transp. Center, Deer-
animal detection technology, there are two main differences Veh. Crash Inf. Clearinghouse, Univ. Wisconsin-Madison, Madison, WI,
between thermographic and common (visible light) images. USA, Tech. Rep. DVCIC-02, 2004.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

MAMMERI et al.: AVC MITIGATION SYSTEM FOR AUTOMATED VEHICLES 13

[5] A. Mammeri, T. Zuo, and A. Boukerche, “Extending the detection range Abdelhamid Mammeri received the M.Sc.
of vision-based driver assistance systems application to pedestrian pro- degree from the Catholic University of Louvain,
tection system,” in Proc. IEEE Glob. Commun. Conf. GLOBECOM, Louvain-la-Neuve, Belgium, in 2002, and the Ph.D.
Austin, TX, USA, Dec. 2014, pp. 1358–1363. degree from Sherbrooke University, Sherbrooke,
[6] D. Zhou, “Real-time animal detection system for intelligent vehicles,” QC, Canada, in 2010, all in electrical and computer
M.S. thesis, School Elect. Eng. Comput. Sci., Univ. Ottawa, Ottawa, engineering.
ON, Canada, 2014. He is a Research Associate with DIVA Strategic
[7] Z. Debao, W. Jingzhou, and W. Shufang, “Countour based HOG deer Research Network, Ottawa University, Ottawa,
detection in thermal images for traffic safety,” in Proc. Int. Conf. Image ON, Canada. His current research interests include
Process. Comput. Vis. Pattern Recognit., Las Vegas, NV, USA, Jul. 2012, visual sensor networks, wireless ad hoc networks,
pp. 1–6. vehicular networks, energy minimization schemes
[8] T. Burghardt and J. Calic, “Real-time face detection and tracking of ani- for visual sensor networks, object detection through vision applied on
mals,” in Proc. 8th Seminar Neural Netw. Appl. Elect. Eng. (NEUREL), Vehicular Adhoc Networks, etc. He has extensively published in highly
Belgrade, Serbia, Sep. 2006, pp. 27–32. ranked international conferences and journals in the above areas.
[9] P. Viola and M. Jones, “Rapid object detection using a boosted cascade Dr. Mammeri was a recipient of the FQRNT Quebec Scholarship Award
of simple features,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. at Post-Doctorate-Level in 2012. He has served as a Technical Program
Pattern Recognit., vol. 1. Kauai, HI, USA, Dec. 2001, pp. 511–518. Committee Member for several conferences, including the IEEE Vehicular
[10] D. Ramanan, D. A. Forsyth, and K. Barnard, “Building models of ani- Technology Conference 2013, the IEEE Local Computer Networks 2013, and
mals from video,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 8, ACM Modeling, Analysis and Simulation of Wireless and Mobile Systems
pp. 1319–1334, 2006. 2013.
[11] H. Cho, P. E. Rybski, A. Bar-Hillel, and W. Zhang, “Real-time pedestrian
detection with deformable part models,” in Proc. IEEE Intell. Veh. Symp.
IV, Alcalá de Henares, Spain, Jun. 2012, pp. 1035–1042.
[12] T. Burghardt, B. Thomas, P. J. Barham, and J. Calic, “Automated visual
recognition of individual African penguins,” in Proc. 5th Int. Penguin
Conf., Ushuaia, Argentina, Sep. 2004. Depu Zhou is currently pursuing the master’s
[13] C. P. Papageorgiou, M. Oren, and T. Poggio, “A general framework for degree in electrical and computer engineering with
object detection,” in IEEE 6th Int. Conf. Comput. Vis., Mumbai, India, the School of Electrical Engineering and Computer
Jan. 1998, pp. 555–562. Science, University of Ottawa, Ottawa, ON, Canada.
[14] W. Zhang, J. Sun, and X. Tang, “From tiger to panda: Animal head His current research interests include video
detection,” IEEE Trans. Image Process., vol. 20, no. 6, pp. 1696–1708, streaming over vehicular networks.
2011.
[15] W. Zhang, J. Sun, and X. Tang, “Cat head detection—How to effectively
exploit shape and texture features,” in Proc. ECCV, vol. 4. Marseille,
France, 2008, pp. 802–806.
[16] M. Zeppelzauer, “Automated detection of elephants in wildlife video,”
EURASIP J. Image Video Process., vol. 46, no. 1, pp. 1–44, 2013.
[17] P. Khorrami, J. Wang, and T. Huang, “Multiple animal species detection
using robust principal component analysis and large displacement optical
flow,” in Proc. Workshop Vis. Observation Anal. Animal Insect Behav.
(VAIB), Tsukuba, Japan, 2012.
[18] N. Dalal and B. Triggs, “Histograms of oriented gradients for human Azzedine Boukerche received the M.Sc. and
detection,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Ph.D. degrees in computer science from McGill
Recognit. (CVPR), vol. 1. San Diego, CA, USA, Jun. 2005, pp. 886–893. University, Montreal, QC, Canada.
[19] R. M. Nowak, Walker’s Mammals of the World. Baltimore, MD, USA: He is a Full Professor and the Canada Research
Johns Hopkins Univ. Press, 1999, pp. 1081–1091. Chair Tier-1 with the University of Ottawa, Ottawa,
[20] J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, “Man vs. computer: ON, Canada. He was a Faculty Member with the
Benchmarking machine learning algorithms for traffic sign recognition,” University of North Texas, Denton, TX, USA. He
Neural Netw., vol. 32, pp. 323–332, Aug. 2012. is the Scientific Director of NSERC-DIVA Strategic
[21] J. Ge, Y. Luo, and G. Tei, “Real-time pedestrian detection and tracking Research Network and the Director of PARADISE
at nighttime for driver-assistance systems,” IEEE Trans. Intell. Transp. Research Laboratory with Ottawa University. He
Syst., vol. 10, no. 2, pp. 283–298, Jun. 2009. was a Senior Scientist with the Simulation Sciences
[22] D. Chen, X. B. Cao, H. Qiao, and F.-Y. Wang, “A multiclass classifier Division, Metron Corporation, San Diego, CA, USA. He was with the
to detect pedestrians and acquire their moving styles,” in Proc. IEEE JPL/NASA-California Institute of Technology, Pasadena, CA, USA, for one
Int. Conf. Intell. Security Informat., San Diego, CA, USA, May 2006, year, where he contributed to a project centered about the specification and
pp. 758–759. verification of the software used to control interplanetary spacecraft operated
[23] L. Zhang, R. Chu, S. Xiang, S. Liao, and S. Z. Li, “Face detection based by JPL/NASA Laboratory. His current research interests include vehicular
on multi-block LBP representation,” in Advances in Biometrics. Berlin, networks, sensor networks, mobile ad hoc networks, mobile and pervasive
Germany: Springer, 2007, pp. 11–18. computing, wireless multimedia, performance evaluation and modeling
[24] R. Lienhart and J. Maydt, “An extended set of Haar-like features for of large-scale distributed systems, distributed computing, and large-scale
rapid object detection,” in Proc. Int. Conf. Image Process., vol. 1. distributed interactive simulation. He has published several research papers
Rochester, NY, USA, 2002, pp. I-900–I-903. in the above areas.
[25] P. Viola and M. Jones, “Robust real-time object detection,” Int. J. Dr. Boukerche was a recipient of the Ontario Distinguished
Comput. Vis., vol. 4, no. 2, pp. 51–52, 2001. Researcher Award, the Premier of Ontario Research Excellence Award,
[26] T. Ojala, M. Pietikainen, and D. Harwood, “Performance evaluation the G. S. Glinski Award for Excellence in Research, the IEEE Computer
of texture measures with classification based on kullback discrimina- Society Golden Core Award, the IEEE CS-Meritorious Award, the University
tion of distributions,” in Proc. 12th IAPR Int. Conf. Pattern Recognit. of Ottawa Award for Excellence in Research, and several best research
Conf. A Comput. Vis. Image Process., vol. 1. Jerusalem, Israel, 1994, paper awards for his work on vehicular and sensor networking and mobile
pp. 582–585. computing. He is an Editor of three books on mobile computing, wireless
[27] Tardif & Associates Inc., “Collisions involving motor vehicles and large ad hoc, and sensor networks. He serves as an Associate Editor for several
animals in Canada,” Final report, Transp. Canada Road Safety Dir., IEEE T RANSACTIONS and ACM journals, as well as the Steering Committee
Canada, Ottawa, ON, Canada, Mar. 2003, p. 44. Chair for several IEEE and ACM international conferences. He is a Fellow of
[28] P. Klocek, Handbook of Infrared Optical Materials. New York, NY, the Engineering Institute of Canada, the Canadian Academy of Engineering,
USA: Marcel Dekker, 1991. and the American Association for the Advancement of Science.

You might also like