You are on page 1of 6

IET Intelligent Transport Systems

Research Article

Using GLCM features in Haar wavelet ISSN 1751-956X


Received on 30th April 2018
Revised 13th March 2019
transformed space for moving object Accepted on 15th March 2019
E-First on 5th April 2019
classification doi: 10.1049/iet-its.2018.5192
www.ietdl.org

Nadia Kiaee1, Elham Hashemizadeh2 , Nima Zarrinpanjeh3


1Department of Computer Engineering, Karaj Branch, Islamic Azad University, Karaj, Iran
2Department of Mathematics, Karaj Branch, Islamic Azad University, Karaj, Iran
3Department of Civil Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran

E-mail: hashemizadeh@kiau.ac.ir

Abstract: This article proposes an integrated system for segmentation and classification of two moving objects, including car
and pedestrian from their side-view in a video sequence. Based on the use of grey-level co-occurrence matrix (GLCM) in Haar
wavelet transformed space, the authors calculated features of texture data from different sub-bands separately. Haar wavelet
transform is chosen because the resulting wavelet sub-bands are strongly affecting on the orientation elements in the GLCM
computation. To evaluate the proposed method, the results of different sub-bands are compared with each other. Extracted
features of objects are classified by using a support vector machine (SVM). Finally, the experimental results showed that use of
three sub-bands of wavelets instead of two sub-bands is more effective and has good precision.

1 Introduction variation and can filter out the noise in multi-scale mode by means
of the same pattern [11].
Automatic image analysis systems for detection, tracking, and Wen and et al. in [12] presented a comprehensive system for the
classification of moving objects have been developed within the extraction of parked vehicle information, which include vehicle
different fields of computer vision such as computer conceptual recognition, localisation, classification, and change detection to
relations with humans, robotics and medical imaging, smart video monitor the parking. This system presents a variable vehicle model
compression, and automatic video analysis [1, 2]. This issue is one to fit the vehicle hypothesis for accurate localisation, and each
of the most important problems in image processing and machine parameter is used as target visualisation. In [13], Chandra Mouli
vision. Displaying the location of an object, tracking it in the and et al. presented a pedestrian method to overcome complex
sequence of video images and classifying objects with high background, which is based on SVM and human visual
accuracy has its own specific purpose and can help in the precise mechanism. According to human visual attention, first mean and
analysis and monitoring of traffic conditions or the retrieval of a Laplacian of Gaussian filter are applied to reduce the background
specific object from a video sequence [3–5]. noise but it increases the contrast relation between the foreground
In recent years, many studies have been done for detection and and background. In addition, to remove the remaining noise,
classification of cars and pedestrians. They often used the features morphological process on filtered image is used. Then, pedestrian
like Haar wavelet, histograms of oriented gradients, local binary and non-pedestrian are obtained by local threshold segmentation on
patterns, and grey-level co-occurrence matrix (GLCM). There are processed morphological image. In order to detect pedestrian truly
different methods for video surveillance. In [6], Juan and et al. and reduce the false detection rate, SVM is used.
have classified four moving objects, including pedestrians, cars, This paper provides a two-class object detection and
motorcycles, and bicycles by using local shape features and the classification system for using in a video sequence with a fixed
histogram of oriented gradients in Haar wavelet transform. These camera. The aim of this paper is to improve classification rate for
features improve classification performance and provide smaller these two types of moving objects by proposing a new set of
data dimensionality. In [7], Moreno and et al. present a method features in this field. The proposed method classified objects by
based on Histogram to extract features of a specific range of the using GLCM features in Haar wavelet transformed space with
moving objects and uses support vector machine (SVM) for SVM classifier.
classification in the crowded traffic scene. Kumar and et al. have The remainder of this paper is organised as follows: Section 2
classified the flying birds in the sky and the airplane on the runway introduces the segmentation method that is used to detect moving
by using three methods of feature extraction like Haar wavelet, co- objects, Section 3 explains the feature extraction process. Section 4
occurrence matrix based on Haar wavelet and co-occurrence matrix presents classification process. While Section 5 presents the
in [8]. They concluded that co-occurrence matrix based on Haar experimental results of the proposed classification approach.
wavelet leads to get better results in a short time and increase the Section 6 includes the conclusions of this paper.
classification accuracy. Papageorgiou and et al. have extracted
modified Haar wavelet features from Haar wavelet transformed
space and provided the results for the vehicle and face detection in 2 Proposed method
[9]. Triggs and et al. in [10] introduced the histogram features in 2.1 Background subtraction
Haar wavelet transformed space. In recent years, the histogram
feature is used for detection and classification issues such as Video data which recorded by a surveillance camera are converted
vehicle detection and facial recognition. Various features have their into frames or images. The conventional techniques for extracting
own power to describe an object. Histogram stored local shape moving objects are background subtraction. In this method, the
features of considered objects based on the image direction and moving object is obtained by subtracting the input frame from the
derivatives size. [Local Binary Pattern.] LBP is a texture descriptor background image. In order to guarantee the motion detection and
that is computationally efficient, constant in the grey-level achieve better results, we can overcome and reduce the changes
problems of brightness by updating the background image. In this

IET Intell. Transp. Syst., 2019, Vol. 13 Iss. 7, pp. 1148-1153 1148
© The Institution of Engineering and Technology 2019
Fig. 1  Block diagram of sub-band's feature extraction by applying wavelet

Fig. 2  Haar wavelet transform is applied on an image in the three spectrums


(a) Red, (b) Blue and, (c) Green

case, the exact background image will be created. Several that can describe the image and the objects in the image well.
background estimation algorithms have been proposed. The Fig. 1 shows the flowchart of the grey-level co-occurrence matrix–
background model of our proposed method is implemented by the Wavelet (GLCM-W) feature extraction process. Pre-processing
Gaussian mixture model [12]. Actually, we have used The usually prepares the image for segmentation and handles the
ForegroundDetector function in our Matlab code, which compares changes in size and brightness of objects in an image. The Haar
a colour video frame to a background model to determine whether wavelet transformation is proposed to improve the
individual pixels are part of the background or the foreground. It distinguishability of extracted features in two classes of objects. So
computes a foreground mask. Then by using background first, the first-level Haar wavelet transformation is carried out.
subtraction, we detect foreground objects in each frame. Then, the GLCM features are gained from the area of each object
ForegroundDetector computes and returns a foreground mask using in each sub-band separately (see Fig. 2) [13, 14].
the Gaussian mixture model. The parameters that we set to this
function are the NumGaussians to 3 and NumTrainingFrames to 2.2.1 Pre-processing: To remove the undesirable background
50. NumGaussians is the number of Gaussian mixture model and noise and fill the empty space of the foreground mask, opening and
the NumTrainingFrames is the number of initial video frames for closing of morphological operations are used (Fig. 3).
training background model. As the main objective of this paper is moving objects
classification by using the extracted features. Hence, we extract the
2.2 Feature extraction features with the information, which obtained from the object.
Although the two-dimensional image after morphological
Feature extraction is one of the most important and challenging operations has no appropriate information, but we use it to find the
issues in image processing. In this process, we extract the features coordinates of the moving object in the colour frames. Therefore,
IET Intell. Transp. Syst., 2019, Vol. 13 Iss. 7, pp. 1148-1153 1149
© The Institution of Engineering and Technology 2019
Fig. 3  Effect of morphological operation in removing background noise output
(a), (b) Images which involve noise, (c), (d) Images with noise reduction

Fig. 4  Rectangular window created around the


(a) Car and, (b) Pedestrian

the background and foreground labels refer to the pixels with the
value of zero and one, respectively. After the labelling step, the
maximum and minimum coordinates of the rows and columns pixel
with one value is applied to colour frames. In this case, we find the
position of moving objects in the colour frame by using the
coordinates, which obtained from the two-dimensional frame.
A window rectangle around any detected object is created as
shown in Fig. 4.

2.2.2 Haar wavelet transformed space: Alfred Haar, Hungarian


mathematician in 1909, first proposed the wavelet idea of its
simplest form. The only problem with this wavelet is its
discontinuity and uncertain nature [16, 17]. Haar mother's wavelet
Fig. 5  Decomposition of four sub bands
is defined as follows:

1 apply Haar wavelet transform in three-colour range of red, green,


1 0≤t≤ and blue spectrums, separately. Each object will pass from two
2
banks of high-pass and low-pass filters. Thus, each image in each
Ψt = 1 (1) spectrum decomposed to four sub-bands as shown in Fig. 5. A total
−1 ≤t≤1
2 of 12 images is achieved. While other sub-bands contain more
0 Otherwise detailed information in a different orientation, LL sub-band
provides an approximate image. Therefore, we ignored LL sub-
The scaling function Φ t of the Haar wavelet is: band and just used the other sub-bands for feature extraction. In
fact, the low-pass filter corresponds to an averaging operator that
1 0≤t≤1 summarises the useless signal information and high-pass filters
Φt = corresponds to the difference operator that provides a summary of
0 otherwise
the detailed signal information. Since applying wavelet is affected
on the image texture, we use the GLCM to extract features. In this
The scaling function can decompose an image into four sub-
matrix, as shown in Fig. 7, the events of pixels are checked in a (2,
images. (Fig. 5).
0) row. After applying wavelet on each spectrum, nine GLCMs are
Applying wavelet to an image: In discrete wavelet transform,
extracted from three sub-bands separately [21].
the signal of an image is decomposed into two sub-sections
The feature vector, which consists of four components, energy,
through low- and high-pass filters at each step. The low-pass filter
correlation, contrast, and homogeneity, is computed in each matrix
corresponds to the averaging operator, which summarises the
by Table 1. Then, 36 features of each object stored in the feature
useless information of the signal. The high-pass filter, which also
matrix. According to the inappropriate information of LL sub-
corresponds to the differential operator, summaries the detailed
band, we do not consider it. Therefore, we extract six GLCMs from
signal information. A two-dimensional conversion is the result of
two sub-bands separately, as shown in Fig. 2. In this case, 24
two separate one-dimensional transformations. According to Fig. 6,
features of each object stored in the feature matrix as shown in
the low-pass filter coefficients are stored in the left-hand side of the
Fig. 1.
matrix and the high-pass filter coefficients are stored at the right-
hand of the matrix. Due to the reduction, the size of the entire
transformed image is the same size as the original image. Then, we 2.3 Classification
filter the image along the x and y axes and reduce it with two The purpose of using SVM is to separate feature vectors of a p-
factors at the end. After applying the filter along x and y axes, the dimension space into a higher p-1 dimensional feature space and
image is divided into four HH, LH, HL, LL sub-bands [18–20]. then creating a hyper-plane with maximum margin. SVM decides
based on the hyper-plane with maximum margin. Equation (1) is
2.2.3 GLCM features in the Haar-wavelet transformed defined hyper-plane with maximum margin. X is the input
space: In order to achieve further details in the input image, we

1150 IET Intell. Transp. Syst., 2019, Vol. 13 Iss. 7, pp. 1148-1153
© The Institution of Engineering and Technology 2019
variables, W is the coefficients vector, and b is the constant, which pedestrian and otherwise as cars. There are a total of N training
determine the hyper-plane. samples for each class of the pedestrians and cars. Suppose that all
the relevant data points belong to one class. We want to assign the
W tX + b = 0 (2) X input vector to one of −1, +1 classes. Therefore, if the sign of the
function in (1) is positive, we assign +1 to x and otherwise −1 will
Here, features are energy, homogeneity, conflict, cohesion, and assign to X. For training the SVM, the favourite output to train N
classes include pedestrian and cars. First, we create the labels samples is ‘1’, which belongs to the pedestrian class and the
matrix by assigning a label to each row of the features object. favourite output is ‘−1’ for cars. It should be mentioned that the
Thus, using cross-validation method, 70% of features matrix are main subject here is to select features that can help us to improve
considered for training and the rest of them as test data. In the test our problem solving and we use the Rbf kernel function of second
phase, with the arrival of each new object, SVM predicts by using poly-order. To conclude, the reason for choosing this classifier is
the features of input object and their similarity. If the value of that we had two classes and fortunately, SVM was well-suited to
wx + b is belonged to y > 0, it classified as pedestrian and our problem.
otherwise as car (Fig. 8). In other words, if there is similarity with
pedestrian features class, the desired object is classified as a

Fig. 6  Original image and apply a two-level wavelet transform

Fig. 7  Calculation of the co-occurrence matrix in row (2,0)

Table 1 GLCM texture statistics defined


Row Features Formula
1 energy ∑ ∑ cij2
i j
2 entropy − ∑ ∑ cijlog2cij
i j
3 contrast ∑ ∑ i − j 2cij
i j
4 homogeneity cij
∑∑ 2
i j 1+ i− j
5 correlation ∑i ∑ j i − μx j − μy
σ xσ y
6 average ∑ ∑ icij
i j

Fig. 8  Flowchart of pedestrian and car classification using Support Vector Machine

IET Intell. Transp. Syst., 2019, Vol. 13 Iss. 7, pp. 1148-1153 1151
© The Institution of Engineering and Technology 2019
Fig. 9  Collected training samples

Table 2 Evaluation of two and three sub-bands


Proposed method Features number Accuracy Precision Sensitivity Specificity
in three sub-bands 36 0.9875 0.9867 1 0.8333
on two sub-bands 24 0.8125 0.9360 0.8559 0.2778

Fig. 10  Comparison of two and three sub-bands in red, green and blue spectrums. Features number = 12 in three sub-bands and features number = 8 in two
sub-bands

3 Results and discussion problem on this hardware is promising good performance in real
issues.
To evaluate the proposed method, the collected data in the field of
urban transport are provided with a fixed camera from their side
view (Fig. 9). Then, a human labels the data so the standard data 3.1 Performance of the proposed method in different sub-
are available. The images of car and pedestrian are captured by bands
Cannon A1400 camera, with 640 × 480 size of each frame and 30 Table 2 shows the accuracy, sensitivity, specificity for each of the
frames in each second of imaging rate. MATLAB programming three and two sub-bands. Fig. 10 shows comparison of the number
environment is introduced for the implementation and testing the of features and accuracy for each set of extracted features. It
proposed system. This programming language has the ability and illustrates that by increasing the number of features, which
flexibility in working with video and images in various fields such extracted from different sub-bands, the accuracy of the
as image processing and computer vision. The program is run on a classification will increase.
personal computer with Core-i7 processor, 2.2 GHz frequency, and Since the sensitivity of surveillance cameras in different colour
8 GB Memory. This hardware is not much stronger than what is spectrums are different, we have examined the results of extracted
used in real applications in intelligent systems. So solving the features from each frame by applying a wavelet in red, green, and

1152 IET Intell. Transp. Syst., 2019, Vol. 13 Iss. 7, pp. 1148-1153
© The Institution of Engineering and Technology 2019
Table 3 Evaluation of two and three sub-bands in three red, green and blue spectrums separately
Proposed method Features numbers Spectrums Accuracy Precision Sensitivity Specificity Recall
three- sub bands 12 red 0.9858 0.9955 1 0.9444 1
12 green 0.9958 0.9955 1 0.9444 1
12 blue 0.9875 0.9867 1 0.8333 1
two-sub bands 8 red 0.9942 0.9942 1 0.9333 1
8 green 0.9917 1 0.9910 1 0.9910
8 blue 0.9917 0.9955 0.9955 0.9944 0.9955

blue separately, as shown in Table 3 [22, 23]. Fig. 10 shows the [5] Shah, M., Deng, J.D., Woodford, B.J.: ‘Video background modeling: recent
approaches, issues and our proposed techniques’, Mach. Vis. Appl., 2014, 25,
results of the extracted features in three colour spectrums in a bar (5), pp. 1105–1119
graph that the blue colour range is leading to decrease the [6] Liang, C.-W., Juang, C.-F.: ‘Moving object classification using local shape
classification accuracy [24]. and HOG features in wavelet-transformed space with hierarchical SVM
classifiers’, Appl. Soft Comput., 2015, 28, pp. 483–497
[7] Zangenehpour, S., Miranda-Moreno, L.F., Saunier, N.: ‘Automated
4 Conclusion classification based on video data at intersections with heavy pedestrian and
bicycle traffic: methodology and application’, Transp. Res. C, Emerg.
This paper classifies two types of moving objects using GLCM Technol., 2015, 56, pp. 161–176
features in Haar wavelet transformed space with SVM in intelligent [8] Mohanaiah, P., Sathyanarayana, P., GuruKumar, L.: ‘Image texture feature
monitoring systems with fixed cameras. First, according to the extraction using GLCM approach’, Int. J. Sci. Res. Publ., 2013, 3, (5), p. 1
[9] Papageorgiou, C.P., Oren, M., Poggio, T.: ‘A general framework for object
characteristics of wavelet such as de-noising and precise image detection’. Sixth Int. Conf. on Computer Vision, Bombay, India, 1998
retrieval, wavelet has been applied to the area of each object. As [10] Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’.
LL sub-band contains noise and other useless information, the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition,
results have been examined by removing this sub-band and using 2005. CVPR 2005, Kolkata, India, 2005
[11] Nanni, L., Lumini, A., Brahnam, S.: ‘Survey on LBP based texture
the two and three sub-bands of the wavelet. Information of texture descriptors for image classification’, Expert Syst. Appl., 2012, 39, (3), pp.
image has been computed from GLCM. The GLCM-W feature, 3634–3641
which extracts the GLCM features from the Haar wavelet- [12] Zivkovic, Z.: ‘Improved adaptive Gaussian mixture model for background
transformed space, shows the advantages of desired classification subtraction’. Proc. of the 17th Int. Conf. on Pattern Recognition, 2004. ICPR
2004, Cambridge, UK, 2004
performance. The location information of the area of each object is [13] Mokji, M., Bakar, S.A.: ‘Gray level co-occurrence matrix computation based
investigated in three red, green, and blue colour spectrums on haar wavelet’ (IEEE, London, UK, 2007)
separately. It can be concluded that for three sub-bands of wavelet [14] Guo, L., Cui, H., Li, P., et al.: ‘Urban road congestion recognition using
decomposition in red and green spectrums, the features were more multi-feature fusion of traffic images’, J. Artif. Intell., 2016, 1, pp. 20–24
[15] Dheekonda, R.S.R., Panda, S.K., Khan, N., et al.: ‘Object detection from a
outstanding instead of determining them in two sub-bands of blue vehicle using deep learning network and future integration with multi-sensor
one. In addition to these, the introduced algorithm is optimised for fusion algorithm’. SAE Technical Paper, 2017
applications that the histogram colour variation of each object is [16] Kaur, H., Mittal, R., Mishra, V.: ‘Haar wavelet approximate solutions for the
low. It is also recommended to compare the results to other similar generalized Lane–Emden equations arising in astrophysics’, Comput. Phys.
Commun., 2013, 184, (9), pp. 2169–2177
methods as performed in pioneering studies presented in Varma et [17] Fard, M.R., Mohaymany, A.S., Shahri, M.: ‘A new methodology for vehicle
al. [24] and Fu et al. [25], which we did not practice in this trajectory reconstruction based on wavelet analysis’, Transp. Res. C, Emerg.
research due to our focus on SVM and wavelet. This could be an Technol., 2017, 74, pp. 150–167
interesting field of research for future work. [18] Agarwal, S., Verma, A., Singh, P.: ‘Content based image retrieval using
discrete wavelet transform and edge histogram descriptor’. 2013 Int. Conf. on
Information Systems and Computer Networks (ISCON), Bali, Australia, 2013
5 Acknowledgments [19] Tang, Y., Zhang, C., Gu, R., et al.: ‘Vehicle detection and recognition for
intelligent traffic surveillance system’, Multimedia Tools Appl., 2017, 76, (4),
The authors would like to thank the anonymous reviewers of this pp. 5817–5832
paper for their careful reading, constructive comments and [20] Hashemizadeh, E., Rahbar, S.: ‘The application of legendre multiwavelet
functions in image compression’, J. Mod. Appl. Stat. Methods., 2016, 15, (2),
suggestions which have improved the paper very much. pp. 510–525
[21] Kamkar, S., Safabakhsh, R.: ‘Vehicle detection, counting and classification in
6 References various conditions’, IET Intell. Transp. Syst., 2016, 10, (6), pp. 406–413
[22] Huang, X., Zhang, L.: ‘An SVM ensemble approach combining spectral,
[1] Hu, W., Tan, T., Wang, L., et al.: ‘A survey on visual surveillance of object structural, and semantic features for the classification of high-resolution
motion and behaviors’, IEEE Trans. Syst. Man Cybern. C, Appl. Rev., 2004, remotely sensed imagery’, IEEE Trans. Geosci. Remote Sens., 2013, 51, (1),
34, (3), pp. 334–352 pp. 257–272
[2] Zhang, R., Liu, X., Hu, J., et al.: ‘A fast method for moving object detection [23] Kiaee, N., Hashemizadeh, E., Zarrinpanjeh, N.: ‘Evaluation of moving object
in video surveillance image’, Signal. Image. Video. Process., 2017, 11, (5), detection based on various input noise using fixed camera’, ISPRS-Int. Arch.
pp. 841–848 Photogramm. Remote Sens. Spatial Inf. Sci., 2017, pp. 151–154
[3] Kavitha, C., Ashok, S.D.: ‘A new approach to spindle radial error evaluation [24] Varma, S., Sreeraj, M.: ‘Object detection and classification in surveillance
using a machine vision system’, Metrol. Meas. Syst., 2017, 24, (1), pp. 201– system’. 2013 IEEE Recent Advances in Intelligent Computational Systems
219 (RAICS), Trivandrum, 2013, pp. 299–303
[4] Wen, X., Shao, L., Fang, W., et al.: ‘Efficient feature selection and [25] Fu, Y., Ma, D., Zhang, H., et al.: ‘Moving object recognition based on SVM
classification for vehicle detection’, IEEE Trans. Circuits Syst. Video and binary decision tree’. 2017 6th Data Driven Control and Learning
Technol., 2015, 25, (3), pp. 508–517 Systems (DDCLS), Chongqing, 2017, pp. 495–500

IET Intell. Transp. Syst., 2019, Vol. 13 Iss. 7, pp. 1148-1153 1153
© The Institution of Engineering and Technology 2019

You might also like