You are on page 1of 4

2016 International Symposium on Computer, Consumer and Control

Real-time Driver Drowsiness Detection System Based on PERCLOS and Grayscale Image
Processing

Jun-Juh Yan1, Hang-Hong Kuo2, Ying-Fan Lin3, Teh-Lu Liao3,*


1.
Department of Computer and Communication, Shu-Te University, Kaohsiung 824, Taiwan, R.O.C
2.
NeoVictory Technology Co., Ltd., Tainan 710, Taiwan, R.O.C
3.
Department of Engineering Science, National Cheng Kung University, Tainan 701 Taiwan, R.O.C.
* Corresponding author

Abstract: This study develops a real-time drowsiness detection system based on grayscale image
processing and PERCLOS to determine if the driver is fatigued. The proposed system comprises three
parts: first, it calculates the approximate position of the driver’s face in grayscale images, and then uses
a small template to analyze the eye positions; second, it uses the data from the previous step and
PERCLOS to establish a fatigue model; and finally, based on the driver’s personal fatigue model, the
system continuously monitors the driver’s state. Once the driver exhibits fatigue, the system alerts the
driver to stop driving and take a rest.

Keywords: Fatigue model, Template matching, Grayscale image processing, PERCLOS

1. Introduction 2. Material and Methods


Fatigue reduces a driver’s attention, System Architecture
especially when driving long distances or at The system features three main parts. First, the
night, when reaction ability declines. The fatigue system collects the necessary data to locate the
effect is a common, yet dangerous, driving driver’s face region and eyes; second, the system
experience that may even include a few seconds analyzes the captured data and collects new data
of shallow sleep. In response to this problem, from the images to set the fatigue model. Once
several automobiles have begun installing the model is built, the system then starts to
onboard computers in their cars featuring a detect the driver’s physical state, and if the
driver drowsiness detection system. driver shows signs of fatigue, an alarm sounds to
Currently, detecting driver drowsiness warn the driver.
systems can be classified into two categories: 2. 1: Eyes Location
contact types and contactless types. Contact After an image is captured, the image
types include measuring the pulse or body pre-processing is performed to extract useful
temperature, while detecting and analyzing the information. The first part of the system includes
driver’s facial expression is a standard data collection and facial region detection. A
contactless type [1]. However, the products flowchart detailing this procedure is given in
requiring physical contact are inconvenient for Figure 1.
drivers and easily forgotten. On the other hand, Image Edge
Capture Detection
most commercially available contactless driver
Image Noise
drowsiness detection systems and related Data
Collection
Preprocessing Reduction

products use visible light to achieve face Facial Region Data Collection Histogram
detection [2-4]. During the day, they do not Detection
Template
affect drivers; nevertheless, at night, since these Quick Sort Matching

systems need to gather the skin-color region,


they are likely to increase the burden on driver’s Median Filter

eyes since extra light was projected on the Calculated


Region
driver’s face to properly obtain color images.
The aim of this study focuses on building a Figure 1. Flow chart of eyes detection.
real-time model, using gray-scale image based
information and a template of particular region Image Preprocessing
in human face, to rapidly detect the driver’s The edge detection is to find edge pixels in
current state. The weak point is the gray-scale gray-scale image; edge pixels are the positions
images are hard to analyze the skin region rather where the gray level suddenly changes. This
than color images [5,6]. However, using study uses the Sobel Operator for edge detection
gray-scale image means that the system does not since it can locate approximate edges in the
need to project visible illumination light on the shortest period of time. The amount of edges and
driver’s face. As a result, it will not affect the the center of the face could be computed as
driver’s eyes or cause eyestrain due to the follows:
illumination light.

978-1-5090-3071-2/16 $31.00 © 2016 IEEE 242


243
DOI 10.1109/IS3C.2016.72
Authorized licensed use limited to: VIT University. Downloaded on October 06,2023 at 17:17:46 UTC from IEEE Xplore. Restrictions apply.
^j H ` In Figures 4, result of template matching is
50
Hi ¦h i j ,50 d i d 309 , CF j max ^H i `
j 50
given, where c is defined as the center of the two
eyes. This template could be quickly and
where  is a point on the x-axis,  is the efficiently locate the position of the human eyes.
histogram value of edge pixels at .  is a Comparing with a whole face template, it uses a
number, which is the amount of histogram value, smaller template which has high calculation
while  is the face center position, a point on speed but also has high error rate.
the x-axis. Figure 2 shows the edge detection
image and histogram image.
It can be seen that both sides of the face
have the maximum amount of edge pixels. As a
result, through the histogram of edge pixels, the
face region can be located. The maximum values
of the left and right sides of the point are defined
as point L and R, respectively. The distance
between L and R is defined as w, the width of the
face. L, R and w could be calculated as follows:
L ^ j max ^h ``  (h z 0, j d i d CF )
i i

R ^ j max ^h ``  (h z 0, CF d i d j)
i i : Center of two eyes
w RL Figure 4. Template matching.
where i, j, L, R and w are points on the x-axis.
ġ ġ ġ ġ ġ ġ ġ Quicksort and Median Filter
w Following the previous steps, each image
needs only to retain two data sets: c, the center
position of the eyes, and w, the face width.
However, these data may contain some errors;
therefore, the two kinds of data need to be sorted
to identify the erroneous information. w and c
can together compose a sequence. After sorting,
it can be easily organized from small to large.
A median filter can remove the very large
or small data from the c and w sequences;
therefore, it can enhance the accuracy of the
results. The previous steps lead to the eyes
region being located.

2.2: Fatigue Modeling


L CF R
In this step, the system needs to collect data
Figure 2. Image after preprocessing. regarding the driver’s physical state and then
Template Matching perform an analysis to build a fatigue model.
In general, a face could be separated into Figure 5 shows the process in a flowchart.
five parts in the horizontal axis; two eyes locate Binarization and Quicksort
in the second and fourth parts. Moreover, the The next step is to binarize the region of
aspect ratio of an eye is about 2:3. For a human concern, as in Figure 6. It can be seen that the
face, the edge pixels are concentrated in hair, iris region and eyelashes are darker than the skin
eyes, nose, and mouth. Using the characteristics and whites of the eyes. Hence, the eyelids’
of the eyes, nose and cheeks, a unique template closure degree will directly affect the amount of
distinct from the other portions can be black pixels in each image. It is intuitive that
established, as in Figure 3. when the eyelids are closed, the number of black
/5 /5 /5 pixels will be much less than when they are
open.
The system can obtain a variety of
/8 different eyes-state images since the webcam
captures images continuously, including when
/8
the eyes are open, closed, or semi-closed.
Moreover, the quantity of black pixels of each
3/5 image is recorded in order to analyze the state
the eyes are in.
Figure 3. Template of eyes and cheek.

244
243

Authorized licensed use limited to: VIT University. Downloaded on October 06,2023 at 17:17:46 UTC from IEEE Xplore. Restrictions apply.
Image
where n is the sorted number of black pixels, in
Capture Locate small to large order, and i is the sorted image
number. The average of the largest 1/20 of the
Data
Collection
Image
Preprocessing Analysis rest of the data is chosen and marked as O,
which means open state. Otherwise, C, closed
Fatigue
Data Collection Summation
state, is defined by the smallest 1/20 of the rest
Modeling
of the data.

Quick Sort Threshold Threshold – PERCLOS: P80


(P80) PERCLOS is the abbreviation for the
percentage of eyelid closure over the pupil over
Statistic and
Analysis
Calculate time. PERCLOS now is the most promising
Time
Proportion real-time measure of alertness for drowsiness
detection [7, 8].
In this study, P80 is taken as a very
Figure 5. Flowchart of fatigue modeling. important criterion of a driver’s physical state.
PERCLOS is measured over a fixed period
of time. P80 can be generally defined by the
eyes state which is determined during the early
driving period. This can be regarded as a
threshold, which separates the eyes state into
two cases: closed more than 80% or not.
Threshold (O  C ) u 0.2  C
Once the amount of black pixels in an
image is less than Threshold , which means that
in this image the eyes are under the P80 situation,
it is defined as CLOSED ; otherwise, it means
3/5 the eyes are open more than twenty per cent, and
therefore defined as OPEN .
Figure 6. Binarized region of concern. Time Proportion
Every person has a unique frequency and
After collecting sufficient data of black speed when blinking. In order to build a personal
pixels from the eyes, the system can begin to fatigue model, the time proportion of this
establish the fatigue model as shown in Figure 7. person’s blink frequency and speed must be
computed.
The next step is to calculate the personal
information of blinking to determine when the
Quick Sort
Threshold (P80) time proportion of eyelids are closed more than
80%. In order to ensure reliable results, the
system requires several minutes of image
Statistic and capturing, the images of which are then
Analysis Calculate Time separated into either the CLOSED or OPEN
Proportion
state.
NC
TP
Figure 7. Flowchart of fatigue modeling. NC  NO
where TP is the personal standard time
Firstly, the two important states of the eyes, proportion of P80, while NC is the number of
both open and closed, must be established. When images in the CLOSED state, and NO is the
the eyes are open, the amount of black pixels is number of images in the OPEN state.
at a maximum; conversely, when the eyes are As the time proportion becomes larger, the
closed, the black pixels are at a minimum. system recognizes this as the driver closing their
The system uses the Quick Sort method eyes more, which in turn represents that the
again to confirm the distribution range of black driver is becoming fatigued.
pixels. First, define there are N images, then:
2.3: Drowsiness Detection
­ i N 10 n ½
° ¦ i N 20 i °
­ i N  N 20 n ½ After the model is built, the system starts to
° ¦ i N  N 10 i ° , detect driver drowsiness.
O ® N ¾ C ® N ¾
° ° ° ° The system separates each image into
¯ 20 ¿ ¯ 20 ¿ either the CLOSED or OPEN state, and
continues computing the time proportion. Once

245
244

Authorized licensed use limited to: VIT University. Downloaded on October 06,2023 at 17:17:46 UTC from IEEE Xplore. Restrictions apply.
the time proportion is higher than the standard, monitor the drivers’ physical state, and remind
TP , the fatigue level is increased. drivers if they are tired, which they themselves
Sometimes, people briefly close their eyes may not noticed. The biggest difference between
for one second or less, but they do not actually commercially available products and the system
feel tired. As a result, the system would not proposed in this study is the use of gray-scale
sound the alarm immediately when the images, which means that detection of skin color
proportion increases. Further, if the driver’s is not required. Although the proposed system
eyelid closure proportion becomes normal, as the features additional calculation steps, it requires
standard, TP , the fatigue level would be set less memory and could be applied in different
back to the normal condition. However, if the environmental conditions. For example, it could
fatigue level keeps increasing, the system will be used even when the driver is wearing glasses
sound the alarm to alert the driver of their or a respiratory mask.
fatigued state. Additionally, the sensitivity of the
alarm standard can be adjusted according to user. Acknowledgments
As a final safety feature, if the driver closes The authors gratefully acknowledge the support
their eyes for more than several seconds, the of Ministry of Science and Technology of
system will consider the driver is drowsy and Taiwan through the grant
immediately sound the alarm. MOST-103-2221-E-006-186-MY2 and
MOST-102-2221-E-006-207-MY2.
3. Resutls
The proposed detection method has no References
negative impact on the driver and is contactless. [1] Lal SKL, Craig A. A critical review of the
Furthermore, this system uses grayscale images, psychophysiology of driver fatigue.
which can significantly reduce the required Biological psychology 2001; 55(3):
memory space compared to a color image based 173-194.
system. [2] Abate AF, Nappi M, Riccio D, Sabatino G.
In general, a RGB image needs 24bits for 2D and 3D face recognition: a survey.
each pixel. For red, green, and blue, each Pattern recognition letters 2007; 28:
channel needs 8 bits. In other words, a RGB 1885-906.
image is composed by three images. However, [3] Kim M, Lee D, Kim KY. System
the system only needs 8 bits for each pixel, since Architecture for Real-Time Face Detection
the grayscale image is consist of one image. on Analog Video Camera. International
Therefore, the system only demand Journal of Distributed Sensor Networks
one-third of the memory while using a color 2015: Article ID 251386, 11 pages
image, as Table 1 shown. [4] Liu YH, Ting Y, Shyu SS, Chen CK, Lee
Besides, using the eyes-and-cheek template CL, Jeng MD. A Support Vector Data
could find the region of concern in a reasonable Description Committee for Face Detection
time and highly acceptable accuracy, which is Mathematical Problems in Engineering,
listed in Table 2. 2014, Article ID 478482, 9 pages.
[5] Kong SG, Heo J, Abidi BR, Paik J, Abidi
Table 1. Comparison of Required Memory. MA. Recent advances in visual and
Image RGB Grayscale infrared face recognition — a review.
QVGA 225 KB 75 KB Computer vision and image understanding
VGA 900 KB 300 KB 2005; 97(1): 103-35.
SVGA 1.38 MB 0.46 MB [6] Zhang B, Zhang L, Zhang D, Shen L.
XGA 2.25 MB 0.75 MB Directional binary code with application to
PolyU near-infrared face database. Pattern
Table 2. Accuracy of Different Conditions. recognition letters 2010; 31(14): 2337-44.
Open Without [7] Dinges DF, Grace R. PERCLOS: a valid
With glasses psychophysiological measure of alertness
Proportion glasses
50% ɥ 92.67% 94.36% as assessed by psychomotor vigilance.
USA Department of Transportation,
50% ɧ 72.66% 77.84% Federal Highway Administration 1998.
0% 84.29% 86.35% [8] Trutschel U, Sirois B, Sommer D, Golz M,
Edwards D. PERCLOS: an alertness
4. Conclusion measure of the past. International driving
This study proposed a real-time, gray-scale symposium on human factors in driver
simulation system to detect driver drowsiness by assessment, training and vehicle design
image processing. In testing and results, and 2011; 6: 172-9
based on the fatigue model, the system can help

246
245

Authorized licensed use limited to: VIT University. Downloaded on October 06,2023 at 17:17:46 UTC from IEEE Xplore. Restrictions apply.

You might also like