You are on page 1of 4

2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

Real-Time Eye State Detection System for Driver


Drowsiness Using Convolutional Neural Network
1st Wissarut Kongcharoen 2nd Siranee Nuchitprasitchai 3rd Yuenyong Nilsiam
Department of Information Department of Information Department of Information
TechnologyKing Mongkut's University TechnologyKing Mongkut's University TechnologyKing Mongkut's University
of Technology North Bangkok of Technology North Bangkok of Technology North Bangkok
Bangkok, Thailand Bangkok, Thailand Bangkok, Thailand
firstwantup@gmail.com siranee.n@it.kmutnb.ac.th yuenyong.n@eng.kmutnb.ac.th

4 4th Joshua M. Pearce


Department of Materials Science &
Engineering and Department of
Electrical & Computer Engineering
Michigan Technological University
Houghton, MI, USA
pearce@mtu.edu

Abstract— One of the top reasons for road accidents that Transportation offices around the country during the month of
result in injuries and deaths is the dozing off of drivers. In this February 2019 shows that the accidents are often caused by
study eye tracking using a novel open source Internet of Things the behaviors of the drivers (47 times) [11]. One of the top
(IoT)-based system has been developed. This study evaluated reasons that causes accidents, which result in injuries and
three driver’s eye recognition algorithms to be integrated into deaths, is the dozing off of the drivers (10 out of 47 times)
the open source solution to wake drivers as they begin to dose [11]. The symptoms of drowsiness during driving are: 1)
off: 1) Convolutional Neural Network (CNN) with Haar yawning, 2) distraction, 3) tired and worry, 4) cannot
Cascade, 2) 68 facial landmark points and 3) gaze detection in remember what have passed in a few kilometers, 5) cannot
three different face positions for both day and night driving
open their eyes, 6) dizzy or giddy, 7) swerving between lanes
conditions as well as with and without glasses. Each combination
of those factor is tested 100 times. The best algorithm is chosen
and 8) skip traffic signs [12].
based on the numbers of correct detections and then this In order to overcome these challenges this study evaluates
algorithm is tested again based on light (day and night), angle of techniques and systems to prevent drowsiness during driving.
face (left, right, and center), angle of camera (left and right), and Systems have been developed to monitor drowsiness or
glasses (on and off) to detect both blinking and closed eyes. The dozing off while on the road including ways to monitor the
results show that the most accurate algorithm to detect a driver for drowsiness including electroencephalogram (EEG),
driver’s eyes is CNN with Haar Cascade algorithm with 94%
electrocardiography (ECG), and eye status tracking [13]. In
accurate. The system can detect the status of the eyes of drivers
this study eye tracking was selected as the best option because
during driving and if drivers close their eyes longer than two
seconds, it sounds an alert to wake the driver and avoid the
no equipment needs to be attached to the driver. In this study
accident. The proposed open source system costs about US$100 a novel open source Internet of Things (IoT)-based system has
and could be widely deployed to help reduce accidents on the been developed and is disclosed and evaluated here. This
road throughout the world. system will detect the status of the eyes of the drivers whether
they are open or closed and the status of the eyes is used to
Keywords— Drowsiness Detection Algorithm, Eye Blinking, indicate if the driver is drowsy or distracted then alarm the
Convolutional Neural Network, Haar Cascade, Automobile driver with a sound. First the face detection and face
Safety, Traffic Fatalities. recognition literature is briefly reviewed in the context of the
IoT. Next several of the most popular algorithms are tested
I. INTRODUCTION with different controlled factors which include light (day and
Thailand’s roads are among the most dangerous in the night), and angle of face (left, right, and center) to detect both
world and the government vowed at a United Nations blinking and closed eyes. Each combination of those factor is
(UN) forum in 2015 to halve the number of road traffic tested 100 times. The best algorithm is chosen based on the
deaths by 2020 [1,2]. The World Health Organization numbers of correct detections and then this algorithm is tested
(WHO) reports that Thailand is far from achieving its again based on light (day and night), angle of face (left, right,
2020 goal of bringing road deaths down to less than 20 and center), angle of camera (left and right), and glasses (on
for every 100,000 people [3]. Traffic related fatalities and off) to detect both blinking and closed eyes. Next, open
has been a concern for a number of years and significant source software is developed to prevent drowsiness driving
analysis to find interventions has been undertaken [4-8]. accidents, which can use three algorithms. This study
Unfortunately, today road accidents still account for evaluated three driver’s eye recognition algorithms: 1)
most deaths in Thailand (particularly among the poor), Convolutional Neural Network (CNN) with Haar Cascade, 2)
with more than 20,000 preventable fatalities each year 68 facial landmark points and 3) gaze detection in three
[9]. Last year (2019) there were 20,169 deaths [10]. An different face positions for both day and night driving
accident study conducting by the Department of conditions as well as with and without glasses. The results of
Transportation of Bangkok and Provincial Department of

978-1-7281-6486-1/20/$31.00 ©2020 IEEE 551

Authorized licensed use limited to: Auckland University of Technology. Downloaded on August 09,2020 at 14:48:31 UTC from IEEE Xplore. Restrictions apply.
2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

this study are discussed in the context of the proposed system


being able to help reduce accidents on the road in Thailand
and other regions.
II. LITERATURE REVIEW
Face detection and face recognition [13] have been used
for many applications, such as, for security purpose. Face
detection is a process of finding a face of human in images or Fig. 1. Convolutional Neural Network
video footage and face recognition is a continued process
after the detection to compare the detected faces with the A convolutional layer example, the input neuron is a
faces in the database to identify the person. To extract the 5x5x3 array of pixel values. A filter or a neuron or a kernel
features of the face is a crucial part of face recognition and and this filter covers a 3x3x3 array of weights or parameters.
face analysis. There are many techniques have been studied: The depth of this filter has to be the same as the depth of input.
Convolutional Neural Network (CNN) [14], Facial Landmark The first filter position starts from the top left corner. As the
Detection [15], Viola-Jones Method [16, 17] and others. filter is sliding or convolving to the right by 1 pixel/stride (or
Regardless of the method, it is important that any device more than 1 pixel/stride) at a time around the input image,
follows Internet of Things and Thailand 4.0 policy [18]. the element wise multiplication is applied by multiplying the
National Broadcasting and Telecommunications Commission values in the filter with the original pixel values of the image.
(NBTC) has been working with many parties to bring all These multiplications are all summed up in a single number
different standards of communication between devices in called the receptive field in first hidden layer or feature map
order to make them able to communicate with each other with 3x3x1 array. The filter can be used more than 1 filter in
[18]. There have also been many attempts to use one convolution layer. If there are 16 filters, the feature map
these techniques to detect drowsiness. First, Vivekanandan of size is 3x3x16. The feature map of size is smaller than the
et al. [19] proposed the machine learning model for input of size. To pad zeros to the boundary of the feature map
drowsiness detection. Even though the results of detection is increasing the size. Rectified Linear Unit (ReLU) is Linear
were accurate, the computation was done on a personal function, which is rectified from S shape. For ReLU, if the
computer running Windows operating system, which is not input is positive, then the slope is always equal one as shown
practical for use in the application of this study. Next, in Equation 1 and 2. A pooling, the max pooling layer is the
Danisman et al. [20] tried to detect drowsy driver using most common used after the convolutional layer to reduce the
eye blinking patterns. The accuracy of the system is 94%, feature map of size only in width and height, not in depth.
which is very high. However, with high accuracy, the system 0, 𝑥 ≤ 0
used a video footage to analyze the patterns that come with 𝑓(𝑥) = max(0, 𝑥) = (1)
𝑥, 𝑥 > 0
high resources consuming and sensitive to the external
light of the system. Irtua et al. [21] built the system to 0, 𝑥 ≤ 0
𝑓′(𝑥) = (2)
detect fatigue of driver using facial landmarks. The 1, 𝑥 > 0
system also has the problem of high resources consuming The model was developed using 104,559 pictures of an eye
due to the use of image processing technique. from Media Research Lab [29]. 52,727 open eye and 48,903
close eye from Media Research Lab. 10,355 open eye and
Next, Tiwari et al. [22] implemented IoT based driver
8,572 close eye were taken by the authors. The ratio of training
drowsiness detection and health monitoring system. The
data to test data is 80 to 20.
system analyzes the status of the eyes of the drivers and also
monitors heart rate, temperature, gas of alcohol in the car, B. 68 Facial landmarks points
and location of the car. All information is sent to the cloud. 68 Facial landmarks points are used to estimate the
This is particularly interesting as it could also be used to location of 68 (x,y)-coordinates that represent eyes,
reduce drunk driving. Finally, Rawal [23], Joshi et al. [24],
eyebrows, nose, mouth, and jawline on the human face. To
Rajneesh et al. [25], Sooksatra et al. [26], Said et al. [27], and
detect an eye blinking in an image, a face needs to be
Jayswal et al. [28] have developed systems to detect
drowsiness of drivers and alert system to prevent accident localized using Histogram of Oriented Gradients (HOG) and
from happening. These prior studies are based on Haar Linear Support Vector Machine (SVM) object detectors [30].
Cascade algorithm, but this research proposes and tests both Histogram of Oriented Gradients (HOG) [31] is a feature
the Convolutional Neural Network (CNN) with Haar descriptor with histograms and oriented gradients
Cascade, which have the potential to both reduce the information of an image. The magnitude and direction of
processing time and improve the accuracy. gradient are used to create a histogram. The histogram of
gradients is divided into nine bins between 0 and 180 degrees.
III. METHODOLOGY The gradient’s direction shows which bin is selected and the
Three approaches are taken to monitor drowsiness to gradient magnitude is the amount of number added into the
prevent driving accidents by using eye recognition algorithms: bin. Linear Support Vector Machine (SVM) [32] is one
1) Convolutional Neural Network (CNN) with Haar Cascade, technique to analyze data for classification in supervised
2) 68 facial landmark points and 3) gaze detection. learning model. Linear SVM is categorizing data into two
classes.
A. Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN) as shown in Figure C. Eye gaze detection
1 is a kind of neural network to help analyze images. In this Eye gaze detection (EGD) is to locate the eye positions on
research, it was used to classify images of drives into the status human face and to analyze the eye blinking. The right and left
of the eyes of drivers: open or close. eye coordinates correspond to the 68 Facial landmarks points

552

Authorized licensed use limited to: Auckland University of Technology. Downloaded on August 09,2020 at 14:48:31 UTC from IEEE Xplore. Restrictions apply.
2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

with indexes: 37 to 42 and 43 to 48 respectively. These


featured points are used to detect the edges of eye lids and then
calculate the distance between the top and bottom edges of
each eyes (left and right). These distances are used to
determine if a driver’s eyes are blinking or closed.
D. Experiments
Using the approaches discussed in section A-C an open
source algorithm was coded in Python (3.5.6) The full code is
available at: https://github.com/wissarutkong/cnn_eyedete
ction.git. The hardware used to conduct the experiments was
shown in Table 1. The experiments are run under two
Fig. 3. Examples of results of 68 facial landmark points algorithm.
environments in day time and night time with three
algorithms: 1) Convolutional Neural Network (CNN) with
Haar Cascade, 2) 68 facial landmark points and 3) gaze
detection. Each algorithm is tested with three different face
positions: 1) the face is in the central straight (C), 2) the face
turns to the left side for 45 degree (L), and 3) the face turns to
the right side for 45 degree (R). Turning 45 degree is used to
imitate when drivers look at side mirrors. Finally, the
experiments were repeated with and without glasses. Each
experiment is repeated 100 times.

TABLE I. BILL OF MATERIALS. ALL MATERIALS WERE SOURCED


FROM ARDUITRONICS.

Item Number Cost(Baht) Fig. 4. Examples of results of eye gaze detection algorithm.
Raspberry Pi 4 Model B 1 1,980
Raspberry Pi NoIR Camera Module Infra- 1 1,150
red Sensitive Camera V2
Infrared Night Vision Unit (LED 1 Watt) 1 89.50
Active Buzzer module (3.3 – 5 V.) 1 35
Jumper Female to Female (40 x 10zm) 1 45
Total 3,299.5

Fig. 5. Real-Time Eye State Detection (RTESD) System Installation.

IV. RESULTS TABLE II. THE EYE BLINKING DETECTOIN ALGORITHMS COMPARING

Some examples of the outputs of the system are shown


below, Convolutional Neural Network (CNN) with Haar
Cascade (See Fig. 2), 68 facial landmark points (see Fig. 3),
and eye gaze detection (see Fig. 4). The 68 facial landmark
points and eye gaze detection are able to detect status of the
eyes (open or close). However, the efficiency of the CNN with
Haar Cascade algorithm is significantly better than the other
two algorithms. The number of correct detections of blinking
and closed eyes out of 100 times. The eye blinking detection TABLE III. ALGORITHM COMPARING FOR SYSTEM LOCATION
results are show in Table 2. Table 3 shows that the best
position to detect the status of the eyes is the face in the central
straight for almost all cases, with glasses, without glasses,
camera installed on the left, and camera installed on the right
(see Fig.5).

Based on the results in this section, a system architecture


is proposed (see Fig. 6) using a Raspberry Pi 4 and 4G/LTE
Shield, which sends data to a Linux server running Python 3.
The buzzer module for the raspberry pi is used to alert the
driver and wake the driver. The cost of the system is about
3,299.50 baht (around $105 USD). This makes it accessible to
both after-market installations by third party vendors as well
as automobile manufacturers for whom buying the
components in bulk would reduce the costs well under $100
USD. Python is popular programming language because it is
Fig. 2. Examples of results of CNN with Haar Cascade algorithm.

553

Authorized licensed use limited to: Auckland University of Technology. Downloaded on August 09,2020 at 14:48:31 UTC from IEEE Xplore. Restrictions apply.
2020 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

easy to learn and is an applicable programming language for [10] Permpoonwiwat, C. K., & Kotrajaras, P. (2012). Pooled time-series
everyone and a wide collections of industries. A large number analysis on traffic fatalities in Thailand. World Review of Business
Research, 2(6), 170-182.
of people around the world has contributed their time and
[11] Statistic of public transportation and truck,
effort to develop new libraries in most fields. It is an open https://www.dlt.go.th/site/ltsb/m-news/3937/ view.php?_did=20224,
source programming language, so everyone can use and lass accessed 2019/09/05.
improve the existing libraries as needed. Python will continue [12] How to deal with drowsy during driving,
to grow and the quality of the libraries will be improved as it https://www.thaihealth.or.th/Content/40214-วิธีปฏิ บตั ิหาก "
is used by additional developers. Overall the results of this ง่วง" ขณะขับรถ.html, lass accessed 2019/09/14.
study were extremely promising. More advanced cameras [13] Fazli, S., & Esfehani, P. (2012, March). Tracking eye state for fatigue
may help solve some of the challenges uncovered if they have detection. In International Conference on Advances in Computer and
Electrical Engineering (ICACEE 2012) (pp. 17-20).
better resolutions and are able to handle low light situations.
For example, Arducam 16MP Pi Camera 4K, IMX298 costs [14] Convolutional Neural Netowrk (CNN), https://blog.datawow.io/มาลองดู
วิธีคิดของ-cnn-กัน-e3f5d73eebaa, last accessed 2019/12/15.
about 1,883.69 baht ($59.99 USD) [33] and could be used in
[15] Facial Landmarks with dlib, OpenCV, and Python,
future work. https://www.pyimagesearch.com/ 2017/04/03/facial-landmarks-dlib-
opencv-python/, last accessed 2019/09/14.
[16] Face Detection, https://pongpich-v.blogspot.com/2018/11/blog-
post.html, last accessed 2019/09/14.
[17] Viewer Tally from Visual and Depth Image, https://isl2-
dev.cp.eng.chula.ac.th/sites
/default/files/Senior%20Project%20Proposal.docx, last accessed
2019/09/14.
[18] Internet of Things and Thailand 4.0 Policy,
Fig. 6. Proposed Real-Time Eye State Detection (RTESD) Architecture. http://www.nbtc.go.th/getattachment/Services /quarter2560/ปี -
2561/32279/เอกสารแนบ.pdf.aspx, last accessed 2019/09/05.
V. RESULTS [19] Ranjith, V. et al.: Machine Learning Model for Drowsiness Detection.
2019 International Journal of Research in Engineering and
This paper has shown that the proposed system, the Real- management 1(1), 334-338 (2019).
Time Eye State Detection System Using Convolutional [20] Danisman, T., Bilasco, I. M., Djeraba, C., & Ihaddadene, N.: Drowsy
Neural Network with Driver Drowsiness Alert, was driver detection system using eye blink patterns. In: 2010 International
implemented successfully. The system can detect the status of Conference on Machine and Web Intelligence, pp. 230-233, IEEE
the eyes of drivers during driving. When the drivers close their (2010).
eyes longer than two seconds, the alert system would notify [21] IRTIJA, N., SAMI, M., & AHAD, M. A. R.: Fatigue Detection Using
Facial Landmarks. In: International Symposium on Affective Science
the drivers by the sound. There are some false alarms and and Engineering ISASE2018, pp. 1-6, Japan Society of Kansei
misses due to the environment around the drivers, such as, Engineering (2018).
glasses and external light. For future work, these issues should [22] Tiwari et al.: IOT Based Driver Drowsiness Detection and Health
be taken into account and resolved. Monitoring System. International Journal of Research and Analytical
Review 6(2), 163i-167i (2019).
REFERENCES [23] Rawal, R. S., & Nagtilak, S. S.; Drowsiness Detection Using
[1] Beech, H. Why are Thailand’s roads among the deadliest in the world? RASPBERRY-Pi Model Based On Image Processing (2016).
(2019, September 3). The Independent. [24] Joshi, A., Gujrati, S., & Bhati, A. (2013). Eye state and head position
https://www.independent.co.uk/news/long_reads/thailand-roads- technique for driver drowsiness detection. International Journal of
deadly-traffic-accidents-class-inequality-a9071696.html Electronics and Computer Science Engineering, 2(3), 874-879.
[2] Initiative to improve Road safety in Thailand. (2015). [25] Goraya, A., & Singh, G. Real Time Drivers Drowsiness Detection and
http://www.uncrd.or.jp/content/documents/3489Thailand%20Initiativ alert System by Measuring EAR. International Journal of Computer
e%20-.pdf Applications, 975, 8887.
[3] Road accidents still account for most deaths in Thailand, report shows. [26] Sooksatra, S., Kondo, T., & Bunnun, P. (2015, June). A robust method
(2019). Https://Www.Nationthailand.Com. Retrieved January 29, for drowsiness detection using distance and gradient vectors. In 2015
2020, from https://www.nationthailand.com/news/30373606 12th International Conference on Electrical Engineering/Electronics,
[4] Tanaboriboon, Y., & Satiennam, T. (2005). Traffic accidents in Computer, Telecommunications and Information Technology (ECTI-
Thailand. IATSS research, 29(1), 88-100. CON) (pp. 1-5). IEEE.
[5] Nakahara, S., Chadbunchachai, W., Ichikawa, M., Tipsuntornsak, N., [27] Said, S., AlKork, S., Beyrouthy, T., Hassan, M., Abdellatif, O. E., &
& Wakai, S. (2005). Temporal distribution of motorcyclist injuries and Abdraboo, M. F. Real Time Eye Tracking and Detection-A Driving
risk of fatalities in relation to age, helmet use, and riding while Assistance System.
intoxicated in Khon Kaen, Thailand. Accident Analysis & Prevention, [28] Jayswal, A. S., & Modi, R. V. (2017). Face and eye detection
37(5), 833-842. techniques for driver drowsiness detection. Int. Research. J.
[6] Kumphong, J., Satiennam, T., & Satiennam, W. (2016). A correlation Engineering and Technology (IRJET), 4(04).
of traffic accident fatalities, speed enforcement and the gross national [29] MRL Eye Dataset, http://mrl.cs.vsb.cz/eyedataset, last accessed
income of Thailand and its cross-border countries. International 2019/09/14.
Journal of Technology, 7(7), 1141-1146. [30] Facial landmarks with dlib, OpenCV, and Python,
[7] Permpoonwiwat, C. K., & Kotrajaras, P. (2012). Pooled time-series https://www.pyimagesearch.com/ 2017/04/03/facial-landmarks-dlib-
analysis on traffic fatalities in Thailand. World Review of Business opencv-python/, last accessed 2019/12/15.
Research, 2(6), 170-182. [31] Histogram of Oriented Gradients,
[8] Suriyawongpaisal, P., & Kanchanasut, S. (2003). Road traffic injuries https://www.learnopencv.com/histogram-of-oriented-gradients/, last
in Thailand: trends, selected underlying determinants and status of accessed 2019/12/15.
intervention. Injury control and safety promotion, 10(1-2), 95-104. [32] Cortes, C., & Vapnik, V. (1995). N, Support Vector Networks.
[9] Thaiger, T., & Nation, T. (2019, July 25). Road incidents still the Machine Learning, 20(3), 273-295.
biggest killer in Thailand. The Thaiger. https://thethaiger.com/hot-
news/road-deaths/road-incidents-still-the-biggest-killer-in-thailand

554

Authorized licensed use limited to: Auckland University of Technology. Downloaded on August 09,2020 at 14:48:31 UTC from IEEE Xplore. Restrictions apply.

You might also like