You are on page 1of 6

2015 International Conference on Information Processing (ICIP)

Vishwakarma Institute of Technology. Dec 16-19, 2015

Human Computer Interaction For Disabled Using


Eye Motion Tracking

Uma Sambrekar Dipali Ramdasi


Department of Instrumentation and Control Department of Instrumentation and Control
Cummins College of Engineering for Women Cummins College of Engineering for Women
Pune, 411052, India Pune, 411052, India
Email: uma.yoga.birje@gmail.com Email: dipali.ramdasi@gmail.com

Abstract—Human Computer Interaction is a trend-in tech- Various techniques have been proposed to estimate the
nology. Working in this field, we have developed a system to eye gaze. Zhiwei Zhu et al. proposed the novel eye gaze
provide a solution to the limb disabled people to interact with tracking techniques [2] in which the gaze point is determined
computer. Different algorithms are implemented in this paper, by intersecting the 3D gaze direction with the object. This
to estimate the gaze of user to recognize the reference key. The 3D gaze direction is estimated by determining the angular
first step performed on each input frame is the face detection,
which is achieved using Viola-Jones algorithm, whereas Circular
deviation between 3D optical axis (this is determined by
Hough Transform is used to locate the pupil in eye image. assuming the cornea of eye as a convex mirror) of the eye
Similarly glint (a small and intense dot inside the pupil image) is and the visual axis. Even though this method allows natural
detected by the blob analysis method. Using these pupil location head movement and calibration required only once, because of
and glint location, gaze direction of the user is estimated. With the requirement of more no. of cameras and IR illuminators,
these functions, the user is able to handle input devices of the set-up becomes complex and also costly. Seung-Jin Baek et
a computer like keyboard and helps to recognize the key to al. proposed an Eyeball Model-based Iris Center Localization
be pressed. Further the type-in process is continued with the method for Visible Image-based Eye-Gaze Tracking Systems
blinking phenomenon which is detected using template matching [3]. In this technique, the eyeball is assumed as a spherical
method. Currently the system is limited to 4-key keypad. The model to estimate the iris radius. Then the eyeball is allowed to
system is also designed with opening some window application
softwares like Skype, media player etc. which are useful to the
rotate and their elliptical shapes along with their corresponding
limb disable people. The camera used in this application, to access iris center locations are stored in the database. Finally the
video is Logitech HD 720p webcam (C310) and the software iris center locations of incoming video frames are detected by
implementation part is done in MATLAB. The system is tested matching them with a database. Even though the accurate iris
under various environmental conditions, giving the appreciable center localization is possible with this technique, it affects the
results and speed. performance of gaze tracking because of the direct mapping of
iris center on the target plane. Xindian Long et al. presented
Keywords: Human Computer Interaction, Circular Hough
Transform, Pupil Detection, Glint Detection, Gaze Estimation,
a high speed head mounted binocular eye tracking system
Key Recognition, Blink Detection that uses a symmetric mass center algorithm to detect the
eye position [4]. Yuki Oyabu et al. proposed a novel eye
input device using the eye movement derived from the relative
I. I NTRODUCTION position of the pupil which is estimated from the characteristics
Nowadays computer has become a part of each and ev- of unbroken pixel line [5]. There are some other systems like
eryone’s life, as the convenience with the use of computers is ERICA and Tobii but most of the existing techniques involve
increasing day by day. However use of computer is restricted to complex hardware, less accuracy and high cost. To overcome
only those people who can easily handle computer peripherals these limitations a novel system is implemented in this paper
like keyboard, mouse etc. but due to limb disability the for HCI. This system is implemented by the concatenation of
handicap people are away from it. As the technologies are many algorithms that includes Viola-Jones algorithm for face
changing very often, it is necessary to develop a system that and eye detection, Circular Hough Transform (CHT) to make
provides an alternative way for handicaps to interact with an accurate segmentation of pupil/iris from the sclera, Blob
computer. The gaze of the user’s eye is one of the best options analysis for glint (a small and intense dot obtained because
for human computer interaction (HCI). For the same reason of reflection of infrared light from corneal surface) detection,
eye tracking has become an active research topic in various Pupil Corneal Reflection (PCR) method to estimate gaze of
applications. [1]. user by determining the relationship between pupil center and
glint and template matching method for blink detection.
During this ongoing research, various non-invasive eye The organization of the paper is as follows. Section II in-
tracking techniques developed for HCI, either only with head troduces the general system for HCI. Section III explains
movement or eye movement or the combination of both. With the implementation of algorithm developed for HCI in detail.
the development of digital image processing, the eye gaze Section IV illustrates results and finally a conclusion is drawn
tracking techniques pursued a wide bandwidth for HCI. in section V.
978-1-4673-7758-4/15/$31.00 2015
c IEEE

745
Fig. 1. General Structure for HCI [6]

Fig. 3. 4-Key Keypad

the top of the computer screen. A limited size keyboard is


created with 4 keys as shown in Fig.3. Whenever the user looks
at particular key, eye images are captured and transmitted to
computer through USB cable. The image processing software
will then access each frame and performs some processing
operations to estimate the gaze of the user.

B. Face Detection / Eye Detection


Fig. 2. Flow Diagram To Develop A System HCI
As we are using a non-touch user interface approach, the
video input device is located away from the user. Hence, in
II. S YSTEM OVERVIEW this case, the first step is face detection and then eye detection
is carried out. To detect the face and eye, the algorithm used
The system uses eye images to decide the gaze direction is Viola-Jones algorithm [7]. In Viola-Jones algorithm, Haar
of person. like features are used to detect the face. The three key points
The general structure shown in Fig.1 describes how a involved in this algorithm are: first is forming an integral image
human eye can interact with the computer. In the system, a for fast feature evaluation. Second is an efficient classifier, built
4-key keypad is created on the computer screen. Whenever the by an Adaboost Learning Algorithm, to select the most suitable
user looks at the key to be pressed, the position of the pupil Haar like features to detect faces. And third is cascading
center varies with the reference. At that time, the eye images these classifiers, to discard the background region quickly and
are captured by video camera and transmitted to the personal consider only the face like regions.
computer through USB cable. The pupil corneal reflection The input to the Viola-Jones Algorithm is a grayscale
method [1] can be used to recognize the position of eye and image. It is difficult to extract the features from gray image as
to determine the gaze of user. The image processing software it has varying shades of gray. Therefore the image is converted
developed in personal computer compares the position of user’s into an integral image (black and white image). An integral
eye with reference keys present in the database to recognize image at any point can be formed as the summation of pixels
the appropriate key. Furthermore, the typing process can be above and to the left of that point. [7] The advantage of the
continued with blink detection method. integral image formation is that, it reduces the computation
The camera used to capture eye images can be either a laptop time while evaluating the Haar-like features.
camera on which the infrared light source is mounted, to get
the better quality images or a separate USB camera which can There are approximately 1,60,000 Haar-like features avail-
be mounted close to the eye. able but only a few sets of features are useful for detecting
the face. Therefore an Adaboost learning algorithm is used to
III. A LGORITHM I MPLEMENTATION extract these suitable Haar-like features. Selected features are
the strong classifiers which are cascaded together.
The block diagram shown in Fig.2 illustrates different steps
required to implement the system for HCI. Each of the step is Now the selected features are absent or present in incoming
explained in detail below. input frame is checked in each strong classifier. If the first
feature is present in frame, it is passed to the next classifier and
if it is absent, the frame is discarded at the same stage as non-
A. Real Time Video Input / Video Processing
face part. The frame with minimum 10 features is considered
The input given to the system is video, which is acquired as face. The same algorithm can be used to detect the eyes but
by the Logitech HD 720p webcam. The USB port of camera the features selected for this are different. The result for face
is connected to the computer and the camera is mounted on detection and eye detection is shown in Fig.4.

746
Fig. 4. Face Detection and Eye Detection

C. Pupil Localization for Gaze Estimation


To estimate the gaze of user Pupil Corneal Reflection
method is used. The method consists of pupil detection and Fig. 6. Hough Transform Mapping [10]
glint detection. When the fraction of light enters into the eye,
it gets reflected from the retinal surface and forms pupil image,
whereas a fraction of the light gets reflected from the corneal curve in the detection process. The equation of the curve can
surface itself producing a small and intense dot in the pupil be given in the explicit or parametric form. In explicit form,
image called glint. Further by establishing the relationship the HT can be defined by considering the equation for a circle
between the pupil center and glint, the gaze of the user is given by:
determined. To determine the gaze of user, it is necessary to
segment the pupil and glint areas accurately and precisely. One
(x − x0 )2 + (y − y0 )2 = r2 (1)
of the segmentation algorithm known as Hough Transform
(HT) is used to extract the pupil from the sclera. The input
to the hough transform is a smoothed image (obtained using This equation defines locus of points (x,y) centered on an
Gaussian filter) and canny edge detected image. origin (x0 , y0 ) and with radius r. This equation can again be
visualised in two dual ways: as a locus of points (x,y) in an
image or as a locus of points (x0 , y0 ) centered at (x,y) with
radius r.
Fig.6 illustrates this dual definition. Each edge point defines
a set of circles in the accumulator space. All these circles are
defined by all possible values of the radius and are centered on
the co-ordinates of the edge point. Fig.6(a) shows a circle to be
(a) (b) detected in x-y plane. Fig.6(b) shows three circles defined by
three edge points. These circles are defined for a given radius
value. Actually, each edge point defines circles for the other
values of the radius. This implies that the accumulator space
is three dimensional (for the three parameters of interest) and
that edge points map to a cone of votes in the accumulator
space. Fig.6(c) illustrates this accumulator mapping. After
gathering evidence of all the edge points, the maximum in
(c) (d) the accumulator space again corresponds to the parameters of
the circle in the original image. The procedure of evidence
Fig. 5. Pupil Detection using Hough Transform gathering is the same as that for the HT for lines, but votes
are generated in cones, according to Eq.(1) of the circle.
1) Hough Transform: The Hough Transform (HT) is a Equation of a circle can be defined in parametric form as
technique that locates shapes in images. It is used to extract
lines, circles and ellipses (or conic sections). The implemen- x = a + r ∗ cosθ (2)
tation of HT defines a mapping from the image points into an
accumulator space (Hough space). As the pupil area is circular,
here we are using circular hough transform to locate the pupil y = a + r ∗ sinθ (3)
in eye image. Here the aim is to find the center co-ordinate
of pupil and its radius to detect the circular shape of pupil for
The advantage of this representation is that it allows us to
gaze estimation [8] [9].
solve for the parameters. These equations define the points in
The HT can be extended by replacing the equation of the the accumulator space dependent on the radius r. Theta is not

747
a free parameter, but defines the trace of the curve. The trace opening application software is implemented with this algo-
of the curve (or surface) is commonly referred to as the point rithm. Template matching method is one of the common blink
spread function. detection method [12] because of its high accuracy and better
flexibility. In this method before segmentation of pupil two
The input to the hough transform algorithm is generally an
template patterns are created which are open-eye-template and
edge detected eye image. CHT normally needs more time if
close-eye-template. In open eye template pattern again four
the circle search space is larger. To reduce the search space,
templates are created for left gaze, right gaze, up gaze and
initially the maximum and minimum values of radius for iris
down gaze according to the key arrangement done in the GUI.
are set to establish search parameter. In this application, in
Then matching each real time input frame with the created
calibration mode, minimum value of search radius is consid-
templates as shown in Fig.7, blink detection is confirmed. That
ered as 40% of height of eye image and maximum is to the
is, if the frame matches with open eye template, it is not a blink
60%. Once the average radius is found in calibration mode, the
and if it matches with close-eye-template, then it is considered
search window radius is restricted to +/- 3 pixels from average
as a blink. This action allows the respective key to be typed
radius.
in the editing window as well as in Notepad. The results are
In hough transform mapping, usually an empty 3-D accu- shown in the next section.
mulator array is formed to save the weights of edge points.
If the edge point satisfies a specific dimension circle equa-
tion then weight of the center point in the hough array is
incremented. Number of edge points on a circle with specific
dimensions (center (x,y) and radius ’r’) are counted by the
value of the corresponding point in accumulator array and an
output is saved in a matrix containing [x co-ordinate, y co-
ordinate, radius, count (no. of edge pixels on that circle)]. Now
the circle with maximum count is considered, as the prominent
circle. This circle position and radius is assigned to the pupil
center and radius.
2) Blob Analysis: Blob analysis is one of the object de-
tection method. In this case it is used to detect the glint
in the eye image. The input to the blob analysis method is
binary image. Blob is a connected region of white pixels (pixel Fig. 7. Template Matching
with value ’1’) in the binary image. These blobs are detected
using connected component analysis algorithm. Since glint is
a bright, intense, small object present in the pupil region, a V. R ESULTS A ND D ISCUSSION
pixel with value ’1’ is searched in cropped binary image. Once A novel system developed as an alternative tool for limb
the pixel with value ’1’ is identified, it searches for white disabled people to interact with computer using eyes involves
pixels around it using the 8-connectivity concept [11]. (i.e. it different algorithms. An initial step in this system is the selec-
searches for white pixels from current pixel in 8 directions). tion of camera to track the minimally distorted eye movement
If the neighbouring pixels are white, it is considered as part of of the user and transfer it to the computer. We have used a
blob else it breaks the search chain. The procedure is continued low cost Logitech HD 720p camera with USB cable. The next
for all white pixels and thus the blobs are detected. necessary step is the camera mounting. As we are using the
In MATLAB, we have a blob analysis function, that detects non-touch user interface approach, the camera is located away
all the available blobs in binary image, their centroids and area from user, on the top of computer screen. The distance between
covered by blobs (no. of pixels present in blobs i.e. size of the computer and user is maintained approximately 16 to 30
blobs). From these blobs, a glint is chosen by satisfying some inches so that the clear face image can be obtained. In order
criteria; first is, blobs connected to the edges are not considered to create a prominent blob of glint for gaze estimation an LED
and secondly, blob with maximum count of white pixels are or IR source with supply of 5 to 12 Volts can be used.
considered as glint.
Once the setup is ready, the video input device takes the
Once the glint and pupil center positions are determined, continuous images of user and are processed in MATLAB.
gaze of the user can be estimated by calibrating current glint The processing speed of the system is 5 frames/sec. This speed
location with respect to pupil center. A marker is created to depends on the complexity of algorithm. During processing the
show the eye movement on the keypad window. Its position first step performed is the face detection on each input frame
is estimated by scaling the glint position in the pupil image discarding the non-face part using a Viola-Jones algorithm
to the corresponding position in keyboard image. When the using suitable Haar-like features for face. The same algorithm
user stares at the key to be pressed, the marker moves towards is used to detect the eye, but uses different Haar features.
respective key and the key can then be recognized. Further the
In order to estimate the gaze of the user, Pupil Corneal
key type-in can be proceeded with blink detection method.
Reflection method is used that records eye movement. Some
preprocessing operations are performed before segmentation
IV. B LINK D ETECTION
of pupil, which includes grayscale conversion, image binariza-
Blink is a natural action observed by eye in any human tion, Gaussian smooth filtering, canny edge detection etc. as
being. By utilizing this action, a key type-in process and shown in Fig.8 and Fig.9 . Next the segmentation of pupil

748
TABLE I. P ERFORMANCE E VALUATION

Background Lighting Conditions Resolution


Complex Static Sufficient Insufficient Low High
80% 90% 80% - 90% 65% - 70% 70% 90%

Viewing Angle Distance


Center Corner Near Far
90% - 95% 80% - 85% 85% - 95% 65% - 70%
(a) Gaussian Filtered Image (b) Canny Edge Detection

Fig. 9. (a)Gaussian Filtered Image (b)Canny Edge Detected Image


is carried out using the Circular Hough Transform. It gives
accurate results for pupil boundary detection along with its
center and radius parameters. Since the computer screen has a
limited size, the variations of pupil center are non-measurable.
Hence the eye movement cannot be tracked only with pupils
location. Therefore, the glint location is estimated using the
blob analysis method. As the glint position changes with an
eye movement, gaze of the user is estimated by calibrating
glint location with respect to pupil center.
Once the gaze is estimated, a marker is inserted in the
4-key keypad window. The marker varies according to the
movement of the eye, to recognize a reference key. Further
the template matching method is implemented to detect blink,
that proceeds to key type-in in the editing window as shown in
Fig.10. But with this kind of keypad arrangement, the accuracy
of recognizing key is low. Also occurrence of repeated key,
as can be seen in the editing window of Fig.10. The other
key arrangement as shown in Fig.11 overcomes this problem. Fig. 10. Output With Key Recognition and Key Type-in
We have also designed this system with opening application
softwares like Skype, Media Player, Notepad, Word-pad, email
functionality with Microsoft Outlook, etc. as shown in Fig.12.
These are the most suitable applications for limb disabled VI. C ONCLUSION
people to make a video call, to listen music, checking emails Several techniques exist in practice, for tracking the di-
by themselves. rection of eye gaze. All these techniques are either intrusive
In order to evaluate the performance of this developed or less accurate, hardware complications involved and hence
system, the experiment is tested on 25 volunteers providing are costly. This paper presents a cost effective system for
minimal training before using the system. The system gives limb disabled people to interact with computers using their
appreciable results and speed with an accuracy of 90%. Per- eyes. In the algorithm implemented, the methods used for eye
formance evaluation of the system is carried out under various tracking from a pupil localization to blink detection produces
conditions which is shown in table I to check the robustness promising results and achieves an appreciable speed. Currently
of system. the system is implemented for 4-keys and the suitable ap-
plication softwares to limb disabled people. In future, with

(a) Captured Eye image (b) Gray image

(c) Binary image (d) Holes Filled Image

Fig. 8. Preprocessing Steps


Fig. 11. Improved Results of Key Recognition and Key Type-in

749
[9] D. H. Ballard, “Generalizing the hough transform to detect arbitrary
shapes”, Page(s): 714-725, Year: 1987.
[10] Mark S. Nixon, Alberto S. Aguado, “Feature Extraction and Image Pro-
cessing”, Newnes, Oxford Auckland, Boston, Johannesburg,Melbourne,
New Dehli. Page(s): 173-201, Year: 2002
[11] Stephen Bialkowski, Zeb Johnson, Kathryn Morgan, Xiaojun Qi, and
Donald H. Cooley, “A Non-Intrusive Approach to Gaze Estimation”,
Computer Science Department, Utah State University, Logan.
[12] Abha Dubey, Mr. Rishi Soni, Ashok Verma, “Automatic Eye Blink Gen-
eration and Detection System in Digital Image Processing”, International
Journal of Electronics and Computer Science Engineering, Volume: 1,
Issue: 4, Page(s): 1913-1918.

Fig. 12. Figure Showing Result with Opening Application Softwares

some improvement in algorithm the number of keys on the


keypad can be incremented or a separate virtual keyboard can
be created, with some accurate and fast processing techniques.
Also by making some specific customizations in each software,
we may control the entire software by tracking gaze.

ACKNOWLEDGMENT
The authors would like to acknowledge Dr. Anagha Pandi-
trao, Head of the Department of Instrumentation and Control,
Cummins College of Engineering for Women, Pune and Dr.
Madhuri Khambete, Principal of Cummins College of Engi-
neering for Women, Pune, for their support and encouragement
during this work.

R EFERENCES
[1] Jianbin Xiong, Weichao Xu, Wei Liao, Qinruo Wang, Jianqi Liu, and
Qiong Liang, “Eye Control System Base on Ameliorated Hough Trans-
form Algorithm”, Sensors Journal,IEEE, Volume 13, Issue 9, Page(s):
3421-3429, Year 2013.
[2] Zhiwei Zhu and Qiang Ji, “Novel Eye Gaze Tracking Under Natural Head
Movement”, Biomedical Engineering, IEEE Transaction on, Volume 54,
Issue 12, Page(s): 2246-2260, Year 2007.
[3] Seung-Jin Baek, Kang-A Choi, Chunfei Ma, Young-Hyun Kim, and
Sung-Jea Ko, “Eyeball Model-based Iris Center Localization for Visible
Image-based Eye-Gaze Tracking Systems”, IEEE Transactions, Volume:
22, Issue: 10, Page(s): 415-421, Year: 2010.
[4] Xindian Long, Ozan K. Tonguz, and Alex Kiderman, “A High Speed Eye
Tracking System with Robust Pupil Center Estimation Algorithm”, 29th
Annual International Conference of the IEEE EMBS Cit Internationale,
Lyon, France, Page(s): 3331-3334, Year: August-2007.
[5] Yuki Oyabu, Hironobu Takano, Kiyomi Nakamura, “Development of the
Eye Input Device Using Eye Movement Obtained by Measuring the
Center Position of the Pupil”, IEEE International Conference on Systems,
Man, and Cybernetics, Page(s): 2948-2952, Year: 2012.
[6] Uma Sambrekar, Dipali Ramdasi “Estimation of Gaze For Human
Computer Interaction”, International Conference on Industrial Instru-
mentation and Control (ICIC), Page(s): 1236-1239, Year: 2015.
[7] Paul Viola, Michael J. Jones, “Robust Real-Time Face Detection”,
International Journal of Computer Vision, Volume: 57, Issue: 2, Page(s):
137-154, Year: 2004.
[8] Takeshi Takegami, Toshiyuki Gotoh, “A Hough Based Eye Direction
Detection Algorithm without On-site Calibration,” Proc. VIIth Digital
Image Computing: Techniques and Applications, Page(s): 459-468, Year:
2003.

750

You might also like