You are on page 1of 6

A Low-Cost Autonomous Attention Assessment

System for Robot Intervention with Autistic Children

Fady S. Alnajjar Abdulrahman Majed Renawi Massimiliano Cappuccio


DepDUWPHQW of Computer Science and 'HSDUWPHQWRI&RPSXWHU6FLHQFHDQG DepDUWPHQW of Philosophy
Software Engineering, UAE University 6RIWZDUH(QJLQHHULQJ8$(8QLYHUVLW\ UAE University
$O$LQUAE $O$LQ8$( $O$LQ, UAE
fady.alnajjar@uaeu.ac.ae abdulrahman.m@uaeu.ac.ae PORUHQ]R#XDHXDFDH

Omar Mubain
WesternSydney University
Australia
o.mubin@westernsydney.edu.au

Abstract— Attention is an essential mental process that is yet they are applied only in controlled experimental setups and
important to achieve learning progress. We cannot get better in not intensively focusing on automatic attention level
our academic learning unless we concentrate our attention on assessment [3,4].
the person giving the educational material, such as the teacher
or trainer. Children with Autism Spectrum Disorder (ASD) may
have attention difficulties that can directly influence their
academic skills. In recent years, robot intervention in autism
therapy and assessment is becoming a popular research topic
due to its role in enhancing children's attention more than a
regular human therapist, as well as, the increasing number of
autism children compared to the availability of professional
therapists. Robot intervention helps in reducing therapy time
and makes early therapeutics sessions easier and much
promising. Many researches have been conducted to develop
robot intervention techniques for ASD children, and some
methods have already been used to assess autistic individuals’
attention during the robot intervention sessions. Yet, the
existing attention assessment methods are either very complex
or simple with one measured interaction cue only. This paper
presents a practical and low-cost automatic approach to assess
autistic individuals’ attention during robot intervention;
addressing multiple interaction cues. Experimental results
show that the proposed attention assessment system could
accurately measure the child attention and enhance therapy
progress. This automatic attention system can open a new era Fig. 1. The NAO robot with our custamized chest holder to carry a
for utilizing technologies to monitor students’ attentions in the mobile to enhance the robot’s sensory & motor capabilties.
class to enhance educational systems.

Keywords—Autism Therapy, Robot Intervention,


NAO Robot, Assessment System, Autism Diagnosis Recent researches on robotics intervention present several
techniques to diagnose autistic individuals using robots and
I. INTRODUCTION accompanying sensors [4,5]. Some of these techniques present
a score-like assessment to represent the interaction level
Children with autism are usually experiencing
between the robot and the autistic individual [6]. These scores
neurodevelopmental disorders characterized by attention
have several forms to represent multiple interaction and
deficit and learning difficulties [1]. Earning the autistic
attention cues. Previous literature informs us that attention
children attention (i.e., perform eye/face contact, focus on the
definition in the context of robotic autism assessment could be
information they hear while filtering the noises and perform
modeled to include various factors such as:
the task assigned to them), in a complex environment is,
therefore, an extreme challenge. This challenge can affect (1) eye/face attention: represents how much the individual
their learning and academic skills and their potential for long- is paying eye/face contact to the robot. This can be measured
term success. using a variety of measurement techniques, ranging from eye-
pupil diameter measurement [7] to image processing utilizing
Autism therapy and assessments with robots intervention
the robot’s onboard camera [8].
is a trending research topic, since the statistics show
noticeable increasing rates in the number of autistic (2) Joint attention: represents the ability of the individual
individuals in UAE compared to professional therapists [2]. to react to robot’s motoric actions, like looking at the same
There are plenty of researches addressing autism therapy and object that the robot is pointing to. This can be measured using
assessments techniques with the aid of robotic interventions, the same hardware used to measure attention with different

978-1-5386-9506-7/19/$31.00 ©2019 IEEE 9–11 April, 2019 - American University in Dubai, Dubai, UAE
2019 IEEE Global Engineering Education Conference (EDUCON)
Page 787
data processing techniques [3] or it can be measured The used chest holder (chest-mount) is designed in our lab
separately using a motion tracking method [9]. using Fusion 360 CAD Software to be a multipurpose tool for
holding any object using up to four screws for rigidity. Velcro
(3) Facial emotions: detection of the facial reaction of the straps are used to fasten the holder from the back. It can be
individual interacting with the robot. Detecting facial features used by other researchers to carry a mobile, a camera, a laser
can be used to detect emotions and other aversion and sensor or any other item on the chest of NAO robot. Moreover,
attention cues [10]. A camera could be enough to detect facial the USB socket on the head of NAO robot is used to charge
features, and a depth camera [11] can be used for better the mobile phone. This technique maintains the robot’s
accuracy. The output of facial features detection helps in equilibrium ability, thus robot’s built-in features can be used
understanding the emotional state of the individual to relate it extensively during the diagnosis and therapy processes.
to other attention measures.
B. Autonomous Assessment system
(4) Sound/Linguistic response: the ability to answer the
given questions during the interventions within a certain time. Since the target of this paper is to particularly present the
autonomous assessment system of the robot intervention,
(5) Imitation response: the ability to imitate the robot’s rather than the intervention technique itself, the following
movements. sections will highlight the assessments factors:
Most of the existing assessment techniques in robot Attention Score:
intervention studies either use external hardware components
not mounted on the robot, which limits portability, or target a The mounted mobile’s camera is used for facial
single attention cue due to robot’s onboard hardware recognition to produce and accumulate the attention score
limitations. This paper presents a method that uses upgraded (i.e., how much the individual is paying eye/face contact to the
robot onboard/mounted hardware targeting automatic robot, thus a smiley face indicates maximum attention). Image
assessment of the above mentioned attention cues. frames are retrieved online from the camera of the mobile,
acting as an IP camera, and processed using Haar classifiers
II. METHODOLOGY [13] to detect the participants face. The participant’s face is
This study introduces a simple and efficient numerical detected if his/her face is directed towards the robot’s front
assessment system for robot intervention without the use of body. This assumes that the participant is paying attention to
external monitoring equipment. The utilized robot for this the robot. Once the participant’s face is detected, the attention
study is the humanoid robot Nao from Softbank Robotics with score is incremented with a set increment value. As long as the
our added mobile screen mounted on the robot body, shown participant’s face is not detected, the attention score is
in Fig. 1. decremented by a smaller set decrement value. The increment
and decrement values are made variables that can be reset
A. Technical and hardware configurations based on future findings from experiments with autistic
For the robot intervention with autistic children, beside participants. The attention score is designed to be a
regular Nao robot intervention [6], we have added mobile comparison factor between several interaction sessions’
based hardware with Facial Features Display to enhance the results for a single participant. Thus, its parameters should be
intervention, as seen in Fig. 1. The presented approach does constant for each participant. In this study, parameters are
not compromise the mobility of the robot and ensures the shown in Table I. The used increment and decrement values
applicability in human-robot interaction scenarios. The affecting attention score are shown in the table
display is assembled with the aid of a custom chest-mount (experimentally tuned).
designed for the NAO robot, shown in Fig. 2, to minimize the
changes in the moment of inertia due to the added weight. The TABLE I. USED INREMENT AND DECREMENT VALUES AFFECTING
ATTENTION SCORE.
added display is a mobile phone that shows emotions running
a custom designed mobile application named Emotions Increment Value Decrement Value
(points) (points)
Selector. A mobile magnetic holder was customized to fit on Attention 5 1
the chest-mount to carry the mobile screen. The mobile
application was designed using the MIT App Inventor [12], it Joint Attention 15 1
receives the emotion control data from the computer via Sound Response Eq. (1) 5
socket communication over Wi-Fi. The application is made
easy to use without complications other than specifying the
computer’s IP address to link it with the mobile automatically. Joint Attention Score:
This score part is designed to measure how the participant
reacts to robot motoric actions and requests. For instance, if
the robot talks about a box on the side and points towards it,
the participant is expected to look towards the same box. The
participant’s head motion detection is simplified to detect the
participant’s left and right ears. If the participant looks to
his/her right side, his/her left ear should appear to the robot.
And the same applies for his/her left side and right ear. This
detection is made using the same image frames retrieved from
the mobile camera. If the detected participant’s head direction
Fig. 2 [a] The customized NAO chest holder front side with the mobile matches the current robot’s head direction, the joint attention
magnetic holder. [b] is the holder back side with the added strap to score is incremented by a set increment value; else, it is
fasten it. decremented by a set decrement value (Table I). In addition,

978-1-5386-9506-7/19/$31.00 ©2019 IEEE 9–11 April, 2019 - American University in Dubai, Dubai, UAE
2019 IEEE Global Engineering Education Conference (EDUCON)
Page 788
the attention score is incremented or decremented by set the participant’s current emotion among five possible
values in each joint attention instant, since attention score is emotions: happy, sad, angry, surprised and neutral. The most
designed to report on the general attention of the participant. probable emotions are logged and graphically represented as
color bands, each color represents a specific emotion

Sound Response:
The mobile microphone is used as an IP mic and defined
as a sound input in the laptop for sound response module. The
sound response module calculates the time between the time
instant when NAO robot finishes asking a question and the
time instant when the participant starts answering. This time
difference is reported as sound response time and it is used to
calculate an increment in the attention score. Lower sound
response time (faster response) generates a bigger increment.
The maximum increment due to the lowest response time
(ideally zero) can be reset, but it should be constant for the
same participant in all sessions for comparison.
This module has some parameters that can be adjusted
based on the participant’s normal speaking volume and
background noise. These parameters are:
x Threshold-sound-peak: detected peak sound level
after which the module assumes the person has started
speaking. Set to 2000Hz in the following experiments.
x Time-out: the maximum time the module waits before
killing the process and ignoring the sound response.
This value is expected to be bigger with participants
with speech difficulties. Set to 2.5 seconds in the
following experiments.
x Text-is-done-time-lapsed: the maximum time the
module waits for the robot to reply with a text-is-done,
after which the process is killed if robot didn’t finish
speaking. This value can be set to larger than the time
the robot takes to say the longest expected question.
Set to 5 seconds in the following experiments.
x Time-delay: a delay in the code to compensate for the
mic delay; the time between capturing a sound peak at
the mic and receiving the same sound peak received
at the computer. Set to 0.25 seconds in the following
experiments.
The increment in attention score due to sound response
(‫ )ݎܿ݊ܫ‬is made as a function of response time (ܴ݁‫ )݁݉݅ܶ݌ݏ‬and
maximum score for response time ( ‫ ) ݁ݎ݋ܿܵ݌ݏܴ݁ݔܽܯ‬as the Fig. 3. Assessment System and Scenario GUIs. The first page in the GUI
following Eq1: [a] is the assessment system control interface. It supports local and IP
cameras. Plot Results will plot the latest session results. The second [b]
‫ ݎܿ݊ܫ‬ൌ ‫ ݁ݎ݋ܿܵ݌ݏܴ݁ݔܽܯ‬െ
ெ௔௫ோ௘௦௣ௌ௖௢௥௘
ൈ ܴ݁‫( ݁݉݅ܶ݌ݏ‬1) and third [c] pages are the English and Arabic versions of the desinged
்௜௠௘ି௢௨௧ scenario, respectivly. They can be accessed from Open Scenario button.
Based on the above increment equation, the faster the sound
response, the more attention score increment is granted. The C. Experimental Setup:
used value of ‫ ݁ݎ݋ܿܵ݌ݏܴ݁ݔܽܯ‬in the following experiments is
25 points. Our robot intervention scenario is divided into two parts
[6], a control part, in which the therapist can control the robot
Moreover, the IP mic allows the operator to hear what the from operator’s room, Fig. 4[b] (more details are in [6]). And
mobile hears; what the participant and the robot say, without the other part is the experimental room in which the child
interfering with the sound response module. All of these interacts with the robot and our system assesses the interaction
communications were made over Wi-Fi with different ports. (Fig. 4[a]).
Emotions Detection: The control part is made as a graphical interface as shown
The proposed assessment system is also able to estimate in Fig. 3 to be easy-to-use by non-technical operators like
the participant’s emotion from facial features with the aid of therapists. The operator should specify the maximum time
NaoQi [8] emotions detection Python module utilizing NAO limit and start the interaction session, having the ability to stop
robot onboard camera. The module returns the probability of at any time. During the diagnosis process, the operator can
control the flow of the predesigned scenario, having the ability

978-1-5386-9506-7/19/$31.00 ©2019 IEEE 9–11 April, 2019 - American University in Dubai, Dubai, UAE
2019 IEEE Global Engineering Education Conference (EDUCON)
Page 789
to type a new response if the participant deviates from the
preset scenario. The scenario includes dialogues to interact
with the participant to grab his/her attention and test his/her
interaction capabilities. Sound responses, joint attention
requests and robot movements are automatically called in the
preset scenario parts.
5 experiments were carried out with healthy participants to
test the robustness of the designed control and assessment
systems. As stated before, the control station was placed in a
room different than the room where participant and the robot
are. Participant and robot were seated close to each other as
seen in Fig. 4[a]. A side camera was used to record and
monitor the experiments remotely.

Fig. 5. Real-Time Plot. The red line is the real-time attention score. The
III. RESULTS blue line is the real-time joint attention score (this experiment had no joint
attention requests). Yellow ribbons mean a sound response request; a sound
repsonse was expected at this time instant. The numbers in white are the
The output assessment scores are accumulated in real- fast sound responses; less than the set time_out value.
time and shown as a separate diagnostic real-time plot as
shown in Fig. 5 to show the operator the level of interaction
between the autistic individual and the robot as attention and Final results after the end of the sessions are presented in
joint attention scores. The real-time plot reports on the fast three subplots as:
responses’ values, sound response requests, attention score 1- Attention mixed scores plot: emotions color bands,
and joint attention requests’ instants. The results are saved at attention and joint attention scores
the end of each session as a ‘.pickle’ data file for further 2- Sound Response Plot: shows the participants’
processing. The operator can directly plot the recent response time versus interaction time in seconds.
experiment’s results from the scoring system GUI. Fig. 6 and 3- Emotions counter: shows how many times each
Fig. 7 shows the results obtained from experiments done with emotion has been detected.
two different participants. Assessment system results show the expected interaction
level; after 100 seconds, the participant in Fig. 6 was not
looking at the robot for a duration of 95 seconds, which was
a clear drop in the attention score. Yet, the other participant,
Fig. 7, showed a better attention to the robot with smaller
drops in attention score.
Second participant, Fig. 7, didn’t have any joint attention
requests from the robot, while the first participant, Fig. 6, was
expected to show some joint attention cues, which is shown
by the increase in the joint attention score.
Sound Response results show that both participants were
partially responsive at all times of interaction, which is shown
in real-time plots of yellow bands without the response time
written on top of the yellow band. Not writing the response
time means that the response time was more than the set
timeout value. This is approved in the sound response plots,
where participants had many big values in response times,
which means that they were slow in their vocal responses at
those instants.
Emotion bands’ plots show the most probable emotion at
each instant compared to attention and joint attention scores
versus interaction time. This helps in matching participants’
interaction levels to emotional states. The emotions counter
makes it easier to know the dominant emotion compared to
other emotional states in the entire interaction session.
Moreover, some autistic traits could be tested with the
knowledge of participant’s emotional state. Both participants
Fig.4. Robot Interaction Room [a] and Control Station [b]. Communication
show more ‘Happy’ emotion predictions at the beginning of
is made over WiFi. The operator may not need any technical trainings to interactions than ‘Sad’, and more ‘Sad’ than ‘Happy’ at the
use the control station. In the lower figure [b], the right screen is the monitor end of interaction. This shows a decay in the level of
showing the Robot Interaction Room to the operator, the middle screen is excitement as the interaction time increases.
the scenario control GUI and the left screen shows the real-time plot and the
robot general motions and behaviors controller.

978-1-5386-9506-7/19/$31.00 ©2019 IEEE 9–11 April, 2019 - American University in Dubai, Dubai, UAE
2019 IEEE Global Engineering Education Conference (EDUCON)
Page 790
Fig. 7. Results of The Experiment With The Second Participant.

Fig. 6. Results of the experiment from the first participant. Upper plot are
the attention scores (red) and joint-attention scores (blue) normalized and IV. CONCLUSION
plot over emotions color bands. Attention and emotional states can be A novel low-cost scoring system for robot intervention in
mapped at each time instant. “None” emotion means that no emotion data
could be generated at that instant. Middle plot shows all sound responses autism therapy and diagnosis was designed and tested on two
recorded. Slow sound responses are more than timeout value and they are participants. Results show a match between the interaction
not plot in real-time. Fast sound responses are equal to or less than the trend captured by the scoring system and observations. The
timeout value. Timeout value can be reset for each participant to match their proposed scoring system uses a simplified hardware and
cases. Several parameters can be adjusted by the operator to customize the
scoring system based on participant’s case, this enhances attention cues software setup that is easy to use by non-technical autism
detection. Lower plot shows participants emotions counter; it is the overall therapists, as well as parents in domestic environments. This
count of emotions detected during the session. These results are comparable scoring system facilitates using robots as daily assessment
for the same participant in different sessions given the same parameters tools and therapy assistants in autism schools, hospitals and
values.
in domestic use. Our future direction is to apply the attention
testing in clinical setups to a cohort of autistic children with
attention deficit and compare the results with therapists’

978-1-5386-9506-7/19/$31.00 ©2019 IEEE 9–11 April, 2019 - American University in Dubai, Dubai, UAE
2019 IEEE Global Engineering Education Conference (EDUCON)
Page 791
assessment results to test the validity of the proposed system Autism Spectrum Disorder Screening and
clinically. Having clinically validated results would help Intervention," Sensors, vol. 17, no. 1, p. 46, 2016.
develop a permanently available clinical system that can be [6] M. Alahbabi et al., "Avatar based interaction
used in domestic setups as well. therapy: A potential therapeutic approach for
children with Autism," in Mechatronics and
REFERENCES Automation (ICMA), 2017 IEEE International
[1] J. L. Matson and M. Shoemaker, "Intellectual Conference on, 2017: IEEE, pp. 480-484.
disability and its relationship to autism spectrum [7] Codamotion CX1-800 [Online]. Available:
disorders," Research in developmental disabilities, https://codamotion.com/3d-measurement/.
vol. 30, no. 6, pp. 1107-1114, 2009. [8] Aldebaran Nao robot [Online]. Available:
[2] V. Eapen, A. A. Mabrouk, T. Zoubeidi, and F. http://www.ald.softbankrobotics.com/.
Yunis, "Prevalence of pervasive developmental [9] Kinect for Windows features [Online]. Available:
disorders in preschool children in the UAE," http://www.microsoft.com/en-
Journal of Tropical Pediatrics, vol. 53, no. 3, pp. us/kinectforwindows/meetkinect/features.aspx.
202-205, 2007. [10] Y. Feng, Q. Jia, M. Chu, and W. Wei, "Engagement
[3] S. M. Anzalone et al., "How children with autism Evaluation for Autism Intervention by Robots
spectrum disorder behave and explore the 4- Based on Dynamic Bayesian Network and Expert
dimensional (spatial 3D+ time) environment during Elicitation," IEEE Access, vol. 5, pp. 19494-19504,
a joint attention induction task with a robot," 2017.
Research in Autism Spectrum Disorders, vol. 8, no. [11] StereoLabs. zed-ros-wrapper [Online] Available:
7, pp. 814-826, 2014. https://github.com/stereolabs/zed-ros-wrapper
[4] K. S. Lohan, E. Sheppard, G. Little, and G. [12] S. C. Pokress and J. J. D. Veiga, "MIT App Inventor:
Rajendran, "Towards improved child robot Enabling personal mobile computing," arXiv
interaction by understanding eye movements," IEEE preprint arXiv:1310.2830, 2013.
Transactions on Cognitive and Developmental [13] P. I. Wilson and J. Fernandez, "Facial feature
Systems, 2018. detection using Haar classifiers," Journal of
[5] J.-J. Cabibihan, H. Javed, M. Aldosari, T. W. Computing Sciences in Colleges, vol. 21, no. 4, pp.
Frazier, and H. Elbashir, "Sensing Technologies for 127-133, 2006.

978-1-5386-9506-7/19/$31.00 ©2019 IEEE 9–11 April, 2019 - American University in Dubai, Dubai, UAE
2019 IEEE Global Engineering Education Conference (EDUCON)
Page 792

You might also like