You are on page 1of 10

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
ScienceDirect
ScienceDirect
Procediaonline
Available Computer
at Science 00 (2019) 000–000
www.sciencedirect.com
Procedia Computer Science 00 (2019) 000–000 www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 159 (2019) 323–332

23rd International Conference on Knowledge-Based and Intelligent Information &


23rd International ConferenceEngineering
on Knowledge-Based
Systems and Intelligent Information &
Engineering Systems
Bob - A General Culture Game with Voice Interaction
Bob - A General Culture Game with Voice Interaction
Marta Filimon, Adrian Iftene11*, Diana Trandabăț22
Marta Filimon, Adrian Iftene *, Diana Trandabăț
Faculty of Computer Science, “Alexandru Ioan Cuza” University, General Berthelot, 16, 700483, Iasi, Romania
Faculty of Computer Science, “Alexandru Ioan Cuza” University, General Berthelot, 16, 700483, Iasi, Romania

Abstract
Abstract
This paper presents a general culture game, called Bob, implemented as a skill for Amazon’s software assistant Alexa. The main
This paper presents
motivation a general
of this work culturelearning
is to enable game, called Bob,
through implemented
games and smartasdevices,
a skill for Amazon’s
which software
are nowadays assistant
part of mostAlexa. The main
children’s lives
motivation
and homes.ofThe thisBob
work is toprovide
game enable learning
users withthrough
generalgames andquestions
culture smart devices,
from which are nowadays
the geography field, part
moreofspecifically
most children’s lives
related to
and homes.
cities, The Bob
countries, and game
lakes. provide users with
The questions are general
formedculture questions
by extracting andfrom the geography
analyzing field,from
information moretwospecifically related to
major knowledge
cities,
sources,countries,
DBpedia and
and lakes. The both
Wikidata, questions are formed
massively used in by
theextracting and analyzing
natural language processing information
field. Bobfrom two major
can initiate knowledge
general culture
sources, DBpedia
tests, during whichand Wikidata,
questions willboth massively usedadapted
be automatically in the natural language processing
to the knowledge level of the field. Bobthrough
player can initiate general
Computer culture
Adaptive
tests,
Testing during
(CAT).which questions
Following userwill be automatically
specified settings, Bobadapted to thethe
also offers knowledge
possibilitylevel of the
to rank player
players through toComputer
according Adaptive
their performance
Testing (CAT).
and to post Following
Twitter statisticsuser specifiedOne
on request. settings,
majorBob also offers
advantage of ourthe possibility
solution comes to from
rank the
players according
fact that to their performance
the application can be used
and to post
through TwitterEcho
Amazon statistics on request.
or through One major advantage
a smartphone, of our
allowing the solution
child comes
to access from home
it from the factorthat theany
from application can Internet
place with be used
through
connection.Amazon Echo or through a smartphone, allowing the child to access it from home or from any place with Internet
connection.
© 2019 The Author(s). Published by Elsevier B.V.
© 2019
© 2019 The
The Authors.
Author(s).Published
Published by Elsevier B.V.
This is an open access article underbythe
Elsevier B.V.
CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open
Peer-review access
under article under
responsibility of the
KES CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
International.
Peer-review under responsibility of KES International.
Peer-review under responsibility of KES International.
Keywords: Educational assistant; Knowledge extraction; Computer Adaptive Testing (CAT); User interface
Keywords: Educational assistant; Knowledge extraction; Computer Adaptive Testing (CAT); User interface

1. Introduction
1. Introduction
The number of smart devices present in our lives has substantially increased nowadays. From smartphones and
The to
tablets number
smartofwatches
smart devices present
and smart in our these
speakers, lives has substantially
kinds of devicesincreased nowadays.
have become From smartphones
indispensable and
in our current
tablets to smart watches and smart speakers, these kinds of devices have become indispensable in our current

* Corresponding author. Tel.: +40 232 201091.


E-mail address:author.
* Corresponding adiftene@info.uaic.ro
Tel.: +40 232 201091.
E-mail address:author.
* Corresponding adiftene@info.uaic.ro
Tel.: +40 232 201771.
E-mail address:author.
* Corresponding Tel.: +40 232 201771.
dtrandabat@info.uaic.ro
E-mail address: dtrandabat@info.uaic.ro
1877-0509 © 2019 The Author(s). Published by Elsevier B.V.
This is an open
1877-0509 © 2019
access
The article
Author(s).
underPublished
the CC BY-NC-ND
by Elsevier license
B.V. (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review
This is an open
under
access
responsibility
article under
of KES
the CC
International.
BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of KES International.

1877-0509 © 2019 The Authors. Published by Elsevier B.V.


This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of KES International.
10.1016/j.procs.2019.09.187
324 Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332
Filimon, M. et al. / Procedia Computer Science 00 (2019) 000–000

activities. The rapid acceptance of these smart devices is obvious if we consider that in 2018 over 16% of Americans
own a smart speaker (according to a recent study by Perez, S. stating that “39 million Americans now own a smart
speaker” 3), although they were introduced to the market only a few years ago. One of such smart speakers is Echo,
enhanced with Alexa software assistant, developed by Amazon, which allows users to interact with devices in an
intuitive way through voice commands (a comprehensive description of the available voice commands can be found
on this article4).
Echo has begun to be widely present and frequently used as timer, music player or to allow users to easily read
the news. More and more users see it as an important vector in the development of the intelligent house concept 5.
The number of devices integrating Alexa increases every day, and their type began to diversify 6. Some services
prove to be very useful in the kitchen7, where more than 85% of users have set an alarm while cooking using the
smart speakers and 45% have added an object to their shopping cart 8 . This was probably one of the motivations that
led to the collaboration between Amazon and LG to introduce the smart Instaview fridge.
In the automotive domain, Ford has integrated Alexa with the Ford Sync application, which allows users to easily
perform operations through voice commands while they drive. They can use the system to set up their GPS or find
information about the car, road, weather, but they can also use it as a road companion, since Alexa is often used for
entertainment. Statistics say that 82% of users listened to music with Alexa, and 66% listened to news from various
newspapers.
Mobile phones, tablets and personal computers have long been used for entertainment, although their original
destination was different. For example, 49% of those who own a mobile device in the United States use it to play
video games. This opens up the possibility that, in the near future, Echo, or other devices which integrate software
assistants, will also be appealing for video game lovers.
Amazon has provided developers with a service through which they can integrate Alexa voice interface with
various applications they develop. Alexa Skill Kit (ASK) is a collection of APIs that help developers quickly and
easily create “skills” for Alexa. Skills are voice applications that a user can access through Alexa.
A skill (or ability) consists of two parts: the interface and the skill service. In order to have a functional skill, the
two parts must be coded to allow for a perfect communication. There are currently around 60,000 skills among
Amazon services available from companies like Starbucks, Uber and Capital One, as well as from innovative
designers and developers, and their number continues to grow. Part of these skills are used in education [4], [12],
[15], and we can mention here: (1) skills to read from Kindle or from online libraries, (2) skills to spell a word, to
provide synonyms, definitions or to extract information from Wikipedia, (3) skills to perform simple mathematics
operations or to solve equations, (4) skills to translate an word and to pronounce it in various languages.
Additionally, it is well known that educational games become ever more frequent today, the majority of them
related to general culture and designed to improve knowledge acquisition by students. In this context, this paper
introduces a complex application, offering a novel solution to geography learning. The proposed system creates test
questions extracting data from well-known information sources such as DBpedia and Wikidata, constantly adapting
tests at the level of the students using Computer Adaptive Testing (CAT). The whole system is using Alexa’s
interactive voice interface, which can be activated through the Amazon Echo device or from a smartphone.
In the next section a series of similar applications are presented, followed in section 3 by the architecture and
implementation details for our general culture application, Bob. The final section discusses several conclusions of
this paper.

3
TechCrunch. Oath Tech Network (all web page references were accessed March 27, 2019).
4
Potter, M. (2017) “What can Amazon Echo (Alexa) Say and Do?” on Steemit.com website
5
Beebom.com website: 10 Cool Amazon Echo Alternatives You Can Use.
6
Smartwatches.com website: When You See These 10 Awesome Amazon Echo Alternatives You’ll Be Sold.
7
Cnet.com website: 11 ways to use Alexa in the kitchen .
8
Weinberger, M. “Why Amazon’s Echo is totally dominating — and what Google, Microsoft, and Apple have to do to catch up” on
businessinsider.com.
Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332 325
Filimon, M. et al./ Procedia Computer Science 00 (2019) 000–000

2. Similar applications

This section presents existing educational applications available for Alexa, with their specificities and differences
when compared to our application. As all Alexa applications, they need to be activated by a special phrase.

2.1. Question of the Day

The Question of the Day9 application will be activated when the user pronounces “Alexa, ask the question of the
day”, “Alexa, play question of the day” or “Alexa play the game question of the day”. Every day, Question of the
Day proposes a different question to the users, ranging from art or entertainment up to literature or science. Its main
objective is similar to ours, namely to improve the general culture level of its users. Another similar feature is that
users can collect points and see statistics on their responses and the ones of other users. To help the user answer
questions, four variants for the answer are suggested. If the user doesn’t hit the correct answer, he is offered the
challenge of the day. At the end, the application will tell you how many of the players have answered the correct
questions. For example, the April 24, 2017, question was the following: What famous American playwright wrote
the play “A Tram called Desire”? A major difference with our system is that the collection of questions and hints
for answers is a closed collection, and it is not dynamically adapted to the knowledge level of the user.

2.2. Amazing Word Master Game

Another type of educational application is the AmazingWord Master Game 10 application, activated when the user
says “Alexa, ask Word Master to play a game”, “Alexa, ask Word Master to start with <<Apple>>” or “Alexa, open
Word Master”. Each time the application is started a word is presented, and the user has to say a word that starts
with the last letter of the initial word, and so on. A score for both the system and the user is calculated, based on the
sum of the lengths of the words given in the answer. After each user’s response, the device will communicate to the
player the score of both players at the current stage. This game is more different than ours, since the user competes
with the application, while in Bob, each user competes with himself. There is also only a one word test question,
while in our application test questions are more complex.

2.3. Tricky Genie

The Tricky Genie11 application is activated by “Alexa, start tricky genie”, “Alexa, launch tricky genie” or “Alexa,
ask tricky genie for a story”. This game is another type of exercise, where a story is presented and the user is
fictively put in the position of a character facing a challenge, and needs to make a choice between three solutions,
unknown to the player. Genie will reveal the content of the first solution, and the user can either accept it or go to
discover the next one. The game contains only twelve stories. This game is perfect for improving the understanding
of English texts. Compared to our application, this game has limited number of predefines stories, and does not
adapt to the level of English that the user needs to improve.

2.4. Speech in Education

In educational contexts, the speech was used in various ways: for the evaluation of children in multicultural
contexts [3], for learning primary school mathematics via explanations [1], or for activating eLearning system
components through speech recognition [11]. More sophisticated eLearning systems have used voice recognition to
detect children’s emotional states and to identify their stressful moments [10]. These solutions are most often

9
Alexa Skills Store. Question of the Day
10
Alexa Store. Amazon Word Master Game
11
Alexa Store. Tricky Genie
326 Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332
Filimon, M. et al. / Procedia Computer Science 00 (2019) 000–000

expensive and require the child’s presence at school or where these solutions are installed. In contrast, our solution
can be used by those who own an Amazon Echo device or a smartphone, and can be played at home or any place
with access to the Internet.

3. The Bob Game

This section presents the architecture for Bob application, as well as elements related to the creation of the
working environment needed for the skill to function. The presentation focuses on the services, their functionalities
and architectural choices. Figure 1 shows an overview of the interaction between all the services used to develop the
application.

3.1. Voice service

To set up the Voice service, an account was created at https://developer.amazon.com, before a new skill for
Alexa was created. The skill received a unique identifier, and various information was collected about the
interaction model, integration with the skill service, description, test instructions, etc. The interaction model contains
plans for intents, slot types and utterances.

Fig. 1. The interaction between services used to develop the application.

Intents schema
The Intent Scheme is a JSON object that contains a list of objects with information about each intention that the
skill will be able to handle. Objects containing information about an intent will store its name along with an optional
list of objects indicating the slots that can appear within the intent, along with their type.
The names for intents can be offered by ASK, defined by the developer, or presented as slot types. Our Bob
application has several intents:
• AMAZON.CancelIntent is defined in ASK, and under this name are collected a series of replies that the
user can communicate when they want to stop their tests;
• BobBeginTest is defined by the developer, and it includes replies that the user can tell when they want to
start a new general culture test;
• BobAnswer is defined by the developer, and it contains input phrases that the user can provide when he
wants to answer a general culture question. Because there are many possible answers, a number of slots are
used. Two of these can be seen in Figure 2:
• AnswerL, which has the AMAZON.Landform type, defined by the user;
• AnswerC, which has the type AMAZON.Country, defined in ASK.
Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332 327
Filimon, M. et al./ Procedia Computer Science 00 (2019) 000–000

{ "intents": [
{ "intent": "AMAZON.CancelIntent"
}, { "intent": "BobBeginTest"
}, { "intent": "BobAnswer",
"slots": [
{ "name": "AnswerL",
"type": "Amazon.Landform"},
{ "name": "AnswerC",
"type": "AMAZON.Country"}],}]
}

Fig. 2. Part of the intents from the Bob application.

Slot types
The type of a slot is defined by listing all its possible values. These are responses that contains possible answers
to questions, such as:
• AMAZON.GB_CITY contains the names of all cities that compose correct answers to questions related to
cities, in the form which is most commonly used in the Great Britain;
• AMAZON.Landform contains the names of landforms which can appear as correct answers to questions
related to lakes. In a further stage of the application, other types of landforms will be added (mountains,
hills, valleys etc.);
• AMAZON.Country contains the names of countries that can be correct answers to questions about
countries.

Utterances
Utterances are added to the Sample Utterances section of the interaction model. Through these utterances, we
specify each possible input that comes from the user and precede them by the intent it relates to.

3.2. Skill service

In order to develop the application, the AWS Lambda service was chosen because it provides support for Alexa
skills and facilitates calls to other AWS services through a lambda function (https://aws.amazon.com/lambda/). The
programming language is NodeJS and has been chosen since it is recommended for small server-side applications
where non-blocking operations are very important.
In order to create a lambda function, the developer needs to access the management section in the AWS account.
In the section dedicated to create lambda functions, the first step involves setting a blueprint, i.e. the running
environment, which for Bob is Node.js 6.10, along with a template for the function to be developed. The second step
is to configure the trigger function, in our case, the voice interaction through Alexa Skill Kit. The configuration
function section provides information about permissions, memory limits, timeout, etc. After creating the function in
the voice service configuration section, the function ID must be provided. This step assures sending the information
from the user, after processing, to the lambda function that will decide on the answer it will provide. In order to have
a functional skill, the code needs to be deployed, in our case through the AWS Command Line Interface (CLI).

3.3. Description of the manipulation of the intents in the Bob application

Bob application offers users the possibility to perform the following main actions:
• to receive questions about general culture in the field of geography;
• to initiate general culture tests that will end with a score which enables them to advance in the overall
player ranking;
• to find the place in the overall ranking and the number of points they have accumulated;
• to post on Twitter the accumulated score.
328 Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332
Filimon, M. et al. / Procedia Computer Science 00 (2019) 000–000

We briefly describe the above mentioned functionalities, assuming that we already have in the database the
information needed to form questions. The next section will show how the database is populated with data extracted
from two common information sources.
Building Questions: Existing studies in education showed that complex questions, in contrast to simple factoid
questions, have better benefits for reading comprehension [6], [14]. Most of the existing approaches used to generate
questions are focused on generating questions from a single sentence [9], [16, 17] and only a few approaches use
semantic analysis of text [2], [7] or semantic resources to create test question. Bob application can receive requests
from users to ask questions about cities, countries, lakes, or a random category. These are translated by the Voice
Service into BobAskCity, BobAskCountry, BobAskLake and BobRandomQuestion respectively. We
exemplify below how the BobAskLake intent is manipulated, the procedure being similar for all intents.
When the BobAskLake intent is received by the Service Skill component, the handleLakeIntent function
that is part of the lake module will be called. Following this call, the user will be asked a question about a lake for
which Bob finds information in the database. The handleLakeIntent function forms a configuration object
containing: the maximum value ID in the database and the value with the minimum value, the name of the table
containing information about the category to which the question belongs (in the case the lake), and the object that
corresponds to the function that builds the question. Finally, it calls the ask_answer.handleAsk function,
which, regardless of the category of the question to be formed, will randomly choose a record from the database.
The ask_config.question function will then create the question. Before providing the user with the question,
the correct answer will be added to the session, along with other information, so that they can be used later.
The request received from Voice Service contains the request key, which is a JSON object containing
information about the last interaction of the client with the Alexa voice interface. Within this object, a summary of
the request is encoded, specifying the type of intent and the value of the slots, if any. Each slot of an intent is
included in the object, but if not used in the client’s request, the value field will not exist. To evaluate the user’s
response, it is extracted from the JSON response object and compared to the session data representing the correct
answer. If there is a match, the answer is considered correct and the user is notified and asked if he wants to find out
more about the answer, if the database contains more information. The user will answer this question with Yes or
No, which will further generate an AMAZON.YesIntent, respectively AMAZON.NoIntent. These two intents can
be generated at any time. For example, a user can say yes in response to a question or to an operation he asks Alexa.
In order to distinguish between the moments when the user is asking for more information from the database, the
MoreInfo field was added to the session, taking the true value when it is legal for the next intent to be
AMAZON.YesIntent or AMAZON.NoIntent.
Testing: Bob application can receive requests from users to start a new general culture test. This is translated by
the Voice Service component into a BobBeginTest intent. The first question in a test is randomly chosen from
the database, similar to the way the BobRandomQuestion intent is manipulated. To mark the fact that the user is
during a test, the test key containing an object is added to the session. This, in turn, has two key value elements
representing the number of remaining questions and the number of questions that the user has correctly answered.
The only difference between the intent of BobRandomQuestion and BobBeginTest is the session component.
The answer is transmitted to the device that is integrated with Alexa via a callback that receives as parameter an
object representing the session and an object containing the answer that Alexa will provide to the user along with
other information. In order not to duplicate the code, we chose to send the BobRandomQuestion intent function,
a fake callback that adds the test-specific items to the session, and then calls the real callback, which transmits the
response to the user. The answer to the questions in the test is similar to the answers to a question. The only
differences are the changes that need to be made to the elements in the session, which are specific to a test, and the
fact that one question can be attempted to respond only once.
Choosing the Questions for a Test: After completing a test, each player will receive 4 or 10 points depending
on the level of geography he/she possess. We will consider that there are two levels: beginner and master, beginners
receiving 4 points at the end of the game, and masters 10. To objectively determine the level of a player, we used an
algorithm that adapts the questions during a test (a process called Computer Adaptive Testing or CAT) [20]. The
objective of better classifying a player’s level of knowledge will be based on answers given to questions, a priori
probability of each questions, and a priori likelihood of classifying the population into levels of knowledge. In order
Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332 329
Filimon, M. et al./ Procedia Computer Science 00 (2019) 000–000

to choose the most appropriate question, 3 randomly extracted questions in each category will be selected from the
ones with the maximum of information gain.
Publish the Score on Twitter: The Bob application can get from users the request to post their score on Twitter.
This will be translated by the Voice Service component into a BobPost intent. In order to implement this
functionality, the skill had to be configured first so that it is linked to the user’s Twitter account. For this, the
application needs access to the Twitter user’s accessToken through which it can access, modify, or add
information to the user’s account. When a BobPost type intent is received by the application, the score of a user,
together with his location, will be extracted from the database. A user’s ID can be extracted from the JSON object
that the Voice Service component forms in order to handle customer requests. Once we have obtained the user data
through the Node.js package, Twitter will post its score and place. To do this, we had to integrate an endpoint that
implements the OAuth 2.0 protocol, for which we have defined an authorization URL in the Alexa Skill Developer
Portal in the Configuration section.

3.4. Extracting data for questions

In the previous section we considered we have the database of information needed to generate questions. To
extract the data, two popular knowledge bases, DBpedia [8] and Wikidata [19] were used. Data from Wikipedia was
previously used in similar approaches [18]. Access to the two knowledge bases was made through a separate
application designed to extract and filter data to only store the information that the developed skill will use.
Why is this a Separate Component? DBpedia or Wikidata can be accessed through a SPARQL endpoint. Initially,
the Skill Service component handles requests made to the two knowledge bases. Although at first glance
communication is fast, with the increase in the number of extracted instances and with the creation of operations
similar to those of relational databases, the latency has increased considerably, reaching more than a second. The
effect of increasing latency could easily be noticed by a user of the Bob application. The interaction between Alexa
and the user no longer gives the impression of a natural conversation. To reduce latency, it was necessary to store
the information needed to construct a question into a database. We have chosen DynamoDB [5] because it is a
NoSQL database. To further reduce network latency we created the database in the same region where we hosted the
Skill Service component, meaning the Lambda function.
Extracting Data: In order to extract the data, a combination of Wikidata and DBpedia will be used, as follows:
1. Make a query using the SPARQL endpoint provided by Wikidata. This query will extract data about the
desired instance;
2. If the data extracted in step 1 is not enough, then a query will be made to the SPARQL endpoint
provided by DBpedia;
3. Several similar response variants are generated. For example, if the correct answer is the Caspian Sea
Lake, located in Asia, we will not consider as response variants lakes located in North America.
The use of both knowledge bases was imposed because, as the comparative study between DBpedia, Freebase,
OpenCyc, Wikidata and YAGO [13] suggests, DBpedia tends to lack consistently structured data. The previous
three steps will be exemplified for data extracted for cities.
In step 1 of the information retrieval process, a relevant query is similar to the one in Figure 3, which provides
each country with the highest population, decreasing ordered by the number of inhabitants.

SELECT DISTINCT ?item WHERE {


{ SELECT (MAX(?population) AS ?population) ?country WHERE {
?item wdt:P31/wdt:P279* wd:Q515 .
?item wdt:P1082 ?population .
?item wdt:P17 ?country .}
GROUP BY ?country ORDER BY DESC(?population)
}
?item wdt:P31/wdt:P279* wd:Q515 .
?item wdt:P1082 ?population .
330 Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332
Filimon, M. et al. / Procedia Computer Science 00 (2019) 000–000

?item wdt:P17 ?country .


?item wdt:P625 ?loc
} order by desc(?population) LIMIT 100

Fig. 3. Example of SPARQL query.

In step 2, the SPARQL endpoint of the DBpedia Knowledge Base will be called for information about each city.
This step can sometimes cause problems because Wikidata and DBpedia have different methods of identifying
instances. In DBpedia, the name is used, but sometimes there may be differences between the name it uses as ID and
what Wikidata provides as the name of the instance. For example, when extracting country data, Wikidata provides
the name of the People’s Republic of China, which is the official name of the country, but DBpedia uses the name
China. To mitigate the effects of this inconsistency, other predicates that specify the name in other forms, such as
dbo:longName for the above example, can be identified. Then a query is build that matches the name given by
Wikidata to the one provided by DBpedia.
In step 3 some possible alternatives of the correct answer will be extracted to make it difficult for the player to
identify the correct answer. In the case of the answer for questions related to cities, locations in the same country as
the correct answer were selected, but also some cities randomly selected, to ensure that we have at least four
possible variants to choose from. Cities in the same country with the correct answer will be selected according to the
following criterion: that they should be among those with a large population, i.e. in the top 12 cities wrt. the
population. In this step, one of the difficulties encountered was the fact that not all cities had as object the P31
(instance of) predicate, Q515 (city ID). After analyzing the objects for the P31 predicate, we noticed that the
instances that are cities are related to the Q15284 instance (the municipality ID), meaning that:
1. There is an object for which the value of the P31 predicate has the subject of the predicate P279 (subclass
of) equal to Q15284, which in natural language means that the selected subject is instance of a subclass for
the concept of municipality.
2. The selected topic is an instance of a subclass that is, in turn, the subclass for the concept of a municipality.

3.5. Usability testing

In order to evaluate the application, we invited seven participants (three female and four male) and organize
different playing sessions. We have recorded their experience with the game while they were performing usual
tasks. During the tests, we took into account their social interaction activity, having 4 participants with high activity,
2 with moderate and 1 with reduced. Recordings analysis clearly showed that:
• Interaction was very attractive, despite a few points where it was a bit confusing;
• Difficulties had been encountered because the questions were not targeting the native country of the
participants;
• The social interaction between participants motivated them to learn the presented notions and try the tests
again.
The conducted usability test consisted of an introduction, five tasks and a short interview. We instructed the
participants to think out loud and express their thoughts during the test. From our observation during the test
sessions, the users found very quickly the needed actions and were able to successfully connect to application. On
the other hand, the participants had some trouble while they pronounced complicated or compound words. The main
issues were that the speed of speaking and speaker volume could not be configured during a test. Another issue is
related to the flexibility to choose the domain of the questions during a test. Also, the participants with a high level
of knowledge asked for the possibility to choose, at the beginning of a test, the level of difficulty, in order to avoid
the simple questions in the first part of the test, while the system is guessing the user’s level.

3.6. Error analysis

Most problems occurred related to understanding and evaluating answers to questions that Bob is addressing,
which affects the test side. These components of the system are most error-prone because the user is in the position
Marta Filimon et al. / Procedia Computer Science 159 (2019) 323–332 331
Filimon, M. et al./ Procedia Computer Science 00 (2019) 000–000

to say complicated words that may not be correctly understood by Bob, especially if the user in a non-native
speaker. Unwanted behavior occurs because the user response is not clear enough and fails to be associated with an
existing intent, or the associated intent is not expected. These problems are based on the fact that the voice service
component, when a user speaks a word-by-word, will compare them with the items in the replica list that the system
accepts and identifies the one that seems closest to the one uttered by the user. Problems arise when the user did not
pronounce correctly certain words or spoke too quickly, or if his interaction with Alexa was interfered with by other
sounds or an unexpected event of the user (for example, coughing or sneeze).
If the user interaction with Bob fails to be assigned to any intent, it will be ignored. Naturally, the user will
repeatedly say the most likely answer, which may give the impression that the application has been blocked. If the
interaction was assigned to another intent, which is unlikely, then the application will give an unexpected response.
For example, if during a test the correct answer sounds very similar to “Post on Twitter”, the application will tell the
user that the score has been posted on Twitter. One of the consequences of this behavior is the interruption of the
test. At this time, the problems with processing the input from the user cannot be solved by the Bob application
because this component is provided by Alexa. In the future, in order to improve the understanding of the user’s
pronunciation, we intend to make a learning module in which Bob learns minimal information about certain
geographic elements, and his task is to repeat the name of that element.

4. Conclusions

The purpose of this paper was to use the Alexa Voice Interface to create a general culture game, Bob, with voice
interaction, and to document each step, in order to facilitate other possible skill development. With the integration of
the Alexa Voice Interface into as many devices as possible, car or home appliances companies have integrated it
into their applications. In the future, Bob can be a way to create entertainment when we wait for traffic or when we
cook. In the next period, more and more abilities will appear on the Amazon site and we tend to think that some of
them will be dedicated to education. Our main contributions come from the fact that we are proposing a solution to
create an ability to help users learn geography with (1) a new method of obtaining evaluation questions based on the
extraction of information from DBpedia and Wikidata resources and (2) a new method of adapting these tests to user
level based on CAT (Computer Adaptive Testing). To improve the application, new categories for questions from
the geography domain will be added, statistics can be obtained related to the likelihood that users’ answers to certain
questions will be correct or wrong depending on their knowledge level, as well as adding more levels of difficulty or
other social networks to post the score. An important advantage over similar approaches come from the fact that our
solution is cheap and can be used anywhere we have access to the Internet through Amazon Echo or through a
smartphone.
Future work will focus on two main directions. The first is to use Bob in the learning and assessment processes of
students who learn geography. The second is to expand Bob’s knowledge database to related fields such as history,
biology, literature, as well as other fields such as art, sports, cinema, music, etc. Regardless of the domain that is
going to be added, the programmer should do the following steps: (1) Decide the predicates of the Wikidata records
that are going to take part in a question; (2) Find the relation between their IDs of the records in Wikidata and the
one used in DBpedia in order to have a higher probability that the information that we considered important in a
question is not missing; (3) In the skill service part, the logic of forming the question for the new domain should be
added, and the configuration file which handles the test creation should be modified accordingly.

Acknowledgements

This work was supported by a grant of the Romanian Ministry of Research and Innovation, PCCDI – UEFISCDI,
project number PN-III-P1-1.2-PCCDI-2017-0818 / 73PCCDI, within PNCDI III. It was also partially supported by
POC-A1-A1.2.3-G-2015 program, as part of the PrivateSky project (P_40_371/13/01.09.2016).
332 Marta
Filimon, M.Filimon et al. / Procedia
et al. / Procedia ComputerComputer
Science Science 159
00 (2019) (2019) 323–332
000–000

References

[1] Ahmad, A. R., Halawani, S., and Boucetta, S. (2014) “Using Speech Recognition in Learning Primary School Mathematics via Explain,
Instruct and Facilitate Techniques”. In Journal of Software Engineering and Applications, 07(04): 233-255. DOI: 10.4236/jsea.2014.74025.
[2] Aldabe, I., Lopez, d. L. M., Maritxalar, M., Martinez, E., and Uria, L. (2006) “ArikIturri: An Automatic Question Generator Based on
Corpora and NLP Techniques”. In Proceedings of Intelligent Tutoring Systems, 8th International Conference, ITS 2006, Jhongli, Taiwan,
June 26-30, 584-594.
[3] Álvarez, C. (2014) “Dialogue in the classroom: the ideal method for values education in multicultural contexts”. In 6th International
Conference on Intercultural Education “Education and Health: From a transcultural perspective”. Procedia - Social and Behavioral
Sciences 132, 336–342. DOI: 10.1016/j.sbspro.2014.04.319.
[4] Amazon Alexa. Alexa in Education. Master the UI of the future (2019), https://developer.amazon.com/alexa/ education. Accessed March 27,
2019.
[5] Amazon. DynamoDB (2019), https://aws.amazon.com/dynamodb/. Accessed March 27, 2019
[6] Amruta, U., and Ashwini, G. (2017) “A Survey on Automatic Question Paper Generation System”. International Advanced Research Journal
in Science, Engineering and Technology (IARJSET). National Conference on Innovative Applications and Research in Computer Science and
Engineering (NCIARCSE-2017), 4(4), January 2017, 18-20. DOI: 10.17148/IARJSET/NCIARCSE.2017.06.
[7] Araki, J., Rajagopal, D., Sankaranarayanan, S., Holm, S., Yamakawa, Y., and Mitamura T. (2016) “Generating Questions and Multiple-
Choice Answers using Semantic Analysis of Texts”. In Proceedings of COLING 2016, the 26th International Conference on Computational
Linguistics: Technical Papers, Osaka, Japan, December 11-17, 1125-1136.
[8] Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., and Ives, Z. (2007) “DBpedia: A Nucleus for a Web of Open Data”. In Aberer
K. et al. (eds) The Semantic Web. Lecture Notes in Computer Science, 4825. Springer, Berlin, Heidelberg, 722-735.
[9] Becker, L., Basu, S., and Vanderwende, L. (2012) “Mind the gap: Learning to choose gaps for question generation”. In Proceedings of
NAACL-HLT 2012, 742–751.
[10] Chen, K., Yue, G., Yu, R., Shen, Y., and Zhu, A. (2007) “Research on Speech Emotion Recognition System in E-Learning”. In Conference
Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27-30, Part III. DOI: 10.1007/978-3-540-72588-
6_91.
[11] Eichner, M., Göcks, M., Hoffmann, R., and Wolff, M. (2004) “Speech enabled services in a web-based e-Learning environment”. In
Proceedings of the IASTED International Conference, WEB-based education. February 16-18, Innsbruck Austria.
[12] Etherington, C. (2017) “Amazon’s Alexa: Your Next Teacher. eLearning”. INSIDE, News. https://news.elearninginside.com/amazons-alexa-
your-next-teacher/. Accessed March 27, 2019.
[13] Farber, M., Ell, B., Menne, C., and Rettinge, A. (2015) “A Comparative Survey of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO”. In
Semantic Web 1, 1-26.
[14] Hamaker, C. (1986) “The effect of adjunct questions on prose learning”. Review of Educational Research, 56 (2): 212–242.
[15] Lewis, V. (2018) “6 Ways Amazon Alexa can Help with Homework”. Perkins: School for the blind eLearning,
http://www.perkinselearning.org/technology/blog/6-ways-amazon-alexa-can-help-homework. Accessed March 27, 2019.
[16] Lindberg, D., Popowich, F., Nesbit, J., and Winne, P. (2013) “Generating natural language questions to support learning on-line”. In
Proceedings of the 14th European Workshop on Natural Language Generation, 105–114.
[17] Mazidi, K., and Nielsen, R. (2014) “Linguistic considerations in automatic question generation”. In Proceedings of ACL 2014, 321–326.
[18] Palmero, A. A. (2014) “Extending Linked Open Data resources exploiting Wikipedia as source of information”. PhD Thesis. Dipartimento
di Informatica. Scuola di Dottorato in Informatica – XXV ciclo. Universita degli Studi di Milano.
[19] Vrandečić, D., and Krötzsch, M. (2014) “Wikidata: a free collaborative knowledgebase”. Commun. ACM 57: 78-85.
[20] Weiss, J. D., and Kingsbury, G. G. (1984) “Application of Computerized Adaptive Testing to Educational Problems”. Journal of
Educational Measurement, 21: 361–375.

You might also like