You are on page 1of 6

How to Design Personalized Challenges for Long-Term

Engagement? Ask your players


Leave Authors Anonymous Leave Authors Anonymous
for Submission for Submission
City, Country City, Country
e-mail address e-mail address

ABSTRACT mental state, as suggested by the Self Determination Theory


[15].
INTRODUCTION
Serious games can be defined as "a game in which education Moreover, the difficulty of the goals should increase as the
(in its various forms) is the primary goal, rather than entertain- player’s skill increase. This is the concept of flow: a player
ment" [14]. The research topic regarding them is growing, and who is in a state of flow is fully engaged with the system. This
they are routinely applied for learning, education and training. state can occur when the player understands what actions are
For example, in recent years, they have been successfully ap- needed to take to reach specific goals. This typically hap-
plied with the objective to increase citizens engagement for pens when the difficulty of the challenges in the gamification
sustainable mobility behaviours [10, 8, 5, 6]. system increases as the player’s skill increases [3]. When a
gamification systems does not get more challenging, it creates
Although they have been proven to effectively promote boredom. On the other hand, the player may feel anxiety or
changes to the behaviour of end users, one crucial aspect frustration if the challenge is too far above his or her skill level.
of their application is the user long-term engagement. To Engagement is reached when the challenges match the skill
be truly effective, behavioural changes promoted by serious level of the player.
games must be sustained over time. Typically the approach
is to recognize the effort of the players based on their perfor- In this paper we present a novel approach for automatic per-
mance, and to provide gratification, some of which must be sonalized generation of challenges, taking into account both
independent from the accumulation of game status over time. the need of ensuring the aspects of choice and difficulty for
maximizing user long-term retention. We tackle this problem
A common game mechanic for this objective is the challenge. through Procedural Content Generation (PCG) of playable
It consists in proposing to a given player to reach a specific units that appeal to each individual player and make her user
goal, whose achievement requires a prolonged individual com- experience more varied and compelling [7].
mitment, typically within a limited period that is significantly
shorter than the game as a whole. A prize is awarded to the We evaluate the proposed approach in the context of a large-
the player if he manages to fulfil the goal within the allotted scale and long-running sustainable urban mobility project.
time. The approach can be directly applied to any other gamifica-
tion application which bases user advancement based on its
Challenge goals target either in-game performance (such as performance.
gaining a certain number of points, reaching a game level,
completing a badge collection), or specific behaviours pro-
moted by the game (such as achieving a specific number of
virtuous actions). RELATED WORK
In order to maximise the engagement provided by the chal- Gamification is an appealing topic, but over the years some
lenge game mechanic, two concept are fundamental: the common problem have arisen. One of the main limitation is the
choice between different goals, and the difficulty of each goal. difficulty to sustain the interest of a player in a gamified system
over a long term. As their interest in game decreases, typically
By providing a choice between goals within the gamified also decreases the effects of behavioural change brought by the
system, players are able to create their own story. Feeling game. Since players have different characteristics, an universal
this autonomy in the system directly links to a more positive solution is infeasible, as would bring some players to boredom
and others to frustration.
Paste the appropriate copyright statement here. ACM now supports three different A solution is to exploits personalization techniques, able to
copyright statements:
• ACM copyright: ACM holds the copyright on the work. This is the historical ap- tailor the game experience to each player. Procedural Content
proach. Generation (PCG) is a set of techniques designed to automate
• License: The author(s) retain copyright, but ACM receives an exclusive publication
license.
the construction of Game Design Elements (GDEs), such
• Open Access: The author(s) wish to pay for the work to be open access. The addi- as game items, encounters, buildings, or even whole levels.
tional fee must be paid to ACM. This allows to improve the experience of players by reducing
This text field is large enough to hold the appropriate release statement assuming it is
single spaced. repetition, that is usually perceived negatively.
Every submission will be assigned their own unique DOI string to be included here.
For example in [17] the authors introduce a combinatorial weekly top 50 players. The leaderboard and the prizes consti-
optimization approach for creating training scenarios from tute the basic extrinsic motivation for the players to play the
scratch, tailoring it to the skill level of the players. This in game.
turn leads to an increase in user engagement, and to a more
effective training. In order to further incentivize players and to keep them en-
gaged, weekly challenges were introduced. They constitute
A special attention must be given to the calibration of the of a goal given to the player, to be completed in a given max-
difficulty of the game. "Given that both failure and success can imum amount of time. If completed they would reward the
become repetitive quickly, games must address the problem of user with a given amount of additional green leaves (the point
meeting all players with the correct level of challenge." [13]. mechanic used in the game). The most common kind of chal-
Rather than reward only with points and badges, it is needed lenges issued asked the user to reach a given amount of tracked
to provide opportunities for the player to succeed in a game kilometres using a given mean of transportation during the
by overcoming challenges [1]. course of a week.
The idea of personalizing game content to the players show Given their nature, the challenges naturally required to be
great potential for keeping players engaged through the con- tailored to the player mobility habits. The objective of the
cept of flow. It can be described as a balance between chal- goal was to promote a personal improvement of the citizen
lenge and competence, or between complexity or boredom lifestyle. In order to be effective, this improvement should be
[4]. "appropriate". Asking for an improvement too slight brings
no utility for the overall goal of the gamified system. On the
One of the goals of adaptive challenge calibration is to aim other hand, an improvement too large might scare the user and
to keep players in a state of flow. Many attempts have been cause him to leave the game. This share many similarities to
conducted in this direction. the concept of flow.
In [12], the authors employ an approach of Dynamic Difficult As a preliminary step, we evaluated the performances of each
Adjustment (DDA) to the game of Tetris. The idea is to ob- player as time series, with the objective to predict the future
serve a given game, extract a defined set of features based on performances based on the past recorded ones. This has unfor-
piece movement and placement, and categorize the player’s tunately proved infeasible.
skill in a three level scale (newbie, average, expert). Based
on this placement the game varies the number of "helpful" We used GMDH, a method that has been credited with superior
next Tetris pieces provided. Their experiments show the user performance than the classical forecasting algorithms such
evaluate more favourably their user experience when improved as Single Exponential Smooth, Double Exponential Smooth,
by DDA. ARIMA and back-propagation neural network [11]. On the
task of predicting the performance of the next week, GMDH
In [16], the authors develop an automatic generation of tracks obtained a mean MASE (Mean Absolute Scaled Error, [9])
for a racing game. They first evaluate the player’s skill, and of 1.46. This means that it was less accurate of the naïve
the proceed to generate a track that the player would describe predictor, which just forecasted the last seen value.
as "fun". This is determined by the difficulty of the curves
presented in the track, and the variation in this difficulty. We speculate that this has to do with two primary reasons: the
scarcity of data in the time series, and the highly volatility of
In [2] the authors aim to introduce difficulty adjustment to the the player’s performance. Regarding the first point, the number
context of Human Computation Games (HCGs). They do so of observation in the time series was at most of 25 elements.
drawing inspiration from the user-matching commonly found The number of non-zero values however was typically much
in multiplayer games, in order to determine an interesting lower, as most of the players started to play after the game’s
choice of tasks for the final user. start, and some stopped to play before its ending.

PROBLEM DEFINITION PERFORMANCE ESTIMATION


We implemented our approach in the context of FictiousGa- choice of the user, autonomy
meName, a gamified system whose purpose is to induce a The concept of choice and autonomy are very important for the
Voluntary Travel Behavior Change (VTBC). The game aims creation of an immersive gameplay. Thus they were introduced
to promote CO2 -free means of transportation and/or public in the gamified system right from the start.
transportation. This is a project performed annually from 2015.
In the last iteration, it was conducted in a six-month period For all users the gamified system provided a fixed challenge
(October 2018 - March 2019?), over 25 weeks. to be beaten in the following week. However, by reaching
a given level, the users were given two challenges to choose
In FictiousGameName, citizens record their daily trips, indi-
from, and later three challenges. The required levels were
cating the mean of transportation: by walk, bike, bus, train, or fairly low, in order to offer this game mechanic to each player
car. For each virtuous trip (all save the last one) they receive that was slightly active, yet presenting it as a reward. The
points, called green leaves, that are then used to calculate the game offered two different kind of choices.
leaderboard. Tracked itineraries are checked by an automatic
itinerary validation algorithm. Each week, physical prizes The first was the choice on the performance indicator to im-
offered by sponsors were given from player chosen from the prove. Five where available in the gamified system: total
amount of kilometers recorded by either walking or biking,
the total amount of trips recorded using either the bus or the
train, or the amount of green leaves (the point system) col-
lected. The gamified system offered the choice between the
indicators in which the user was most active in the previous
week. The goal for each indicator was computed based on the
user’s past performance. We will detail this in Section 5.
The second kind was the choice on the goal. The performance
indicator was automatically chosen from the one in which the
user was most active in the previous week. Again the initial
goal was computed based on the user’s past performance, as
(a) Question posed to the user.
detailed in Section 5. Based on this, additional goals were
presented to the user. For the two-challenge choice, the ad-
ditional goal was 10% higher than the initial one. For the
three-challenge choices, the two additional goals were respec-
tively 10% higher and 10% lower. Each goal would bring
a different amount of green leaves (the points in the serious
game), giving the user the choice for the difficulty of their
game.

DIFFICULTY CALIBRATION
As we have mentioned, deciding the correct difficulty of a
task is a challenge itself. A difficult too low, or too high, (b) Example of answer.
would push the player to stop playing, either via boredom or
frustration. This is a concept that is easily explained, but is
difficult to define.
Drawing ideas from (?), we designed a type of questionnaire
that would help us understanding which is the range of im-
provement that we should expect from a player. The question-
naire constituted of nine graphs of past performance. Each
graph regarded a single action that was recorded in the game,
for the last six weeks, in order. For example, the first graph,
shown in Figure 1a, pertains to the total amount of kilometres
that an user has recorded by walking, for each of the last six
weeks, in order (the last point is the most recent). (c) Example of aggregated response.

The nine graphs were taken from real data, and chosen in order Figure 1: Example of a graph of the questionnaire, pertain-
to represent all the range of possible performance progression, ing total amount of kilometres recorded walking during the
as shown in Figure ??. The first row show a rising pattern (the previous six weeks.
user is improving its performance); the second row a stable
pattern (the user is mainting its performance); the third row
is a declining pattern (the user is worsening its performance). The next step was to find a function able to approximate the in-
The second column and the third column exhibit a sudden tentions expressed in the questionnaire. We desired a function
change in performance during one week. Each user received that was of easy interpretation. By its definition, a challenge
a questionnaire where the graph were randomly shuffled in requires an increment of the player’s efforts. So we chose to
order. model the challenge prediction as:
We selected 20 user from explain where and why.
We asked the user to write on the graph would he/she think C = P ∗ I, (1)
would have been the correct challenge to issue to that particular
player, taking into account only the short window of its past where C is the goal of the challenge, P is the prediction of
performance. Users were also able to write down a short the player’s efforts, and I is the improvement. We choose to
description of their though process, as shown in Figure 1b. model it as a linear function; a future work could be to assess
other functions.
We collected the answers and recorded the proposed goal for
each graph. An example of the aggregated responses is given For the estimation of We considered the following extrapola-
in Figure 1c, while we show the complete responses in Figure tion functions: Linear, Polynomial, Conic, Moving Average,
??. Weighted Moving Average. The rationale was to approximate
the intentions expressed in the questionnaire, as all of their rea- the user would automatically choose one challenge for them
soning took into consideration the past performances shown in at twelve o’clock of Friday. The thus measured the percent-
the graph. In particular, no user indicated that their reasoning age of players that did choose their challenge, from all the
was based on only the last performance. Thus all methods players that were given a choice (the set of different available
were tested taking into consideration a number of previous challenges). We call this percentage choice rate. We use it
points in the range [2, 5] (this means basing the prediction on to measure the user’s engagement as an user that performed
the performance of the past two weeks, three, etc. up to five). the choice can reasonably be expected to be still interested
in the game and actively playing, compared to an user than
The function were evaluated comparing the difference of their will passively accept the choice made for them by the system.
outcome to the mean responses of the questionnaire, using the We considered for each week only the users that recorded at
MAPE (Mean absolute percentage error) as an error function,
least one activity during that week. We thus didn’t consider
since the graphs pertained different performance indicator with
users that were inactive. The results are shown in Figure 3.
different ranges.
In the first weeks the choice rate was high, indicating an high
In the end the lowest MAPE was found with the combination engagement most probably due to the novelty of the game, but
of WMA-5 as the prediction function (Weighted Moving Aver- soon started to decrease, most probably due to lost interest by
age over the last 5 recorded level of performance), and I = 1.3, the players. After the introduction of the novel approach, at
corresponding to an increase of 30% over the predicted user week 11, the choice rate had an increase that was sustained
performance. over time. We can conclude that the novel approach had a
direct effect of improving the user’s engagement within the
EVALUATION gamified system.
Mean difficulty

1.34
0.6
1.32 Choice ratio

0.5
1.3

0 5 10 15 0.4
week 0 5 10 15 20
week
Figure 2: Mean difficulty of the challenges proposed in the
serious game. Figure 3: Choice ratio of the challenges proposed in the serious
game.
We employed the proposed approach during the course of the
serious game. For the first part of the game, the challenges
were proposed with the naïve approach of proposing a goal We also observed the effect of the novel approach on the com-
that was directly computed using the performance observed pletion rate of the challenges. This is the percentage of users
during the the last week and a fixed improvement factor in that completed their challenges. Again we considered for each
{1.2, 1.3, 1.4}. Our novel approach was introduced starting week only users that were active during that week. The results
from the eleventh week. This allows us to directly compare its are shown in Figure 4. During the first week the completion
effects to the naïve approach.
In Figure 2 we compare the mean difficulty of the challenges
0.8
proposed. The difficulty was computed as the ratio of the
requested performance and the performance observed during
Completion ratio

the last week. It is a measure of the effort required to the 0.7


players. We can observe the mean difficulty during the first
weeks was higher, since most of the players were starting to 0.6
play, and slightly decreased during the following weeks. After
the introduction of the novel approach, at week 11, the mean
difficulty of the challenges was more regular, and slightly 0.5
higher than during the application of the naïve approach.
0 5 10 15 20
The main objective of this work was to improve user’s engage-
ment. As a measure of this we compare the choice rate of the week
challenges. When the player reached a fixed level they were
proposed a set of challenges, and were able to choose their Figure 4: Completion ratio of the challenges proposed in the
challenge for the next week. If the user didn’t make a choice, serious game.
1.8 Human Computation Games. In Proceedings of 1st
International Joint Conference of DiGRA and FDG.
Mean improvement

1.6 3. Mihaly Csikszentmihalyi. 1997. Finding flow: The


psychology of engagement with everyday life.
1.4 4. Mihalyi Csikszentmihalyi. 2014. Flow and the
Walk Km
foundations of positive psychology. Springer.
Bus Trips
1.2
Bike Km 5. Silvia Gabrielli, Paula Forbes, Antti Jylha, Simon Wells,
Train Trips Miika Sirén, Samuli Hemminki, Petteri Nurmi, Rosa
1 Maimone, Judith Masthoff, and Giulio Jacucci. 2014.
green leaves
Design challenges in motivating change for sustainable
5 10 15 20 urban mobility. Computers in Human Behavior 41 (2014),
416–423.
week
6. Juho Hamari, Jonna Koivisto, and Harri Sarsa. 2014.
Figure 5: Mean improvement of the challenges proposed in Does gamification work?–a literature review of empirical
the serious game. studies on gamification. In 2014 47th Hawaii
International Conference on System Sciences.
rate was high, but quickly degraded, probably due to the nov- 7. Mark Hendrikx, Sebastiaan Meijer, Joeri Van Der Velden,
elty of the game wearing off. After the introduction of our and Alexandru Iosup. 2013. Procedural Content
novel approach, the completion rate increased, stabilizing to Generation for Games: A Survey. ACM Trans.
an higher mean value. This shows that our novel approach im- Multimedia Comput. Commun. Appl. 9, 1 (2013),
proved the completion rate within the game system. It should 1:1–1:22.
be noted that during the second part of the game, the mean 8. Paul Holleis, Marko Luther, Gregor Broll, Hu Cao, Johan
difficulty of the challenges didn’t decrease, as shown in the Koolwaaij, Arjan Peddemors, Peter Ebben, Martin
first graph. Wibbels, Koen Jacobs, and Sebastiaan Raaphorst. 2019.
In the end, we also observed the improvement on the desired TRIPZOOM: A System to Motivate Sustainable Urban
change on the user’s behaviour brought by the gamified sys- Mobility. In 1st Int. Conf. on Smart Systems, Devices and
tem. We thus compare the mean improvement of the user’s Technologies.
performance for each week, for each of the possible desirable 9. Rob J Hyndman and Anne B Koehler. 2006. Another
actions rewarded by the game. These are: the total number of look at measures of forecast accuracy. International
kilometers recorded by walking or biking, the total amount of Journal of Forecasting (2006), 679–688.
trips recorded using the bus or the train, or the amount of green
leaves (the point system) collected. Also in this evaluation we 10. R. Kazhamiakin, A. Marconi, M. Perillo, M. Pistore, G.
considered for each week only the users that were active in Valetto, L. Piras, F. Avesani, and N. Perri. 2015. Using
that week. The improvement is computed as the ratio of the gamification to incentivize sustainable urban mobility. In
user’s performance in a given week and its performance in the 2015 IEEE First International Smart Cities Conference
previous week. We show the mean improvement in Figure 5. (ISC2). 1–6.
Following the introduction of the novel approach, in general 11. Rita Yi Man Li, Simon Fong, and Kyle Weng Sang
the mean improvement increased for all the five indicators, Chong. 2017. Forecasting the REITs and stock indices:
showing a direct increase. Group Method of Data Handling Neural Network
approach. Pacific Rim Property Research Journal 23, 2
CONCLUSION
(2017), 123–160.
We introduced a novel approach for the automatic generation
of challenges in a gamified system whose purpose was to 12. Diana Lora, Antonio A Sánchez-Ruiz, Pedro A
promote CO2 -free and public transportation. Our approach González-Calero, and Marco Antonio Gómez-Martín.
was based on the responses of a questionnaire we devised 2016. Dynamic Difficulty Adjustment in Tetris.. In
with the goal of finding a consistent approach for challenge FLAIRS Conference. 335–339.
generation. 13. Randy J Pagulayan, Kevin Keeker, Dennis Wixon,
REFERENCES
Ramon L Romero, and Thomas Fuller. 2002.
1. Darryl Charles, A Kerr, M McNeill, M McAlister, M User-centered design in games. In The human-computer
Black, J Kcklich, A Moore, and K Stringer. 2005. interaction handbook. CRC Press, 915–938.
Player-centred game design: Player modelling and 14. David R. Michael and Sandra L. Chen. 2006. Serious
adaptive digital games. In Proceedings of the Digital Games: Games That Educate, Train, and Inform. (2006).
Games Research Conference, Vol. 285. 00100.
15. Richard M. Ryan and Edward L. Deci. 2000.
2. Seth Cooper, Christoph Sebastian Deterding, and Theo Self-determination theory and the facilitation of intrinsic
Tsapakos. 2016. Player Rating Systems for Balancing motivation, social development, and well-being. (2000).
16. J. Togelius, R. De Nardi, and S. M. Lucas. 2007. Towards 17. Alexander Zook, Stephen Lee-Urban, Michael R
automatic personalised content creation for racing games. Drinkwater, and Mark O Riedl. 2012. Skill-based mission
In 2007 IEEE Symposium on Computational Intelligence generation: A data-driven temporal player modeling
and Games. 252–259. approach. In Proceedings of the The third workshop on
Procedural Content Generation in Games. 6.

You might also like