You are on page 1of 8

The Effects of Age on Player Behavior

in Educational Games

Eleanor ORourke, Eric Butler, Yun-En Liu, Christy Ballweber, and Zoran Popovic
Center for Game Science
Department of Computer Science & Engineering, University of Washington

ABSTRACT lenges for designers, since it is difficult to create games that

Casual games attract a diverse group of players with varied appeal to diverse audiences without first understanding how
needs and interests. In order to effectively tailor games to players differ. It is also challenging for researchers to gen-
specific audiences, designers must consider the effects of de- eralize experimental results without understanding the ex-
mographics on player behavior. This is particularly impor- pected behaviors of the populations they study.
tant when developing educational games for children, since
research shows that they have different design needs than These challenges are particularly prevalent in the design of
adults. In this work, we develop in-depth metrics to capture games for children. Young children make up a large portion
demographic differences in player behavior in two educa- of video game players, in part due to the growing interest in
tional games, Refraction and Treefrog Treasure. To learn games as educational tools [11, 20]. Child-computer interac-
about the effects of age on behavior, we use these metrics to tion researchers have shown that the design needs of children
analyze two player populations, children on the educational and adults differ, and that technology for children should
website BrainPOP and adults of the popular Flash website consider these requirements [22, 9]. Game researchers have
Kongregate. We show that BrainPOP players make more developed guidelines that suggest design considerations for
mathematical mistakes, display less strategic behavior, and childrens games [19, 16, 3]. This work provides a valuable
are less likely to collect optional rewards than Kongregate theoretical grounding, however very few of these guidelines
players. Given these results, we present design suggestions have been verified through empirical studies comparing the
for casual games that target children. behavior of adults and children in widely released games.

Categories and Subject Descriptors In this work, we study the effects of age on player behavior
in natural settings with two casual online games developed
K.8.0 [Personal Computing]: General Games; H.5.0
by our research group, Refraction and Treefrog Treasure.
[Information interfaces and presentation]: General
Previous research has shown that it is challenging to col-
lect demographic information from online players, and as
Keywords a result most existing studies rely on optional surveys or
games, player behavior, analytics, children. bring players into unnatural settings such as labs [27, 21].
Instead of collecting demographic data explicitly, we note
1. INTRODUCTION that many websites target a demographically distinct pop-
Video game players are a diverse group of people with varied ulation of players. We released our games on two websites:
interests, backgrounds, and motivations for playing. A re- Kongregate, which targets males ages 18 to 35, and Brain-
cent study shows that women make up 47% of players, and POP, which targets elementary school children. We compare
that player age is evenly split between children under 18, the behavior of these two player populations by analyzing
young adults ages 18 to 35, and adults older than 35 [10]. detailed telemetric data. Though we do not know the demo-
Games are also rising to prominence as a way to motivate graphics of any particular player, we can measure systematic
people to achieve serious goals, such as education [11], health differences in behavior and make inferences about the pos-
[26], or scientific discovery [8]. Serious games must appeal to sible causes based on the target population of each website.
widely varied audiences to successfully achieve these goals.
Despite this diversity of players, little is known about how We describe the fine-grained metrics we developed for Re-
demographics affect in-game behavior. This presents chal- fraction and Treefrog Treasure to measure differences in
player behavior, and present results from an analysis of 8,000
players showing that the Kongregate and BrainPOP popula-
tions behave very differently. Although they show different
levels of engagement in the two games, we find that Brain-
POP players make more mathematical mistakes, display less
strategic behavior, and are less likely to collect optional re-
wards than Kongregate players in both games. We provide
explanations for our results and discuss possible implications
for the development of games that target children.
2. BACKGROUND classroom, and offers 54 games. While BrainPOP does not
Human computer interaction researchers have studied how collect demographic information about players, the Alexa
children use technology for decades, showing that they have rankings show that the website is frequented by children
unique design needs [22, 9]. This research suggests that cog- and women ages 35 to 44, who are most likely teachers [1].
nitive development affects childrens interactions with tech-
nology. Gelderblom and Kotze present design guidelines for Given the distinct target audiences of these websites, we
children based on cognitive theories and describe concrete expected to observe differences in the behavior of the two
lessons from their research [12, 13]. They suggest that chil- populations. Many factors could produce these differences,
drens limited short-term memory affects their ability to re- such as variations in the website interfaces or differences
member instructions, and that their problem-solving strate- in the social incentives and achievements provided by each
gies are influenced by brain maturation, conceptual under- site. Kongregate offers more games than BrainPOP, giving
standing, and past experiences [13]. Other work has com- players many alternatives if they become bored with their
pared how children and adults search the web [14, 4], finding current game. BrainPOP is primarily used in the classroom,
that children repeat the same search queries frequently, pos- which could influence player interactions. Despite these dif-
sibly due to their poor cognitive recall. ferences, we believe that player demographics will have the
strongest effects on behavior. Kongregate attracts young
Researchers have applied cognitive theories to game design male adults with prior gaming experience, while BrainPOP
as well, developing guidelines for childrens games. Moreno- attracts children from diverse backgrounds. These popula-
Ger et al. explore methods of balancing fun and pedagogical tions will have different skill sets and developmental abilities,
goals [19], Linehan et al. look at ways to incorporate the strongly affecting how they interact with games.
Applied Behavioral Analysis teaching method into games
[16], and Baauw et al. design a question-based evaluation 2.2 Expected Behavioral Differences
methodology for assessing fun and usability in childrens
While there are many demographic differences between the
games [3]. This work provides a valuable foundation, how- populations we studied, we expected player age to affect
ever these high-level guidelines are not based on empirical behavior the most. Previous work has shown that age can
evidence. Very few studies explore the effects of age on in-
influence play more strongly than gender [27], which will
game behavior, even though Yee found that age is the most
be most apparent when comparing very young players and
effective predictor of players behavioral roles in a study of adults. This intuition informed our hypotheses about the
adults in online role-playing games [27]. Most closely related differences in behavior we expected to observe, which are
is a study by Pretorius et al. that compared how adults over based on in-person observations of children and adults and
40 and children 9 to 12 learn an unfamiliar game by record-
relevant theories in education and cognitive development.
ing eye-tracking data and observing interactions with the
tutorial. They found that adults approached the new game Both Refraction and Treefrog Treasure are games about frac-
systematically and read instructions, while children ignored tions. Fractions are a challenging concept for most elemen-
instructions in favor of a trial-and-error approach [21].
tary school children, and are considered one of the first se-
rious roadblocks in math education [25]. As a result, we
With this work, we build on existing research in child- expected BrainPOP players to make more mistakes relating
computer interaction and cognitive development. We design to fractions than Kongregate players. Although many adults
in-depth behavioral metrics, perform an empirical analysis find fractions difficult, we expected them to have stronger
of how adults and children play games, and provide design
mathematical skills than children on average.
considerations for games for children.
Hypothesis 1: Kongregate players will make fewer mathe-
2.1 Game Website Audience matical mistakes than BrainPOP players.
We released our games on two casual game websites that at-
tract very different types of players. The first, Kongregate, Children are also likely to display fewer strategic and
is a popular portal for free Flash games. Developers can up- problem-solving skills than adults. Extensive research in
load games to the site, which currently has over 50,000 titles cognitive development has shown that problem-solving skills
available. Kongregate provides a variety of social features, take years to develop [23, 24]. Children develop these skills
and players can create optional accounts to become part slowly as they refine strategies and develop the metacogni-
of this community. Kongregate attracts 15 million unique tive abilities needed to reflect on problem solutions [5]. As
players per month, which they report are 85% male with an a result, we expected that adult players would display more
average age of 21 [15]. The Alexa rankings for the website strategic behavior than children.
support this data, showing that visitors are predominantly
males between ages 18 and 24 [2]. Hypothesis 2: Kongregate players will display more strate-
gic behavior than BrainPOP players.
The second, BrainPOP, is a popular educational website
[6]. BrainPOP is best known for its curriculum resources, Achievement systems are a huge part of the casual game
including content for students and support materials for industry for websites like Kongregate [18], and previous re-
teachers. The BrainPOP Educators community has over search has shown that Kongregate players will go out of
210,000 members [7], and the website is used as a resource their way to collect optional rewards [17]. However, during
in around 20% of elementary schools in the United States playtesting we have observed that children are less interested
(Traci Kampel, personal communication). BrainPOPs ed- in rewards, often ignoring them entirely. We expected adult
ucational game portal GameUp was designed for use in the players to care about optional rewards more than children.
Hypothesis 3: Kongregate players will collect more op-
tional rewards than BrainPOP players.

We explore these hypotheses by conducting an in-depth

analysis of player behavior in two educational games.

In this work, we study player behavior in two educational
games developed by our research group: Refraction and
Treefrog Treasure. Both games were designed to teach frac-
tion concepts to elementary school children, however they
have found success with older audiences as well. Both games
are implemented in Flash and played in a web browser, and
while both cover mathematical topics, they come from dif-
ferent genres and provide distinct gaming experiences.
Figure 1: A level of Refraction. The pieces on the
3.1 Refraction right are used to split lasers into fractional amounts
Refraction is a puzzle game that involves splitting lasers and redirect them to satisfy the target spaceships.
into fractional amounts. The player interacts with a grid All ships must be satisfied at the same time to win.
that contains laser sources, target spaceships, and aster-
oids, as shown in Figure 1. The goal is to satisfy the target
spaceships and avoid asteroids by placing pieces on the grid.
Some pieces change the laser direction and others split the
laser into two or three equal parts. To win, the player must
correctly satisfy all targets at the same time, a task that re-
quires both spatial and mathematical problem-solving skills.
Refraction has 62 levels, some of which contain coins, op-
tional rewards that can be collected by satisfying all target
spaceships while a laser of the correct value passes through
the coin. It has been played over 200,000 times on Brain-
POP since its release in April 2012, and over 500,000 times
on Kongregate since its release in September 2010.

3.2 Treefrog Treasure

Treefrog Treasure is a platformer that involves jumping
through a jungle world and solving numberline problems to
reach an end goal. The player navigates sticky, bouncy, and Figure 2: A screenshot of Treefrog Treasure. The
slippery surfaces and avoids hazardous lava to win. Num- player navigates sticky, slippery, and bouncy sur-
berline problems serve as barriers that the player solves by faces and solves numberline problems to progress to
hitting the correct target location, as shown in Figure 2. the finish. Gems are collected along the way.
The player can collect gems placed throughout the level,
and gains additional gems by solving numberline problems.
Gems are lost when the player dies, but they are scattered
around the area of death and can be recollected. Treefrog periods, we randomly sampled 3,000 players from each set
Treasure has 36 levels, covering increasingly complex num- to control for possible timing effects.
berline concepts. The game is newer than Refraction; it has
been played over 90,000 times on BrainPOP and over 17,000 We also collected two Treefrog Treasure data sets. Due to
times on Kongregate since its release in November 2012. On time constraints, we collected less Kongregate data than we, the most popular Flash game portal in China, it wanted. The Kongregate data set contains 1,412 players,
has been played over 5 million times since July 2012. collected from December 12, 2012 to December 13, 2012.
Kongregate does not award any badges for this game. The
BrainPOP data set contains 30,893 players, collected from
3.3 Data Collection December 7, 2012. to December 13, 2012. We randomly
We collected two Refraction data sets, one from Kongregate sampled 1,000 players from each data set for our analysis.
and one from BrainPOP. The Kongregate data set contains
6,174 players and was collected from April 1, 2012 to De- Both games recorded and logged every interaction players
cember 2, 2012. We note that Kongregate awards players a made with the game or its interface. We only included new
badge if they collect all Refraction coins, which could influ- players who were not familiar with the games in our analysis,
ence player motivations. The BrainPOP data set contains and only used data from a players first session to control for
4,756 players and was collected from November 26, 2012 un- issues with shared computers in schools. In both games, we
til December 3, 2012. Since these data sets had different tracked players by storing their progress in the Flash cache,
numbers of players and were collected over different time allowing us to selectively include new players and determine
whether players returned to the game. One drawback of pothesis, we designed game-specific metrics to capture math-
this method is that players who clear the cache or change ematical mistakes in Refraction and Treefrog Treasure.
computers will be treated as new players. However, since the
Flash cache is inconvenient to clear and this action deletes 4.2.1 Refraction
all game progress, we considered this risk to be small. Refraction players must understand fractions to split lasers
appropriately and satisfy target ships. We measure two com-
4. DATA ANALYSIS AND RESULTS mon mistakes to capture mathematical understanding. The
We explore our hypotheses by performing a statistical anal- first occurs when players try to satisfy a ship with an in-
ysis of game-specific behavioral metrics. The evaluation of correct laser value, and the second occurs when an unsatis-
the Kolmogorov-Smirnov statistic for each of our metrics fied ship cannot be satisfied given the remaining lasers and
was statistically significant, indicating that they violate the pieces. Conducting a clean analysis of mathematical ability
assumptions of normality. We therefore use non-parametric is challenging because it is often entangled with both spatial
statistical methods: a Mann-Whitney U Statistic and a r ability and the players attempts to collect coins. To control
measure of effect size for continuous variables, and a Chi- for this, we analyzed two early levels that have no coins and
square statistic and a Cramers V measure of effect size for require minimal spatial reasoning.
categorical variables. We report effect sizes in addition to
p-values to show the magnitude of the difference between In this analysis, we wanted to capture the players tolerance
our populations, since we are likely to find significant trivial of mathematical mistakes rather than the raw number of
differences due to our large sample sizes. For both tests, mistakes made. Often, players place pieces on the grid to
effect sizes with values less than 0.1 are considered trivial, see what effect they produce, which could artificially inflate
0.1 are small, 0.3 are moderate, and 0.5 or greater are large. the total number of mistakes. We defined an erroneous
game state as any state in which a mistake was present on
4.1 High-Level Behavior the grid. If a player makes many moves without correcting
First, we calculated descriptive statistics to gain an under- a mathematical error, her trace will contain a large number
standing of any high-level differences in player behavior. We of erroneous game states. Both of our mistake metrics count
looked at three metrics in both games: total time played, the total number of states in which the mistake is present.
number of unique levels played, and return rate.
The first mistake metric counts the number of game states
We calculated total time played by counting the number of in which a laser with the incorrect fractional value enters
active seconds of play, excluding menu navigation and idle a target ship. A larger proportion of BrainPOP players
periods with more than thirty seconds between actions. In states contained these mistakes, with medians of 0.06 and
Refraction, we found that Kongregate players play longer 0.17 mistakes for the two levels compared to 0.00 and 0.00
than BrainPOP players, with a median time of 437.00 sec- for Kongregate players (p<.001, r=0.31 and r=0.47). We
onds compared to 189.00 (p<.001, r=0.36). However, in also calculated the proportion of active time spent in states
Treefrog Treasure they play for less time, with a median time with this mistake. Again, BrainPOP players perform worse,
of 312.00 seconds compared to 476.00 (p<.001, r=0.15). with medians of 0.09 and 0.20 compared to 0.00 and 0.00 for
Kongregate players (p<.001, r=0.36 and r=0.43).
We calculated the number of unique levels completed for
each player by counting levels with at least one game action. The second mistake metric counts the number of game states
We exclude players who quit a new level before making any where the pieces are placed such that an unsatisfied ship
moves. In Refraction, Kongregate players play more unique cannot be satisfied with the remaining lasers and dividers.
levels, a median of 12 levels compared to 6 for BrainPOP For example, if the player splits to create two 1/2 lasers,
(p<.001, r=0.31), but in Treefrog Treasure they play fewer it becomes impossible to satisfy a ship that requires 1/3
unique levels (medians of 8 and 9, p<.008, r=0.06). power. We found that a larger proportion of BrainPOP
players states contained these mistakes, with medians of
We calculated the return rate for each population by com- 0.17 and 0.12 compared to 0.00 and 0.00 for Kongregate
puting the percentage of players who come back to the game players (p<.001, r=0.38 and r=0.37). BrainPOP players
within three days. We found no significant difference in the also spent a greater proportion of time in these states, with
return rate for Refraction, but found that 29.2% of Brain- medians of 0.23 and 0.14 compared to and 0.00 and 0.00 for
POP players return to Treefrog Treasure, while only 9.2% Kongregate players (p<.001, r=0.38 and r=0.37).
percent of Kongregate players return (p<.001, V=0.25).
These findings support our first hypothesis. BrainPOP play-
The Kongregate and BrainPOP populations reacted differ- ers are more tolerant of mathematical mistakes than Kon-
ently to the games. Kongregate players complete more lev- gregate players, and spend more time in erroneous states on
els and play longer than BrainPOP players in Refraction, average. The effect sizes for all of these calculations were
but complete fewer levels and play for less time in Treefrog moderate to large, indicating that the two populations have
Treasure. The median values for these metrics indicate that very different levels of mathematical understanding.
Kongregate players are more engaged by Refraction, while
BrainPOP players are more engaged by Treefrog Treasure. 4.2.2 Treefrog Treasure
Players must understand fractions to solve the numberline
4.2 Mathematical Understanding problems in Treefrog Treasure. To solve a problem, the
We expected Kongregate players to make fewer mathemat- player jumps into the numberline multiple times until hitting
ical mistakes than BrainPOP players. To explore this hy- the correct location. Each failed attempt can be viewed as
a mathematical mistake. The severity of the mistake can be
captured by calculating the distance between the guess and
the correct solution. We performed our analysis on three
numberline problems from the first non-introductory level.

First, we calculated the proportion of correct attempts to

total attempts across all three numberlines. Players who
solved all three problems correctly on the first try have a
proportion of 1. We found that Kongregate players have
a greater proportion of correct attempts than BrainPOP
players, but the medians for both are 1.0 (p<.001, r=0.29).
Next, we calculated the average error, or distance from the (a) BrainPOP (b) Kongregate
correct solution, for failed attempts across all three num-
berlines. We found that BrainPOP players have a greater
Figure 3: Refraction search graph representations,
average error, with a median of 6% compared to 3% for
showing that BrainPOP players search less effi-
Kongregate players (p<.001, r=0.37).
ciently. Red is the start state, yellow boxes are win
states, and orange boxes are intermediate states.
One explanation for these results is that Kongregate players
are more careful and take the time to place their jumps cor-
rectly. To explore this possibility, we calculated the number
search efficiently and remember previously viewed states.
of seconds spent between the state preceding the first jump
We found that Kongregate players have a higher proportion
into the numberline and the first jump. We found that
of unique states than BrainPOP players in both of the ana-
BrainPOP players spend more time considering their first
lyzed levels, with medians of 0.5 and 0.5 compared to 0.36
jump on average, with a median of 16.37 seconds compared
and 0.37 (p<.001, r=0.27, and r=0.26).
to 12.21 seconds for Kongregate players (p<.001, r=0.27).
Next, we calculated each players search degree. Search de-
These results also support our first hypothesis. BrainPOP
gree as the average out degree of the players graph, where
players spend more time planning their jumps than Kongre-
out degree is the number of edges leaving a particular node.
gate players, but still make more mistakes with a greater
This metric captures the type of search the player is using.
margin of error. The effect sizes are moderate, indicating
A high out degree indicates the use of a breadth-first-like
that the differences are considerable. It is possible that
strategy, and a low out degree indicates the use of a depth-
BrainPOP players make mistakes because they have have
first-like strategy. Lower search degree could also show that
lower mouse proficiency than adults, however we believe that
the player is thinking ahead and investigating multi-move
mathematical understanding plays a greater role because we
hypotheses. We found that Kongregate players have a lower
see the same effect in Refraction.
search degree, with medians of 1.5 for both levels, com-
pared medians to 2.0 and 2.5 for BrainPOP players (p<.001,
4.3 Strategy r=0.35 and r=0.43).
We expected Kongregate players to act more strategically
than BrainPOP players. To explore this hypothesis, we de- Finally, we calculated the number of dead ends a player
signed game-specific metrics to capture strategic behavior reaches. A dead end is a node with only one neighbor, cre-
in Refraction and Treefrog Treasure. Refraction is a puzzle ated when the player immediately takes back a move and
game that requires the use of problem-solving skills, so we returns to the previous state. A node is also a dead end if
were able to perform a deep analysis of strategic behavior the player resets the board immediately after reaching that
in this game. Treefrog Treasure targets a younger audience, state. We would expect players who search inefficiently and
and was designed to require less strategy. However, we iden- have trouble predicting the effects of moves to have a large
tified a few key tasks in this game that require planning, a number of dead ends. BrainPOP players visited more dead
central component of strategy, to study in our analysis. end states, with medians of 1.0 and 3.0, compared to 0.0
and 0.0 for Kongregate players (p<.001, r=0.38 and r=0.43).
4.3.1 Refraction During this analysis, we noticed that BrainPOP players re-
To complete a refraction level, a player must search the space turn to the start state frequently. We counted the number
of all possible moves to find a correct solution. One way to of times players remove all pieces from the board, and found
visualize her search through this space is as a graph, where that BrainPOP players return to the start state more, with
the nodes represent unique board states and the edges rep- a median of 9 returns to Kongregates 3 (p< .001, r=0.43).
resent moves. Given this representation, we can calculate
metrics that capture characteristics of the players search These results support our second hypothesis. Kongregate
strategy. In this analysis, we use the same two levels with- players display more efficient and strategic search behavior,
out coins used to study mathematical mistakes. returning to states less often, searching more deeply, and
reaching fewer dead ends than BrainPOP players. Kongre-
First, we calculated the proportion of visited states that are gate players also reset the board less frequently, indicating
unique. High values indicate that the player does not revisit that they are better able to reason about intermediate states.
states often, while low values show that the player returns to Figure 3 shows aggregate search graphs for a single level.
states over and over again. We would expect strategic play- These graphs were created by randomly selecting 240 play-
ers to have a high proportion of unique states, because they ers from each population and displaying states visited by
(a) (b) (c)

Figure 4: Screenshots of analyzed levels. Figure 4(a) shows moving hazards in Treefrog Treasure, Figure 4(b)
shows a solution to a coin level in Refraction, and Figure 4(c) shows inconvenient gems in Treefrog Treasure.

at least two players. Size represents the volume of players 4.4.1 Refraction
and color indicates state type. BrainPOP players are less Refraction coins are static pieces with fractional values that
focused; they search more broadly, visit a wider variety of appear on the game grid. Coins are collected by satisfying all
states, and reach more dead ends than Kongregate players. the spaceships while a correctly valued laser passes through
the coin, as shown in Figure 4(b). In this analysis, we calcu-
4.3.2 Treefrog Treasure lated whether players successfully collect coins and the total
Treefrog Treasure was explicitly designed to require minimal time spent working to collect them. We analyze three early
strategy. As a result, we explored our second hypotheses levels with coins, including the one shown in Figure 4(b).
by identifying three level regions where players will perform
better if they plan ahead and consider multiple moves in First, we looked at how players interact with coins in the
advance. The first region, shown in Figure 4(a), involves three levels. We bucketed players into four interaction
planning jumps to avoid moving hazardous objects. The groups: players who never touch the coin with a laser, play-
other two regions involve jumping back and forth between ers who only touch the coin with an incorrectly valued laser,
walls to reach a better vantage point. players who touch the coin with the correctly valued laser
but do not win it, and players who win the coin. A graph
To measure performance in the hazardous region, we count showing this distribution for one of the levels is shown in
the number of times players die by hitting the moving haz- Figure 5(a). Next, we calculated the raw number of coins
ards. We found that BrainPOP players die more frequently that the two player groups collected on average, and found
than Kongregate players, although the median number of that Kongregate players collect more coins, with a median
deaths for both was 1.0 (p<.001, r=0.19). For the other two of 2.0 out of three possible coins collected, compared to 1.0
regions, we count the number of jumps players use to reach out of three for BrainPOP (p<.001, r=0.42).
the top of the wall, expecting players who plan ahead to re-
quire fewer jumps. BrainPOP players use more jumps in the We also calculated the amount of time players spent at-
first region, but the medians for both are 4 jumps (p<.001, tempting to collect each coin on average. We looked at each
r=0.21), and we found no statistically significant difference play of the level during the players first session, and only in-
in the second region. cluded plays in which the coin was touched with the laser at
least once. We summed the amount of time spent across all
These results also support our second hypothesis. While level plays with these characteristics, and found that Kon-
the effect sizes are small and this analysis does not provide gregate players spent more time on average working to col-
as deep of a picture of strategic thinking as our Refraction lect coins, with a median of 413 seconds compared to 268
analysis, these results suggest that Kongregate players are for BrainPOP players (p< .001, r=0.24).
better able to plan moves in advance, and therefore avoid
obstacles and jump between objects more efficiently. These results support our third hypothesis; Kongregate
players collect more coins than BrainPOP players. Coin col-
4.4 Optional Rewards lection is conflated with mathematical ability, which could
We expected Kongregate players to collect more optional increase the size of the observed effect. However, Brain-
rewards than BrainPOP players. To explore this hypothe- POP players spend less time than Kongregate players work-
sis, we designed game-specific metrics to capture player in- ing towards coins, indicating that they are less interested
teractions with coins and gems in Refraction and Treefrog in collecting them. It is important to note that Kongregate
Treasure. While both games include optional rewards, each players may be motivated to collect coins primarily due to
requires different behavior to collect them. In Refraction, the badge awarded to players who collect them all.
coins are collected by solving a more challenging problem
with additional constraints, while in Treefrog Treasure, in- 4.4.2 Treefrog Treasure
level gems are collected by taking the time to visit inconve- Gems in Treefrog Treasure can be collected in two ways.
nient locations. Players receive gems when they complete numberline prob-
0.6 0.9
0.5 0.7

Proportion players
Proportion players

0.3 0.4
0.2 0.2
0 0 1 2 3 4 5
Not Attempted Touched Coin Satisfied Coin Won With Coin # Tricky Gems

(a) (b)

Figure 5: Optional rewards graphs. Figure 5(a) shows four types of interaction with Refraction coins. Figure
5(b) shows the total number of tricky gems that Treefrog Treasure players collect, of five possible gems. In
both graphs, Kongregate is blue and BrainPOP is red.

lems correctly, and they can also collect fixed gems that the two player populations experience different levels of en-
appear in the level. For this analysis, we look at the total gagement with each game; Kongregate players finish more
number of gems collected in nine early levels and the number levels and play longer in Refraction than BrainPOP play-
of gems players recollect after losing them by dying. We also ers, but have the opposite behavior in Treefrog Treasure.
identified a few fixed gems that are particularly inconvenient Despite these differences, all three of our experimental hy-
to collect, and examined these specifically in our analysis. potheses were empirically supported in both games. Brain-
POP players make more mathematical mistakes than Kon-
First, we calculated the percentage of the total possible gems gregate players, creating more incorrect lasers in Refraction
that players collected across the nine levels, and found that and requiring more tries to complete numberline problems
BrainPOP players collect fewer gems on average, with a me- in Treefrog Treasure. They also display less strategic behav-
dian percentage of 93% compared to 95% for Kongregate ior. BrainPOP players use less efficient search strategies in
players (p<.003, r=0.12). Next, we looked at the percent- Refraction, returning to states more frequently and reach-
age of lost gems that players regain. When players jump or ing more dead ends than Kongregate players, and they are
fall into hazardous lava, they die and are re-spawned. Play- less efficient in the parts of Treefrog Treasure that require
ers lose a few gems every time they die, which are scattered planning. Finally, BrainPOP players were less interested in
around the area of death. We calculated the percentage of optional rewards in both games, spending less time work-
these gems that players recollected after dying on average, ing towards Refraction coins and returning to collect tricky
and found no significant difference. gems in Treefrog Treasure less frequently.

We also analyzed two sets of gems that are inconvenient While the specific demographics of the Kongregate and
to collect. One set, shown in Figure 4(c), is only visible BrainPOP player populations are only loosely confirmed,
after the player has gone past the tunnel where the gems our results suggest that age had the strongest effect on player
can be collected. We hypothesized that players would only behavior. While our results should be repeated and con-
go back to collect these tricky gems if they cared strongly firmed by future studies with verified demographic data, the
about optional rewards. We found that BrainPOP players observed behavioral trends suggest design considerations for
were less likely to collect these gems, with a median of 4.0 games that target children. Our data show that children
gems collected out of five compared to 5.0 out of five for have trouble searching large state spaces, suggesting that
Kongregate players (p<.001, r=0.33), shown in Figure 5(b). game designers should restrict the amount of searching and
planning their games require. This could be achieved by
These results support our third hypothesis. Kongregate limiting the size of the search space or by using tutorials
players collect more gems than BrainPOP players, and while and scaffolding to teach search strategies directly. Children
they do not recollect more lost gems, they are more likely to also struggled with the mathematical concepts in our games.
collect inconvenient gems. Kongregate players do not receive While this is not surprising, game designers should offer scaf-
any achievements for collecting gems in Treefrog Treasure, folding or visual feedback when errors are made to ease the
indicating that they are genuinely more motivated or able cognitive load for young players. Finally, children were less
to capture these optional rewards than BrainPOP players. likely to collect optional rewards, indicating that they are
not a strong motivator for this population. Designers may
want to exclude optional rewards from games for young au-
5. CONCLUSION diences.
Our analysis of Kongregate and BrainPOP players in two
casual games shows that the populations behave very differ- With this work, we provide a model for studying the effects
ently. The games we studied, Refraction and Treefrog Trea- of demographics on player behavior that could be general-
sure, come from different genres and provide distinct gaming ized to study other player populations. Our results show
experiences. High-level behavioral statistics indicate that
that each casual game website attracts players with distinct [12] H. Gelderblom and P. Kotze. Designing technology for
demographics, and that the effects of those demographics on young children: what we can learn from theories of
player behavior should be considered when generalizing re- cognitive development. In Proceedings of the 2008 annual
research conference of the South African Institute of
search results to different populations. It would be valuable Computer Scientists and Information Technologists on IT
to study the effects of demographics such as age, gender, research in developing countries: riding the wave of
and education on in-game behavior further, to allow design- technology, SAICSIT 08, pages 6675, New York, NY,
ers and researchers to make principled predictions about how USA, 2008. ACM.
specific populations will react to a given game. Our research [13] H. Gelderblom and P. Kotze. Ten design lessons from the
method could also be applied to work in adaptive games. In literature on child development and childrens use of
order to appropriately adapt a game to a particular player, technology. In Proceedings of the 8th International
Conference on Interaction Design and Children, IDC 09,
the adaptive game must appropriately cluster players with pages 5260, New York, NY, USA, 2009. ACM.
similar needs and behaviors. A classifier trained on data [14] T. Gossen, T. Low, and A. Nurnberger. What are the real
sets with distinct demographics, such as the Kongregate and differences of childrens and adults web search. In
BrainPOP data sets, could produce successful clustering al- Proceedings of the 34th international ACM SIGIR
gorithms. We will explore this direction in future work. conference on Research and development in Information
Retrieval, SIGIR 11, pages 11151116, New York, NY,
This work highlights significant differences in the behavior of USA, 2011. ACM.
[15] E. Greer and A. Pecorella. Core games, real numbers:
Kongregate and BrainPOP players, and represents the first Comparative stats for mmos & social games. Presented at
in-depth comparative analysis of in-game behavior based on the Game Developers Conference, San Francisco, CA, 2012.
demographic characteristics. Our results support our predic- [16] C. Linehan, B. Kirman, S. Lawson, and G. Chan. Practical,
tion that any observed differences would be primarily caused appropriate, empirically-validated guidelines for designing
by the age of the players, which suggests design guidelines educational games. In Proceedings of the SIGCHI
for games that target children based on our results. Conference on Human Factors in Computing Systems, CHI
11, pages 19791988, New York, NY, USA, 2011. ACM.
[17] Y.-E. Liu, E. Andersen, R. Snider, S. Cooper, and
6. ACKNOWLEDGMENTS Z. Popovic. Feature-based projections for effective playtrace
We would like to thank the creators of Refraction and analysis. In Proceedings of the 6th International Conference
Treefrog Treasure for making this work possible. We also on Foundations of Digital Games, FDG 11, pages 6976,
thank Anthony Pecorella of Kongregate and Allisyn Levy New York, NY, USA, 2011. ACM.
[18] C. Moore. Modern gaming trends: Achievements, 2012.
and Norman Basch of BrainPOP for promoting our games
[19] P. Moreno-Ger, D. Burgos, I. Martnez-Ortiz, J. L. Sierra,
and helping us gather data. This work was supported by and B. Fernandez-Manjon. Educational game design for
the NSF Graduate Research Fellowship grant DGE-0718124, online education. Comput. Hum. Behav., 24(6):25302540,
the Bill and Melinda Gates Foundation grant OPP1031488, Sept. 2008.
the Office of Naval Research grant N00014-12-C-0158, the [20] H. F. ONeil, R. Wainess, and E. L. Baker. Classification of
Hewlett Foundation grant 2012-8161, Adobe, and Intel. learning outcomes: evidence from the computer games
literature. The Curriculum Journal, 16(4):455474, 2005.
[21] M. Pretorius, H. Gelderblom, and B. Chimbo. Using eye
7. REFERENCES tracking to compare how adults and children learn to use
[1] Alexa Ranking, BrainPOP. an unfamiliar computer game. In Proceedings of the 2010
[2] Alexa Ranking, Kongregate. Annual Research Conference of the South African Institute
[3] E. Baauw, M. M. Bekker, and W. Barendregt. A structured of Computer Scientists and Information Technologists,
expert evaluation method for the evaluation of childrens SAICSIT 10, pages 275283, New York, NY, USA, 2010.
computer games. In Proceedings of the 2005 IFIP TC13 ACM.
international conference on Human-Computer Interaction, [22] J. C. Read, P. Markopoulos, N. Pares, J. P. Hourcade, and
INTERACT05, pages 457469, Berlin, Heidelberg, 2005. A. N. Antle. Child computer interaction. In CHI 08
Springer-Verlag. Extended Abstracts on Human Factors in Computing
[4] D. Bilal and J. Kirby. Differences and similarities in Systems, CHI EA 08, pages 24192422, New York, NY,
information seeking: children and adults as web users. Inf. USA, 2008. ACM.
Process. Manage., 38(5):649670, Sept. 2002. [23] R. F. Sternberg, editor. Thinking and Problem Solving.
[5] D. F. Bjorklund, editor. Childrens Strategies: Academic Press, San Diego, California, USA, 1994.
Contemporary views of cognitive development. Lawrence [24] S. Thornton. Children Solving Problems. Harvard
Erlbaum Associates, Inc., Hillsdale, New Jersey, USA, 1990. University Press, Cambridge, Massachusetts, USA, 1995.
[6] BrainPOP. [25] U.S. Department of Education. The final report of the
[7] BrainPOP Educators. national mathematics advisory panel, 2008. [26] Y. Xu, E. S. Poole, A. D. Miller, E. Eiriksdottir,
[8] S. Cooper, F. Khatib, A. Treuille, J. Barbero, J. Lee, R. Catrambone, and E. D. Mynatt. Designing pervasive
M. Beenen, A. Leaver-Fay, D. Baker, Z. Popovic, and health games for sustainability, adaptability and sociability.
F. Players. Predicting protein structures with a multiplayer In Proceedings of the International Conference on the
online game. Nature, 466(7307):756760, August 2010. Foundations of Digital Games, FDG 12, pages 4956, New
[9] A. Druin. Cooperative inquiry: developing new technologies York, NY, USA, 2012. ACM.
for children with children. In Proceedings of the SIGCHI [27] N. Yee. Motivations of play in online games. Journal of
conference on Human Factors in Computing Systems, CHI CyberPsychology and Behavior, 9:772775, 2007.
99, pages 592599, New York, NY, USA, 1999. ACM.
[10] Entertainment Software Association. Essential facts about
the computer and video game industry, 2012.
[11] J. P. Gee. What Video Games Have to Teach Us About
Learning and Literacy. St. Martins Press, revised and
updated edition. edition, March 2008.