You are on page 1of 58

UNIVERSITY OF HERTFORDSHIRE

School of Computer Science

MSc in Advanced Computer Science

Advanced Computer Science Masters Project

7COM1039

13/09/2016

"A comparative study among two algorithms on the development of a

civilization-type resource allocation strategy game"


Table of Contents

Chapter 1 Introduction
1. Introduction…………………………………………………………………….4
1.1. Overall Aim………………………………………………………………...5
1.2. Motivation………………………………………………………………… 5
1.3. Aims and Objectives……………………………………………………... 5
1.4. Request Questions……………………………………………………….. .6
1.5. Approach…………………………………………………………………..6
Chapter 2 Analysis and Design
2. Analysis and Design……………………………………………………………8
2.1. Civilization type strategy games…………………………………………...9
2.1.1. Humans…………………………………………………………….9
2.1.2. Resources…………………………………………………………10
2.1.3. Advance through the ages………………………………………..10
2.1.4. War………………………………………………………………..10
2.1.5. Fog of War………………………………………………………..10
2.2. Games and Artificial Intelligence………………………………………...11
2.3. The game of the project..........................................................................12
2.3.1. Moves.........................................................................................12
2.3.2. Rules..........................................................................................12
Chapter 3 Literature Review
3. Literature Review.........................................................................................14
3.1. Literature Review...................................................................................15
Chapter 4 Algorithms
4. Algorithms...................................................................................................19
4.1. Monte Carlo Tree Search.......................................................................20
4.2. Upper Confidence Applied to Trees......................................................22
4.3. Minimax................................................................................................23
Chapter 5 Implementation
5. Implementation............................................................................................24
5.1. Description.............................................................................................25
5.1.1. The tools that were used.............................................................25
5.2. Design Documents.................................................................................26
5.2.1. Class Diagram............................................................................26
5.2.2. Description of the Class Diagram...............................................26
5.3. Classes of the implementation................................................................27
5.3.1. Player.java class.........................................................................27
5.3.2. MinMaxPlayer.java class............................................................28
5.3.3. MCTSNode.java class................................................................29
5.3.4. MonteCarlo.java class................................................................29
5.3.5. MCTSPlayer.java class...............................................................29
5.3.6. EmpiresAge.java class................................................................30
5.4. Testing....................................................................................................31

2
Chapter 6 Evaluation and Discussion
6. Evaluation and Discussion........................................................................32
6.1. Request Questions...............................................................................33
6.2. Conclusion..........................................................................................36
6.3. Future work and what could have been better.....................................37
Appendices...........................................................................................................38
References............................................................................................................57

3
Chapter 1
Introduction

4
1. Introduction

Monte Carlo Tree Search has been successfully applied to a great variety of games in
the last few years including classic turn-based games like backgammon, poker and
Scrabble. The main reason behind its increased popularity is the recent success of
MCTS in the game of Go by the AlphaGo [Deepmind.com, 2016] computer program
that was able to beat Lee Se-dol in March 2016 [Cellan-Jones, 2016]. This project
tests if the MCTS can also be applied successfully in a civilization-type resource
allocation strategy game.

1.1. Overall Aim

The problem that is examined in this project’s scope is a comparative study among
two algorithms on the development of a civilization type resource allocation strategy
game. It tests if the Monte Carlo Tree Search algorithm along with the Upper
Confidence Bounds applied to trees (UCT) can be successfully applied in this kind of
game. Also, it tests if the Monte Carlo Tree Search algorithm is more suitable than the
Minimax algorithm. The main idea of the game is having two players one for each
algorithm (either MCTS, either UCT and Minimax) to contest on who will built the
biggest empire faster. In order to categorize an empire big enough there are some
conditions which are also the factor of the end of the game. These conditions refer to
the maximum population, the maximum protection and the amount of the fighting
units.

1.2. Motivation

Nowadays computer science has a plethora of subfields that can be separated into
both theoretical and practical branches. Thus, it makes it a difficult choice to select a
subject for a Masters project. It is true to say that Artificial Intelligence becomes more
popular and important as years pass because it gets more complex and can be
expanded to many applications. Also, I always wanted to implement a game with an
AI approach so I thought it was a great idea to choose to implement a civilization-type
strategy game where I could compare three algorithms (the MCTS, the UCT and the
Minimax). This game has the basic factors that allow it to be categorized as a
civilization-type strategy game since the important part was the comparison of the
algorithms.

1.3. Aims and Objectives

The aims and objectives of this project are listed below.

 Design and implement an application with three contested algorithms (MCTS,


UCT and Minimax) for a civilization type resource allocation strategy game.
 Find and read relevant literature on Monte Carlo Tree Search algorithm, Upper
Confidence Bounds for trees algorithm, Minimax algorithm and strategy
games.

5
 Set some research questions and try to answer them.
 Run a suite of tests and analyse the results.

1.4. Request Questions

This project was implemented in order to answer the following questions. In order to
do this three algorithms are implemented, the Monte Carlo Tree Search, the Upper
Confidence Bounds to trees and the Minimax. The first question tests whether or not
the MCTS algorithm can be successfully applied to a strategy game and not only to
classic turn-based games. The second question compares the two approaches (the
MCTS and the UCT) to see which one has yield better results and also which are the
limitations of each approach. The third question investigates what factors could affect
the construction of the resources of the game. The last question checks which
algorithm (the MCTS or the Minimax) is the most suitable when it comes to the
indication of successful population growth.

 Can MCTS be successfully applied to a strategy game?


 Which of the two approaches, simple MC or MCTS with UCT is better? What
are the limitations of each approach?
 What factors could affect the construction of the resources (population
growth)?
 Is MCTS a more suitable algorithm than Minimax to indicate successful
population growth?

1.5. Approach

Structure of the report:

The first chapter of the report (Introduction) includes the introduction of the project,
the overall aim, the motivation behind the subject of this project along with the aims
and objectives. Also, it includes the research questions that were needed to be set.

The second chapter of the report (Analysis and Design) describes the logic behind the
Civilization type strategy games, what elements are including in these kinds of games
and how AI has contributed through the years in the development of games. Also,
there is a description on how the game of this project works including its rules and
available moves.

The third chapter of the report (Literature Review) is about the literature review that
was done before the starting of the implementation of the application. Many papers,
books, journal articles that mention the three algorithms that were used for this project
(the MCTS, the UCT and the Minimax) were read and studied in order to research
what has already been done regarding this subject.

6
The fourth chapter (Algorithms) gives a description of the algorithms that were used
for the implementation of the project (the MCTS, the UCT and the Minimax).

The fifth chapter (Implementation) includes a description on what tools were used for
the implementation of the project. Also, it gives information about the classes of the
application along with the tests that were performed in order to test the
implementation. Last but not least, there is a characterization of the class diagram of
the application.

The last chapter of the report (Evaluation and Discussion) contains the answers to the
research questions, the results of the performed tests and the conclusion. Also, it
includes some future work and what could have been better in the current
implementation.

At the end of the report there are the appendices with the tables and the source code.

7
Chapter 2
Analysis and Design

8
2. Analysis and Design

The logic behind Civilization type strategy games

This section describes the logic behind the Civilization type strategy games, what
elements are including in these kinds of games and how Artificial Intelligence has
contributed through the years in the development of games. Also, there is a
description on how the game of this project works including its rules and available
moves.

2.1. Civilization type strategy games logic

Nowadays, civilization type strategy games are one of the most popular and
marketable game genres with many famous titles such as the Age of Empires
[Ensemble Studios, 1997], Command and Conquer [Westwood Studios, 1995], World
of Warcraft [Blizzard Entertainment, 1994] and Starcraft franchises [Blizzard
Entertainment, 1998]. Generally, in order to be categorized a game as a civilization
type strategy game it needs to have some elements [Stanescu and Certicky, 2016].

First of all, there is a map which is limited sized and it can have a different terrain
such as water, sand, mountain or grass. In this map the units will be able to move
unless the terrain is inaccessible. Also, in this map there are located some consumable
resources (food, minerals, etc.) that help to produce buildings (some of them have
specific technological prerequisites) like houses or fighting units that require the
above resources and time so that he will be able to improve his economic and military
power and therefore defeat his opponent. Moreover, the player is given units with
different types of strength such as archers and cavalry. Furthermore, in order to win
the player needs tactical planning, strategy and unit management. The main goal for
the player is to control a given nation and make it progress through history.

2.1.1. Humans

Almost in every civilization type strategy game there are the “humans” who have
specific tasks and can be categorized in two categories: the villagers and the warriors.
Most of the times a villager can be categorized as a builder (a person who is
responsible to construct buildings or farms), a farmer (a person who gathers food from
a farm), a forager (a person who gathers food from bushes), a gold/stone miner (a
person who mines at gold/stone mines looking for gold/stone), a villager (a person
who either goes to war or does nothing) and a woodcutter (a person who is
responsible for cutting woods from nearer trees). The warrior is responsible to give
protection and fight other units. He can belong to different fighting units including the
archer category, the cavalry, the infantry etc. [Mecham and Bishop, 1999],
[Honeywell, 2003]

9
2.1.2. Resources

Also, there are existing resources in these kinds of games needed for constructing or
creating objectives. The wood resource can be used for constructing buildings or
fighting units. In order to create villagers the food resource is essential and it can be
represented most of the times by fruits, grains, berries or meat. The gold and stone
resources are needed to research different kinds of technologies, upgrade and create
fighting units like towers or walls [Mecham and Bishop, 1999], [Honeywell, 2003].

2.1.3. Advance through the ages

If the player wants to win these kinds of games he has to advance through different
ages. For example, in the Age of Empires franchise [Ensemble Studios, 1997] (which
is the closest game to this project’s approach) there exist four different periods from
the Old Stone Age till the beginnings of the Iron Age. As the player goes from one
age to another he gains access to new buildings, fighting units and technologies but
this costs resources and time as well. Furthermore, there are many different kinds of
technologies including storage pit technologies, market technologies, government
center technologies and temple technologies. [Mecham and Bishop, 1999],
[Honeywell, 2003]

2.1.4. War

In order to conquer the player needs to win the war which can either be achieved by
the Infantry, the Archers or the Cavalry. The infantry in most games is consisted from
different villagers who have some specific roles. A villager can either be an axe man,
a swordsman or belong in the legion or in the centurion. The archer category is
consisted from bowmen or archers. The cavalry along with the other two categories
has both weak and strong fighting units with the weakest being the scout and the
strongest be the heavy cavalry. A strong factor to contribute to this victory is the
weapons that will be used. They can differ from catapults to ballista. [Mecham and
Bishop, 1999], [Honeywell, 2003]

2.1.5. Fog of war

The fog of war is a common characteristic in these kinds of games where a part of the
map cannot be seen since it has not been explored yet. The player is only able to see
the part of the map where he has placed his units and buildings but he has the option
of scouting in order to detect if the enemy is nearby and explore more to extend his
territory [Setear, 1989].

10
2.2. Games and Artificial Intelligence

Artificial Intelligence was introduced in games a long time ago when the games
needed an obstacle for the player to encounter. This obstacle had the form of an
enemy character that moved around the space along with the player and in order for
the player to advance further into the game this enemy needed to be defeated. The
oldest game that it was first believed to have AI technique in the form of a state
machine was the Pac Man [Midway Games West, Inc., 1979]. Since then there were
needed many years to pass for AI to make a great impact in games. This impact
occurred when real time strategy games have begun to emerge. A great example of AI
path finding can be seen in the Warcraft game [Blizzard Entertainment, 1994]. Also,
games like The Sims [Maxis Software, Inc., 2000] or Spore [Maxis Software, Inc.,
2008] have been created with AI as their basic element. Each character in these games
works with a neural network which sometimes does not look very clever when it takes
actions. These days there are many different AI techniques in games including the old
ones. Some games use the same bots as before in first person shooters games while
others use the Co-op versus AI technique in order to help the players understand the
game like in League of Legends [Riot Games, Inc., 2009]. In general most of the
games that use AI now have in mind three basic factors that need to be implemented.
First of all, it is the ability to move characters either individually or in groups. Also,
the place where these characters are supposed to be moved needs to be decided. And
last but not least, the ability to plan tactically and strategically.

This project focuses on Civilization type strategy games that have a great amount of
elements that need to be controlled (the units, the buildings, the resources, etc.)
therefore that gives a great room for Artificial Intelligence to enter. Moreover, they
are an interesting domain in AI because they represent AI challenges in terms of
planning, learning, uncertainty and reasoning. Many different Artificial Intelligence
theories can be considered suitable for these kinds of games. An Artificial Neural
Network (ANN) is a great solution for pattern recognition by adjusting the weights in
the network (backpropagation) [Jain, Jianchang Mao and Mohiuddin, 1996]. Also,
another popular approach for AI in games is the genetic algorithms [Chih-Sheng Lin
and Chuan-Kang Ting, 2011]. Genetic algorithms (GA) however when it comes to
resource handling have a drawback because it could need many games to run
simultaneously to deal with the resource management. Another approach is the Finite
State Machine (FSM) which is a good approach to control tasks. The Bayesian model
is a good approach for AI games that needs to make decisions for the current state of
the game when the information is incomplete [Langseth and Portinale, 2007]. Also, it
is very useful because in these kinds of games the player most of the times needs to
take a decision based on uncertainty. It has been used by Gabriel Synnaeve and Pierre
Bessière in their approach on the StarCraft game where they tried to control units
locally [Synnaeve and Bessiere, 2015]. Another approach that should be taken into
account and also one of the approaches that is used in this project is the Monte Carlo

11
Tree Search which along with the Upper Confidence Bounds applied to trees and the
Minimax algorithm will be explained later in detail.

2.3. The game of the project

In order for the game to have some consistency some rules needed to be created.
There are 6 rules in total and 6 available moves.

2.3.1. Moves

1. Produce a new villager

2. Produce a new warrior

3. Build a house

4. Build a fighting unit

5. Make the villager gather food resources

6. Make the villager gather wood resources

The above listed moves are the moves that are available in the game. One of the
available moves is the option of creating a new villager. Another available move is
the choice of creating a new warrior. Also, one move can be considered the building
of a house or a fighting unit. The fifth and sixth move is to make the villager gather
either food resources or wood resources.

2.3.2. Rules

1. In order to produce a villager: Actual population < Allowed population + food


resource >= 50

2. In order to produce a warrior: Actual population < Allowed population + food


resource >= 100 + at least one fighting unit

3. In order to build a house: Number of available villagers = 3 + wood resource


>= 150

4. In order to build a fighting unit: Number of available villagers = 5 + wood


resource >= 300

5. There needs to exist food resources on the matrix

6. There needs to exist wood resources on the matrix

12
The first rule is about the production of a villager. A villager can be produced if the
number of the actual population is smaller than the allowed population (each house
allows 5 people to stay in it) and the number of the stored food resources is greater or
equal to 50. The second rule has been created for the production of a warrior. A
warrior can be created if the number of the allowed population is greater than the
number of the actual population, the number of the food resources has to be greater or
equal to 100 and at least one fighting unit has to exist. The third rule exists in order
for a house to be built. If the number of the available villagers that are needed to build
the house equals to 3 and the number of the wood resources is greater or equal than
150 then a house can be built. The fourth rule is about the building of a fighting unit
which can be built if the number of the available villagers that are needed to build the
house equals to 5 and the number of the wood resources is greater or equal than 300.
The fifth and sixth rules are essential for the smooth function of the game and they
say that both food resources and wood resources need to exist on the matrix.

13
Chapter 3
Literature Review

14
3.1. Literature Review

This section describes the literature review that was done before the implementation
of the project. In order to have a full understanding of what exists so far based on the
algorithms that are used (the MCTS, the UCT and the Minimax) many papers, journal
articles and books were read and studied. Some of them are listed below while others
are mentioned in other parts of this project’s report as references.

First of all, there is a paper written by Paolo Ciancarini and Gian Piero Favini which
refer to a game called Kriegspiel or invisible chess which is a game that combines
complex tasks including heuristic search and opponent modelling. Even though the
rules are exactly the same with the ones that are used on a classical chess game the
thing that changes is the way the player sees the board of the game since he has access
only to his side of the board. This leads to the fact that the level of uncertainty is high
and also that there is limited information received. This paper suggests three Monte
Carlo tree search methods for playing Kriegspiel. The first approach failed when it
was against a traditional program based on Minimax algorithm but performed better
than a random player but the second approach had better results since the opponent
did not play randomly but based on average expectations where the actual moves
were never disclosed. The third approach which was the most successful emphasized
selection strategies over simulation strategies. After tests they ended up in the
conclusion that the third approach where the simulations last only one move plus the
quiescence performed the best by defeating even a Minimax based program. Also,
they show that a MCTS algorithm is suitable for difficult environments like
Kriegspiel and it can perform well in a reasonable amount of time [Ciancarini and
Favini, 2010].

Another approach is made by Edward J. Powley, Peter I. Cowling and Daniel


Whitehouse, in their paper they try to enchant MCTS by using a framework called
ICARUS (Information Capture and ReUse Strategies) able to collect information from
visits to one area of the game tree in order to inform the future policy in other areas.
This framework was used in games like Dou Di Zhu, Heart, Lord of the Rings: the
confrontation, checkers, Othello and Backgammon. These are games of perfect and
imperfect information where MCTS could be used. After a set of experiments this
framework was proven an efficient method to define MCTS enchantments [Powley,
Cowling and Whitehouse, 2014].

Also, Keh-Hsun Chen focuses on representing two dynamic randomization techniques


for MC Go with the one being the dynamic parameter randomization and the other
being the dynamic hierarchical move generator randomization [Chen, 2012].

In their paper, Michael Chung, Michael Buro, and Jonathan Schaeffer suggest a plan
selection algorithm called MCPlan which is based on MC sampling, simulation and
replanning. This algorithm is used for planning in RTS games without rely on

15
scripting. It uses a sample of possible plans for a player and then selects the plan with
the highest statistical outcome [Chung, 2005].

Another approach by Vojtech Kovarik and Viliam Lisy present a modified MCTS
algorithm called SM-MCTS-A which is responsible for updating the selection
functions with the averages of the sampled values instead of the current values. It is
focused on two-player zero-sum games with perfect information. With this modified
algorithm it is shown that with the help of any Hannan consistent (HC) selection
function can lead to subgame-perfect Nash Equilibrium in this kind of games
[Kovarik and Lisy, 2015].

Moreover, Markus Enzenberger, Martin Müller, Broderick Arneson, and Richard


Segal introduce an open source software framework based on MCTS for full
information two player board games like the game of Go called FUEGO. It helps to
experiment with new algorithms without thinking of the cost of implementing a
complete state-of-the-art system [Enzenberger et al., 2010].

In addition, Zhe Wang, Kien Quang Nguyen, Ruck Thawonmas, and Frank Rinaldo in
their paper focus on an application of MC planning that controls units in a RTS game
like Starcraft. In order to improve MCPlan’s performance is uses an ε-greedy
algorithm. The tests that were performed shown that the MCPlan has great potential in
this domain as it can reduce the expert knowledge by using an effective evaluation
function [Wang et al., 2012].

Another way to use Monte Carlo Tree Search is the one that Tom Pepels, Mark H. M.
Winands and Marc Lanctot talk about in their paper where MCTS can be applied in a
real time game like Ms Pac-Man. It uses four enchantments to do so since Ms Pac-
Man is a game which terminal state is not conclusive. You can either win by surviving
the game or by scoring as many points as possible. MCTS is a suitable algorithm for
this game since there is a limited time available for the player to make a move. Also,
the results have shown that these four enchantments have improved the overall
performance [Pepels, Winands and Lanctot, 2014].

Diego Perez, Sanaz Mostaghim, Spyridon Samothrakis and Simon M. Lucas in their
paper propose a Multiobjective MCTS algorithm responsible for planning and
controlling in real time game domains. This algorithm after tested it has shown that it
can outperformed the nondominated sorting evolutionary algorithm II (NSGA-II)
which was one of the algorithms that was compared with. This algorithm was also
tested in two different games: the real time version of the deep-sea treasure game and
the multiobjective version of the physical travelling salesman problem [Perez et al.,
2015].

Another approach is presented by Naoyuki Sato, Kokolo Ikeda and Takayuki Wada in
their paper that focuses in creating team-mate computer players in order to estimate
the preferences of the human players and reduce their dissatisfaction. Monte Carlo

16
plays an important role here as it helps in deciding the parameters for each player
[SATO, IKEDA and WADA, 2015].

In another paper Rémi Coulom focuses on suggesting a new algorithm for Monte
Carlo tree search that also has a Minimax phase and a Monte Carlo phase and it was
implemented in a Go-playing program called Crazy Stone. This algorithm was
performed in tournaments and won a 100-game match against state-of-the-art Monte
Carlo Go program [Coulom, 2009].

Furthermore, in their paper Maarten P.D. Schadd, Mark H.M. Winands, H. Jaap van
den Herik, Guillaume M.J-B. Chaslot, and Jos W.H.M. Uiterwijk insist on how A*
and IDA* algorithms are well known and successful choice for one player games and
suggest to use MCTS as an alternative approach since A* and IDA* cannot perform
well without the help of an evaluation function. It adds a new variant called Single
Player Monte Carlo Tree Search (SP-MCTS) which has a different selection and
backpropagation strategy from the classic MCTS. After the results were performed it
was clear that the SP-MCTS was suitable for single player perfect information games
with good results [Schadd et al., 2008].

The Dominion game is challenging when it comes to Artificial Intelligence


approaches because it has hidden information. Robin Tollisen, Jon Vegard Jansen,
Morten Goodwin, and Sondre Glimsdal in their paper propose two novel variants: the
AI-UCB (Upper Confidence Bounds without trees) and the AI-UCT (Upper
Confidence Bounds applied to trees) that use the MCTS for playing the game
Dominion. The results have shown that the AI-UCB performed better than the AI-
UCT [Tollisen et al., 2015].

J. (Pim) A.M. Nijssen and Mark H.M. Winands in their paper propose a technique
called Playout search with the use of MCTS in order to improve the reliability of the
playouts. The maxn, Paranoid and BRS were analyzed in two deterministic perfect
information multi-player games: Focus and Chinese checkers. The results have
shown that both games with the help of the Playout search increased their playouts but
their speed was slower [Nijssen and Winands, 2012].

A paper written by Diego Perez, Philipp Rohlfshagen and Simon M. Lucas investigate
whether MCTS can be applied to the Physical Travelling Salesman Problem (PTSP).
Monte Carlo methods are not suitable for long-term planning since they require
thousands of moves but MCTS can be used to look ahead. This paper suggests a
solution for two major problems of the PTSP: the order of waypoints and the driving
of the ship [Perez, Rohlfshagen and Lucas, 2012].

Also, Sylvain Gelly, Yizao Wang, Rémi Munos and Olivier Teytaud in their paper
introduce the MoGo which is a Monte Carlo Go program and one of the first Go
programs that uses UCT. After the results were performed they showed that the UCT
is more efficient than the alpha-beta search when the search time is limited [Gelly,
Wang, Munos and Teytaud, 2006].

17
Another approach is proposed by Weijia Wang and Michèle Sebag where in their
paper they propose an extension to the MCTS called MO-MCTS (Multi-Objective
Monte Carlo Tree Search) to multi-objective sequential decision making. This
algorithm was tested to the classical Artificial Intelligence problem, Deep Sea
Treasure and in grid scheduling. The various tests showed that the MO-MCTS can
solve the Deep Sea Treasure problem [Wang and Sebag, 2012].

Radha-Krishna Balla and Alan Fern in their paper introduce a way to use UCT for the
problem of tactical assault planning in real time strategy games and in particular the
game of Wargus. In this case compared to baselines and a human player performed
better and showed that it is a promising approach for tactical assault planning [Balla
and Fern, 2009].

Cameron Browne in his paper examines a 5x5 Hex (a two-player, zero-sum game
played) position that it can easily defeat the simple Monte Carlo methods along with
the plain Upper Confidence Bounds applied to trees (UCT) [Browne, 2013].

Last but not least, Nathan R. Sturtevant in his paper examines if the UCT algorithm is
able to perform better than or as well as other existing algorithms in multi-player
games. The UCT was tested in Chinese checkers, Spades and Hearts and it showed
that it is capable to benefit multi-player games even though it does not answer the
problem of opponent modelling [Sturtevant, 2008].

18
Chapter 4
Algorithms

19
Description of the algorithms that were used in the implementation

This section describes the Monte Carlo Tree Search algorithm, the Upper Confidence
Bounds for trees algorithm and the Minimax algorithm that were used for the
implementation of this project.

4.1. Monte Carlo Tree Search

Monte Carlo Tree Search (MCTS) is a very popular decision tree search algorithm
that is able to find optimal decisions in a given domain with the help of random
samples and it was based on the Monte Carlo method [Browne et al., 2012]. The
Monte Carlo method was first introduced in the 1940s by Stanislaw Ulam
[SULLIVAN, 1984] and then the MCTS was first explored by others including C.
Suttner (Suttner and Ertel, 1989). It has helped AI to produce successful strategies in
real time strategy games, board games and non deterministic games as well, examples
can be seen in the games of Go [Gelly and Silver, 2011], Hex [Arneson, Hayward and
Henderson, 2009], Poker, Ms. Pac-Man [Pepels, Winands and Lanctot, 2014],
Starcraft [Wang et al., 2012] and Scrabble. Also, it plays a great role when it comes
to two-player zero-sum games and when planning through uncertainty [Keller and
Eyerich, 2012]. But even though it has been a popular approach there are still no
theoretical results that put the MCTS algorithm as efficient as the Nash equilibrium in
imperfect information games.

The basic idea behind MCTS is to run a large number of random simulations of a
problem and then based on the data that will be collected find the optimal decisions
and build a search tree according to the results. It runs simulations till it reaches a
terminal state in order to grow a tree rooted at the current state. In the beginning, the
tree is empty and with every iteration that is made, a leaf is added to the tree. Every
node of the tree represents a state of the game [Browne et al., 2012].

MCTS is based on four steps which are shown in the (1.1) graph.

1. Selection: The first step is the selection where the children of the tree are
selected according to a selection policy. This selection policy is used in order
to find the most urgent node of the tree. If a node is reached and it does not
represent a terminal state then it is selected for the next step which is the
expansion.
2. Expansion: During the expansion the child of the final selected node is added.
There exist one or more child nodes that can be added based on the available
actions.
3. Simulation: In order to estimate the outcome of the game there is a third step
called simulation/play out. It starts from the state of the added node by playing
random actions or a heuristic strategy till the terminal state is reached.
4. Backpropagation: The last step is the backpropagation which is basically the
result of the simulation and is propagated from the selected node to the root
node. It updates all the nodes that were visited during the first two steps.

20
1.1 Graph: The four basic Monte Carlo Tree Search steps.

MCTS has two basic distinct policies. The first policy is the tree policy which is used
in order to create a node based on the existing nodes inside the search tree during the
selection and expansion phase. The second policy is the default policy which is used
for the creation of a value estimation based on the play out domain of an existing non
terminal state during the simulation phase [Perez et al., 2015] [Browne et al., 2012].

In general MCTS is the best approach when the non-terminal states need to be
evaluated and it can outperform any depth limited search method. There are three
main aspects that define the MCTS as a popular and successful algorithm. First of all,
it is aheuristic so it does not need any domain specific knowledge and it is a popular
choice to a domain that needs a tree in order to be modelled. Moreover, the MCTS is
able to backpropagate (the four step) the outcome of each game instantly meaning that
the algorithm has the ability to return an action from the root at anytime. This ability
has the advantage of allowing extra iterations to take place leading to an improved

21
result and since the MCTS can be interrupted at anytime it means that the most
promising move has already been found. Also, by performing an asymmetric tree
which is able to adapt to a topology, the algorithm can visit more interesting and
promising nodes more often. This helps the algorithm to focus in more related parts of
the tree and was an important factor to make it suitable for games like 19x19 Go.
Furthermore, MCTS is best suited than depth-limited Minimax because it does not
have to evaluate the values of the middle states so there is less amount of domain
knowledge required [Perez et al., 2015] [Browne et al., 2012].

On the other hand, MCTS can have a major drawback which is the fact that during the
simulation for every given turn there might be a variety of possible moves but that
does not guarantee that all of them are good. Most of the times one or two of the
possible moves are good enough and if each turn a random move is chosen then it is
unlikely for the simulation to show the best path. But it is true that with the proper
enchantments the algorithm can be improved.

4.2. Upper Confidence Bounds for Trees

The UCT is a standard tree based Monte Carlo algorithm proposed by Kocsis and
Szepesvari, 2006 [Kocsis and Szepesvari, 2006] and the most popular algorithm of the
MCTS family. Kocsis and Szepesvari suggested using the UCB1 as a tree policy in
order to create the UCT. Also, it has played a very important role in the majority of
today success stories of MCTS including the Go game. The basic reason that UCT
works so well in the Go game is the fact that UCT is able to evaluate only end game
states and with Go it is almost impossible to evaluate in the middle of the game the
states. Also, an important factor is that since the Go game tends to random play then it
is easy to reach an end game state.

The basic idea behind UCT is that it does many multi-phase playouts instead of many
random simulations. The UCT does not require any evaluation function or a depth
bound in contrary with the Minimax, but it constructs an accumulatively tree which is
responsible for updating values based on Monte Carlo rollouts beginning from the
root of the tree till the terminal state. It makes the rollout trajectories tend towards to
the ones that seem more promising than others based on past trajectories. This helps
the tree to grow first the most promising branches but also make sure that an ideal
decision will be made nevertheless by enough given rollouts. Every node in the
emerged tree is able to hold an estimated value for each of the available actions that
were responsible for the selection of the next executable action. UCT is a promising
solution when it comes to the exploration and exploitation dilemma which is often
found in MCTS problems.

22
4.3. Minimax

The Minimax is an algorithm which is applied to two player games with perfect
information where the players take turns. It has been used as a first choice algorithm
for tree search for many decades and it has produce great results against human
players in games like checkers, tic-tac-toe, chess and tactics. But it can also be applied
to games without perfect information like Poker.

In the Minimax algorithm two players are involved, the MAX player and the MIN.
The MAX player tries to select the move that has the highest value while the MIN
player tries to choose the move that is better to him while in the meantime it will
minimize the MAX’s outcome. Thus a search tree is generated with the current game
position until the end game position. The search is usually stopped early and then a
value function is used in order to estimate the outcome of the game. The Minimax
search wants to achieve an optimal strategy based on the best available moves of a
given player while keeping in mind the available moves that the other given player
has up to a certain given depth. By examining a move into a greater depth the results
of the Minimax are increased rapidly. This makes the Minimax algorithm to take a lot
of time to make a full search of the game tree because it needs to search all the nodes
of the tree for the best solution but this can be helped by using pruning methods. The
α-β heuristic is one of the most used methods in order to prune the tree where the α is
the best score that the computer is able to achieve while the β is the best score that the
human is able to achieve [Strong, 2011].

23
Chapter 5
Implementation

24
5. Implementation

5.1 Description

This chapter contains all the tools that were chosen to be used for the implementation
of this project. Also, there is a description of the classes of the application and the test
methods that were used for the testing of the project along with the description of the
class diagram of the application.

5.1.1. The tools that were used

Integrated Development Environment

NetBeans IDE 8.1, Java: 1.8.0_65; Java HotSpot(TM) 64-Bit Server VM 25.65-b01
Runtime: Java(TM) SE Runtime Environment 1.8.0_65-b17

System: Mac OS X version 10.10.5 running on x86_64

Word Editor

Microsoft Office Word 2007

General Used Tools

Web Browser: Google Chrome, Version 52.0.2743.116 m

PDF Reader: Adobe Reader, Version 11.0.3

Used Tools for the diagrams and tests

Edraw Soft: Edraw Max 6, Version 6.1

SPSS version 21

25
5.2. Design Documents

5.2.1. Class Diagram

1.2 The Class Diagram of the application

5.2.2. Description of the Class Diagram

The Class Diagram consists of the following java classes. The main class EmpiresAge
which creates objects from the MonteCarlo.java, MinMaxPlayer.java and the
MCTSPlayer.java classes. The MinMaxPlayer.java class which inherits the
Player.java class. The MonteCarlo.java class and the MCTSPlayer.java class inherit
the Player.java class and create MCTSNode.java class objects.

26
5.3. Classes of the Implementation

5.3.1. Player.java class

The player.java class describes what the player can do. It is the super class of the
minimax.java class, montecarlo.java class and MCTS.java class. The reason that this
class has been chosen as a super class is the fact that these three algorithms (MCTS,
UCT and Minimax) share the same attributes like the state of the game, utility of the
state and possible states (every possible move that the player can have in a state).

Attributes of the class:

The player.java class has two attributes, one HashMap the state of the game and one
Integer player which can have two Integer values: which can only have 1 or 2. The 1
value means that is the first player and the 2 means that it is the second player.

The state of the game is represented by a HashMap <String, Integer>, the maintained
key is a String and the mapped value is an Integer. Its size is 12 (6 * 2) since it
represents the six things that a player can have in the game (number of villagers,
number of warriors, number of built houses, number of fighting units, number of
wood resources and the number of food resources). In this game the state is how many
available resources and units a player has at the current state of the game. When the
state of the game needs to be updated, the player’s move is taken in order to update
the HashMap. For example, if the player one’s move is to build a new house, the
element with the key numberOfHousesPlayerOne will be updated.

Let state be the object of the HashMap. This is how the state is updated in the
program:

state.put (“numberOfHousesPlayerOne”, state.get (“numberOfHousesPlayerOne”) ++);

,where it updates the value +1.


state.put (“woodPlayerOne”, state.get (“woodPlayerOne”) - 50);

,where it subtracts the cost of the wood that is needed in order to build a house.

The main reason that the private possibleMoves method was created was to give the
possible moves that a player can make in a given state. It takes two parameters: a state
(HashMap state) and which player it is (Integer player). It returns an ArrayList of the
possible moves which is an integer number.
1 for collecting wood,
2 for collecting food,
3 for creating a new house,
4 for creating a fighting unit,
5 for creating a new villager and
6 for creating a new warrior.

27
It takes all the valuables that the player has on this specific state and then calculates
which move the player is able to make, adds them in the ArrayList and then return
this ArrayList.

The utility method’s goal is to return how good a state is. It takes the state of the game
as a parameter. It takes all the valuables that the player has on this specific state and
then calculates which move the player is able to make, adds them in the ArrayList and
then return this ArrayList. The more analogue the attributes are between themselves
the better the utility. But this means that the smaller number the utility is the better.
Since Minimax needs the utility to be bigger in order to be better, the utility changes
the sign.

The endgame method takes the state of the game and returns if the game has ended or
not. In order for the game to win one of the two players needs to have villagers > 50
&& warriors > 50 && fighting unit > 30.

The possibleStates method’s aim is to return an ArrayList of possible states from a


specific state which is passed as a parameter. It calls the possibleMoves method in
order to identify which moves it can take from a specific state. Then it runs a loop for
all these moves, creates a new state for each move, add these states to the ArrayList
and then returns the list.

5.3.2. MinMaxPlayer.java

The MinMaxPlayer.java class is used in order to create the Minimax algorithm and
run it. It inherits the Player.java class so that it can share its attributes.

The alphabeta method is used for pruning the Minimax algorithm with α-β pruning. It
takes as parameters the state of the game that the α-β is run, the depth of the tree that
will be searched, the alpha which is the plus infinity, the beta which is the minus
infinity and the player who is either max or min. It runs recursively until the depth is 0
or the state is a terminal state and it returns the utility. If it is the max player it runs a
loop of all possible states and it calls itself (for the min player) again in order to
calculate the alpha value. If the alpha value is bigger than the beta value the loop
breaks. If it is the min player it does the same thing but calls itself (for the max
player).

The updateMove method decides which will be the next state of the player by using
the Minimax with α-β pruning and returns the updated state. It takes as arguments the
current state of the game, runs α-β pruning for all the possible states and returns the
state with the highest score.

28
5.3.3. MCTSNode.java

The MCTSNode.java class is used in the MonteCarlo.java and the MCTSPlayer.java


classes in order to build the tree. The attributes of this class are the following. The
visitors field which contains the number of how many times this node has been visited
in the tree (this is used for the MCTS algorithm). The utility field is the utility of the
node, the terminal field is used to describe if the node is terminal or not, the
parentNode field is used to tell which is the parent node in the tree and the state field
is a HashMap for the state of the node. The childrenList field is the list with the
children of this node in the tree. This class contains all the setters and getters of these
fields.

5.3.4. MonteCarlo.java

The MonteCarlo.java class has been created for the simple Monte Carlo algorithm. It
extends the Player.java class in order to inherit its attributes.

The simulation method takes as parameter the MCTSNode and returns an ArrayList
of these nodes which is the tree of the simulation. It runs a while loop until a terminal
node is reached and it takes the possible states of one node and then adds them to the
ArrayList. For each node that it runs it updated the utility of the node.

The monteCarlo method takes a state and calculates the total utility of the simulation
of this state. Then it divides it with the height of the simulated tree and this result is
how good the state is.

The updateMove method decides which will be the next state of the player by using
the Monte Carlo algorithm and returns the updated state. It takes as parameters the
current state of the game, runs the Monte Carlo for all the possible states and returns
the state with the highest score.

5.3.5. MCTSPlayer.java

The MCTSPlayer.java class is used for the creation of the Monte Carlo Tree Search
algorithm. It extends the Player.java class so that it can share its attributes.

The expand method takes as argument a node and it expands it (adds all the possible
nodes and adds them as children of the node).

The backPropagation method propagates from the parameter node to the root node by
using recursive and every time it updates the visitor and the backpropagation value of
the node (utility).

29
The simulation method as described in the MonteCarlo.java class works similar as
that simulation method except that this one calls the expand method at the end.

The updateMove method decides which will be the next state of the game by running
the MCTS algorithm. It decides which is the best possible state by running MCTS
(simulation(), backPropagation() , expand() method) and calculates the max score for
each one by using the UCT formula.

double maxScore1 = backPropagationNode1.getUtility() +


Math.sqrt(Math.log(backPropagationNode1.getVisitor()) / currentNode.getVisitor() );

,where backPropagationNode1 is the node where the MCTS algorithm is ran and the
currentNode is the current state’s node and also the parent node of the
backpropagation node.

5.3.6. EmpiresAge.java

The EmpiresAge.java class is the class which contains the main method of the
implementation. It is used to run the contested algorithms and outputs the results. It
runs three games (the Minimax algorithm against the MCTS algorithm, the MCTS
algorithm against the simple Monte Carlo and the simple Monte Carlo against the
Minimax algorithm).

30
5.4. Testing

In order to identify any errors in the design and coding of the program a testing
technique was essential to be implemented. The testing method that was most suitable
for this project’s testing was the white box testing technique. [Ehmer and Khan,
2012] [Ehmer, 2010]. The white box testing can be implemented only if the tester is
fully aware of the source code that has been written and it has to do with the internal
logic and structure of the program code.

1.3.Process of the white box testing.

31
Chapter 6
Evaluation and Discussion

32
6.1. Request Questions

 Can MCTS be successfully applied to a strategy game?


 Which of the two approaches, simple MCTS or MCTS with UCT is better?
What are the limitations of each approach?
 What factors could affect the construction of the resources (population
growth)?
 Is MCTS a more suitable algorithm than Minimax to indicate successful
population growth?

The above questions were using data in order to be answered that were obtained from
the study analysed with the use of the statistical package SPSS (version 21). Paired-
sample T-tests were employed to test the significant differences on the variables of
food, wood, villagers, warriors, houses and fighting units of the algorithms Monte
Carlo and UCT. Paired-sample T-tests were also performed to indicate the potential
significant differences in the population growth of the UCT and the Minimax
algorithms. Normality tests were performed in both cases.

Can MCTS be successfully applied to a strategy game?

There are many papers written that have studied whether or not the Monte Carlo Tree
Search algorithm can be applied successfully to strategy games. The majority of these
papers shown that the MCTS algorithm is well suited for a strategy game as well as in
other types of games.

First of all, the paper by Paolo Ciancarini and Gian Piero Favini [Ciancarini and
Favini, 2010] studies how MCTS can be applied to the game of Kriegspiel. They run
a variety of tests that compared the MCTS approach to the Minimax approach and the
results shown that the MCTS can produce good results in a reasonable amount of time
but only if the design of the application of the Monte Carlo method is well structured.

Another approach has been made by István Szita, Guillaume Chaslot, and Pieter
Spronck. In their paper they apply MCTS in the game of Settlers of Catan [Herik,
Herik and Herik, 2010] which has as a main objective the gathering of resources in
order to use them for the creation and development of settlers, cities, roads etc. They
proved that the MCTS is a successful approach for the Settlers of Catan game and to
games with more complex rules than the Go game [Szita et al., 2009].

In their paper Naoyuki Sato, Kokolo Ikeda and Takayuki Wada suggested on using
the Monte Carlo method in setting the preferences of an RPG game. After a series of
experiments on both human and AI players the Monte Carlo method was proven
effective [Sato et al., 2015].

Another example is the paper created by Michael Chung, Michael Buro, and Jonathan
Schaeffer that shows that the Monte Carlo planning can show promising results in
RTS games [Chung et al., 2005].

33
In addition, Zhe Wang, Kien Quang Nguyen, Ruck Thawonmas, and Frank Rinaldo in
their paper used the Monte Carlo in the famous game Starcraft where it performed
better than original AI and show promising results [Wang et al., 2012].

In our approach we run Monte Carlo Tree Search compared to Monte Carlo and
Monte Carlo Tree Search compared to Minimax on the same game and we observed
that out of 20 playouts on each approach the UCT was always surpassing simple
Monte Carlo and Minimax. So we can conclude that the Monte Carlo Tree Search is a
suitable algorithm for strategy games and also the best option.

Which of the two approaches, simple MC or MCTS with UCT is better? What
are the limitations of each approach?

According to the following results (Figure 1), the statistical analysis concluded that
there was a high variability among the five parameters of the two studied algorithms.
More specifically, the UCT algorithm illustrated a more effective performance on the
house (t(20)=2,961, P=0.008) and the warrior factors of the strategy game
(t(20)=6,648, P<0.005) (Appendix 1), while Monte Carlo surpassed by 9 more units
the UCT (t(20)=-6,576, P<0.005) in the villager and 72 units in the wood parameters
(t(20)=-1,12, P>0.005). Yet, the food parameter narrowed the difference among the
two algorithms by 29 units (t(20)=-1,12, P>0.005) and the fighting unit buildings by
only 2 units (t(20)=-5,152, P<0.005) (Appendix 2).

Figure 1: The number and type of parameters that were calculated on the study.

34
Is MCTS a more suitable algorithm than Minimax to indicate successful
population growth?

Similar differences with the above question were also summarized from the statistical
analysis of the algorithms UCT and Minimax regarding the population growth (Figure
2). More specifically, the UCT algorithm provided a very low variability with the
UCT algorithm in the house parameter (t(12)=-8,350, P<0.005) (Appendix 3), while
on the resources, significant differences were noticed between them (t(12)=-0,805,
P=0.436) (Appendix 4).

Figure 2: The total number of variables summarized from the two algorithms which
were performed on the game platform

What factors could affect the construction of the resources (population growth)?

Population growth is one of the most crucial parts in the logic of strategy games. It is
highly essential as it highly defines the number of individuals that can occupy a
specific land side. Therefore, the population growth is dependent on the proportion of
houses that were constructed on a specific area. However, the percentage of houses
and the population growth are notably affected by the resources. In order for the
construction of the houses button to be available, a specific number of wood is
required. Hence, wood is a required component that can indirectly affect the
population growth. Lastly, population growth can be also influenced by the food
resource.

35
6.2. Conclusion

Taken all the above into account, it was concluded that the Monte Carlo Tree Search
algorithm performed better than the other two algorithms (Monte Carlo and
Minimax). Furthermore, the Monte Carlo Tree Search was the winning algorithm
most of the times in both cases compared to the simple Monte Carlo and to the
Minimax algorithm. Additionally, it was the fastest algorithm as it ran in less than 23
seconds while the Minimax algorithm needed ten times more time. Moreover, it can
be easily used in a civilization type resource allocation strategy game successfully.

36
6.3. Future work and what could have been better

It is true to say that as in every application in this one there are some things that could
have been better or some additions that could have been made.

First of all, if there was a greater variety of moves and rules the results along with the
consistency of the game could have been improved. For example, there could have
existed more units (cavalry, archers, etc.), more fighting units (towers, weapons, etc)
and resources (stone, gold, a variety of gathered food resources, etc.) like there are in
games like Age of Empires [Ensemble Studios, 1997] or Command and Conquer
[Westwood Studios, 1995].

Also, a good addition would have been the use of the resources to other utilities
except from fighting units, for example, the investment on technologies. Furthermore,
if there were more rules the utility of the game could have been better and every move
of the game could have a longer duration time. For example, if a house needs to be
built, it needs two turns of the game in order to be built so for the next two turns the
villagers that are working on building the house are not available. So if I need to build
another house in the next turn I need to find other available villagers to perform the
task.

Another good aspect would have been the addition of a Graphical User Interface in
order to allow the user to interact with the application instead of just having a
command line. If a Graphical User Interface existed it would have been easier to set
the parameters, perform the experiments and have a more clear view of the each state
of the game.

Last but not least, all the algorithms (the MCTS, the Monte Carlo and the Minimax)
could have been implemented more sophisticated.

37
Appendices

38
7.1. Appendices

Appendix 1: Summary of the statistics from the factors of the two algorithms (MCTS
and MC)

Appendix 2: Descriptive statistics results of three of the parameters of the study

Appendix 3: Summary of the statistics from the two algorithms (MCTS and
Minimax), (Mean, Standard deviation, confidence interval)

Appendix 4: Descriptive statistics of the two algorithms (MCTS and Minimax),


(Mean, Standard deviation, confidence interval)

39
7.2. Tables with data

Data Table of Minimax and MCTS

40
Data Table of MCTS vs MC

41
7.3. Source Code
package empiresage;

import java.util.ArrayList;
import java.util.HashMap;

/**
*
* @author Lydia
*/
public class EmpiresAge {

/**
* @param args the command line arguments
*/
public static void main(String[] args) {
HashMap <String,Integer> state = new HashMap <String,Integer>();
state.put("numberOfVillagersPlayerOne", 1);
state.put("numberOfVillagersPlayerTwo", 2);
state.put("numberOfFightersPlayerOne", 2);
state.put("numberOfFightersPlayerTwo", 2);
state.put("numberOfHousesPlayerOne", 1);
state.put("numberOfHousesPlayerTwo", 1);
state.put("numberOfFightingUnitOne", 1);
state.put("numberOfFightingUnitTwo", 1);
state.put("woodPlayerOne", 0);
state.put("woodPlayerTwo", 0);
state.put("foodPlayerOne", 0);
state.put("foodPlayerTwo", 0);

// MiniMax VS MCTS
MinMaxPlayer minMax = new MinMaxPlayer(state,1);
MCTSPlayer mscts = new MCTSPlayer(state,2);
while (!mscts.endGame(state) && !minMax.endGame(state)){
state= mscts.updateMove(state);
mscts.setState(state);
minMax.setState(state);
state = minMax.updateMove(state);
mscts.setState(state);
minMax.setState(state);
}

if(minMax.endGame(state)){
System.out.println("And the winnerrrrrrrrr of the battle between "
+ "Minimax and MCTS isssssss MiniMax");
}
if(mscts.endGame(state)){
System.out.println("And the winnerrrrrrrrr of the battle between "
+ "Minimax and MCTS isssssss MCTS");
}

// MCTS VS Monte Carlo


HashMap <String,Integer> state2 = new HashMap <String,Integer>();
state2.put("numberOfVillagersPlayerOne", 1);
state2.put("numberOfVillagersPlayerTwo", 2);
state2.put("numberOfFightersPlayerOne", 2);
state2.put("numberOfFightersPlayerTwo", 2);
state2.put("numberOfHousesPlayerOne", 1);
state2.put("numberOfHousesPlayerTwo", 1);
state2.put("numberOfFightingUnitOne", 1);
state2.put("numberOfFightingUnitTwo", 1);
state2.put("woodPlayerOne", 0);
state2.put("woodPlayerTwo", 0);
state2.put("foodPlayerOne", 0);
state2.put("foodPlayerTwo", 0);

MonteCarlo mc = new MonteCarlo(state2,1);


MCTSPlayer mscts2 = new MCTSPlayer(state2,2);
while (!mscts2.endGame(state2) && !minMax.endGame(state2)){
state2= mscts2.updateMove(state2);
mscts2.setState(state2);
mc.setState(state2);
state2 = mc.updateMove(state2);
mscts2.setState(state2);
mc.setState(state2);
}

if(mc.endGame(state2)){
System.out.println("And the winnerrrrrrrrr of the battle between"
+ " MCTS and Monte Carlo isssssss Monte Carlo");
}
if(mscts2.endGame(state2)){
System.out.println("And the winnerrrrrrrrr of the battle between "
+ "MCTS and MonteCarlo isssssss MCTS");
}

// Minimax VS Monte Carlo


HashMap <String,Integer> state3 = new HashMap <String,Integer>();
state3.put("numberOfVillagersPlayerOne", 1);
state3.put("numberOfVillagersPlayerTwo", 2);
state3.put("numberOfFightersPlayerOne", 2);
state3.put("numberOfFightersPlayerTwo", 2);
state3.put("numberOfHousesPlayerOne", 1);
state3.put("numberOfHousesPlayerTwo", 1);
state3.put("numberOfFightingUnitOne", 1);
state3.put("numberOfFightingUnitTwo", 1);
state3.put("woodPlayerOne", 0);
state3.put("woodPlayerTwo", 0);
state3.put("foodPlayerOne", 0);
state3.put("foodPlayerTwo", 0);

MonteCarlo mc2 = new MonteCarlo(state3,2);


MinMaxPlayer minMax2 = new MinMaxPlayer(state3,1);
while (!minMax2.endGame(state3) && !mc2.endGame(state3)){
state3= minMax2.updateMove(state3);
minMax2.setState(state3);
mc2.setState(state3);
state3 = mc2.updateMove(state3);
minMax2.setState(state3);
mc2.setState(state3);
}

if(mc2.endGame(state3)){

43
System.out.println("And the winnerrrrrrrrr of the battle between"
+ " Minimax and Monte Carlo isssssss Monte Carlo");
}
if(minMax2.endGame(state3)){
System.out.println("And the winnerrrrrrrrr of the battle between"
+ " Minimax and Monte Carlo isssssss MiniMAx");
}
}
}

package empiresage;

import java.util.ArrayList;
import java.util.HashMap;

/**
*
* @author Lydia
*/
public class MCTSNode {

private int visitors = 0; // visitors of the node


private int utility = 0; //utility of the node
private boolean terminal = false ;
private MCTSNode parentNode = null;
private HashMap <String,Integer> state = new HashMap <String,Integer> ();

private ArrayList <MCTSNode> childrenList = new ArrayList <MCTSNode> ();

/**
* Creates MCTSNode object
* @param parent hm state of the node
* @param parent parent node in the tree
* @param player identifier
*/
public MCTSNode(MCTSNode parent, HashMap hm,int player){
this.parentNode = parent ;
this.state =hm;
terminal = this.endGame(hm, player);
}

/**
* Sets the state.
*
* @param adds child node to the childrenList
*/
public void addChild(MCTSNode n){
childrenList.add(n);
}

/**
* update visitor value by 1
*/
public void updateVisitor(){
visitors++;
}

/**

44
* set visitor
* @param int visitor
*/
public void setVisitor(int visitorN){
visitors = visitors + visitorN;
}

/**
* Update utility
* @param int new utility
*/
public void updateUtility(int utility){
this.utility = utility;
}

/**
* Update utility for backpropagation purpose
* @param int new utility
*/
public void updateUtilityBackPropagation(int utilityP){
this.utility = this.utility + utilityP;
}

/**
* Returns visitor
* @return boolean
*/
public int getVisitor(){
return visitors;
}

/**
* Returns list of children
* @return ArrayList
*/
public ArrayList<MCTSNode> getChildrenList() {
return childrenList;
}

/**
* Returns if node is terminal
* @return boolean
*/
public boolean isTerminal() {
return terminal;
}

/**
* Returns if it has parent
* @return boolean
*/
public boolean hasParent(){
return parentNode != null;

/**
* Returns the parent of the node

45
* @return MCTSNode
*/
public MCTSNode getParentNode() {
return parentNode;
}

/**
* Returns the state
* @return HashMap
*/
public HashMap<String, Integer> getState() {
return state;
}

/**
* Returns the utility
* @return integer
*/
public int getUtility(){
return utility;
}

/**
* sets state
* @param HashMap<String, Integer> state
*/
public void setState(HashMap<String, Integer> state) {
this.state = state;
}

/**
*This method returns if the game has ended in the specific state.
* @param statePS - state of a game
* @return int player identifier
* @return boolean true if there is a winner false if there is none
*/
public boolean endGame(HashMap statePS,int playerN){
String playerString = (playerN == 1) ? "One" : "Two";
int villagers = (int) statePS.get("numberOfVillagersPlayer"+playerString);
int warriors = (int) statePS.get("numberOfFightersPlayer"+playerString);
int fightingUnit = (int) statePS.get("numberOfFightingUnit"+playerString);

return villagers > 50 && warriors > 50 && fightingUnit > 30;

}
package empiresage;

import java.util.*;

/**
*
* @author Lydia
*/
public class MCTSPlayer extends Player {

46
private Random r = new Random ();

/**
* Creates a new MCTSPlayer object by initialize the state of the player
* and the player identifier
* @param state first state of the player
* @param player identifier
*/
public MCTSPlayer(HashMap state,int player){
super(state,player);
}

/**
* Expands a node with it's children (all possible nodes)
* @param node
*/
public void expand(MCTSNode node){
if (node.isTerminal()){
return ;
}
// all possible states
for (Object x : this.possibleStates(node.getState(), playerN)){
// create MCTSNode for a possible node
MCTSNode mn = new MCTSNode(node, (HashMap) x,playerN);
node.addChild(mn);
mn.setState((HashMap<String, Integer>) x);

}
}

/**
* runs until the root node is reached and updates the utility and the
* visitors
* @param node
* return MCTSNode
*/
public MCTSNode backPropagation(MCTSNode node){
//if there is no parent that means that it is the root node
if (node.hasParent()){
node.getParentNode().setVisitor(node.getVisitor());
node.getParentNode().updateUtilityBackPropagation(node.getUtility());
return backPropagation(node.getParentNode());

}else{
return node;//root node
}
}

/**
* runs the simulation for a node and returns the tree that has been
* produced
* @param MCTSNode node
* @return ArrayList <MCTSNode> tree of the simulation
*/
public ArrayList <MCTSNode> simulation(MCTSNode node){
ArrayList <MCTSNode> simuList = new ArrayList <MCTSNode>();
simuList.add(node);

47
// as long as it is not the terminal node it builts the tree
while (!simuList.get(simuList.size()-1).isTerminal()){
ArrayList <HashMap> possList = this.possibleStates
(simuList.get(simuList.size()-1).getState(), playerN);
int ran = r.nextInt(possList.size());
//System.out.println(ran);
MCTSNode mn = new MCTSNode(simuList.get
(simuList.size()-1), possList.get(r.nextInt(possList.size())),playerN);
mn.updateVisitor();
mn.updateUtility(this.utility(mn.getState()));
simuList.add(mn);
expand(mn);//expand the node
}

return simuList;
}

/**
* This method decides what will be the player's next state
* by using the MSCT and returns this state
* @param g current state
* @return state
*/
public HashMap updateMove(HashMap g){

//current state's node


MCTSNode currentNode = new MCTSNode(null,g,playerN);
currentNode.updateVisitor(); // update visitor +1
//update utility by using backprogation Method
currentNode.updateUtilityBackPropagation(currentNode.getUtility());
// possible states from current state
ArrayList <HashMap> possibleStates = this.possibleStates(g, playerN);

//first possible state is assumed to have the biggest value


MCTSNode node = new MCTSNode(null,possibleStates.get(0),playerN);
node.updateVisitor();
node.updateUtilityBackPropagation(node.getUtility());
ArrayList <MCTSNode> simulation = simulation(node);
MCTSNode backPropagationNode =
backPropagation(simulation.get(simulation.size()-1));

//use UCT
double maxScore = backPropagationNode.getUtility() + Math.sqrt
(Math.log(backPropagationNode.getVisitor()) / currentNode.getVisitor());
int counter = 0;

//calculate the node which has the biggest score from MSCT
for(int i=1; i <possibleStates.size(); i++){
MCTSNode node1 = new MCTSNode(null,possibleStates.get((i)),playerN);
node1.updateVisitor();
node1.updateUtilityBackPropagation(node1.getUtility());
ArrayList <MCTSNode> simulation1 = simulation(node1);
MCTSNode backPropagationNode1 =
backPropagation(simulation1.get(simulation1.size()-1));
double maxScore1 = backPropagationNode1.getUtility() + Math.sqrt
(Math.log(backPropagationNode1.getVisitor()) / currentNode.getVisitor());

if(maxScore < maxScore1){

48
maxScore = maxScore1;
counter = i;
}
}

// returns the state with the biggest score from MSCT


return possibleStates.get(counter);
}

}
package empiresage;

import java.util.*;

/**
*
* @author Lydia
*/
public class MinMaxPlayer extends Player{

private Random r = new Random ();

/**
* Creates a new MinMaxPlayer object by initialize the state of the player
* and the player identifier
* @param state first state of the player
* @param player identifier
*/
public MinMaxPlayer(HashMap state,int player){
super(state,player);
}
/**
* Use minimax with alpha-beta pruning
* runs until a state is terminal or the depth becomes 0
*
* @param depth the depth the search
* @param player identifier
* @param stateMM for the state that the minimax will run
* @param alpha
* @param beta
* @return utility of the the final state
**/
public int alphabeta(HashMap stateMM, int depth, int alpha, int beta,
int player){

if (depth == 0 || this.endGame(stateMM)){
return this.utility(stateMM);
}else{
if(player == 1){//max player
for(Object xo : this.possibleStates(stateMM,1)){
int al = this.alphabeta((HashMap) xo, depth-1, alpha,
beta, 2);
alpha = Math.max(alpha, al);
if (alpha >= beta){break;}
}
return alpha;

}else{//min

49
for(Object xo : this.possibleStates(stateMM,2)){
int be = this.alphabeta((HashMap) xo, depth-1, alpha,
beta, 1);
beta = Math.min(beta, be);
if (alpha >= beta){break;}

}
return beta;
}
}

/**
* This method decides what will be the player's next state
* by using the minimax (alpha beta pruning ) and returns
* this state
* @param g current state
* @return state
*/
public HashMap updateMove(HashMap g){
ArrayList <HashMap> possibleStates = this.possibleStates(g, playerN);
ArrayList <Integer> list = new ArrayList <> ();

int max = alphabeta(possibleStates.get(0), 6, -999, 999, 1);


list.add(0);

for (int i=1; i<possibleStates.size();i++){


int alpha = alphabeta(possibleStates.get(i), 6, -999, 999, 1);
if (alpha > max){
max = alpha;
list.removeAll(list);
list.add(i);
}

if(alpha == max){
list.add(i);
}
}
return possibleStates.get(list.get(r.nextInt(list.size())));

}
}
package empiresage;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Random;

/**
*
* @author Lydia
*/
public class MonteCarlo extends Player{

Random r = new Random ();

50
/**
* Creates a new MonteCarlo object by initialize the state of the player
* and the player identifier
* @param state first state of the player
* @param player identifier
*/
public MonteCarlo (HashMap state,int player){
super(state,player);
}

/**
* returns the score value of the monteCarlo algorithm
* @param HashMap state
* @return double score
*/
public double mondeCarlo(HashMap state){
int total =0 ;
MCTSNode currentNode = new MCTSNode(null,state,playerN);
ArrayList <MCTSNode> simulationList = this.simulation(currentNode) ;
for (MCTSNode x :simulationList){
total = total + x.getUtility();
}

return total/simulationList.size();
}

/**
* runs the simulation for a node and returns the tree that has been
* produced
* @param MCTSNode node
* @return ArrayList <MCTSNode> tree of the simulation
*/
public ArrayList <MCTSNode> simulation(MCTSNode node){
ArrayList <MCTSNode> simuList = new ArrayList <MCTSNode>();
simuList.add(node);

while (!simuList.get(simuList.size()-1).isTerminal()){
ArrayList <HashMap> possList = this.possibleStates
(simuList.get(simuList.size()-1).getState(), playerN);
int ran = r.nextInt(possList.size());
MCTSNode mn = new MCTSNode(simuList.get(simuList.size()-1),
possList.get(r.nextInt(possList.size())),playerN);
mn.updateUtility(this.utility(mn.getState()));
simuList.add(mn);
}

return simuList;
}

/**
* This method decides what will be the player's next state
* by using the minimax (alpha beta pruning ) and returns
* this state
* @param g current state
* @param state
*/
public HashMap updateMove(HashMap g){

51
ArrayList <HashMap> possibleStates = this.possibleStates(g, playerN);

double maxScore = mondeCarlo( possibleStates.get(0));


int counter = 0;

for(int i=1; i <possibleStates.size(); i++){


double maxScore1 = mondeCarlo( possibleStates.get(i));

if(maxScore < maxScore1){


maxScore = maxScore1;
counter = i;
}
}

return possibleStates.get(counter);
}

package empiresage;

import java.util.ArrayList;
import java.util.HashMap;

/**
*
* @author Lydia
*/
public class Player {

protected int playerN = 0; // identifier of player


//current state of the player
protected HashMap <String,Integer> state = new HashMap <String,Integer>();
private final int WARRIORCOST = 100; // cost of warrior
private final int VILLAGERCOST = 50; //cost of villager
private final int HOUSECOST = 100; //cost of house
private final int FIGHTINGUNITCOST = 150; //cost of fighting unit

/**
* Creates a new Player object by initialize the state of the player and
* the player's identifier
* @param state the first state of the player
* @param player identifier
*/
public Player(HashMap state,int PlayerNc){
this.state = state;
this.playerN = PlayerNc;
}

/*
* return and ArrayList of possible moves
* 1 for collect wood
* 2 for collect food
* 3 create new house

52
* 4 create fightUnit
* 5 create new villager
* 6 create new warrior
* @param statePS - state of the game
* @param player identifier
* @return ArrayList of possible moves
*/
private ArrayList <Integer> possibleMoves(HashMap statePM,int playerPM) {
ArrayList <Integer> possibleMoves = new ArrayList <Integer> ();

//for which player should get data from the state


String playerString = (playerPM == 1) ? "One" : "Two";

//get the date from the state


int villagers = (int)statePM.get("numberOfVillagersPlayer"+playerString);
int house = (int)statePM.get("numberOfHousesPlayer"+playerString);
int warriors = (int)statePM.get("numberOfFightersPlayer"+playerString);
int fightingUnit = (int)statePM.get("numberOfFightingUnit"+playerString);
int wood = (int)statePM.get("woodPlayer"+playerString);
int food = (int)statePM.get("foodPlayer"+playerString);

int allowerPopulation = house *5; //allow poulation is count 5 x house


int population = warriors +villagers;

possibleMoves.add(1); // always can collect wood


possibleMoves.add(2); // always can collect food

// can build a house if the available wood resourses is bigger than


//the cost of the house
if ( wood >= HOUSECOST){
possibleMoves.add(3);
}
if( wood >= FIGHTINGUNITCOST){ //can built a fighting unit
possibleMoves.add(4);
}
//can produce new villager if available food is bigger than the cost
//and the population hasn't overcome the available population
if(food > VILLAGERCOST && population < allowerPopulation){
possibleMoves.add(5);
}
//can produce a new warrior
if(food > WARRIORCOST && population < allowerPopulation){
possibleMoves.add(6);
}

return possibleMoves;

/*
* Return the utility of a state
* @param statePS - state of the game
* @return int utility of the state
*/
public int utility(HashMap state){

53
int utilityLevel =0;
String playerString = (playerN == 1) ? "One" : "Two";

//get the date from the state


int villagers = (int) state.get("numberOfVillagersPlayer"+playerString);
int house = (int) state.get("numberOfHousesPlayer"+playerString);
int warriors = (int) state.get("numberOfFightersPlayer"+playerString);
int fightingUnit = (int) state.get("numberOfFightingUnit"+playerString);
int wood = (int) state.get("woodPlayer"+playerString);
int food = (int) state.get("foodPlayer"+playerString);

int allowedPolulationU = house * 5;


int populationU = warriors + villagers;

//the utility is better when every attitube is analogue


utilityLevel = (allowedPolulationU - populationU) +
( warriors/fightingUnit) + (warriors - villagers) +
(food / villagers) + (wood / villagers) + (food / warriors) +
(wood / warriors);

utilityLevel *= -1; // changing the sign of utility because the smaller


//is the better utility but minimax looks for the bigger value
return utilityLevel;
}

/**
* This method returns if the game has ended in the specific state.
* @param statePS - state of the game
* @return boolean true if there is a winner false if there is none
*/
public boolean endGame(HashMap statePS){
String playerString = (playerN == 1) ? "One" : "Two";
int villagers = (int) statePS.get("numberOfVillagersPlayer"+playerString);
int warriors = (int) statePS.get("numberOfFightersPlayer"+playerString);
int fightingUnit = (int) statePS.get("numberOfFightingUnit"+playerString);

return villagers > 50 && warriors > 50 && fightingUnit > 30;

/*
* return an ArrayList of possible states
* @return int player identifier
* @param statePS - state of the game
* @return ArrayList <HAshMap> of possible moves
*/
public ArrayList possibleStates(HashMap statePS,int player){

ArrayList <HashMap> list = new ArrayList <> ();


String playerString = (player == 1) ? "One" : "Two";
for (Integer x: this.possibleMoves(statePS,player)){

HashMap <String,Integer> possibleStateH = statePS;

54
int wood = possibleStateH.get("woodPlayer" + playerString);
int food = possibleStateH.get("foodPlayer" + playerString);
int house = possibleStateH.get("numberOfHousesPlayer" + playerString);
int fighting = possibleStateH.get("numberOfFightingUnit" + playerString);
int villager = possibleStateH.get("numberOfVillagersPlayer" + playerString);
int fighters = possibleStateH.get("numberOfFightersPlayer" + playerString);

switch (x) {
case 1:
wood = 60 + wood;
break;
case 2:
food = 60 + food;

break;
case 3:
house = 1 + house;
wood = wood - HOUSECOST;
break;
case 4:

fighting = 1 + fighting;
wood = wood - FIGHTINGUNITCOST;
break;
case 5:
villager = 1 + villager;
food = food - VILLAGERCOST;
break;
case 6:

fighters = 1 + fighters;
food = food - WARRIORCOST;
break;
}

HashMap <String,Integer> xc= new HashMap <>();

String playerString2 = (player == 1) ? "Two" : "One";

// update values of possible states


xc.put("woodPlayer" + playerString, wood);
xc.put("foodPlayer" + playerString, food);
xc.put("numberOfHousesPlayer" + playerString, house);
xc.put("numberOfFightingUnit" + playerString, fighting);
xc.put("numberOfVillagersPlayer" + playerString, villager);
xc.put("numberOfFightersPlayer" + playerString, fighters);

xc.put("woodPlayer" + playerString2, possibleStateH.get("woodPlayer" + playerString2));


xc.put("foodPlayer" + playerString2, possibleStateH.get("foodPlayer" + playerString2));
xc.put("numberOfHousesPlayer" + playerString2, possibleStateH.get("numberOfHousesPlayer" +
playerString2));
xc.put("numberOfFightingUnit" + playerString2, possibleStateH.get("numberOfFightingUnit" +
playerString2));
xc.put("numberOfVillagersPlayer" + playerString2, possibleStateH.get("numberOfVillagersPlayer" +
playerString2));

55
xc.put("numberOfFightersPlayer" + playerString2, possibleStateH.get("numberOfFightersPlayer" +
playerString2));

list.add(xc);

}
return list;
}

/**
* Returns the state in which the player is .
* @return HashMap <Sting,Integer> state
*/
public HashMap getState(){
return state;
}

/**
* Sets the state.
*
* @param h to set state
*/
public void setState(HashMap h){
state = h ;
}

/**
* Returns which player is - it can be 1 or 2
* @return int player identifier
*/
public int getPlayerN() {
return playerN;
}

56
References
1. Stanescu, M. and Certicky, M. (2016). Predicting Opponent's Production in Real-Time
Strategy Games With Answer Set Programming. IEEE Trans. Comput. Intell. AI Games, 8(1),
pp.89-94.
2. Jain, A., Jianchang Mao, and Mohiuddin, K. (1996). Artificial neural networks: a tutorial.
Computer, 29(3), pp.31-44.
3. Chih-Sheng Lin and Chuan-Kang Ting,(2011) Emergent Tactical Formation Using Genetic
Algorithm in Real-Time Strategy Games. Conference on Technologies and Applications of
Artificial Intelligence.
4. Synnaeve, G. and Bessiere, P. (2015). Multi-scale Bayesian modeling for RTS games: an
application to StarCraft AI. IEEE Trans. Comput. Intell. AI Games, pp.1-1.
5. Langseth, H. and Portinale, L. (2007). Bayesian networks in reliability. Reliability
Engineering & System Safety, 92(1), pp.92-108.
6. Balla, R. (2009). UCT for tactical assault battles in real-time strategy games.
7. Chung, M. (2005). Monte Carlo planning in RTS games.
8. Mecham, J. and Bishop, R. (1999). Age of empires II. Rocklin, Calif.: Prima Pub.
9. Honeywell, S. (2003). Command & conquer. Roseville, CA: Prima Games.
10. Setear, J. (1989). Simulating the fog of war. Santa Monica, CA: Rand Corp.
11. Ciancarini, P. and Favini, G. (2010). Monte Carlo tree search in Kriegspiel. Artificial
Intelligence, 174(11), pp.670-684.
12. Powley, E., Cowling, P. and Whitehouse, D. (2014). Information capture and reuse strategies
in Monte Carlo Tree Search, with applications to games of hidden information. Artificial
Intelligence, 217, pp.92-116.
13. Chen, K. (2012). Dynamic randomization and domain knowledge in Monte-Carlo Tree Search
for Go knowledge-based systems. Knowledge-Based Systems, 34, pp.21-25.
14. Enzenberger, M., Muller, M., Arneson, B. and Segal, R. (2010). Fuego&#x2014;An Open-
Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search.
IEEE Trans. Comput. Intell. AI Games, 2(4), pp.259-270.
15. Pepels, T., Winands, M. and Lanctot, M. (2014). Real-Time Monte Carlo Tree Search in Ms
Pac-Man. IEEE Trans. Comput. Intell. AI Games, 6(3), pp.245-257.
16. Perez, D., Mostaghim, S., Samothrakis, S. and Lucas, S. (2015). Multiobjective Monte Carlo
Tree Search for Real-Time Games. IEEE Trans. Comput. Intell. AI Games, 7(4), pp.347-360.
17. Browne, C. (2013). A Problem Case for UCT. IEEE Trans. Comput. Intell. AI Games, 5(1),
pp.69-74.
18. Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S.,
Perez, D., Samothrakis, S. and Colton, S. (2012). A Survey of Monte Carlo Tree Search
Methods. IEEE Trans. Comput. Intell. AI Games, 4(1), pp.1-43.
19. Sylvain Gelly and David Silver. Monte-carlo tree search and rapid action value estimation in
computer go. Artificial Intelligence, 175(11):1856{1875, 2011.
20. SULLIVAN, W. (1984). STANISLAW ULAM, THEORIST ON HYDROGEN BOMB. [online]
Nytimes.com. Available at: http://www.nytimes.com/1984/05/15/obituaries/stanislaw-ulam-
theorist-on-hydrogen-bomb.html [Accessed 21 Aug. 2016].
21. Suttner, C. and Ertel, W. (1989). Automatic acquisition of search guiding heuristics.
München Inst. für Informatik.
22. Arneson, B., Hayward, R. and Henderson, P. (2009). MOHEX wins Hex Tournament. ICG,
32(2), pp.114-116.
23. Wang, Z., Nguyen, K., Thawonmas, R. and Rinaldo, F. (2012). MONTE-CARLO
PLANNING FOR UNIT CONTROL IN STARCRAFT. The 1st IEEE Global Conference on
Consumer Electronics 2012.

57
24. Thomas Keller and Patrick Eyerich. Prost: Probabilistic planning based on uct. In Proceedings
of the Twenty-Second International Conference on Automated Planning and Scheduling
(ICAPS), pages 119{127, 2012.
25. Kovarik, V. and Lisy, V. (2015). Analysis of Hannan Consistent Selection for Monte Carlo
Tree Search in Simultaneous Move Games. HC selection for MCTS in Simultaneous Move
Games.
26. SATO, N., IKEDA, K. and WADA, T. (2015). Estimation of Player’s Preference for
Cooperative RPGs Using Multi-Strategy Monte-Carlo Method. IEEE CIG 2015.
27. Coulom, R. (2009). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search.
LIFL, SequeL, INRIA Futurs, Université Charles de Gaulle, Lille, France.
28. Schadd, M., Winands, M., Herik, H., Chaslot, G. and Uiterwijk, J. (2008). Single-Player
Monte-Carlo Tree Search. IFIP International Federation for Information Processing 2008.
29. Tollisen, R., Jansen, J., Goodwin, M. and Glimsdal, S. (2015). AIs for Dominion Using Monte
Carlo Tree Search. Springer International Publishing Switzerland 2015.
30. Nijssen, J. and Winands, M. (2012). Playout Search for Monte-Carlo Tree Search in Multi-
player Games. Springer-Verlag Berlin Heidelberg 2012.
31. Perez, D., Rohlfshagen, P. and Lucas, S. (2012). Monte Carlo Tree Search: Long-term versus
Short-term Planning. 2012 IEEE Conference on Computational Intelligence and Games
(CIG’12).
32. Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud. Modification of UCT with
Patterns in Monte-Carlo Go. [Research Report] RR-6062, INRIA. 2006.
33. Wang, W. and Sebag, M. (2012). Multi-objective Monte-Carlo Tree Search. JMLR: Workshop
and Conference Proceedings 25:507{522, 2012.
34. Balla, R. and Fern, A. (2009). UCT for Tactical Assault Planning in Real-Time Strategy
Games. Proceedings of the Twenty-First International Joint Conference on Artificial
Intelligence (IJCAI-09).
35. Sturtevant, N. (2008). An Analysis of UCT in Multi-player Games. IFIP International
Federation for Information Processing 2008, CG 2008, LNCS 5131, pp. 37-49.
36. Cellan-Jones, R. (2016). Artificial intelligence: Google's AlphaGo beats Go master Lee Se-dol
- BBC News. [online] BBC News. Available at: http://www.bbc.co.uk/news/technology-
35785875 [Accessed 31 Aug. 2016].
37. Deepmind.com. (2016). AlphaGo | Google DeepMind. [online] Available at:
https://deepmind.com/alpha-go [Accessed 31 Aug. 2016].
38. Berardi, D., Calvanese, D. and De Giacomo, G. (2005). Reasoning on UML class diagrams.
Artificial Intelligence, 168(1-2), pp.70-118.
39. R. Ramanujan, A. Sabharwal, and B. Selman, “On the behavior of UCT in synthetic search
spaces,” presented at the Monte Carlo Tree Search Workshop, 21st Int. Conf. Autom. Plan.
Sched., Freiburg, Germany, 2011.
40. Glenn Strong, The Minimax Algorithm, February 10,2011.
41. Ehmer, M. and Khan, F. (2012). A Comparative Study of White Box, Black Box and Grey
Box Testing Techniques. International Journal of Advanced Computer Science and
Applications, 3(6).
42. Mohd, Ehmer Khan (2010). Different Forms of Software Testing Techniques for Finding
Errors. International Journal of Computer Science Issues, Vol. 7, Issue 3, No 1.
43. Szita, Chaslot and Spronck (2009). Monte-Carlo Tree Search in Settlers of Catan. ACG’09
Proceedings of the 12th International conference on Advances in Computer Games. Springer-
Verlag Berlin Heidelberg.

58

You might also like