Risk Project Design Document

AI player for Risk Design Document
Dirk Brand, 16077229 September 26, 2013
Contents
1 Introduction 1.1 Game Phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Framework 2.1 Protocol Design . . . . . . . . 2.1.1 Protocol Messages . . 2.1.2 Protocol Flow . . . . . 2.2 Class Design . . . . . . . . . 2.2.1 Common Objects . . . 2.2.2 Facilitator . . . . . . . 2.2.3 Controller . . . . . . . 2.2.4 Engine . . . . . . . . . 2.3 Method Summary . . . . . . 2.4 User Interface . . . . . . . . . 2.4.1 Facilitator Connection 2.4.2 Pre-Game Screen . . . 2.4.3 Game Screen . . . . . 2.5 Logging Functionality . . . . 2 2 3 3 5 8 8 9 9 10 10 12 15 15 17 18 20 21 22 22 22 23 24 25 25
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
3 Computer Player 3.1 Submissive Player . . . . . . . . 3.2 Baseline Player . . . . . . . . . . 3.3 Expectiminimax with Alpha-Beta 3.4 Monte Carlo Tree Search . . . . . 3.4.1 Stochasticity . . . . . . . 3.4.2 The Algorithm . . . . . . 4 Testing Design
. . . . . . . . . . Pruning . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Introduction
This document describes the design of the game framework of the Risk game outlined in this project. The design of a testing framework is also described. The main focus of the project is to investigate computer players for Risk, so emphasis is placed on the design of the algorithms that will feature in the computer players. Additional details can be found at the project website [10].
1.1
Game Phases
Our game of Risk consists of several phases outlined in Figure 1. The diagram illustrates how a our game of Risk transitions between game phases.
Battle
Pre-game
Setup
Recruitment
Manoeuvre
Figure 1: The game phases. The phases are: pre-game (P), setup (S), recruitement (R), battle (B), and manoeuvre (M).
Framework
The controller, which handles the playing of a game between two engines. The facilitator, which allows engines to connect to it and then initializes a controller with two engines to start a game. The graphical user interface (GUI) for human players. The GUI backend engine. Various computer based engines (AIs). A component that facilitates logging. A text-based communication protocol for managing communication between the components.
The framework consists of the following components:
The framework structure will be based on a model described in the Go Text Protocol (GTP) [4]. The structure of the framework is shown in Figure 2.
2.1
Protocol Design
The communication protocol will be a text-based communication system based on the GTP [4]. For both human players and AI players, the communication between the engine and the controller or facilitator will take place over TCP/IP channels. The communication could then take place over a network. Communication will be done in unicode, so the framework of the game could be implemented in dierent languages. Additional message text (for success and failure commands) should be in plain text. The protocol species various messages. A message can either be a request, issued by the controller, or a reply from an engine. If an engine is not able to handle a certain request from the controller, it can reply with an error message. Request The structure of a request will be as follows:
id request_type [arguments] The id eld is an unique positive integer value and request_type is a string (without white space) describing the specic command, followed by a list of arguments (possibly empty, comma separated). Reply The structure of a reply will be as follows:
id = [responses]
Controller
Facilitator
Engine (Human/AI)
Engine (Human/AI)
(a) Engines Connected to the Controller and Facilitator. All connections use TCP/IP as base for communication.
GUI Backend Engine
User Interface
Human Player
(b) Human interaction with the engine.
Figure 2: Structure of the framework.
The id eld is again a positive integer value and should match the id value included with the request that the reply is in response to. The responses eld contains a list of values in response to the previous request (comma separated and possibly empty). The = indicates a success. Note that some replies may function as an acknowledgement of success without a list of responses (see the list of requests in Figures 3 and 4). Error The structure of a failure reply will be as follows:
id ? [#]error_message The [id] eld is again a positive integer value and should match the id value included with the request that triggered this error. The error_message eld is a compulsory string that describes the reason why the request could not be handled (with an optional # symbol as prex). Failure Mode If an error message starts with the # symbol, followed by a message containing only the request type that triggered it, it indicates that the request was not expected by the engine. The controller does not recover, but instead logs the failure and ends the game. If the engine sends an unrecognised or unexpected message to the controller, the failure is logged and the game ends. If the controller or the engine reaches a network timeout, it also does not recover but logs the failure. In all instances where the game ends, the log will state that the misbehaving engine lost the game. 2.1.1 Protocol Messages
Dierent messages are used in dierent phases of the game and some message may be re-used in dierent game phases. The facilitator uses a specic set of messages for communicating with the engines. Once a game has been set up, a controller is launched for the game and the controller then communicates with the engines using a dierent set of messages. The two sets of request messages that will be handled by the protocol is shown in Figures 3 and 4 (appropriate responses and game phase usage are also shown).
Request opponents [AI 1, AI 2, ...]
Expected Reply =
Game Phase P
Description. Sends a list of available opponents to the engine (these include AIs and connected human players). Sends a list of available game maps to the engine. Requests the engine to submit the initial conguration of a game. The engine replies with the appropriate choices made by the player. Informs the engine of the conguration of the game and that a game is starting. The rst player name is the rst player to place troops and will have their turn rst.
2 3
maps [world map, starcraft map, ...] start choices
= = playerName, opponent, map
P P
start game [Player1name, Player2name,chosenMap]
Figure 3: Protocol for communication between the Facilitator and the engines.
Request initial own territories [territory id1, territory id2, ...]
Expected Reply =
Game Phase S
place troops [number]
= territory id1, number1, territory id2, number2, ... =
S, R
Description. Informs the engine of its initial allocated territories. The territories not mentioned belong to the opponent (the engine knows which territories, since it is specied in the map le). Informs the engine of the number of troops the player receives to place. The engine then replies with a list of territories and the number of troops to place on each territory. Informs the engine on which territories an opponent placed troops and how many troops the opponent placed on each territory. Requests the engine to provide a source and a destination for an attack. The engine replies with a source and destination, or with an empty list if the player wants to stop attacking.
troops placed [territory id1, number1, territory id2, number2, ...] attack
S, R
= source territory id, dest territory id
7 9 = B Informs the engine of the results of an attack. The message includes the individual dice rolls of the players (a 0 is returned for unused dice) and where the attack took place. The engine calculates the result of the attack and removes troops from territories if necessary. Requests the engine to provide a source and a destination for a manoeuvre, as well as the number of troops to move. The engine replies with a source, destination and number of troops. If the engine does not wish to move any troops, it provides an empty list. The request is used in both the manoeuvre phase as well as in the battle phase when a player defeats a territory belonging to another player and then has to move troops from the source of the attack to the defeated territory. Informs the engine that the specied number of troops have been moved from the source to the destination. Informs the engine what the result of the game was with the appropriate message (either a victory or a defeat). If the game ends as the result of a failure, the engine is not informed, but simply disconnected. The failure is logged.
attack attack defend source
result [attack dice1, dice2, attack dice3, dice1, defend dice2, id, dest id]
10
manoeuvre
= source territory id, dest territory id, number
B, M
11
move troops [source territory id, dest territory id, number] result [message]
B, M
12
ALL
Figure 4: Protocol for communication between the Controller and the engines.
2.1.2
Protocol Flow
The protocol is known to the engines, the controller and the facilitator. Thus, when an engine receives a request message, it replies with the correct response and the controller or facilitator would know to expect that response (and vice versa for the engine). This makes it easy to evaluate if an unexpected message has arrived. The facilitator only communicates with the engines during the pregame phase, then the controller handles the communication with the engines in the other four game phases. The ow of a typical game is shown in Figure 5 with blue edges indicating the ow of requests and replies for the engine currently playing their turn and red indicating the same for the other engine. State 12 (a result message) can be reached at the end of the game, or by any state if the game fails at any point in the game. 9 10
11
Figure 5: The usage of the protocol in our game of Risk.
2.2
Class Design
The facilitator and the controller components both have access to the protocol manager that communicates with the engines. The engines have a separate protocol manager for communicating with the facilitator and controller. The engines, the facilitator and the controller all have access to the Logger class. Class diagrams of the various classes are shown in Figures 6 to 9 with the individual method descriptions showing in Figure 10. All local variables in classes are private, but appropriate getters and setters are provided by default.
2.2.1
Common Objects
Some object classes that are used by multiple classes throughout the framework of the game. The GameState object is abstract and can be extended by users that wish to write their own engines and want to store additional information in the GameState object.
GameState (Abstract) ConnectedPlayer mapName : String mapLocation : String players : Player [] gamePhase : int player1Territories : Territory [] player2Territories : Territory [] currentPlayerID : int playerInfo : Player socket : Socket output : OutputStream input : BueredReader
+ closeConnections() + send(String mes)
+ allocateTerritory (int player,int territory) + placeTroop (int player,int territory) + removeTroop (int player,int territory)
ProtocolManager - messages : String [] - clients : ConnectedPlayer [] + sendCommand(int destID, int id, String command, String arguments)
Territory name : String ID : int continent : String continentID : int neighbours : Territory[] x : int y : int troops : int Logger - currentLog : File - debugLevel : int
Player name : String id : int color : Color ipAddress : String port : int
+ log(int level, String message)
Figure 6: Common objects used by other classes. 2.2.2 Facilitator
The facilitator allows engines to connect to the controller and once two engines start a game, launches the controller with the two connected players and their choices made in the pre-game phase. The structure of the facilitator is shown in Figure 7.
10
FacilitatorLogic - PM : ProtocolManager - log : Logger + + + + main() readAIOpponents(String path) connectPlayer(Socket socket, int id) disconnectPlayer(int id)
Figure 7: Structure of the facilitator. 2.2.3 Controller
The controller keeps the game state and performs all the logic of the game (management of turns, simulation of dice rolls, management of game phases). The structure of the controller is shown in Figure 8.
ControllerLogic game : GameState PM : ProtocolManager log : Logger randomSeed : long
+ ControllerLogic(ConnectedPlayer p1, ConnectedPlayer p2, String mapName, Logger log) + playGame() + loadMap(String mapPath) + resolveAttack() - genRoll()
Figure 8: Structure of the controller. 2.2.4 Engine
The Engine consists of an EngineProtocolManager to process all requests from the controller and send relevant replies, and an EngineLogic unit that handles said requests by either making changes to the local game state, or retrieving the information requested by the controller. The EngineLogic unit has a local copy of the game state. The controller, facilitator and engines all have access to the same set of map les. If an engine is outdated and does not have a map specied by the facilitator and the controller, it should be updated by getting the latest maps from the project website [10]. All map les will be in the same format as the format used in the Risk implementation by Yura [7].
11
The structure of the engine is shown in Figure 9.
EngineLogic - game : GameState - epm : EngineProtocolManager + main() + establishControllerConnection (String address, int port) + playGame(Player p1, Player p2) + loadMap(String mapName, String mapURL) + troopPlaced(int [] territory ids, int [] numberOfTroops) + recruitment(int number) + resolveAttack(int a1, int a2, int a3, int d1, int d2) + transferTerritoryControl (int territory id, int player id1, int player id2) + setOppManoeuvre(int territory id1, int territory id2, int number) EngineProtocolManager replies : String [] controllerIP : String controllerPort : int controllerSocket : Socket
+ process(String m) - sendSuccess(int id, String arguments) - sendFailure(int id, String message)
Figure 9: Structure of the engine.
12
2.3
Method Summary
All methods have a void return type, unless specied otherwise. Class / Return Type GameState Method Description
allocateTerritory(int player id, int territory id) placeTroop(int player id, int territory id) removeTroop(int player id, int territory id) ConnectedPlayer closeConnections() send(String message) Logger log(String message, int level)
Allocates the specied territory to the player (adds the territory to the players list of territories). Increments the number of troops at the territory belonging to the specied player. Decrements the number of troops of the territory belonging to the specied player. Close the Socket, OutputStream and BueredReader associated with ConnectedPlayer. Send the specied message via the Socket of the ConnectedPlayer. Checks if the specied level is less than or equal to the logging level of the Logger instance and if so, appends the message to the log. Sends a message to the ConnectedPlayer that corresponds to the destID. The message consists of the id value (if positive), the command String and the arguments String (unless empty). Constructor for the ControllerLogic class. Initiates the controller with two ConnectedPlayers and a map to play. The controller is also initialised with a Logger object, as it keeps the same log as the facilitator. Starts a game at the setup phase. The method will cycle through the game phases and the player turns and communicate with the engines while updating the game state when changes happen in the game. Loads the map from the le at the mapPath. Will generate a list of territories from the map le and allocate to players randomly. Gets dice rolls and removes troops from the appropriate territories in the game state, then informs players of the result. Generates a random number between one and six.
ProtocolManager sendCommand(int destID, int id, String command, String arguments) ControllerLogic ControllerLogic(ConnectedPlayer p1, ConnectedPlayer p2, string mapName, Logger log)
playGame()
loadMap(String mapPath)
resolveAttack()
int
genRoll()
13 FacilitatorLogic main() The starting point of the facilitator. Initialises the ProtocolManager and Logger global objects, opens a ServerSocket and waits for players to connect to it. Reads the list of AI opponents in the directory at the given path and adds each of them to the list of ConnectedPlayers in the protocol manager kept by the facilitator. Adds a ConnectedPlayer to the list of connected players in the protocol manager with the given socket and id value. Removes the ConnectedPlayer corresponding to the given identier from the list of connected players and calls the closeConnections() method inside the ConnectedPlayer class. The starting point of the engine. Initialises the objects and elds of the class. Attempts to connect to the controller and on a successful connection, awaits further instructions from the controller. Connects to the facilitator. Throws an IOException if the port is already bound to another connection or an UnknownHostException if no available connection is found at the specied address. Initializes the game phases, starting with the setup phase. All messages from the controller will be received and processed, with relevant replies sent back. Loads the map from the le at the mapPath to display in the user interface. It will process the territories listed in the map le and allocate territories to players based on the list sent by the controller with the initial own territories message. Throws a FileNotFoundException if the map le at the provided path does not exist. After receiving the troops placed message from the controller, this method will be called to update the game state. Resolves the recruitment of the given number of troops during the recruitment phase. The player selects territories where to place recruited troops and the engine sends the selection to the controller. Resolves the attack between the chosen source and destination territory. The rst three integer parameters are the values of the attackers dice and the other two the value of the defenders dice (a value of zero for unused dice). The result of the battle is computed and the appropriate troop numbers of the territories involved in the attack are updated in the game state.
readAIOpponents(String path)
connectPlayer(Socket socket, int id) disconnectPlayer(int id)
EngineLogic main()
establishFacilitatorConnection (String address, int port)
playGame(Player p1, Player p2)
loadMap(String mapPath)
troopsPlaced(int [] territory ids, int [] numberOfTroops) recruitment(int number)
resolveAttack(int a1, int a2, int a3, int d1, int d2)
14 Removes the territory from the list of territories belonging to rst player and adds it to the list of territories belonging to the second player. Decrements the number of troops of the source by the specied number, then increments the destination number of troops with the same number.
transferTerritoryControl(int territory id, int player id1, int player id2) setOppManoeuvre(int territory id1, int territory id2, int number) EngineProtocolManager process(String m) sendSuccess(int id, String arguments)
sendFailure(int id, String message)
Process the message sent by the controller by calling the relevant methods and sending the correct reply. Sends a success message to the controller on the controllerSocket. The message will consist of the id value, if a positive id parameter is given, a = and the arguments (unless an empty parameter is provided). Sends a failure message to the controller on the controllerSocket. The message will consist of the id value, if a positive id parameter is given, a ? and the message String.
Figure 10: Method descriptions.
15
2.4
User Interface
The user interface will be created using the Java Swing API. The user interface will consist of only two windows. These will include a window to set up a game (selecting an opponent, selecting the game map, choosing a name and choosing a colour) and a window that will function as a front-end for the game engine. This will contain a panel for displaying information as well as a panel that displays the map and the various troop allocations per territory. Some prototype screenshots are shown in Figures 11 to 13. Each of the prototype interfaces consists of various components. These components will also be discussed. 2.4.1 Facilitator Connection
Before any game playing can commence, the facilitator has to be launched. Since communication can be done over the network, the facilitator can be on either of the human players machines, or on a dierent remote machine. Any human player that wishes to play a game, must rst establish a connection to the facilitator. The facilitator can be used in two ways. Either a player connects to the facilitator and creates a game with their choices of name, map and opponent (either an AI or a player that has already connected to the facilitator) or they join a game that has been created by another player. The facilitator could also be circumnavigated and the controller could be launched directly by a handler that will set up a game between two engines. This will be used when testing AI engines. The window in Figure 11 will co-ordinate the connection procedure. The player need only provide the internet address (can be localhost), the port on which to connect to the controller and the proxy (if left blank, the default system proxy is used). Once a player has connected to the controller, he/she is registered, so when another human player connects to the controller, they will be able to choose any previously connected players as opponents.
16
(a) Prototype GUI.
(b) Warning message if port value not lled in.
Figure 11: Connecting to the controller. Figure 11 consists of three JLabels, three JTextFields and a JButton. A user can enter text into the text elds, then press connect. If any of the elds are uncompleted, an error message will be displayed (like in Figure 11b).
17
2.4.2
Pre-Game Screen
Once a player has connected to the facilitator, the next window will allow a player to choose a name, select which opponent they wish to play against, select a map to use in the game and choose the colour each player will be represented by. The prototype for this window is shown in Figure 12a.
(a) Prototype for the pre-game phase window.
(b) Warning message if user did not ll in a name.
Figure 12: The Pre-game Screen The pre-game screen consists of a JList that shows the list of available opponents, another JList with the list of available maps, two JColorChoosers that the user can use to associate players with colours, a JTextField that takes
18
the players name, and a JButton that allows the user to proceed to the next window. If the player does not enter a name, chooses an opponent or chooses a map, an error message will be displayed (like in Figure 12b). 2.4.3 Game Screen
The game screen will show a panel that represents the actions a player may take during each phase of the game as well as the map with the troop allocations per territory. The actions a player takes might inuence the game state and what will be displayed on both the panel and the map. When it is not a players turn, the player will be informed of their opponents actions in this window. As is shown in Figures 13a, 13b and 13c, the panel on the left shows dierent content for the dierent game phases.
(a) During the setup and recruitment phases.
(b) During the battle phase.
19
(c) During the manoeuvre phase.
Figure 13: The front-end for the engine The map is an image, with circles (indicating which player owns a territory) and numbers (indicating the number of troops currently on the territory) drawn on it with methods from the Java Swing API. The recruitment panel consists of a list of territories that belong to the player currently playing their turn, with JTextBoxes where the player indicates how many troops to recruit to each territory. When the player is satised with their choices and all the recruited troops have been placed, they continue to the next phase by pressing the JButton that says Done. The battle panel contains of two JLabels saying Source and Destination, with two JComboBoxes that functions as a list of territories the player can choose from. The battle panel also contains a JTextArea that displays all battle results (like the outcome of dice rolls) and JButtons that allow the player to Attack Again, to Manoeuvre troops from the source of the attack to the destination, or to end the battle phases by pressing the Done button. The Manoeuvre button takes the player to the manoeuvre window (Figure 13c), but with the source and destination xed. The manoeuvre panel has three JLabels, two JComboBoxes, a JSpinner and a JButton. The comboboxes, like in the battle panel, allow the player to select a territory from a list of territories. The spinner allows the player to specify how many troops should be moved from the source to the destination. The player then presses the Done button to end their turn (if in the manoeuvre phase) or to continue attacking (if still in the battle phase). If, during any of the phases, the player makes illegal selections (like attempting to move troops between two territories that are not adjacent), an error message will be displayed like in Figure 14.
20
Figure 14: Player attempted to manoeuvre troops between to disconnected territories.
2.5
Logging Functionality
The controller, facilitator and the engines will all have access to the logger and will periodically generate information and store it in a log le that can be consulted after a game has been played. The facilitator and controller will have one log le, while the engines each have their own. The log that the controller and facilitator keep can be used for all levels of logging, while the log that an engine keeps could only be used for debugging (as it does not have enough information to replay a game). Dierent levels of logging can be done. These levels also subsume one another, so level 30 will also contain the information of levels 10 and 20. This is not strictly true for the log the engines keep, as they will only be able to do logging for debugging that will not contain all the information from replay levels. The levels are: 10. minimal, 20. replay, 30. debug (input and output), 40. debug (Method invocations), and 50. debug (All variables). The structure of the predetermined levels allows a user to set custom logging levels between existing levels when implementing their own engine. A user can put dierent invocations of the log method (specied in Figure 10) with a specic level in dierent parts of the code. If the level is less than or equal to the level of the logger, the message will be logged. So the log will only show as much as the user has set the log to show. The log will, for instance, show only the pre-game details and the result of the game if the logger is set to the minimal level. If the replay level is set, all the information of the setup and the various player turns and game phases
21
will be shown. If the debug option is selected, the log will show which methods were called and from where the call originated (in addition to all the replay information). If the game fails at any point, the failure is logged and the log is closed.
Computer Player
A submissive AI that only defends and never attacks. A baseline AI that plays according to a greedy scheme. An AI making use of expectiminimax tree search with alpha-beta pruning. An AI based on Monte Carlo Tree Search.
Four computer players will be implemented:
To allow testing of these computer players against the AI players implemented by Yura [7], the interface specied by Yura will be implemented for each AI. Some of the methods might not necessarily be used by the AI players (e.g. trading is not implemented in the version of Risk used in this project), but will return default values for the purpose of robustness when inserted into Yuras framework.
interface AI Player - game : GameState - id : int - name : String + + + + + + + + + + + setGame(GameState game) getType() getCommand() getBattleWon() getTacMove() getTrade() getPlaceArmies() getAttack() getRoll() getCapital() getAutoDefendString()
AI Baseline
AI Submissive
AI MCTS
AI ExpectiMinimaxAlphaBeta
22
3.1
Submissive Player
The submissive player will make defensive choices based on a basic strategy. During the setup phase, it will place a single troop on each territory until each territory has two troops, then a third, etc. During the recruitment phase, it will place half of the recruited troops on the territory it owns with the least number of troops, and the other half on the territory with the second least number of troops. During the battle phase, it will never attack, but only defend. During the manoeuvre phase, it will move troops from the territory with the most troops to the territory with the least number of troops in such a way that the two territories will have the same number of troops after the manoeuvre.
3.2
Baseline Player
The baseline player will play according to a simple greedy scheme. During the setup phase, it will behave like the submissive player and repeatedly place a single troop on every territory until no more troops are left. During the recruitment phase, it will place all of its troops on the territory it owns where the ratio of troops on the territory to troops on neighbouring enemy territories is the highest (e.g. if there are three troops on the territory and the opponent has 3 3 four neighbouring territories each with two troops, the ratio is 2 4 = 8 = 0.375). In the event of a tie, a territory is chosen at random. During the battle phase, it will repeatedly attack from the territory it placed its recruited troops on, to the neighbouring territory with the least number of troops and will either keep attacking until the troops run out, or until the territory is defeated, in which case it will move all remaining troops to the defeated territory and repeat the procedure from the new territory. It will follow the same manoeuvre scheme as the submissive player.
3.3
Expectiminimax with Alpha-Beta Pruning
The expectiminimax algorithm is a recursive algorithm that builds a game tree. The game tree is a graph consisting of nodes representing dierent states of the game and directed edges between nodes representing the actions that changes one game state to another. These actions are choices based on the current game state at a node or stochastic events. Expectiminimax makes use of heuristics when evaluating the value of the game state at a given node. Alpha-Beta pruning seeks to improve expectiminimax by decreasing the number of nodes that are evaluated. In principle, it stops investigating a branch of the game tree that cannot do better than a previously determined best value. In that case, the branch is pruned and the search continues on a dierent branch. The code for the expectiminimax tree search with alpha beta pruning algorithm is shown below. The initial call to Emm AB(node, depth, , , maxP layer) will be with the current game state as starting node, a predened depth to search to, = , = and maxP layer = true (indicating that in the
23
current game state, the current player wishes to maximise his/her gain). The method nextP layer(node) is a boolean method that determines which player is next by examining the game state at the node. It could be the same player, in which case the method would return true, or the next player, in which case the method would return false. 1: procedure Emm AB(node, depth, , , maxP layer ) 2: if node is terminal node then 3: return Result of game 4: end if 5: if depth == 0 then 6: return Heuristic value of node 7: end if 8: if maxPlayer then Return value of max child 9: for all children of node, as child do 10: := max(, Emm AB(child, depth 1, , , nextP layer(node)) 11: if then 12: break 13: end if 14: end for 15: return 16: else if !maxPlayer then Return value of min child 17: for all children of node, as child do 18: := min( , Emm AB(child, depth 1, , , nextP layer(node)) 19: if then 20: break 21: end if 22: end for 23: return 24: else if random event at Node then 25: let a := 0 26: for all children of node, as child do 27: a := a+ (Probability[child] * Emm AB(child, depth1, , , nextP layer(node)) 28: end for 29: return a 30: end if
31: 32:
end procedure
3.4
Monte Carlo Tree Search
This algorithm also involves building a game tree, but unlike expectiminimax where the entire tree is built and then pruned with alpha-beta pruning, this game tree is only built until the allowed computation time has expired (usually a predetermined time). The tree consists of nodes (representing the game state at dierent time steps) and directed edges (representing an action being performed).
24
Each iteration [1] of the algorithm consists of the following phases: Selection - Recursively select nodes until an expandable node is reached. A node is expandable if it has children that have not been visited and is also not a terminal node. Expansion - Add one or more child nodes to the tree from the expandable node. The children added depend on what actions are available at the expandable node. Simulation - A simulation is performed from the newly added node(s). The simulation typically involves playing a game until the game nishes. Backpropagation - The result of the simulation (which player won the game) is propagated back to the root node and every intermediate nodes status is updated. Each node carries a ratio of games won to games lost (by simulating games from the node) and this ratio gets updated in the backpropagation phase. Each node also carries a count of how many times it has been visited, which also gets updated in the backpropagation phases. The algorithm is iterated while the computing time has not expired. When the algorithm stops, the actions that lead from the root are considered and the best performing action is returned. The best performing action is dened as either the action from the root node that leads to the child with the highest win to loss ratio (max reward child) or the child that has been visited the most (robust child). For using MCTS with Risk AIs, the robust child will be considered. 3.4.1 Stochasticity
Each simulation contains stochastic elements, since there are random dice rolls involved during the attack phase. Some nodes will have choices that lead to nodes with stochasticity (e.g. if the player chooses to attack, the choice would lead to a dice roll). It is not yet clear how stochasticity will be handled in the MCTS algorithm. The algorithm needs to be adapted to handle the stochasticity in the Selection and Backpropagation phases. During the selection phase, when the selection reaches the node with stochasticity, the edges from the node will be the various outcomes with relevant probabilities, and an all the edges of a stochastic node could be expanded (since they would all be possible). During backpropagation, the results of expanding and simulating along one of these edges could be added to current results at the node and weighted according to the probability of the outcome of the edge. Using selection and backpropagation in this way would result in a much wider game tree. This will be thoroughly investigated during the implementation and testing of the MCTS algorithm.
25
3.4.2
The Algorithm
The pseudo code of the algorithm looks as follows [2]: 1: procedure MCTSSearch(Node root node) 2: while time not expired do 3: current node root node 4: while current node ST do 5: last node current node 6: current node Select(current node) 7: end while 8: last node Expand(last node) 9: R Simulate(last node) 10: while current node ST do 11: current node.UpdateRatio(R) 12: current node.visit count++ 13: current node current node.parent 14: end while 15: end while 16: return Action(BestChild(root node)) 17: end procedure
Selection Selection Selection Expansion Simulation Backpropagation Backpropagation Backpropagation
Testing Design
Unit testing for the methods used in the facilitator, controller, engine and AI classes, as well as in the common objects. Integration testing of the various classes and also the interaction of the dierent components over the network. Code coverage testing of all code used in the project. User Interface testing of all the windows used in the user interface. Testing of the computer players through a combination of the evaluation plan and the testing plan.
The framework will be tested in several ways. These are listed below.
The unit and integration testing can be organises with a tool like the Apache Maven Project [9] which organises a project build schedule and can split unit and integration tests to build and execute at dierent times. The tests can be automated with tools like CodePro Analytix [8]. For code coverage, tools like EclEmma [6] and eCobertura [3] could be used. The user interfaces can be testing with some behaviour testing using a tool like FEST [5].
26
References
[1] Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. A Survey of Monte Carlo Tree Search methods. Computational Intelligence and AI in Games, IEEE Transactions on, 4(1):143, 2012. [2] Guillaume M JB Chaslot, Mark HM Winands, Jaap Van Den Herik, Jos WHM Uiterwijk, and Bruno Bouzy. Progressive strategies for MonteCarlo tree search. New Mathematics and Natural Computation, 4(03):343 357, 2008. [3] eCobertura by jmhofer. http://ecobertura.johoop.de/, Accessed: 24 May 2013. [4] Gunnar Farneback. Specication of the Go Text Protocol. Version 2, Draft 2, October 2002. [5] FEST: Fixtures for Easy Software https://code.google.com/p/fest/, Accessed: 24 May 2013. Testing.
[6] Java Code Coverage for Eclipse. http://www.eclemma.org/, Accessed: 24 May 2013. [7] Domination (Risk Board Game). http://sourceforge.net/p/domination/wiki/Home/, Accessed: 27 Mar 2013. [8] CodePro Analytix User Guide. https://developers.google.com/java-dev-tools/codepro/doc/, Accessed: 24 May 2013. [9] Apache Maven Project. cessed: 24 May 2013. http://maven.apache.org/index.html, Ac-
[10] Risk AI Project. https://sites.google.com/a/ml.sun.ac.za/risk-ai/, Accessed: 21 May 2013.

Risk Project Design Document

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Risk Project Design Document

Uploaded by

Copyright:

Available Formats

AI player for Risk Design Document

Dirk Brand, 16077229 September 26, 2013

The framework consists of the following components:

GUI Backend Engine

(b) Human interaction with the engine.

Figure 2: Structure of the framework.

Request opponents [AI 1, AI 2, ...]

maps [world map, starcraft map, ...] start choices

= = playerName, opponent, map

start game [Player1name, Player2name,chosenMap]

Request initial own territories [territory id1, territory id2, ...]

place troops [number]

= territory id1, number1, territory id2, number2, ... =

= source territory id, dest territory id

attack attack defend source

= source territory id, dest territory id, number

Figure 5: The usage of the protocol in our game of Risk.

+ closeConnections() + send(String mes)

+ log(int level, String message)

Figure 6: Common objects used by other classes. 2.2.2 Facilitator

Figure 7: Structure of the facilitator. 2.2.3 Controller

ControllerLogic game : GameState PM : ProtocolManager log : Logger randomSeed : long

Figure 8: Structure of the controller. 2.2.4 Engine

The structure of the engine is shown in Figure 9.

+ process(String m) - sendSuccess(int id, String arguments) - sendFailure(int id, String message)

Figure 9: Structure of the engine.

connectPlayer(Socket socket, int id) disconnectPlayer(int id)

establishFacilitatorConnection (String address, int port)

playGame(Player p1, Player p2)

troopsPlaced(int [] territory ids, int [] numberOfTroops) recruitment(int number)

sendFailure(int id, String message)

Figure 10: Method descriptions.

(a) Prototype GUI.

(b) Warning message if port value not lled in.

(a) Prototype for the pre-game phase window.

(b) Warning message if user did not ll in a name.

(a) During the setup and recruitment phases.

(b) During the battle phase.

(c) During the manoeuvre phase.

Figure 14: Player attempted to manoeuvre troops between to disconnected territories.

Four computer players will be implemented:

Expectiminimax with Alpha-Beta Pruning

Monte Carlo Tree Search

Selection Selection Selection Expansion Simulation Backpropagation Backpropagation Backpropagation

[10] Risk AI Project. https://sites.google.com/a/ml.sun.ac.za/risk-ai/, Accessed: 21 May 2013.

You might also like