Professional Documents
Culture Documents
Yitong Dai
The chip was designed specifically to play a pebble game. Figure 1 demonstrates how the game will be
played. The top row belongs to the PC player, the bottom row belongs to a human player. At the beginning
of a match, each square has two pebbles, in other words each player has four pebbles in his own row. In
each player’s turn, a player needs to select one square in his row where the number of pebbles is not zero.
The pebbles in this square will be distributed to other squares in clockwise. Figure 1 shows what the game
board will look like after the PC player chooses the top left square. A player will lose once the sum of
pebbles in his row is zero. The reason why I chose this topic is because I am interested in using digital
circuit design to solve practical problems, also I am curious about the performance difference between
software solution and hardware solution that are designed to solve the same problem.
The AI part of the chip is driven by Min-Max algorithm. Figure 2 shows how this algorithm works. The basic
ideas is, given a certain game state, AI will examine all possible moves, thus new game states can be
generated based on all these possible moves. Then for each of new game states AI will examine all
possible moves again. By doing this eventually a searching tree will be formed, and each layer of the tree
represents either AI’s turn or human player’ turn. In figure 2, the AI looks four steps ahead. At the leaf
nodes of the tree, a heuristic function is used to return a value that will suggest the probability that the AI
can win the match. Heuristic values generated at leaf nodes will travel backwards. For example, the second
bottom layer is the human player’s turn, thus min values will be chosen as less chance for the AI to win
means better chance for the human player to win. At the third bottom layer, it is the AI’s turn and the AI will
choose max values for its best interest. At the very top, the AI will eventually choose the move that leads to
the largest heuristic values returned from the bottom. In figure 2, at the top the middle move will be chosen
Min-Max algorithm that is used to verify the simulation result .The chip has three inputs besides clock.
“Reset” is used to reset the game board to its initial state. “Play” and “player_position” will be a bundle and
these two signals come from user input. For example, when a player presses a certain button to select a
square, “play” will be “1” to represent that a button was pressed; “player_position” will represent which
button was pressed. There are also four outputs from the chip (Fig 3). “Pos1”, “pos2”, “pos3” and “pos4”
represent a number of pebbles for each square, also they are listed in clockwise just like what is shown in
figure 1. “Winner” tells which player win or a match was still being played. In order to verify that the result
generated by ModelSim matches what is in figure 4, look at “player_position” first. If “player_position” is “1”
in figure 3, it means square “2” is selected in figure 4; if “player_position” is “0” in figure 3, it means square
“3” is selected in figure 4. In both figure 3 and figure 4, a player selects the same square when his turn
comes. For example, left bottom square is selected in a human player’s first turn, based on figure 4 the new
game board should be “3-3-2-0” when values in each square is read in clockwise. In figure 3, it also shows
“3-3-2-0”, which means that the chip generates the new game board correctly. After a player’s first turn, in
figure 4 the AI player selects left top square, which leads to the result “0-4-1-3”, the same result “0-4-1-3” is
shown in figure 3. By comparing game states in both figure 3 and figure 4, the results match each other,
which indicates that the AI module of the chip is able to return the same result as the software solution. In
figure 3, a second match was played to test that the transition between two matches works as expected
and verify the AI module further. For the sake of keeping the report simple, I didn’t provide the result
generated by the software for the second match shown in figure 3, but I included the executable (Mac
version) and C++ source code of my software solution in my submission. So the second match will be able
the square that is selected is divided by 4, the quotient is the base number of pebbles that each square will
be added by; the remainder tells how many squares after the chosen square in clockwise will be added one
more pebbles on top of the base number. To make it more clear, a quotient refers to a number of pebbles;
a reminder refers to a number of squares. The purpose of “input factory unit” is to make sure that “controller
unit” is only sensitive to a positive edge of “play” signal, so that gameboard will not be updated for multiple
times even if a player keeps pressing a button. “AI unit” and “position decoder unit” are in parallel, both of
their jobs is to tell “position updater unit” which square will be selected. The difference is “position decoder
unit” generates outputs based on a human player’s input, while “AI unit” generates outputs by Min-Max
algorithm. “Controller unit” is used to ensure that a match is played by the AI and a human player in turns.
Also if a human player selects a square with zero pebble in his turn, the next turn will still be the human
player’s turn rather than AI’s until the human player provides a valid input. “Controller unit” is also able to
end a match when there is a winner, so that the gameboard won’t be updated by either AI or user input.
AI unit (fig 6) is the largest module of the chip, it is consist of 62 “position updater unit” to achieve five
layers prediction. The reason why I chose five as the prediction depth is because five is the minimum
prediction depth in order to have a competitive AI based on my software simulation. The circuit itself mimics
the structure of Min-Max algorithm. One major problem that needs to be tackled is that, the circuit needs to
have a complete structure of a searching tree, but at the same time illegal branches needs to be pruned.
That is why there are extra wires that connect “position updater unit” and “min comparator unit” together.
So “min comparator unit” is aware if there are illegal moves. For example, if the top input of “min
comparator unit” is a result of an illegal move, the top input will not be selected even if its value is less than
the smallest clock cycle as even if I achieved a 30 ns clock cycle, it doesn’t change the magnitude when I
do the comparison between the hardware and the software solution. I run the software version of Min-Max
algorithm on Intel Core i5-4260U processor that uses 22nm technology, on the other hand my hardware
solution uses 180nm technology. By measuring the elapsed time in the software solution, based on the
fastest record rather the average, it takes 17000 ns to predict five steps ahead, which is 243 times (243 =
17000 / 70) more than what my hardware solution will take. This comparison fully demonstrates the
advantage of having a dedicated hardware component to do a certain job. While the trade-off of dedicated
Acknowledgements:
I thank Edouard Giacomin to help me figure out a lot of issues during the placement & routing stage and
answer my questions at the initial stage of my design.
References:
[1] Stephen Brown and Zvonko Vranesic, “Fundamentals of Digital Logic with Verilog Design (3rd edition)”,
2014.
[2] Neil H.E. Weste and David Money Harris , “CMOS VLSI Design A Circuits and Systems Perspective 4th
edition”, 2009.
[3] FPGA4Student.com, “Tic Tac Toe Game in Verilog and LogiSim”, available online at:
https://www.fpga4student.com/2017/06/tic-tac-toe-game-in-verilog-and-logisim.html .
Figure 1: Example of how pebble game will be played.
Figure 6: The internal structure of AI unit with a prediction depth of two as an example.
Figure 7: Timing report.