Professional Documents
Culture Documents
1
official cite: http://kde.org
KBlocks can be run in two modes: KDE users can use it as a learners. They are illustrated as the gray rectangles, the regular
normal desktop game; researchers can choose to start a game rectangles, and the round-cornered rectangles in the figure.
engine, a GUI, or a player. The GUIs and the players are Figure 2 also shows the relations among the components. We
connected to the game engine via UDP sockets. The align these components vertically according to the catalogs. A
components can be run in one or several computers. lower algorithm uses the outputs of the upper one as its inputs.
KBlocks can be configured with parameters defined in a The learners computes the models which are used in the
text file. The game engine (and the GUI) supports game algorithms.
GUI The middle column with the dotted arrow lines shows the
socketconnection sequence of the computation in the games. With the current
DataPreprocessor
Game
P layer2 P layer1
Engine Filtered
Counting Filters
Data
Hand
Pattern Activated
playeror Info.Gain Coded
Calculator Patterns Features
GUI...
Support
SVM Class
Vector
Fig. 1. T he System Components of KBlocks Learner
Machine
Labels
stop
1
http://www.informatik.uni-freiburg.de/∼kiro/KBlocksDummyPlayer.tgz
In order to reduce the data set, the types of pieces are used in From another aspect, it is important to cons ider some “global”
the data preprocessor to separate the data into 7 subsets. Each parameters in Tetris. For example, a candidate placement can
subset is used to train its own filter, patterns, and SVM. In other clear 4 rows. This would be important for the game. The
words, seven SVMs work together in the artificial player. patterns, however, cannot express this occurrence.
When placing the current piece, human players can first We designed hand-coded features to acquire “global”
reduce the candidates to a limited number by observing the information. If the patterns can define the tactics of the games,
surface of the accumulated blocks. Then, they choose one from the features can be used to describe the strategies. In order to
the filtered candidates as their final decision. This idea was used define these features, we use Figure 4 to illustrate some phrases:
to develop the filters for reducing the amount of data in the hole, flat, column, and well. A well or a hole is buried if it is no
learning. deeper than three cells from the surface.
A filter consists of a set of patterns. Figure 3 shows the The features are listed in Table I. Items 2 and 3 are for the
column
column. Items 4 − 6 are about the flat. 9 − 11 are for the hole.
14 − 18 are about the well. Our features are compared
x
x x T ABLE I
x x List of Hand-Coded Features
x x c c c
flat x x x x x x x c x 1* How many attacks are possible after the current
hole x x x x x x x x
x x x x x x x x x placement.
x x x x x x x x x
x x x x x x x x x
2 T he number of the columns.
3 T he increased height of the column.
buriedwell
4 T he increased number of the flat.
Fig. 4. T he Illustrations of the Features 5 T he decreased number of the flat.
6 T he maximum length of the flat
7 T he increased height of accumulated blocks.
concept of the patterns. The current piece is denoted by ’c’, it is
8 T he height difference between the current placement and
an ’L’ in the figure. We use ’x’ to denote the already occupied
the highest position of the accumulated blocks.
cells. Around the placement, a small field, which is marked in
9 How many holes will be created after the current
gray, is chosen as the activated area for the patterns. The placement
patterns are smaller than the small field. For example, the 10 How many holes will be removed after the current
deeper gray area in the figure shows a pattern. It contains 5×2 placement.
cells. The cells with a ’c’ or ’x’ inside are occupied. 11* How may occupied cells are added over a hole.
A pattern can be activated by a placement. As mentioned 12 T he number of removed lines of the current placement.
above, the small field is activated by the placement. All the 5 × 13* How well will the next piece be accommodated.
2 patterns can be enumerated. We move a pattern around the
14 If a well is closed by the current placement, how deep is the
small field. It is activated by the placement, if the occupied cells well.
in the pattern match the occupied cells in the background (the 15 If a well is open by the current placement, how deep is the
small field). well.
Filters can thus be learned by counting. If a pattern is never 16* How may occupied cells are added over a buried well.
activated by the placements of the imitated player, it can be used 17 T he number of the open wells.
to reduce the candidates. Each filter is a set of such patterns. It
18 How deep is the well, if it is created by the current
can be learned by running the activation tests over all th e placement.
training data. 19 Whether a well is removed by current placement.
Support Vector Machines with the features listed in [1]; the items with * were not
mentioned. There are differences in the descriptions of the
The patterns are useful not only in the filters but also for
features because we use them as the inputs of the SVMs. The
modeling the skills of the imitated players. For instance, a
other researchers developed the evaluation function with the
pattern was activated 1000 times over the training set, among
which 900 were activated by the positive cases. This pattern linear combinations of the weighted features.
A large number of patterns can be created by enumeration.
cannot be used in a filter because there are mixed negative and
positive cases. However, activating it apparently indicates that For example, an enumeration of 5×2 will create 1024 patterns.
the placement tends to be positive because of the positive to It is difficult to consider all these patterns as the inputs of the
negative rate in the training data. Therefore, the patterns are also SVMs because of the required computational power. To our
used in this section to compute the inputs of the SVMs. knowledge, there is no trivial way to compute a subset of the
However, the patterns can only get the “local” information. patterns which yield to the optimal performance of the SVMs.
They are checked within the small field around the placement.
Therefore, we employ the information gain in decision tree for and the imitated players can be calculated the rate that the
computing a subset of 20 patterns for each SVM. trained model chooses the same placements as the imitated
SVMs are a popular method in data classification, in which player. The results are shown in the upper plot of Figure 5. The
the whole data set are globally classified with a set of the labels. data were averaged over 10 slices.
Nevertheless, the data in Tetris are grouped by the current piece. The solid lines show the performance of the player that
Among the candidate placements of the current piece, the imitates the human player. The dotted lines are the player that
algorithm needs to choose the one which is closest to the choice was imitating Fehey’s player. Both imitations achieved a
of the imitated player. LIBSVM [13] provides an API to similarity of about 0.7. The curves resemble a typical learning
compute this probability, which is used in our implementation. curve because the similarity is regarded as the evaluation in the
The values of the inputs should be within the same range in learning. The similarity cannot be higher because of the
the SVMs. The patterns always have a value of 0 or 1, which differences in the data representation and the models between
denotes whether or not it is activated by the current placement. the imitating system and the imitated systems.
The value of features, however, can be much bigger. For The trained models compete against Fehey’s player in the
example, the maximu m length of the flat can be up to 9 in a synchronized two-player games. 200 random piece sequences
standard Tetris game. In order to avoid this situation, the values were generated for the 200 games, so that each model was
of the features were mapped to 0, 0.5, or 1 in our evaluated in the same set of the games. The middle plo t in
Imitating Human
0.8 Imitating AI Figure 5 shows the winning rates of the imitating players. The
0.7
0.6 player imitating human finally achieved 0.25 as its rate of wins
0.5
0 20 40 60 80 100
35 Imitating Human
Imitating AI
25
15
0 20 40 60 80 100
References
[1] S. B. T . Christophe, “Building Controllers for T etris,” International
Computer Games Association Journal, vol. 32, pp. 3–11, 2009.
[2] C. Fehey, “T etris AI,”
http://www.colinfahey.com/tetris/tetrisen.html, 2003, www accessed
on 02-August-2010.
[3] G. K. S. M. N. Bo¨hm, “An evolutionary approach to tetris,” 2005, in
Proceedings of the sixth metaheuristics international conference
(MIC2005).
[4] I. A. Lo˝rincz, “Learning tetris using the noisy cross-entropy method,”
Neural Computation, vol. 18, pp. 2936–2941, 2006.
[5] L. Reyzin, “ 2 player tetris is pspace hard,” 2006, in Proceedings of 16th
Fall Workshop on Computational and Combinatorial Geometry.