Professional Documents
Culture Documents
碩士學位論文
桌球比賽裁判自動系統
An Automatic Table Tennis Match Umpiring
System
中華民國一○三年七月
i
i
摘要
目前桌球評判方式主要是靠人眼來決定得分與否。但在擊球速度過快或裁判
個人因素而會出現誤判等情形。因此,有必要建立一套完整且準確的桌球自動裁
判系統。在以往的相關應用當中以網球的鷹眼系統最為著名,其方法為追蹤記錄
球的路徑並顯示記錄的實際路徑的圖形圖像,也可以預測球未來的路徑。
本篇文章將此大致概念運用至桌球比賽中,利用頂帽濾波器(tophat)來銳
化桌球之桌緣,並利用形態學以及連通區域標記將桌球桌切割出來。桌網切割的
部分為形態學搭配線性回歸的方式得出其直線方程式。接著利用前景灰階值扣除
背景灰階值來保留移動物件,搭配形態學以及區域性偵測來提高偵測桌球球心之
準確率。最後再將所得之三樣特徵值(桌球桌區域、桌球落點質心、桌網直線方
程式)輸入至本桌球規則系統中來有效判別得分與否。本篇文章採用 JVC-GZ-GX1
攝錄影機拍攝,並與人眼判別進行下列之比對,桌面切割、偵測桌球質心、落點、
評分等準確率,平均準確率分別為 0.9919、0.8488、0.8460、及 0.8322。實驗結
果證明本篇文章提出的演算法對桌球偵測與桌球自動判分有一定之準確率。
關鍵詞: 桌球自動評判、鷹眼系統、軌跡偵測、線性回歸、形態學
i
Abstract
The current referee methods of table tennis games rely on the human eye to
determine the score. However, fast hitting and the referee’s personal factors may cause
referee system of table tennis games. The most famous related application is Hawk-Eye,
to visually track the trajectory of the ball and display a record of its statistically most
This article generally applies this concept to a table tennis game. First, use tophat
filter to sharpen the edge of the table tennis table. Then, we can apply Morphology and
labeling to detect the table tennis table. The part of the table tennis net detection utilizes
Morphology and Linear Regression to find the net’s linear equation. After that, we can
deduct the background intensities from the foreground intensities to retain the moving
objects. And with morphological and regional detection, we can improve the accuracy
of table tennis ball center detection. Finally, import the above results which are the table
tennis table, the table tennis net’s linear equation, and the table tennis ball center to the
automated referee system of table tennis games in order to distinguish score or not. This
article uses JVC-GZ-GX1 camcorder to shoot, and compare with the human eye by the
following features, which are the accuracy of table detection, ball detection, placement
detection, and score. The average accuracies are 0.9919, 0.8488, 0.8460, and 0.8322.
Experimental results show that the algorithm proposed in this article has high accuracy.
ii
Table of Contents
摘要 ................................................................................................. i
Abstract ......................................................................................... ii
iii
3.3.1 Part A-Candidate point detection ............................................................ 26
3.3.2 Part B-Local area detection .................................................................... 30
3.3.3 Placement detection ................................................................................ 33
3.4 Regular system ............................................................................................... 35
3.4.1 Effective service judgment ..................................................................... 35
3.4.2 Effective service return judgment ........................................................... 38
3.4.3 Reset serving point ................................................................................. 40
Reference ..................................................................................... 59
iv
List of Tables
Table 4-1 The assessment results of table detection....................................................... 43
Table 4-2 The assessment results of net detection ......................................................... 44
Table 4-3 The comparing results of the ball detection ................................................... 54
Table 4-4 The comparing results of the placement detection ......................................... 54
Table 4-5 The final results of the referee system ........................................................... 55
Table 4-6 The cost times of ATTMU ............................................................................. 56
v
List of Figures
Fig.1-1 Three main objects of the referee system of table tennis games; ........................ 5
Fig.1-2 The final position of the camera .......................................................................... 6
Fig.1-3 The environment of table tennis games in this thesis .......................................... 6
Fig.2-1 Three possible states of B[x]................................................................................ 9
Fig.2-2 The schematic diagram of erosion ..................................................................... 10
Fig.2-3 The schematic diagram of dilation; ................................................................... 11
Fig.2-4 The schematic diagram of opening; ................................................................... 12
Fig.2-5The schematic diagram of closing; ..................................................................... 12
Fig.2-6 The schematic diagram of Area filling algorithm; ............................................. 13
Fig.2-7 The relationship between P and the pixels around P ......................................... 14
Fig.2-8connected component labeling in the binary image;........................................... 15
Fig.2-9 A linear regression equation having a variable .................................................. 17
Fig.3-1 The flow chart of table detection; ...................................................................... 20
Fig.3-2 The flow chart of table detection ....................................................................... 21
Fig.3-3 the structure element : disc; ............................................................................... 22
Fig.3-4 the process of the table detection ....................................................................... 23
Fig.3-5 The flow chart of net detection .......................................................................... 24
Fig.3-6 The process of the net detection......................................................................... 25
Fig.3-7 The flow chart of net detection .......................................................................... 25
Fig.3-8The flow chart of net detection in Part A ............................................................ 26
Fig.3-9 The process of the ball detection in Part A ........................................................ 29
Fig.3-10 The flow chart of net detection in Part B ......................................................... 30
Fig.3-11 The process of the ball detection in Part B ...................................................... 32
Fig.3-12The feature of the placement............................................................................. 33
Fig.3-13 The flow chart of regular system ..................................................................... 35
Fig.3-14 The schematic diagram of effective service; ................................................... 36
Fig.3-15 The schematic diagram of effective and invalid service return; ...................... 38
Fig.4-1 The experiment results of table detection .......................................................... 43
Fig.4-2 The experiment results of net detection ............................................................. 44
Fig.4-3 Movie(a)-white, the experiment results of table detection ................................ 46
Fig.4-4 Movie(a)-yellow, the experiment results of table detection .............................. 47
Fig.4-5 Movie(b)-white, the experiment results of table detection ................................ 48
Fig.4-6 Movie(b)-yellow, the experiment results of table detection .............................. 49
Fig.4-7 Movie(c)-white, the experiment results of table detection ................................ 50
Fig.4-8 Movie(c)-yellow, the experiment results of table detection .............................. 51
vi
Fig.4-9 Movie(d)-white, the experiment results of table detection ................................ 52
Fig.4-10 Movie(d)-yellow, the experiment results of table detection ............................ 53
vii
Chapter 1 Introduction
1.1 Background
Table tennis is one of the most popular sports in the world. There are no age or
gender barriers. One can play table tennis according to his own capabilities and
limitations, and still be competitive. He does not have to worry about those bruises or
even broken bones that he can get in contact sports. Even many athletes with
disabilities can compete on equal terms with able-bodied athletes at table tennis. He
can also play table tennis all year round, day or night, and don't have to worry about
bad weather or covering up to keep harmful UV rays off him. He does not have to
spend a lot of money to play table tennis either. A huge amount of space is not needed
to have fun playing table tennis at home. Table tennis is easy to play, yet difficult to
master. You will always have another challenge to look forward to, and another
mountain to climb. Table tennis is great for getting up a sweat and getting the heart
rate up. It is also good for the brain of old person. Table tennis is a wonderful sport to
Table tennis match umpiring is a very demanding task. Since a table tennis
umpire needs to make accurate judgment about the ball-moving environment, which
involves a series of fast actions and its legitimacy is strictly governed by the Laws of
Table Tennis [2]. A more pragmatic approach [3] is to employ computerized tools
capable of making accurate and fast measurements of the ball moving, to aid the
1
An intuitive, non-disruptive way of evaluating the ball-moving environments to
capture the moving ball, table, and net, called object of interest (OOI), using a video
camera and to then detect and track the OOIs in real time on a frame-wise basis.
The ball travels very fast. A high shutter speed camera is necessary; otherwise,
the object can become blurred, color faded, and distorted in shape.
Besides the ball, many moving objects, such as players, table-tennis bats, and the
Uneven color and illumination exist since some objects are moving.
The ball, table, and net may be blocked by the player, the bat, clothing, or others;
otherwise, the ball may disappear from view when it is move to too high or too
low.
When the contrast between OOI and others is low, the OOI may become
indistinguishable.
An OOI and others with similar color, size and shape may be confused.
The size of the ball may be only a few percent of the frame size.
In a real time application, detecting and tracking the ball must be as fast as
possible.
The study proposes an automatic table tennis match umpiring system (ATTMU
system) to extract the OOIs from a table tennis competition video, and analyze the
actions of the extracted objects, judge whether all the actions comply with the rules,
and then suggest a recommendation for the umpire to consider. ATTMU system will
assist umpires in fast and precisely making correct rulings in a table tennis game.
2
One of official table tennis rules, Articles of 11 point scale (A11)[2] is as follows:
1. When the server serves, the server must hit the ball so that the ball can hit the
table on the server side of the net, and then the ball can go over or around the net
2. When one player serves or fights back, his opponent must hit the ball over the
3. The winner of a game is the player who first scores 11 points unless both players
score 10 points. If both players score 10 points, then the game is won by the first
In this research, the table tennis rules will be used to investigate the performance of
An automated referee of the table tennis game [4] is a difficult task. On the one
hand, due to the problem of shooting angles, placement of judgment will cause
difficulties. And the athlete stations may also cover part of the desktop, and then it
may cause misjudgment. On the other hand, due to the table tennis ball deformation
caused by the film, it is not easy to find the correct ball positions at each time point.
detect table tennis ball. Therefore, choosing the right camcorder and shooting angles
In the approach [4], shooting modes have taken into double and single camcorder
shooting, as follows:
3
1. Double-camcorders shoot: One camcorder is set up next to the table, and the
other camcorder is set up over the table. The advantage is that we can have the
clearer and actual positions of table tennis balls. Also, the shooting angles of the
camcorder set up next to the table will not affect the placement of judgment.
advantage is that it only needs to process the images captured by one video
The goal of this thesis is to establish a timely referee system. How to detect
simultaneously when we use the double camcorders is a major problem. So, we use
the single camcorder to make the films. After several experiments, we find the most
appropriate way to set up the camcorder. And it can minimize the misjudgment of
actual location of the table tennis ball by using the single camcorder to shoot.
In a table tennis tournament video, there are three main OOIs, ball, table, and net;
Figure 1.1 shows the three OOIs. First, the ATTMU system will extract the three
OOIs from each frame of the table tennis tournament video, and then judge whether
the actions comply with the rules of A11according to special relationship of ball and
table as well as that of ball and net. We can determine the placement by hitting the
ball on the desktop. And then we can use the placement and net to determine the
service return is effective or not. Finally, we can judge which athlete scores. Thus,
4
Fig.1-1 Three main objects of the referee system of table tennis games;
environment, we choose four matches environment with the yellow and white ball,
which are the two major specified table tennis balls in the international competitions.
It shows atFig.1-3. So we get a total of eight films to be tested. This thesis uses
JVC-GZ-GX1 camcorder to shoot. The automatic mode is used, and the shutter speed
is set to 1/30, which means thirty images will be shot in one second, for the purpose
of improving the clarity of the ball and accuracy. After several tests and comparisons
of various methods to set up, the final position of the camcorder is set up to align the
net, and 175 cm away from the desktop. The erect height is 200 cm high from the
desktop, and the angle between the desktop and the tripod is 35 degrees. It shows at
Fig.1-2.
5
Fig.1-2 The final position of the camera
1.4 Organization
In subsequent chapters, the second chapter will introduce the existing methods
6
which this thesis has used, including morphology, area filling algorithm, connected
component labeling, Otsu’s thresholding, and linear regression. The third chapter is
the main method of this thesis, and it can be divided into the table tennis desktop
detection, the linear equation of the table tennis net detection, and the center of the
table tennis ball detection. And we import the results obtained in these steps into the
referee system of the table tennis games in order to judge the scores. The forth chapter
7
Chapter 2 Related Works
This subsection briefly reviews some techniques which will be applied to this
thesis.
2.1 Morphology
Morphology [5] is the method that analyzes the geometry of the structure, based on
1. Erosion
2. Dilation
3. Opening
4. Closing
Morphological image processing is based on the four operators, and then these
operators derive other morphological algorithms. This thesis uses the top-hat filter,
and the principle is subtracted from the original image by the opening image. And
2.1.1 Erosion
Erosion [5] makes the objects in the binary image shrink or thinning. The way and
elements of different sizes can remove the objects of different sizes. In addition, if
there is a small link between two objects, we can separate the two objects by erosion.
2. B[x] ⊆ 𝐴𝑐 ;
The first condition shows that B[x] has the closest relation with image A. So, the
point x which satisfies the first condition is called a set of A to B of erosion, denoted
by 𝐴○
-B. It is defined as follows:
𝐴○
-B = {𝑧|(𝐵)𝑧 ⊆ 𝐴} (1)
(𝐵)𝑧 means the set of all points z, and the equation represents the displacement of B
contained in A.
Fig.2-2 shows an example of erosion. The solid lines in (b) represent the limit of
movement of the structuring element B. When B moves out over the line, B can’t be
entirely contained in A.
9
Fig.2-2 The schematic diagram of erosion; (a) image A and structuring element B;
2.1.2 Dilation
Dilation [5] makes the objects in the binary image larger or thicken. The way and
extent of the shrinkage is also controlled by a structuring element. Assume that the
target image is called A and a structuring element is B. The condition that A dilates by
Where the equation (3) means that after reflecting B for origin and shifting z units,
we can get the set of the displacement z. Fig.2-3 shows an example of dilation. The
solid lines in (b) represent the limit of movement of the structuring element B. When
B moves out over the line, it will lead to the situation that the intersection of A and B
is an empty set. Thus, the area inside the solid lines represents that A dilates by B.
10
Fig.2-3 The schematic diagram of dilation; (a) image A and structuring element
B; (b) the result that A is dilated by B
Opening [5] is constituted from two basic operators, including erosion and dilation.
Here we define the opening which is between image A and structuring element B.
the narrow part of A will be cut off. Then, we use dilation to smooth the truncated A.
So, opening can be used to cut off the narrow part of the image. It is shown as Fig.2-4.
11
Fig.2-4 The schematic diagram of opening; (a) original image; (b) structuring
element; (c) the result after erosion; (d) the result after (c) is dilated by (b)
defined as follows:
A・B = (A ⊕ B)○
-B (5)
Here we dilate A by B. And then, we erode the result A by B. Because A is first dilated
by B, we can fill the small gaps in the image A. Next, the erosion we used can smooth
the image A. The difference between opening and closing is that closing can be used
Fig.2-5 The schematic diagram of closing; (a) original image; (b) structuring
element; (c) the result after dilation; (d) the result after (c) is eroded by (b)
12
2.2 Area Filling Algorithm
foreground. Area filling algorithm [6] uses dilation, complementary set, and
intersection to fill the hole in the image. Assume A is a set of images. It is defined as
follows:
𝑋𝑘 = (𝑋𝑘−1 ⊕ 𝐵) ∩ 𝐴𝑐 𝑘 = 1, 2, 3, ⋯ (6)
Where X0 means the pixels P in the boundary, and B means the structuring element.
Fig.2-6 The schematic diagram of Area filling algorithm; (a) original image; (b)
structuring element; (c) 𝑋0=P; (d) 𝑋1; (e) intersection of 𝑋3 and A
13
2.3 Connected Component Labeling
Connected component labeling [7] distinguishes the regions in the binary image.
The connected regions have the same labels. The different regions are given different
8-connected labeling. 4-connected labeling means that if those pixels around the pixel
in four directions, including up, down, right, and left have the same intensity as the
pixel, we give them the same labels. And the directions in 8-connected labeling are
more than connected diagonal directions in 4-connected labeling. In this these, we use
Step 1
Scan the image from left to right, and top to down. The pixel P means the pixel
currently being scanned. If P equals 0, and we scan the next point. If P equals 1, we
judge as follows:
If only one the pixel around P, including q, r, s, t is 1, we give P the same label as
the pixel.
If there are over two pixels around P, including q, r, s, t are 1, no matter the labels
of these pixels are the same or not, we give P one label of these pixels. And other
14
Step 2
Give each equivalence class the unique label. And then substitute each label of the
pixels in the image from the label of the class that the pixel belongs with. It is shown
as Fig.2-8(b) represents that the labels of some pixels are not correct in step one. After
Fig.2-8connected component labeling in the binary image; (a) binary image (b)
the result after step 1; (c) the result after step 2;
grayscale image into a binary image. Here we introduce the process of automatic
1. Select the initial value T which is usually the median intensity of the image.
2. Separate all the pixel in the image into two classes by T, 𝐺1 and 𝐺2 . 𝐺1 is the
set of the pixels whose intensities are less than T. 𝐺2 is the set of the pixels
15
Otsu’s thresholding is a method based on histogram. The probability of a pixel with
Where 𝑛𝑞 denotes the number of pixels with gray-level 𝑟𝑞 and n is the total
number of pixels in a gray-level image . Assume that the threshold k has been
Where,
𝑘−1 (9)
𝜔1 = ∑ 𝑝𝑞 (𝑟𝑞 )
𝑞 0
−1 (10)
𝜔2 = ∑ 𝑝𝑞 (𝑟𝑞 )
𝑞 𝑘
𝑘−1 (11)
𝜇1 = ∑ 𝑝𝑞 (𝑟𝑞 ) 𝜔1
𝑞 0
−1 (12)
𝜇2 = ∑ 𝑝𝑞 (𝑟𝑞 ) 𝜔2
𝑞 𝑘
−1 (13)
𝜇𝑟 = ∑ 𝑝𝑞 (𝑟𝑞 )
𝑞 0
16
2.5 Linear Regression
regression coefficients of the model parameters. The process is that we model the
then given without its accompanying value of y, the fitted model can be used to make
In order to find the line of linear regression, we should have a dataset which x and y is
given:
𝑥1 , 𝑥2 , … , 𝑥𝑚 , 𝑦1 , 𝑦2 , … , 𝑦𝑚
17
And then, assume a straight line containing the unknowns α and β:
f(𝑥) = αx + β (14)
Define the error value 𝜀𝑖 and the squared error 𝐸2 (𝛼, 𝛽) of the model:
𝜀𝑖 = 𝑦𝑖 − 𝑓(𝑥𝑖 ) (15)
𝑚 (16)
𝐸2 (𝛼, 𝛽) = ∑ 𝜀𝑖 2
𝑖 1
If we can find α and β to let the 𝐸2 be the minimum value, this line is the linear
regression line. The goal of least square method is to find the line which lets 𝐸2 be
follows:
𝑚 (17)
𝛿 1
𝐸(𝛼, 𝛽) = −2 ∑(𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 ) = 0
𝛿α 𝑚
𝑖 1
𝑚
𝛿 1
𝐸(𝛼, 𝛽) = −2 ∑(𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )𝑥𝑖 = 0
{𝛿β 𝑚
𝑖 1
2. We can get α and β by the equations above. And we can also get α and β by
18
Chapter 3 ATTMU System
The ATTMU system consists of four approaches: table detection, net detection, ball
detection, and regular system, and the flow chart is shown as Fig.3-1. The table
detection, net detection, and ball detection approaches are to extract the table, net, and
ball from each frame of the table tennis competition video. The regular system
approach is to judge the actions according to the competition rules of A11 based on
the special relationship of ball and table as well as that of ball and net. The ATTMU
system first transforms each frame of a color table tennis competition video into a
gray-level frame. Let 𝑅𝐺𝐵 be one frame of a color table tennis competition video,
and 𝑅𝐺𝐵 (𝑖, 𝑗) be the pixel located at the coordinates on 𝑅𝐺𝐵 . The corresponding
where 𝑅 (𝑖, 𝑗), 𝐺 (𝑖, 𝑗), and 𝐵 (𝑖, 𝑗) are the R, G, and B color components of
𝑅𝐺𝐵 (𝑖, 𝑗) [10]. The system contains the rules of the table tennis game. We combine
these three features, including the table region, the equation of the net, the centroid
coordinates of the ball and this system to determine which athlete scores.
A. Table detection : The goal is to cut out of the table region in the film. So we can
determine the placement of the ball is located in the table region or not. The balls
located in the table region are considered as the placements of the balls. And we
B. Net detection : The goal is to find out the equation of the net. We can import the
placements of the balls in the equation of the net, and determine the ball is
located on the right side or the left side of the net. In order to determine the
D. Regulation system : The system contains the rules of the table tennis game. We
combine these three features, including the table region, the equation of the net,
the centroid coordinates of the ball and this system to determine which athlete
scores.
Table detection is to accurately cut off the tennis table from the frame of the table
tennis competition video. The step can be divided into two parts as follows:
20
Fig.3-2 The flow chart of table detection
In a table tennis game, since the camcorder is generally stationary, the ATTMU
system extracts the table only from the first frame 𝑓 of the gray-level table tennis
competition video. Since white lines 2 cm wide are painted along each edge of a
tennis table, and a center line 3 mm wide parallel to the side lines divides the playing
surface in half, the ATTMU system then uses top-hat filter technique to enhance the
Top-hat filter employs the ranked value from two different size regions, the
brightest value in a circular interior region is compared with the brightest value in a
= 𝑓 − 𝑓 。𝐸 (21)
where “。” denotes the opening operation. The white top-hat transform returns an
image, containing those “objects” of an input image that are smaller than the
Fig. 3-4(a) is a frame of a color table tennis competition video. Fig. 3-4(b) is the
21
image obtained by executing top-hat filtering operation on the gray-level image of Fig.
3-4(a). From Fig. 3-4(b) one can obviously observe that the top-hat filtering operation
can effectively highline the boundary of upper tennis table. In , the pixels on the
boundary of upper tennis table are generally brighter. Otsu thresholding method is
hence used to decide the suitable threshold 𝑜 for generate a binary image 𝑏 by
where 0-bit represents the black pixel and 1-bit represents the white pixel. Fig. 3-4(c)
on 𝑏. In the closing operation, the ATTMU system executes dilation operation and
then erosion operation based on the structuring element shown Fig. 3-3. Then the
biggest region in 𝑏 is considered to the extracted upper surface table. Fig. 3-4(d)
depicts the extracted upper surface table on the image in Fig. 3-4(a).
22
(a) original image (b) the result after tophat
(c)the binary image after the Otsu’s (d) the result after the closing on time and
thresholding maintaining the biggest region
23
3.2 Net detection
Net posts are set apart for a doubles court. The net detection approach is to detect
the net location and use an equation to describe the location. To extract the net, the
ATTMU system first cuts off a small region (a rectangle) 𝑅𝑛 which contains the net,
and then derives the location of the net from the small region. Let 𝑏 consist of
M×N pixels. The central pixel of 𝑏 is located at the coordinates (𝑖𝑐 , 𝑗𝑐 ) by the
equation (23).
M+1 N+1 (23)
(𝑖𝑐 , 𝑗𝑐 ) = (⌊ ⌋,⌊ ⌋)
2 2
Since the net is located at the middle of the table, the ATTMU system thinks that
M+1 N+1
the left most top corner of 𝑅𝑛 is located at (𝑖𝑐 − ⌊ ⌋ , 𝑗𝑐 − ⌊ ⌋) and the right
10 10
M+1 N+1
most bottom corner is located at (𝑖𝑐 + ⌊ ⌋ , 𝑗𝑐 + ⌊ ⌋) . The 𝑅𝑛 cut off from the
10 10
binary image 𝑅𝑏 . Fig. 3-6(b) delineates the 𝑅𝑏 obtained from the 𝑅𝑛 in Fig. 3-6(a).
24
Besides, the thinning operation is used to trim all the lines on 𝑅𝑏 to single pixel
thickness and the result is shown in Fig. 3-6(c). Then, the 1-bit pixels in 𝑅𝑏 are input
The goal of this step is to detect the location of the ball in the film. It can be divided
into two parts, including Part A, and Part B. We detect the entire image until we find
the coordinates of the ball in Part A. After that, we only detect the area expanding
from the coordinates of the ball by a range in Part B. The moving range between the
25
ball in this frame and the ball in the next frame isn’t too wide. So, if we detect around
the coordinates of the ball by a narrow range, we can not only reduce the execution
time but also improve the detection accuracy. If the ball is not detected in Part B, we
The method is to find the objects similar to ball and their centroid coordinates. Next,
they can provide these features to detect the ball by a narrow range in Part B.
as foreground, and the first grayscale image 0 as background. Because the moving
26
objects are mainly from the foreground image, we have to highlight the high
luminance portion in the foreground. In this case, we remain the area with the stronger
contrast in the image 𝐹. Here we get the image 𝑁𝐹 by using Otsu’s thresholding
After that, we subtract the intensities of the foreground from the intensities of the
get the image 𝑁𝐵 by using Otsu’s thresholding method to create a binary image of
final background removing image . The purpose is to enhance the contrast of the
remain the area and the shape of the object which is most similar to the ball. First, we
consider by the area of the object. The area of the ball and the table will show a
1 1
certain percentage [3], and the percentage range is around ~ 4500. So, we use this
4000
range to be the threshold to remove the regions which are too large or too small. It’s
1 1 (24)
= {0, 𝑖𝑓 <
4500
× 𝐴𝑖 , 𝑜𝑟 >
4000
× 𝐴𝑖
1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
Then, we consider by the shape of the object. We use a value d to determine the
regions are nearly circular or not in the image . The value d is defined by equation
(25).
27
area (25)
d = 4π
perimeter 2
0, 𝑖𝑓 𝑑 > 2 (26)
={
1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
far from 1, it means the region is absolutely not circular. Thus, we remove the region
whose d is greater than 2. The remaining regions are the candidate objects similar to
the ball. The result is shown at Fig. 3-9(c). And we use connected component labeling
to label these regions. Where 𝐶𝑧 (𝑖, 𝑗) is the zth centroid candidate of the labeling
object similar to the ball, and find the most similar one. We don’t need to detect the
entire image during detecting the objects similar to the ball. Thus, we can expand
from the center of each object similar to the ball by a range. It can not only reduce the
execution time but also improve the detection accuracy. Here, we find the 𝐶𝑧 (𝑖, 𝑗)
from the grayscale image 0. Next, we expand from 𝐶𝑧 (𝑖, 𝑗) by the range of
1 (27)
𝑚2 = 𝑀,
{ 10
1
𝑛2 = 𝑁.
10
The size of the detection range is only needed to be slightly larger the size of the
ball. The result is shown at Fig.3-9(e). Then, we use the Otsu’s thresholding method
to create binary images of the range areas. The result is shown at Fig.3-9(f). Similar to
Step 2, we also consider the objects similar to the ball by area and shape. We remove
28
the regions which are too large or too small by equation (24) in these binary images of
the range areas. And these images are regarded as 𝐵𝑧 . Compute the value d of the
objects in the 𝐵𝑧 by equation (25) and find the image whose d is the closest to
1.The purpose is to find out the object which is the most similar to the ball. Finally,
we define the object as the ball and find the centroid coordinate 𝐹𝐶 (𝑖, 𝑗).
(e) (f)
29
3.3.2 Part B-Local area detection
The moving range between the balls in two consecutive images isn’t too wide.
If we continue to use Part A to detect the whole image, it can be time consuming. So,
Part B has been proposed to improve Part A. The method is to narrow the detection
range around the ball location of the previous image. If the ball is detected in the
following image, we can continue to use the procedure to detect. It can not only
reduce the execution time but also improve the detection accuracy.
this range.
1 (28)
𝑚3 = 𝑀,
{ 3
1
𝑛3 = 𝑁.
3
contrast of the moving objects, so we can cut out of the ball. And we do the same two
actions in Part A. The first action is that we use Otsu’s thresholding method to create a
binary image of the range area. The result is shown at Fig.3-11(a). In order to remove
the background, the second step is to subtract the intensities of the ball in the range
area from the intensities of the ball of the first frame. Also, we use Otsu’s thresholding
intersect both of the binary images to get the moving objects remaining image 𝑀𝐵 .
object which is most similar to the ball. The object similar to the ball can be
considered by area and shape. Considered by area, we remove the regions which are
too large or too small in the image 𝑀𝐵 by equation (24). And considered by shape,
we compute the value d by equation (25) of the objects of the 𝑀𝐵 and remain the
objects whose d is the closest. The result is shown at Fig.3-11(b). Finally, we define
the remaining object as the ball, so we can get the centroid coordinate 𝐹𝐶 (𝑖, 𝑗).
31
(a) (b)
32
3.3.3 Placement detection
The method is to find the placement of the ball by the centroid coordinate 𝐹𝐶 (𝑖, 𝑗)
in ball detection.
Fig.3-12The feature of the placement (a) the k-1th centroid coordinate (a) the kth
Fig.3-12 shows that the y value of the placement will increase first and decline,
such as (b). Here we use the following steps to determine whether the nth centroid
First, we take the intersection of the candidate point 𝐹𝐶 (𝑖, 𝑗) and the table region
𝐴𝑖 . Where 𝐴𝑖 is the set of the coordinates in the table region .If the intersection is
absent, it indicates that 𝐹𝐶 (𝑖, 𝑗) is not located in the table, and it isn’t taken into
account. Then, the remaining centroid coordinates detected in ball detection will be
33
stored in the array CSet[ ], which means the set of total centroid coordinates of the
𝑛1 𝑛2 𝑛𝑘
CSet[ 𝐹𝐶1 (𝑥𝑐1 , 𝑦𝑐1 ), 𝐹𝐶2 (𝑥𝑐2 , 𝑦𝑐2 ), … , 𝐹𝐶𝑘 (𝑥𝑐𝑘 , 𝑦𝑐𝑘 )], 𝑖𝑓 𝐹𝐶 (𝑖, 𝑗) ∈ 𝐴𝑖 (29)
Where 𝑥𝑘 , 𝑦𝑘 mean the x-coordinates and the y-coordinates of the kth centroid
coordinate and 𝑛𝑘 means the frame number of 𝐹𝐶 (𝑖, 𝑗). And the frame number will
Then, we store the remaining centroid coordinates of the ball which meet the
following condition in the array PSet[ ], which means the set of total centroid
placement.
34
3.4 Regular system
Shown as Fig.3-13, regular system combines three features obtained from the
above method, including table region 𝐴𝑖 , the equation of the net 𝐸𝑞 , the centroid
𝑛𝑘
coordinates of the placements 𝑃𝑚 (𝑥𝑃𝑚 , 𝑦𝑃𝑚 ) with two judge methods, including
effective service judgment and effective service return judgment to determine the
scores and the winner. Finally, we reset the subsequent of the centroid coordinates of
the ball to be the serving point. Then, we repeat the above action until the end of the
game.
The definitions of the effective service is that the ball hit by the server must hit on
the server’s desktop, and then the ball can go over or around the net before it hits the
table on his opponent’s desktop. Thus, the features needed by the method are the first
35
𝑛1
centroid coordinates of the balls 𝐹𝐶1 (𝑥𝑐1 , 𝑦𝑐1 ), the first and the second centroid
𝑛1 𝑛2
coordinates of the placements 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ), 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ), and the equation of
First, we examine the serving side. The first centroid coordinates of the balls
𝑛1
𝐹𝐶1 (𝑥𝑐1 , 𝑦𝑐1 ) is considered as the serving point 𝑆𝐶 (𝑖, 𝑗). And we do the following
Fig.3-14 The schematic diagram of effective service; (a) serve from the right side;
(b) the first placement hit on the right side of the desktop; (c) the second placement hit
on the left side of the desktop
𝑛1
Fig.3-14 shows that if the first placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ) is located on the same
36
𝑛2
side with the serving point 𝑆𝐶 (𝑖, 𝑗), and then the second placement 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ) is
𝑛1
located on the different side with the first placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ), we can regard
this situation as effective service. After that, if the value gained by using 𝑆𝐶 (𝑖, 𝑗) as
variable into the equation of the net 𝐸𝑞 and the value gained by using the first
𝑛1
placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ) as variable into the equation of the net 𝐸𝑞 are the same
sign,it means the server hits his own desktop successfully by equation (31). And vice
If these results meet the definition of effective service, we do the following action
to judge effective service. Similar as above, if the value gained by using the second
𝑛1
placement 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ) as variable into the equation of the net 𝐸𝑞 and the value
𝑛1
gained by using the first placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ) as variable into the equation of
the net 𝐸𝑞 are the different sign, it means the server hits his own desktop and then hit
the opponent’s desktop successfully by equation (32). Then, we ensure the situation
fulfill the definition of effective service. So we can execute the subsequent method of
effective service return judgment. And vice versa, it means invalid service, so that the
receiver scores.
37
3.4.2 Effective service return judgment
If we ensure this service is effective service after the steps above, the subsequent
placement will be imported into the method of effective service return judgment. The
definition of effective service return is that the player serves or fights back, the other
player must hit the ball over the net and let the ball hit the other player’s desktop. The
𝑛𝑘
features needed by the method are the subsequent placements 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ), which
means the mth centroid coordinate of the placement and the equation of the net 𝐸𝑞 .
Fig.3-15 The schematic diagram of effective and invalid service return; from
(a)to(b) is effective service return; From (c) to (d) is invalid service return;
𝑛𝑘
Fig.3-15shows that if this placement 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) is located on the different side
𝑛𝑘−1
with the previous placement 𝑃𝑚−1 (𝑥𝑚−1 , 𝑦𝑚−1 ), we can regard this situation as
38
𝑛𝑘
effective service return. Conversely, if 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) is located on the same side
𝑛𝑘−1
with 𝑃𝑚−1 (𝑥𝑚−1 , 𝑦𝑚−1 ), it means invalid service return. The previous method,
effective service judgment, use the first and the second centroid coordinates of the
𝑛1 𝑛2
placements 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ), 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ). If the server serves successfully, we use
the subsequent placements as variables into the equation of the net 𝐸𝑞 . Then, we let
two of the continuous value to do the multiplication so we can get the new value, and
this new value has to be a negative sign, so we can ensure that the ball hits the other
This value will be updated when the subsequent placements are used as variables
into the equation of the net 𝐸𝑞 continuously. Until this value is updated as a positive
𝑛𝑘
sign, it indicates that this placement 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) is located on the same side with
𝑛𝑘
the previous placement 𝑃𝑚−1 (𝑥𝑚 , 𝑦𝑚 ). This situation means invalid service return,
The Case 1 and Case 2 in Fig.3-13 show the two situations after determined by
equation of the net 𝐸𝑞 is updated as a positive sign and the value gained by
using 𝑆𝐶 (𝑖, 𝑗) as variable into the equation of the net 𝐸𝑞 is greater than0,it
represents that server make an invalid service return. And the receiver scores.
𝑛𝑘
Case 2 : When the value gained by using 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) as variable into the
equation of the net 𝐸𝑞 is updated as a positive sign and the value gained by
using 𝑆𝐶 (𝑖, 𝑗) as variable into the equation of the net 𝐸𝑞 is less than 0, it
represents that receiver make an invalid service return. And the server scores.
39
3.4.3 Reset serving point
If any athlete scores, we regard the next placement, which hasn’t been used as a
variable into the equation of the net, as the first placement. For example, we use the
𝑛𝑘
placement 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) as a variable into the equation 𝐸𝑞 of the net when the
𝑛𝑘+1
athlete scores. Then we set 𝑃𝑚+1 (𝑥𝑚+1 , 𝑦𝑚+1 ) as the first placement. Also, the
𝑛𝑘
centroid coordinates of the ball 𝐹𝐶𝑘 (𝑥𝑐𝑘 , 𝑦𝑐𝑘 ) is regarded as the serving point
𝑆𝐶 (𝑖, 𝑗). Finally, the reset serving point action has been completed.
40
Chapter 4 Results and Discussion
In the experiment, the results of each of these methods, including the edge of the
table, the equations of the nets, the centroid coordinates of the balls, the centroid
coordinate of the placement will be marked at the films of the table tennis game, and
we will assess the accuracy by following methods. We will compare the edge of the
table which is the result in table detection with the experts’ hand-painted desktop. And
compute the true positive (TP), true negative (TN), false positive (FP), false negative
(FN) of the result in table detection. Also the experts make a judgment about the
results of net detection. If the positions of the net detection results are located within a
reasonable margin of error, the experts denote it as True. Otherwise, it’s defined as
False. Then, we compute the accuracy of the centroid coordinates of the balls and
Where 𝑛𝑅 is the number of the balls and placements in ball detection which the
experts denote it as effective, and 𝑛𝐺 is the number of the balls and placements the
experts mark in the total films. Finally, we compare the results in regular system and
the result which is judged by referee on the spot to see whether the system can make a
judgment correctly.
The smaller ME value means the error number of samples is fewer, which has the
𝐹𝑁 − 𝐹𝑃 (36)
2. , 𝑖𝑓 (𝐹𝑃 + 𝑃) < ( 𝑃 + 𝐹𝑁)
RFAE = { 𝑃 + 𝐹𝑁
𝐹𝑃 − 𝐹𝑁
, 𝑖𝑓 (𝐹𝑃 + 𝑃) ≥ ( 𝑃 + 𝐹𝑁)
𝐹𝑃 + 𝑃
𝑃+ 𝑁 (37)
3. Accuracy =
𝑃 + 𝑁 + 𝐹𝑁 + 𝐹𝑃
Where the denominator is defined as the number of total pixels, and the molecular is
defined as the number after comparing with the result classified by experts. Thus, the
𝑃 (38)
4. Sensitivity =
𝑃 + 𝐹𝑁
Sensitivity means the ratio of the correct number of samples in target sample
classification. So, the larger value means it has better results of segmentation.
42
𝑁 (39)
5. Specificity =
𝐹𝑃 + 𝑁
Specificity means the ratio of the correct number of samples in non-target sample
classification. So, the larger value means it has better results of segmentation.
Table 4-1 shows that the table detection has a very excellent performance in
43
these five assessment methods. And the accuracy is almost entirely in line with the
44
As shown in Table 4-2 after visually assessed by experts, the results of net
detection are within a reasonable margin of error. And they are all marked as True.
Here the results of ball detection, such as the centroid coordinates and the
placement are marked in the table tennis match films. The centroid coordinates are
marked by the green box, and the placements are marked by the red box. Then, the
experts compute the number of the centroid coordinates and placements which are
within a reasonable margin of error, and also compute the number of the balls and
placements the experts mark in the films. So we can calculate the accuracy by
equation (37). The results are shown at Fig.4-3, Fig.4-4, Fig.4-5, Fig.4-6, Fig.4-7,
45
Fig.4-3 Movie(a)-white, the experiment results of table detection
46
Fig.4-4 Movie(a)-yellow, the experiment results of table detection
47
Fig.4-5 Movie(b)-white, the experiment results of table detection
48
Fig.4-6 Movie(b)-yellow, the experiment results of table detection
49
Fig.4-7 Movie(c)-white, the experiment results of table detection
50
Fig.4-8 Movie(c)-yellow, the experiment results of table detection
51
Fig.4-9 Movie(d)-white, the experiment results of table detection
52
Fig.4-10 Movie(d)-yellow, the experiment results of table detection
53
Table 4-3 The comparing results of the ball detection
Movie Num. TP+TN TP+TN+FN+FP Accuracy
Movie(a)-white ball 852 783 91.90
Movie(a)-yellow ball 963 892 92.62
Movie(b)-white ball 743 620 83.44
Movie(b)-yellow ball 841 759 90.24
Movie(c)-white ball 925 868 93.83
Movie(c)-yellow ball 864 823 95.25
Movie(d)-white ball 792 721 91.03
Movie(d)-yellow ball 833 793 95.20
As shown in Table 4-3 and Table 4-4, the accuracy of table detection and placement
detection are more than 90% in eight test movies. However, the accuracy of Movie(b)
using white ball is relatively low with other table tennis match films. This is because
that the white wall occupies a great part of the background in the film. When the
position of the white ball is located in the white wall, it is difficult to detect the
centroid coordinates of the ball. The lower accurate number of the centroid
coordinates causes fewer samples can be used for placement detection. So, it leads the
lower accuracy of the placements. Overall, there are still a very high accuracy rate for
Here we compare the results. We can see that Movie(b) using white ball is affected
by the lower accuracy of placement. Thus, the result of regular system is different
from the result which is judged by referee. Conversely, due to the high accuracy of the
placements, there are more accurate experiment results in other test table tennis match
films.
55
4.6 The efficacy of ATTMU
Finally, we assess the efficacy of ATTMU. Table 4-6 shows that ATTMU performs
well and spend a little to detect, this is because we narrow the detection range in ball
detection. So, we can avoid the situation of time consuming. Integrate the results in
Table 4-5 and Table 4-6, and we can confirm that ATTMU has the advantage of high
56
Chapter 5 Conclusions and Future Prospects
5.1 Conclusions
The thesis proposes an automatic table tennis match umpiring system (ATTMU)
for a variety of environments with high accuracy. The system is divided into four
stages step, including table detection, net detection, ball detection, regular system.
The purpose of table detection is to cut out the table area. Net detection is to detect the
equation of the net by the table area. Ball detection is to detect the centroid
coordinates of the ball, and then use the centroid coordinates and the table area to
identify the placements. And regular system is to use the placement from the previous
step and comply them with the rules of A11. So, we can get the final scores.
environments with two major specified table tennis balls in the international
competitions, including the white ball and the yellow ball. Then, we compare the
results, such as the edge of table, the equations of the nets, the centroid coordinates of
the balls, the centroid coordinates of the placements with experts’ judgment. Then we
assess the performance of the ATTMU. The experiment results show that regardless of
specificity, ME, RFAE, accuracy in table detection, and the accuracy in ball detection,
57
5.2 Future Prospects
The experiments results of most of the test films have high accuracy. But if the
contrast between the background and the color of the ball is not big enough, the
accuracy of table detection will decline, such as the white wall with the white ball.
Thus, it leads the situation that the results of ATTMU will be inconsistent with the
actual scores. Also, the films we make do not include the situation about that the ball
touch the net or the desk and the curveball, and ATTMU is suitable for singles. In
future, we will make more films with more circumstance which may happen. So we
expect to improve the accuracy of ball detection by selecting the camcorder with
higher pixels or adding more detailed steps in ball detection in future research. Hope
58
Reference
[1] S.T. Rodrigues, J.N. Vickers & A.M Williams, “Head, eye and arm
[2] X. Zhang, “Analysis of the effect of new competition rules on table tennis
2007. ICIG 2007. Fourth International Conference, vol. 22-24, pp. 833 – 838,
2007
[4] W. Chen, Y. Zhang. “Tracking Ball and Players with Applications to Highlight
[5] R.C. Gonzalez, R.E. Woods, “Digital Image Processing”, 2nd, Prentice-Hall,
2002.
model for filling-in gray level and color images", Proc. Int. Conf. Computer
Image.” Circuits and Systems, IEEE International Symposium on, pp. 2393-2396,
2005.
model", IEEE Trans. Systems Man Cybernet, vol. 12, pp.903 -907, 1982
60