You are on page 1of 69

國立中興大學資訊科學與工程學系

碩士學位論文

桌球比賽裁判自動系統
An Automatic Table Tennis Match Umpiring
System

指導教授: 喻石生 Shyr-Shen Yu


指導教授: 詹永寬 Yung-Kuan Chan
研 究 生: 王珮翰 Pei-Han Wang

中華民國一○三年七月

i
i
摘要
目前桌球評判方式主要是靠人眼來決定得分與否。但在擊球速度過快或裁判
個人因素而會出現誤判等情形。因此,有必要建立一套完整且準確的桌球自動裁
判系統。在以往的相關應用當中以網球的鷹眼系統最為著名,其方法為追蹤記錄
球的路徑並顯示記錄的實際路徑的圖形圖像,也可以預測球未來的路徑。

本篇文章將此大致概念運用至桌球比賽中,利用頂帽濾波器(tophat)來銳
化桌球之桌緣,並利用形態學以及連通區域標記將桌球桌切割出來。桌網切割的
部分為形態學搭配線性回歸的方式得出其直線方程式。接著利用前景灰階值扣除
背景灰階值來保留移動物件,搭配形態學以及區域性偵測來提高偵測桌球球心之
準確率。最後再將所得之三樣特徵值(桌球桌區域、桌球落點質心、桌網直線方
程式)輸入至本桌球規則系統中來有效判別得分與否。本篇文章採用 JVC-GZ-GX1
攝錄影機拍攝,並與人眼判別進行下列之比對,桌面切割、偵測桌球質心、落點、
評分等準確率,平均準確率分別為 0.9919、0.8488、0.8460、及 0.8322。實驗結
果證明本篇文章提出的演算法對桌球偵測與桌球自動判分有一定之準確率。

關鍵詞: 桌球自動評判、鷹眼系統、軌跡偵測、線性回歸、形態學

i
Abstract
The current referee methods of table tennis games rely on the human eye to

determine the score. However, fast hitting and the referee’s personal factors may cause

misjudgment. Therefore, it is necessary to establish a complete and accurate automated

referee system of table tennis games. The most famous related application is Hawk-Eye,

to visually track the trajectory of the ball and display a record of its statistically most

likely path as a moving image.

This article generally applies this concept to a table tennis game. First, use tophat

filter to sharpen the edge of the table tennis table. Then, we can apply Morphology and

labeling to detect the table tennis table. The part of the table tennis net detection utilizes

Morphology and Linear Regression to find the net’s linear equation. After that, we can

deduct the background intensities from the foreground intensities to retain the moving

objects. And with morphological and regional detection, we can improve the accuracy

of table tennis ball center detection. Finally, import the above results which are the table

tennis table, the table tennis net’s linear equation, and the table tennis ball center to the

automated referee system of table tennis games in order to distinguish score or not. This

article uses JVC-GZ-GX1 camcorder to shoot, and compare with the human eye by the

following features, which are the accuracy of table detection, ball detection, placement

detection, and score. The average accuracies are 0.9919, 0.8488, 0.8460, and 0.8322.

Experimental results show that the algorithm proposed in this article has high accuracy.

Keyword : automated referee system of table tennis games, Hawk-Eye, trajectory

detection, linear regression, morphological

ii
Table of Contents
摘要 ................................................................................................. i

Abstract ......................................................................................... ii

Table of Contents ......................................................................... iii

List of Tables ................................................................................. v

List of Figures .............................................................................. vi

Chapter 1 Introduction ............................................................. 1


1.1 Background ....................................................................................................... 1
1.2 Motivation and goal .......................................................................................... 3
1.3 Environment Settings ....................................................................................... 5
1.4 Organization ..................................................................................................... 6

Chapter 2 Related Works ......................................................... 8


2.1 Morphology ...................................................................................................... 8
2.1.1 Erosion ...................................................................................................... 8
2.1.2 Dilation ................................................................................................... 10
2.1.3 Opening and Closing ...............................................................................11
2.2 Area Filling Algorithm ................................................................................... 13
2.3 Connected Component Labeling .................................................................... 14
2.4 Otsu’s Thresholding ........................................................................................ 15
2.5 Linear Regression ........................................................................................... 17

Chapter 3 ATTMU System ..................................................... 19


3.1 Table detection ................................................................................................ 20
3.2 Net detection ................................................................................................... 24
3.3 Ball detection .................................................................................................. 25

iii
3.3.1 Part A-Candidate point detection ............................................................ 26
3.3.2 Part B-Local area detection .................................................................... 30
3.3.3 Placement detection ................................................................................ 33
3.4 Regular system ............................................................................................... 35
3.4.1 Effective service judgment ..................................................................... 35
3.4.2 Effective service return judgment ........................................................... 38
3.4.3 Reset serving point ................................................................................. 40

Chapter 4 Results and Discussion.......................................... 41


4.1 Experimental evaluation methods .................................................................. 41
4.2 The experiment results of Table Detection ..................................................... 43
4.3 The experiment results of Net Detection ........................................................ 44
4.4 The experiment results of Ball Detection ....................................................... 45
4.5 The experiment result of Regular System ...................................................... 55
4.6 The efficacy of ATTMU ................................................................................. 56

Chapter 5 Conclusions and Future Prospects ...................... 57


5.1 Conclusions .................................................................................................... 57
5.2 Future Prospects ............................................................................................. 58

Reference ..................................................................................... 59

iv
List of Tables
Table 4-1 The assessment results of table detection....................................................... 43
Table 4-2 The assessment results of net detection ......................................................... 44
Table 4-3 The comparing results of the ball detection ................................................... 54
Table 4-4 The comparing results of the placement detection ......................................... 54
Table 4-5 The final results of the referee system ........................................................... 55
Table 4-6 The cost times of ATTMU ............................................................................. 56

v
List of Figures
Fig.1-1 Three main objects of the referee system of table tennis games; ........................ 5
Fig.1-2 The final position of the camera .......................................................................... 6
Fig.1-3 The environment of table tennis games in this thesis .......................................... 6
Fig.2-1 Three possible states of B[x]................................................................................ 9
Fig.2-2 The schematic diagram of erosion ..................................................................... 10
Fig.2-3 The schematic diagram of dilation; ................................................................... 11
Fig.2-4 The schematic diagram of opening; ................................................................... 12
Fig.2-5The schematic diagram of closing; ..................................................................... 12
Fig.2-6 The schematic diagram of Area filling algorithm; ............................................. 13
Fig.2-7 The relationship between P and the pixels around P ......................................... 14
Fig.2-8connected component labeling in the binary image;........................................... 15
Fig.2-9 A linear regression equation having a variable .................................................. 17
Fig.3-1 The flow chart of table detection; ...................................................................... 20
Fig.3-2 The flow chart of table detection ....................................................................... 21
Fig.3-3 the structure element : disc; ............................................................................... 22
Fig.3-4 the process of the table detection ....................................................................... 23
Fig.3-5 The flow chart of net detection .......................................................................... 24
Fig.3-6 The process of the net detection......................................................................... 25
Fig.3-7 The flow chart of net detection .......................................................................... 25
Fig.3-8The flow chart of net detection in Part A ............................................................ 26
Fig.3-9 The process of the ball detection in Part A ........................................................ 29
Fig.3-10 The flow chart of net detection in Part B ......................................................... 30
Fig.3-11 The process of the ball detection in Part B ...................................................... 32
Fig.3-12The feature of the placement............................................................................. 33
Fig.3-13 The flow chart of regular system ..................................................................... 35
Fig.3-14 The schematic diagram of effective service; ................................................... 36
Fig.3-15 The schematic diagram of effective and invalid service return; ...................... 38
Fig.4-1 The experiment results of table detection .......................................................... 43
Fig.4-2 The experiment results of net detection ............................................................. 44
Fig.4-3 Movie(a)-white, the experiment results of table detection ................................ 46
Fig.4-4 Movie(a)-yellow, the experiment results of table detection .............................. 47
Fig.4-5 Movie(b)-white, the experiment results of table detection ................................ 48
Fig.4-6 Movie(b)-yellow, the experiment results of table detection .............................. 49
Fig.4-7 Movie(c)-white, the experiment results of table detection ................................ 50
Fig.4-8 Movie(c)-yellow, the experiment results of table detection .............................. 51

vi
Fig.4-9 Movie(d)-white, the experiment results of table detection ................................ 52
Fig.4-10 Movie(d)-yellow, the experiment results of table detection ............................ 53

vii
Chapter 1 Introduction

1.1 Background

Table tennis is one of the most popular sports in the world. There are no age or

gender barriers. One can play table tennis according to his own capabilities and

limitations, and still be competitive. He does not have to worry about those bruises or

even broken bones that he can get in contact sports. Even many athletes with

disabilities can compete on equal terms with able-bodied athletes at table tennis. He

can also play table tennis all year round, day or night, and don't have to worry about

bad weather or covering up to keep harmful UV rays off him. He does not have to

spend a lot of money to play table tennis either. A huge amount of space is not needed

to have fun playing table tennis at home. Table tennis is easy to play, yet difficult to

master. You will always have another challenge to look forward to, and another

mountain to climb. Table tennis is great for getting up a sweat and getting the heart

rate up. It is also good for the brain of old person. Table tennis is a wonderful sport to

take up for life [1].

Table tennis match umpiring is a very demanding task. Since a table tennis

umpire needs to make accurate judgment about the ball-moving environment, which

involves a series of fast actions and its legitimacy is strictly governed by the Laws of

Table Tennis [2]. A more pragmatic approach [3] is to employ computerized tools

capable of making accurate and fast measurements of the ball moving, to aid the

umpire in making correct decisions.

1
An intuitive, non-disruptive way of evaluating the ball-moving environments to

capture the moving ball, table, and net, called object of interest (OOI), using a video

camera and to then detect and track the OOIs in real time on a frame-wise basis.

However, accurately segmenting and tracking the OOIs in match situations is

extremely challenging for a myriad of reasons, including:

 The ball travels very fast. A high shutter speed camera is necessary; otherwise,

the object can become blurred, color faded, and distorted in shape.

 Besides the ball, many moving objects, such as players, table-tennis bats, and the

crowd, may appear in the video.

 Uneven color and illumination exist since some objects are moving.

 The ball, table, and net may be blocked by the player, the bat, clothing, or others;

otherwise, the ball may disappear from view when it is move to too high or too

low.

 When the contrast between OOI and others is low, the OOI may become

indistinguishable.

 An OOI and others with similar color, size and shape may be confused.

 The size of the ball may be only a few percent of the frame size.

 In a real time application, detecting and tracking the ball must be as fast as

possible.

The study proposes an automatic table tennis match umpiring system (ATTMU

system) to extract the OOIs from a table tennis competition video, and analyze the

actions of the extracted objects, judge whether all the actions comply with the rules,

and then suggest a recommendation for the umpire to consider. ATTMU system will

assist umpires in fast and precisely making correct rulings in a table tennis game.

2
One of official table tennis rules, Articles of 11 point scale (A11)[2] is as follows:

1. When the server serves, the server must hit the ball so that the ball can hit the

table on the server side of the net, and then the ball can go over or around the net

before it hits the table on his opponent's side of the net.

2. When one player serves or fights back, his opponent must hit the ball over the

net and let the ball hit the opponent's court.

3. The winner of a game is the player who first scores 11 points unless both players

score 10 points. If both players score 10 points, then the game is won by the first

player gaining a lead of 2 points.

In this research, the table tennis rules will be used to investigate the performance of

the ATTMU system by experiments.

1.2 Motivation and goal

An automated referee of the table tennis game [4] is a difficult task. On the one

hand, due to the problem of shooting angles, placement of judgment will cause

difficulties. And the athlete stations may also cover part of the desktop, and then it

may cause misjudgment. On the other hand, due to the table tennis ball deformation

caused by the film, it is not easy to find the correct ball positions at each time point.

Furthermore, because the light changes in the environment, it is much difficult to

detect table tennis ball. Therefore, choosing the right camcorder and shooting angles

are important in automated detection of table tennis games.

In the approach [4], shooting modes have taken into double and single camcorder

shooting, as follows:
3
1. Double-camcorders shoot: One camcorder is set up next to the table, and the

other camcorder is set up over the table. The advantage is that we can have the

clearer and actual positions of table tennis balls. Also, the shooting angles of the

camcorder set up next to the table will not affect the placement of judgment.

2. Single-camcorder shooting: The camcorder is set up next to the table. The

advantage is that it only needs to process the images captured by one video

camera, so it has the higher processing speed.

The goal of this thesis is to establish a timely referee system. How to detect

simultaneously when we use the double camcorders is a major problem. So, we use

the single camcorder to make the films. After several experiments, we find the most

appropriate way to set up the camcorder. And it can minimize the misjudgment of

actual location of the table tennis ball by using the single camcorder to shoot.

In a table tennis tournament video, there are three main OOIs, ball, table, and net;

Figure 1.1 shows the three OOIs. First, the ATTMU system will extract the three

OOIs from each frame of the table tennis tournament video, and then judge whether

the actions comply with the rules of A11according to special relationship of ball and

table as well as that of ball and net. We can determine the placement by hitting the

ball on the desktop. And then we can use the placement and net to determine the

service return is effective or not. Finally, we can judge which athlete scores. Thus,

how to detect these objects accurately is very important.

4
Fig.1-1 Three main objects of the referee system of table tennis games;

1.3 Environment Settings

Because the system is expected to adapt to a variety of table tennis competition

environment, we choose four matches environment with the yellow and white ball,

which are the two major specified table tennis balls in the international competitions.

It shows atFig.1-3. So we get a total of eight films to be tested. This thesis uses

JVC-GZ-GX1 camcorder to shoot. The automatic mode is used, and the shutter speed

is set to 1/30, which means thirty images will be shot in one second, for the purpose

of improving the clarity of the ball and accuracy. After several tests and comparisons

of various methods to set up, the final position of the camcorder is set up to align the

net, and 175 cm away from the desktop. The erect height is 200 cm high from the

desktop, and the angle between the desktop and the tripod is 35 degrees. It shows at

Fig.1-2.

5
Fig.1-2 The final position of the camera

Fig.1-3 The environment of table tennis games in this thesis

1.4 Organization

In subsequent chapters, the second chapter will introduce the existing methods

6
which this thesis has used, including morphology, area filling algorithm, connected

component labeling, Otsu’s thresholding, and linear regression. The third chapter is

the main method of this thesis, and it can be divided into the table tennis desktop

detection, the linear equation of the table tennis net detection, and the center of the

table tennis ball detection. And we import the results obtained in these steps into the

referee system of the table tennis games in order to judge the scores. The forth chapter

is experiment results. The fifth chapter is conclusion and future work.

7
Chapter 2 Related Works

This subsection briefly reviews some techniques which will be applied to this

thesis.

2.1 Morphology

Morphology [5] is the method that analyzes the geometry of the structure, based on

a set of algebra. Morphology-based operators are as follows:

1. Erosion

2. Dilation

3. Opening

4. Closing

Morphological image processing is based on the four operators, and then these

operators derive other morphological algorithms. This thesis uses the top-hat filter,

and the principle is subtracted from the original image by the opening image. And

then, we will introduce the four basic operators as follows.

2.1.1 Erosion

Erosion [5] makes the objects in the binary image shrink or thinning. The way and

extent of the shrinkage is controlled by a structuring element. Using the structural

elements of different sizes can remove the objects of different sizes. In addition, if

there is a small link between two objects, we can separate the two objects by erosion.

First, we have an image A and a structuring element B. Then, B is moving on the

image A. There are three possible states as follows:


8
1. B[x] ⊆ A;

2. B[x] ⊆ 𝐴𝑐 ;

3. Both B[x] ∩ A and B[x] ∩ 𝐴𝑐 are not null.

Fig.2-1 Three possible states of B[x]

The first condition shows that B[x] has the closest relation with image A. So, the

point x which satisfies the first condition is called a set of A to B of erosion, denoted

by 𝐴○
-B. It is defined as follows:
𝐴○
-B = {𝑧|(𝐵)𝑧 ⊆ 𝐴} (1)

(𝐵)𝑧 means the set of all points z, and the equation represents the displacement of B

contained in A.

Fig.2-2 shows an example of erosion. The solid lines in (b) represent the limit of

movement of the structuring element B. When B moves out over the line, B can’t be

entirely contained in A.

9
Fig.2-2 The schematic diagram of erosion; (a) image A and structuring element B;

(b) the result that A is eroded by B

2.1.2 Dilation

Dilation [5] makes the objects in the binary image larger or thicken. The way and

extent of the shrinkage is also controlled by a structuring element. Assume that the

target image is called A and a structuring element is B. The condition that A dilates by

B is denoted by A⊕B. It is defined as follows:


(2)
𝐴 ⊕ 𝐵 = { 𝑧|(𝐵̂)𝑧 ∩ 𝐴 ≠ ∅ }

(𝐵̂ ) represents the reflection of B. It is defined as follows:


𝐵̂ = { 𝑤|𝑤 = −𝑏,𝑤ℎ𝑒𝑛 𝑏 ∈ 𝐵 } (3)

Where the equation (3) means that after reflecting B for origin and shifting z units,

we can get the set of the displacement z. Fig.2-3 shows an example of dilation. The

solid lines in (b) represent the limit of movement of the structuring element B. When

B moves out over the line, it will lead to the situation that the intersection of A and B

is an empty set. Thus, the area inside the solid lines represents that A dilates by B.

10
Fig.2-3 The schematic diagram of dilation; (a) image A and structuring element
B; (b) the result that A is dilated by B

2.1.3 Opening and Closing

Opening [5] is constituted from two basic operators, including erosion and dilation.

It is denoted by A。B. It is defined as follows:


A。B = (A ○
-B) ⊕ B (4)

Here we define the opening which is between image A and structuring element B.

First, we erode A by B. Next, we dilate the result by B. Because A is first eroded by B,

the narrow part of A will be cut off. Then, we use dilation to smooth the truncated A.

So, opening can be used to cut off the narrow part of the image. It is shown as Fig.2-4.

11
Fig.2-4 The schematic diagram of opening; (a) original image; (b) structuring
element; (c) the result after erosion; (d) the result after (c) is dilated by (b)

Closing [5] between image A and structuring element B is denoted by A・B, It is

defined as follows:
A・B = (A ⊕ B)○
-B (5)

Here we dilate A by B. And then, we erode the result A by B. Because A is first dilated

by B, we can fill the small gaps in the image A. Next, the erosion we used can smooth

the image A. The difference between opening and closing is that closing can be used

to fill the narrow gaps in the image. It is shown as Fig.2-5.

Fig.2-5 The schematic diagram of closing; (a) original image; (b) structuring
element; (c) the result after dilation; (d) the result after (c) is eroded by (b)

12
2.2 Area Filling Algorithm

A hole can be considered an area of background surrounding by the pixels of

foreground. Area filling algorithm [6] uses dilation, complementary set, and

intersection to fill the hole in the image. Assume A is a set of images. It is defined as

follows:
𝑋𝑘 = (𝑋𝑘−1 ⊕ 𝐵) ∩ 𝐴𝑐 𝑘 = 1, 2, 3, ⋯ (6)

Where X0 means the pixels P in the boundary, and B means the structuring element.

If 𝑋𝑘 = 𝑋𝑘−1 , the algorithm stop doing iteration. Finally, we intersect 𝑋𝑘 and A to

fill the hole in the image.

Fig.2-6 The schematic diagram of Area filling algorithm; (a) original image; (b)
structuring element; (c) 𝑋0=P; (d) 𝑋1; (e) intersection of 𝑋3 and A

13
2.3 Connected Component Labeling

Connected component labeling [7] distinguishes the regions in the binary image.

The connected regions have the same labels. The different regions are given different

labels. Connected component labeling is divided into 4-connected labeling and

8-connected labeling. 4-connected labeling means that if those pixels around the pixel

in four directions, including up, down, right, and left have the same intensity as the

pixel, we give them the same labels. And the directions in 8-connected labeling are

more than connected diagonal directions in 4-connected labeling. In this these, we use

8-connected labeling. The algorithm is divided into two steps as follows.

Step 1

Scan the image from left to right, and top to down. The pixel P means the pixel

currently being scanned. If P equals 0, and we scan the next point. If P equals 1, we

judge as follows:

 If the pixels around P, including q, r, s, t are 0, we give P a new label.

 If only one the pixel around P, including q, r, s, t is 1, we give P the same label as

the pixel.

 If there are over two pixels around P, including q, r, s, t are 1, no matter the labels

of these pixels are the same or not, we give P one label of these pixels. And other

labels of these pixels are given in the same equivalence class.

Fig.2-7 The relationship between P and the pixels around P

14
Step 2

Give each equivalence class the unique label. And then substitute each label of the

pixels in the image from the label of the class that the pixel belongs with. It is shown

as Fig.2-8(b) represents that the labels of some pixels are not correct in step one. After

step two, we can get the correct labels.

Fig.2-8connected component labeling in the binary image; (a) binary image (b)
the result after step 1; (c) the result after step 2;

2.4 Otsu’s Thresholding

Otsu’s thresholding method [8] specifies the threshold to transform the

grayscale image into a binary image. Here we introduce the process of automatic

threshold selection as follows:

1. Select the initial value T which is usually the median intensity of the image.

2. Separate all the pixel in the image into two classes by T, 𝐺1 and 𝐺2 . 𝐺1 is the

set of the pixels whose intensities are less than T. 𝐺2 is the set of the pixels

whose intensities are greater than T.

3. Compute the mean of 𝐺1 and 𝐺2 .

4. Compute the new threshold T.

5. Repeat step 2 to 4 until the difference between T is less than 0.

15
Otsu’s thresholding is a method based on histogram. The probability of a pixel with

gray-level 𝑟𝑞 in can be computed as follows.


𝑛𝑞 (7)
𝑝𝑟 (𝑟𝑞 ) = q = 0, 1, 2, … , L − 1
𝑛

Where 𝑛𝑞 denotes the number of pixels with gray-level 𝑟𝑞 and n is the total

number of pixels in a gray-level image . Assume that the threshold k has been

chosen. 𝐺1 is a set of pixels in [0, 1, … , 𝑘 − 1]. And 𝐺2 is a set of pixels in

[𝑘, 𝑘 + 1, … , 𝐿 − 1]. Otsu’s thresholding chooses the between-class variance𝜎𝐵2 to be

the maximum threshold k. The variance is defined as follows.


𝜎𝐵2 = 𝜔1 (𝜇1 − 𝜇𝑟 )2 + 𝜔2 (𝜇2 − 𝜇𝑟 )2 (8)

Where,
𝑘−1 (9)
𝜔1 = ∑ 𝑝𝑞 (𝑟𝑞 )
𝑞 0

−1 (10)
𝜔2 = ∑ 𝑝𝑞 (𝑟𝑞 )
𝑞 𝑘

𝑘−1 (11)
𝜇1 = ∑ 𝑝𝑞 (𝑟𝑞 ) 𝜔1
𝑞 0

−1 (12)
𝜇2 = ∑ 𝑝𝑞 (𝑟𝑞 ) 𝜔2
𝑞 𝑘

−1 (13)
𝜇𝑟 = ∑ 𝑝𝑞 (𝑟𝑞 )
𝑞 0

16
2.5 Linear Regression

Linear regression [9] is known as a linear combination of one or more of the

regression coefficients of the model parameters. The process is that we model the

relationship between a scalar dependent variable and one or more explanatory

variables. A linear regression equation having a variable represents a straight line.

Most commonly, linear regression refers to a model which is constructed by a dataset

which x and y is given. After developing such a model, if an additional value of x is

then given without its accompanying value of y, the fitted model can be used to make

a prediction of the value of y.

Fig.2-9 A linear regression equation having a variable

Here we analysis linear regression as follows:

In order to find the line of linear regression, we should have a dataset which x and y is

given:
𝑥1 , 𝑥2 , … , 𝑥𝑚 , 𝑦1 , 𝑦2 , … , 𝑦𝑚

17
And then, assume a straight line containing the unknowns α and β:
f(𝑥) = αx + β (14)

Define the error value 𝜀𝑖 and the squared error 𝐸2 (𝛼, 𝛽) of the model:
𝜀𝑖 = 𝑦𝑖 − 𝑓(𝑥𝑖 ) (15)

𝑚 (16)
𝐸2 (𝛼, 𝛽) = ∑ 𝜀𝑖 2
𝑖 1

If we can find α and β to let the 𝐸2 be the minimum value, this line is the linear

regression line. The goal of least square method is to find the line which lets 𝐸2 be

the minimum value. It is defined as follows:

1. Assume both partial derivatives of 𝐸2 equal 0. We can get two equations as

follows:
𝑚 (17)
𝛿 1
𝐸(𝛼, 𝛽) = −2 ∑(𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 ) = 0
𝛿α 𝑚
𝑖 1
𝑚
𝛿 1
𝐸(𝛼, 𝛽) = −2 ∑(𝑦𝑖 − 𝛼 − 𝛽𝑥𝑖 )𝑥𝑖 = 0
{𝛿β 𝑚
𝑖 1

2. We can get α and β by the equations above. And we can also get α and β by

ordinary least squares estimator. It is defined as follows:


∑𝑚
𝑖 1(𝑥𝑖 − 𝑥̅𝑚 )(𝑦𝑖 − 𝑦̅𝑚 ) (18)
𝛽̂𝑚 =
∑𝑚
𝑖 1(𝑥𝑖 − 𝑥̅𝑚 )
2

𝛼̂𝑚 = 𝑦̅𝑚 − 𝑏̂𝑚 𝑥̅𝑚 (19)

18
Chapter 3 ATTMU System

The ATTMU system consists of four approaches: table detection, net detection, ball

detection, and regular system, and the flow chart is shown as Fig.3-1. The table

detection, net detection, and ball detection approaches are to extract the table, net, and

ball from each frame of the table tennis competition video. The regular system

approach is to judge the actions according to the competition rules of A11 based on

the special relationship of ball and table as well as that of ball and net. The ATTMU

system first transforms each frame of a color table tennis competition video into a

gray-level frame. Let 𝑅𝐺𝐵 be one frame of a color table tennis competition video,

and 𝑅𝐺𝐵 (𝑖, 𝑗) be the pixel located at the coordinates on 𝑅𝐺𝐵 . The corresponding

gray-level frame 𝑅𝐺𝐵 can be obtained by the equation (20).

0 (𝑖, 𝑗) = 0.299 × 𝑅 (𝑖, 𝑗) + 0.587 × 𝐺 (𝑖, 𝑗) + 0.114 × 𝐵 (𝑖, 𝑗) (20)

where 𝑅 (𝑖, 𝑗), 𝐺 (𝑖, 𝑗), and 𝐵 (𝑖, 𝑗) are the R, G, and B color components of

𝑅𝐺𝐵 (𝑖, 𝑗) [10]. The system contains the rules of the table tennis game. We combine

these three features, including the table region, the equation of the net, the centroid

coordinates of the ball and this system to determine which athlete scores.

A. Table detection : The goal is to cut out of the table region in the film. So we can

determine the placement of the ball is located in the table region or not. The balls

located in the table region are considered as the placements of the balls. And we

can import the placement in the regular system.

B. Net detection : The goal is to find out the equation of the net. We can import the

placements of the balls in the equation of the net, and determine the ball is

located on the right side or the left side of the net. In order to determine the

service is effective or not.


19
C. Ball detection : Detect the location of the ball in the film. Then we use this and

the table region to determine the placement of the ball.

D. Regulation system : The system contains the rules of the table tennis game. We

combine these three features, including the table region, the equation of the net,

the centroid coordinates of the ball and this system to determine which athlete

scores.

Fig.3-1 The flow chart of table detection;

3.1 Table detection

Table detection is to accurately cut off the tennis table from the frame of the table

tennis competition video. The step can be divided into two parts as follows:

20
Fig.3-2 The flow chart of table detection

In a table tennis game, since the camcorder is generally stationary, the ATTMU

system extracts the table only from the first frame 𝑓 of the gray-level table tennis

competition video. Since white lines 2 cm wide are painted along each edge of a

tennis table, and a center line 3 mm wide parallel to the side lines divides the playing

surface in half, the ATTMU system then uses top-hat filter technique to enhance the

brighter edge of the table.

Top-hat filter employs the ranked value from two different size regions, the

brightest value in a circular interior region is compared with the brightest value in a

surrounding annular region. If the brightness difference is greater than a threshold

level, it is kept or it is erased. Let 𝐸 be a grayscale structuring element. The top-hat

transform of 𝑓 is given by the equation (21).

= 𝑓 − 𝑓 。𝐸 (21)

where “。” denotes the opening operation. The white top-hat transform returns an

image, containing those “objects” of an input image that are smaller than the

structuring element and brighter than their surroundings.

Fig. 3-4(a) is a frame of a color table tennis competition video. Fig. 3-4(b) is the
21
image obtained by executing top-hat filtering operation on the gray-level image of Fig.

3-4(a). From Fig. 3-4(b) one can obviously observe that the top-hat filtering operation

can effectively highline the boundary of upper tennis table. In , the pixels on the

boundary of upper tennis table are generally brighter. Otsu thresholding method is

hence used to decide the suitable threshold 𝑜 for generate a binary image 𝑏 by

the equation (22).

0, 𝑖𝑓 (𝑖, 𝑗) < 𝑜, (22)


𝑏 (𝑖, 𝑗) ={
1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒,

where 0-bit represents the black pixel and 1-bit represents the white pixel. Fig. 3-4(c)

is the binary image obtained from the image in Fig. 3-4(b).

Next, closing operation is adopted to connect the disconnected object boundary

on 𝑏. In the closing operation, the ATTMU system executes dilation operation and

then erosion operation based on the structuring element shown Fig. 3-3. Then the

biggest region in 𝑏 is considered to the extracted upper surface table. Fig. 3-4(d)

depicts the extracted upper surface table on the image in Fig. 3-4(a).

Fig.3-3 the structure element : disc;

22
(a) original image (b) the result after tophat

(c)the binary image after the Otsu’s (d) the result after the closing on time and
thresholding maintaining the biggest region

Fig.3-4 the process of the table detection

23
3.2 Net detection

Fig.3-5 The flow chart of net detection

Net posts are set apart for a doubles court. The net detection approach is to detect

the net location and use an equation to describe the location. To extract the net, the

ATTMU system first cuts off a small region (a rectangle) 𝑅𝑛 which contains the net,

and then derives the location of the net from the small region. Let 𝑏 consist of

M×N pixels. The central pixel of 𝑏 is located at the coordinates (𝑖𝑐 , 𝑗𝑐 ) by the

equation (23).
M+1 N+1 (23)
(𝑖𝑐 , 𝑗𝑐 ) = (⌊ ⌋,⌊ ⌋)
2 2
Since the net is located at the middle of the table, the ATTMU system thinks that

M+1 N+1
the left most top corner of 𝑅𝑛 is located at (𝑖𝑐 − ⌊ ⌋ , 𝑗𝑐 − ⌊ ⌋) and the right
10 10

M+1 N+1
most bottom corner is located at (𝑖𝑐 + ⌊ ⌋ , 𝑗𝑐 + ⌊ ⌋) . The 𝑅𝑛 cut off from the
10 10

image 𝑏 is shown in Fig. 3-6(a).

After that, Otsu’s thresholding method is also applied to convert 𝑅𝑛 into a

binary image 𝑅𝑏 . Fig. 3-6(b) delineates the 𝑅𝑏 obtained from the 𝑅𝑛 in Fig. 3-6(a).
24
Besides, the thinning operation is used to trim all the lines on 𝑅𝑏 to single pixel

thickness and the result is shown in Fig. 3-6(c). Then, the 1-bit pixels in 𝑅𝑏 are input

to derive the linear equation of the net by linear regression method.

(a) (b) (c)

Fig.3-6 The process of the net detection

3.3 Ball detection

Fig.3-7 The flow chart of net detection

The goal of this step is to detect the location of the ball in the film. It can be divided

into two parts, including Part A, and Part B. We detect the entire image until we find

the coordinates of the ball in Part A. After that, we only detect the area expanding

from the coordinates of the ball by a range in Part B. The moving range between the
25
ball in this frame and the ball in the next frame isn’t too wide. So, if we detect around

the coordinates of the ball by a narrow range, we can not only reduce the execution

time but also improve the detection accuracy. If the ball is not detected in Part B, we

step back to Part A. Else, we repeat Part B.

3.3.1 Part A-Candidate point detection

Fig.3-8The flow chart of net detection in Part A

The method is to find the objects similar to ball and their centroid coordinates. Next,

they can provide these features to detect the ball by a narrow range in Part B.

 Step 1,Remove the background, and remain moving objects:


The balls in the film are the moving objects, so we only have to remain these

objects to do the subsequent image processing. First we consider grayscale image 𝐹

as foreground, and the first grayscale image 0 as background. Because the moving
26
objects are mainly from the foreground image, we have to highlight the high

luminance portion in the foreground. In this case, we remain the area with the stronger

contrast in the image 𝐹. Here we get the image 𝑁𝐹 by using Otsu’s thresholding

method to create a binary image of 𝐹 .The result is shown at Fig.3-9(a).

After that, we subtract the intensities of the foreground from the intensities of the

background. So we can get the initial background removing image 𝐹𝐵 . Similarly, we

get the image 𝑁𝐵 by using Otsu’s thresholding method to create a binary image of

𝐹𝐵 . The result is shown at Fig.3-9(b). So we can intersect 𝑁𝐹 and 𝑁𝐵 to get the

final background removing image . The purpose is to enhance the contrast of the

initial background removing image. The result is shown at Fig.3-9(c). So we can

remain the moving objects initially.

 Step 2,Remain the objects similar to ball, and identify the


candidate point coordinates:
The objects similar to the ball are considered by area and shape. It means we only

remain the area and the shape of the object which is most similar to the ball. First, we

consider by the area of the object. The area of the ball and the table will show a

1 1
certain percentage [3], and the percentage range is around ~ 4500. So, we use this
4000

range to be the threshold to remove the regions which are too large or too small. It’s

defined by equation (24).

1 1 (24)
= {0, 𝑖𝑓 <
4500
× 𝐴𝑖 , 𝑜𝑟 >
4000
× 𝐴𝑖
1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

Then, we consider by the shape of the object. We use a value d to determine the

regions are nearly circular or not in the image . The value d is defined by equation

(25).

27
area (25)
d = 4π
perimeter 2
0, 𝑖𝑓 𝑑 > 2 (26)
={
1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

If the d is closer to 1, it means the region is nearly circular. Conversely, if the d is

far from 1, it means the region is absolutely not circular. Thus, we remove the region

whose d is greater than 2. The remaining regions are the candidate objects similar to

the ball. The result is shown at Fig. 3-9(c). And we use connected component labeling

to label these regions. Where 𝐶𝑧 (𝑖, 𝑗) is the zth centroid candidate of the labeling

objects similar to the ball.

 Step 3,Detection range expansion:


The objects similar to the ball are usually not only one. So, we have to detect each

object similar to the ball, and find the most similar one. We don’t need to detect the

entire image during detecting the objects similar to the ball. Thus, we can expand

from the center of each object similar to the ball by a range. It can not only reduce the

execution time but also improve the detection accuracy. Here, we find the 𝐶𝑧 (𝑖, 𝑗)

from the grayscale image 0. Next, we expand from 𝐶𝑧 (𝑖, 𝑗) by the range of

𝑚2 × 𝑛2 . It’s defined by equation (27).

1 (27)
𝑚2 = 𝑀,
{ 10
1
𝑛2 = 𝑁.
10

The size of the detection range is only needed to be slightly larger the size of the

ball. The result is shown at Fig.3-9(e). Then, we use the Otsu’s thresholding method

to create binary images of the range areas. The result is shown at Fig.3-9(f). Similar to

Step 2, we also consider the objects similar to the ball by area and shape. We remove

28
the regions which are too large or too small by equation (24) in these binary images of

the range areas. And these images are regarded as 𝐵𝑧 . Compute the value d of the

objects in the 𝐵𝑧 by equation (25) and find the image whose d is the closest to

1.The purpose is to find out the object which is the most similar to the ball. Finally,

we define the object as the ball and find the centroid coordinate 𝐹𝐶 (𝑖, 𝑗).

(e) (f)

Fig.3-9 The process of the ball detection in Part A

29
3.3.2 Part B-Local area detection

Fig.3-10 The flow chart of net detection in Part B

The moving range between the balls in two consecutive images isn’t too wide.

If we continue to use Part A to detect the whole image, it can be time consuming. So,

Part B has been proposed to improve Part A. The method is to narrow the detection

range around the ball location of the previous image. If the ball is detected in the

following image, we can continue to use the procedure to detect. It can not only

reduce the execution time but also improve the detection accuracy.

 Step 1,Detection range expansion:


First, we find the 𝐹𝐶 (𝑖, 𝑗) from the next grayscale image 1 . And expand from

𝐹𝐶 (𝑖, 𝑗) by the range of 𝑚3 × 𝑛3 by equation (28). Because the moving range


30
between the balls in two consecutive images isn’t too wide, we only have to detect in

this range.

1 (28)
𝑚3 = 𝑀,
{ 3
1
𝑛3 = 𝑁.
3

 Step 2,Remove the background, and remain moving objects:


This procedure is similar to the first step in Part A. The purpose is to enhance the

contrast of the moving objects, so we can cut out of the ball. And we do the same two

actions in Part A. The first action is that we use Otsu’s thresholding method to create a

binary image of the range area. The result is shown at Fig.3-11(a). In order to remove

the background, the second step is to subtract the intensities of the ball in the range

area from the intensities of the ball of the first frame. Also, we use Otsu’s thresholding

method to create a binary image of the background moving image. Finally, we

intersect both of the binary images to get the moving objects remaining image 𝑀𝐵 .

 Step 3,Remain the objects similar to ball, and identify the


centroid coordinates:
This procedure is similar to the second step in Part A. The purpose is to find the

object which is most similar to the ball. The object similar to the ball can be

considered by area and shape. Considered by area, we remove the regions which are

too large or too small in the image 𝑀𝐵 by equation (24). And considered by shape,

we compute the value d by equation (25) of the objects of the 𝑀𝐵 and remain the

objects whose d is the closest. The result is shown at Fig.3-11(b). Finally, we define

the remaining object as the ball, so we can get the centroid coordinate 𝐹𝐶 (𝑖, 𝑗).

31
(a) (b)

Fig.3-11 The process of the ball detection in Part B

32
3.3.3 Placement detection

The method is to find the placement of the ball by the centroid coordinate 𝐹𝐶 (𝑖, 𝑗)

in ball detection.

Fig.3-12The feature of the placement (a) the k-1th centroid coordinate (a) the kth

centroid coordinate (a) the k+1th centroid coordinate

Fig.3-12 shows that the y value of the placement will increase first and decline,

such as (b). Here we use the following steps to determine whether the nth centroid

coordinate is the placement or not.

 Step 1, determine whether 𝑰𝑭𝑪 (𝒊, 𝒋) is located in the table region


𝑨𝒊 .

First, we take the intersection of the candidate point 𝐹𝐶 (𝑖, 𝑗) and the table region

𝐴𝑖 . Where 𝐴𝑖 is the set of the coordinates in the table region .If the intersection is

absent, it indicates that 𝐹𝐶 (𝑖, 𝑗) is not located in the table, and it isn’t taken into

account. Then, the remaining centroid coordinates detected in ball detection will be
33
stored in the array CSet[ ], which means the set of total centroid coordinates of the

ball. It is expressed by the equation (29),

𝑛1 𝑛2 𝑛𝑘
CSet[ 𝐹𝐶1 (𝑥𝑐1 , 𝑦𝑐1 ), 𝐹𝐶2 (𝑥𝑐2 , 𝑦𝑐2 ), … , 𝐹𝐶𝑘 (𝑥𝑐𝑘 , 𝑦𝑐𝑘 )], 𝑖𝑓 𝐹𝐶 (𝑖, 𝑗) ∈ 𝐴𝑖 (29)

Where 𝑥𝑘 , 𝑦𝑘 mean the x-coordinates and the y-coordinates of the kth centroid

coordinate and 𝑛𝑘 means the frame number of 𝐹𝐶 (𝑖, 𝑗). And the frame number will

be used in regular system.

 Step 2, determine whether 𝑰𝑭𝑪𝒌 (𝒙𝒌 , 𝒚𝒌 ) is the placement.

Then, we store the remaining centroid coordinates of the ball which meet the

following condition in the array PSet[ ], which means the set of total centroid

coordinates of the placement. It is expressed by equation (30),

𝑃𝑆𝑒𝑡[ 𝑃1 𝑛1 (𝑥𝑃1 , 𝑦𝑃1 ), 𝑃2 𝑛2 (𝑥𝑃2 , 𝑦𝑃2 ), … , 𝑃𝑚 𝑛𝑘 (𝑥𝑃𝑚 , 𝑦𝑃𝑚 )] (30)


𝑛𝑘 𝑛𝑘
𝑃𝑚 (𝑥𝑃𝑚 , 𝑦𝑃𝑚 ) = 𝐹𝐶𝑘 (𝑥𝑐𝑘 , 𝑦𝑐𝑘 ), 𝑖𝑓 𝑦𝑐𝑘 − 𝑦𝑐𝑘−1 > 0 𝑎𝑛𝑑 𝑦𝑐𝑘+1 − 𝑦𝑐𝑘 < 0

Where 𝑥𝑚 , 𝑦𝑚 mean the x-coordinates and the y-coordinates of the mth


𝑛𝑘
placement. If the conditions are satisfied, and we can define 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) as the

placement.

34
3.4 Regular system

Fig.3-13 The flow chart of regular system

Shown as Fig.3-13, regular system combines three features obtained from the

above method, including table region 𝐴𝑖 , the equation of the net 𝐸𝑞 , the centroid
𝑛𝑘
coordinates of the placements 𝑃𝑚 (𝑥𝑃𝑚 , 𝑦𝑃𝑚 ) with two judge methods, including

effective service judgment and effective service return judgment to determine the

scores and the winner. Finally, we reset the subsequent of the centroid coordinates of

the ball to be the serving point. Then, we repeat the above action until the end of the

game.

3.4.1 Effective service judgment

The definitions of the effective service is that the ball hit by the server must hit on

the server’s desktop, and then the ball can go over or around the net before it hits the

table on his opponent’s desktop. Thus, the features needed by the method are the first

35
𝑛1
centroid coordinates of the balls 𝐹𝐶1 (𝑥𝑐1 , 𝑦𝑐1 ), the first and the second centroid
𝑛1 𝑛2
coordinates of the placements 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ), 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ), and the equation of

the net 𝐸𝑞 : ax + by + c. And we determine effective service by these three features.

First, we examine the serving side. The first centroid coordinates of the balls
𝑛1
𝐹𝐶1 (𝑥𝑐1 , 𝑦𝑐1 ) is considered as the serving point 𝑆𝐶 (𝑖, 𝑗). And we do the following

action to judge effective service.

Fig.3-14 The schematic diagram of effective service; (a) serve from the right side;
(b) the first placement hit on the right side of the desktop; (c) the second placement hit
on the left side of the desktop

𝑛1
Fig.3-14 shows that if the first placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ) is located on the same
36
𝑛2
side with the serving point 𝑆𝐶 (𝑖, 𝑗), and then the second placement 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ) is
𝑛1
located on the different side with the first placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ), we can regard

this situation as effective service. After that, if the value gained by using 𝑆𝐶 (𝑖, 𝑗) as

variable into the equation of the net 𝐸𝑞 and the value gained by using the first
𝑛1
placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ) as variable into the equation of the net 𝐸𝑞 are the same

sign,it means the server hits his own desktop successfully by equation (31). And vice

versa, it means invalid service, so that the receiver scores.


𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 , 𝑖𝑓(𝑎𝑥𝐶1 + 𝑏𝑦𝐶1 + 𝑐) × (𝑎𝑥𝑃1 + 𝑏𝑦𝑃1 + 𝑐) > 0 (31)
{
𝑖𝑛𝑣𝑎𝑙𝑖𝑑 𝑠𝑒𝑟𝑣𝑖𝑐𝑒, 𝑖𝑓(𝑎𝑥𝐶1 + 𝑏𝑦𝐶1 + 𝑐) × (𝑎𝑥𝑃1 + 𝑦𝑃1 + 𝑐) < 0

If these results meet the definition of effective service, we do the following action

to judge effective service. Similar as above, if the value gained by using the second
𝑛1
placement 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ) as variable into the equation of the net 𝐸𝑞 and the value
𝑛1
gained by using the first placement 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ) as variable into the equation of

the net 𝐸𝑞 are the different sign, it means the server hits his own desktop and then hit

the opponent’s desktop successfully by equation (32). Then, we ensure the situation

fulfill the definition of effective service. So we can execute the subsequent method of

effective service return judgment. And vice versa, it means invalid service, so that the

receiver scores.

𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 , 𝑖𝑓(𝑎𝑥𝑃1 + 𝑏𝑦𝑃1 + 𝑐) × (𝑎𝑥𝑃2 + 𝑏𝑦𝑃2 + 𝑐) < 0 (32)


{ , 𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑠𝑒𝑟𝑣𝑖𝑐𝑒
𝑖𝑛𝑣𝑎𝑙𝑖𝑑 𝑠𝑒𝑟𝑣𝑖𝑐𝑒, 𝑖𝑓(𝑎𝑥𝑃1 + 𝑏𝑦𝑃1 + 𝑐) × (𝑎𝑥𝑃2 + 𝑏𝑦𝑃2 + 𝑐) > 0

37
3.4.2 Effective service return judgment

If we ensure this service is effective service after the steps above, the subsequent

placement will be imported into the method of effective service return judgment. The

definition of effective service return is that the player serves or fights back, the other

player must hit the ball over the net and let the ball hit the other player’s desktop. The
𝑛𝑘
features needed by the method are the subsequent placements 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ), which

means the mth centroid coordinate of the placement and the equation of the net 𝐸𝑞 .

Fig.3-15 The schematic diagram of effective and invalid service return; from
(a)to(b) is effective service return; From (c) to (d) is invalid service return;

𝑛𝑘
Fig.3-15shows that if this placement 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) is located on the different side
𝑛𝑘−1
with the previous placement 𝑃𝑚−1 (𝑥𝑚−1 , 𝑦𝑚−1 ), we can regard this situation as

38
𝑛𝑘
effective service return. Conversely, if 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) is located on the same side
𝑛𝑘−1
with 𝑃𝑚−1 (𝑥𝑚−1 , 𝑦𝑚−1 ), it means invalid service return. The previous method,

effective service judgment, use the first and the second centroid coordinates of the
𝑛1 𝑛2
placements 𝑃1 (𝑥𝑃1 , 𝑦𝑃1 ), 𝑃2 (𝑥𝑃2 , 𝑦𝑃2 ). If the server serves successfully, we use

the subsequent placements as variables into the equation of the net 𝐸𝑞 . Then, we let

two of the continuous value to do the multiplication so we can get the new value, and

this new value has to be a negative sign, so we can ensure that the ball hits the other

player’s desktop. And we regard it as effective service return.

This value will be updated when the subsequent placements are used as variables

into the equation of the net 𝐸𝑞 continuously. Until this value is updated as a positive
𝑛𝑘
sign, it indicates that this placement 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) is located on the same side with
𝑛𝑘
the previous placement 𝑃𝑚−1 (𝑥𝑚 , 𝑦𝑚 ). This situation means invalid service return,

so that the opponent scores. It is expressed by equation (33).


𝑒𝑓𝑓𝑒𝑐𝑡𝑖𝑣𝑒 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 𝑟𝑒𝑡𝑢𝑟𝑛 , 𝑖𝑓(𝑎𝑥𝑃𝑚−1 + 𝑏𝑦𝑃𝑚−1 + 𝑐) × (𝑎𝑥𝑃𝑚 + 𝑏𝑦𝑃𝑚 + 𝑐) < 0 (33)
{
𝑖𝑛𝑣𝑎𝑙𝑖𝑑 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 𝑟𝑒𝑡𝑢𝑟𝑛, 𝑖𝑓 (𝑎𝑥𝑃𝑚−1 + 𝑏𝑦𝑃𝑚−1 + 𝑐) × (𝑎𝑥𝑃𝑚 + 𝑏𝑦𝑃𝑚 + 𝑐) > 0

The Case 1 and Case 2 in Fig.3-13 show the two situations after determined by

effective service return. It is defined as follows.


𝑛𝑘
 Case 1 : When the value gained by using 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) as variable into the

equation of the net 𝐸𝑞 is updated as a positive sign and the value gained by

using 𝑆𝐶 (𝑖, 𝑗) as variable into the equation of the net 𝐸𝑞 is greater than0,it

represents that server make an invalid service return. And the receiver scores.
𝑛𝑘
 Case 2 : When the value gained by using 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) as variable into the

equation of the net 𝐸𝑞 is updated as a positive sign and the value gained by

using 𝑆𝐶 (𝑖, 𝑗) as variable into the equation of the net 𝐸𝑞 is less than 0, it

represents that receiver make an invalid service return. And the server scores.

39
3.4.3 Reset serving point

If any athlete scores, we regard the next placement, which hasn’t been used as a

variable into the equation of the net, as the first placement. For example, we use the
𝑛𝑘
placement 𝑃𝑚 (𝑥𝑚 , 𝑦𝑚 ) as a variable into the equation 𝐸𝑞 of the net when the
𝑛𝑘+1
athlete scores. Then we set 𝑃𝑚+1 (𝑥𝑚+1 , 𝑦𝑚+1 ) as the first placement. Also, the
𝑛𝑘
centroid coordinates of the ball 𝐹𝐶𝑘 (𝑥𝑐𝑘 , 𝑦𝑐𝑘 ) is regarded as the serving point

𝑆𝐶 (𝑖, 𝑗). Finally, the reset serving point action has been completed.

40
Chapter 4 Results and Discussion

4.1 Experimental evaluation methods

In the experiment, the results of each of these methods, including the edge of the

table, the equations of the nets, the centroid coordinates of the balls, the centroid

coordinate of the placement will be marked at the films of the table tennis game, and

we will assess the accuracy by following methods. We will compare the edge of the

table which is the result in table detection with the experts’ hand-painted desktop. And

compute the true positive (TP), true negative (TN), false positive (FP), false negative

(FN) of the result in table detection. Also the experts make a judgment about the

results of net detection. If the positions of the net detection results are located within a

reasonable margin of error, the experts denote it as True. Otherwise, it’s defined as

False. Then, we compute the accuracy of the centroid coordinates of the balls and

placements by equation (34).


𝑛𝑅 (34)
accuracy =
𝑛𝐺

Where 𝑛𝑅 is the number of the balls and placements in ball detection which the

experts denote it as effective, and 𝑛𝐺 is the number of the balls and placements the

experts mark in the total films. Finally, we compare the results in regular system and

the result which is judged by referee on the spot to see whether the system can make a

judgment correctly.

There are five assessment methods of the table detection, including

Misclassification error (ME), relative foreground area error (RFAE), accuracy,

sensitivity, specificity. There are defined as follows [11].


41
𝐹𝑁 + 𝐹𝑃 𝑃+ 𝑁 (35)
1. ME = = 1−
𝑃 + 𝐹𝑁 + 𝑁 + 𝐹𝑃 𝑃 + 𝐹𝑁 + 𝑁 + 𝐹𝑃

TP (true positive): the correct number of samples in target sample classification

FP (false positive): the error number of samples in non-target sample classification

TN (true negative): the correct number of samples in non-target sample classification

FN (false negative): the error number of samples in target sample classification

The smaller ME value means the error number of samples is fewer, which has the

higher accuracy of classification.

𝐹𝑁 − 𝐹𝑃 (36)
2. , 𝑖𝑓 (𝐹𝑃 + 𝑃) < ( 𝑃 + 𝐹𝑁)
RFAE = { 𝑃 + 𝐹𝑁
𝐹𝑃 − 𝐹𝑁
, 𝑖𝑓 (𝐹𝑃 + 𝑃) ≥ ( 𝑃 + 𝐹𝑁)
𝐹𝑃 + 𝑃

The smaller RFAE value means it has better results of segmentation.

𝑃+ 𝑁 (37)
3. Accuracy =
𝑃 + 𝑁 + 𝐹𝑁 + 𝐹𝑃

Where the denominator is defined as the number of total pixels, and the molecular is

defined as the number after comparing with the result classified by experts. Thus, the

larger value means the higher accuracy of classification.

𝑃 (38)
4. Sensitivity =
𝑃 + 𝐹𝑁

Sensitivity means the ratio of the correct number of samples in target sample

classification. So, the larger value means it has better results of segmentation.

42
𝑁 (39)
5. Specificity =
𝐹𝑃 + 𝑁

Specificity means the ratio of the correct number of samples in non-target sample

classification. So, the larger value means it has better results of segmentation.

4.2 The experiment results of Table Detection

Fig.4-1 The experiment results of table detection

Table 4-1 The assessment results of table detection

Table 4-1 shows that the table detection has a very excellent performance in

43
these five assessment methods. And the accuracy is almost entirely in line with the

correct desktop painted by the experts.

4.3 The experiment results of Net Detection

Fig.4-2 The experiment results of net detection

Table 4-2 The assessment results of net detection

44
As shown in Table 4-2 after visually assessed by experts, the results of net

detection are within a reasonable margin of error. And they are all marked as True.

4.4 The experiment results of Ball Detection

Here the results of ball detection, such as the centroid coordinates and the

placement are marked in the table tennis match films. The centroid coordinates are

marked by the green box, and the placements are marked by the red box. Then, the

experts compute the number of the centroid coordinates and placements which are

within a reasonable margin of error, and also compute the number of the balls and

placements the experts mark in the films. So we can calculate the accuracy by

equation (37). The results are shown at Fig.4-3, Fig.4-4, Fig.4-5, Fig.4-6, Fig.4-7,

Fig.4-8, Fig.4-9, and Fig.4-10.

45
Fig.4-3 Movie(a)-white, the experiment results of table detection

46
Fig.4-4 Movie(a)-yellow, the experiment results of table detection

47
Fig.4-5 Movie(b)-white, the experiment results of table detection

48
Fig.4-6 Movie(b)-yellow, the experiment results of table detection

49
Fig.4-7 Movie(c)-white, the experiment results of table detection

50
Fig.4-8 Movie(c)-yellow, the experiment results of table detection

51
Fig.4-9 Movie(d)-white, the experiment results of table detection

52
Fig.4-10 Movie(d)-yellow, the experiment results of table detection

53
Table 4-3 The comparing results of the ball detection
Movie Num. TP+TN TP+TN+FN+FP Accuracy
Movie(a)-white ball 852 783 91.90
Movie(a)-yellow ball 963 892 92.62
Movie(b)-white ball 743 620 83.44
Movie(b)-yellow ball 841 759 90.24
Movie(c)-white ball 925 868 93.83
Movie(c)-yellow ball 864 823 95.25
Movie(d)-white ball 792 721 91.03
Movie(d)-yellow ball 833 793 95.20

Table 4-4 The comparing results of the placement detection


Movie Num. TP+TN TP+TN+FN+FP Accuracy
Movie(a)-white ball 38 35 92.10
Movie(a)-yellow ball 42 39 92.85
Movie(b)-white ball 29 25 86.20
Movie(b)-yellow ball 37 37 100.00
Movie(c)-white ball 48 46 95.83
Movie(c)-yellow ball 43 40 93.02
Movie(d)-white ball 35 33 94.28
Movie(d)-yellow ball 41 39 95.12

As shown in Table 4-3 and Table 4-4, the accuracy of table detection and placement

detection are more than 90% in eight test movies. However, the accuracy of Movie(b)

using white ball is relatively low with other table tennis match films. This is because

that the white wall occupies a great part of the background in the film. When the

position of the white ball is located in the white wall, it is difficult to detect the

centroid coordinates of the ball. The lower accurate number of the centroid

coordinates causes fewer samples can be used for placement detection. So, it leads the

lower accuracy of the placements. Overall, there are still a very high accuracy rate for

most of the test film in ball detection and placement detection.


54
4.5 The experiment result of Regular System

Table 4-5 The final results of the referee system


Movie Num. Actual score Proposed method
Movie(a)-white ball 7:11 7:11
Movie(a)-yellow ball 8:11 8:11
Movie(b)-white ball 11:5 10:5
Movie(b)-yellow ball 9:11 9:11
Movie(c)-white ball 6:11 6:11
Movie(c)-yellow ball 11:13 11:13
Movie(d)-white ball 11:8 11:8
Movie(d)-yellow ball 11:9 11:9

Here we compare the results. We can see that Movie(b) using white ball is affected

by the lower accuracy of placement. Thus, the result of regular system is different

from the result which is judged by referee. Conversely, due to the high accuracy of the

placements, there are more accurate experiment results in other test table tennis match

films.

55
4.6 The efficacy of ATTMU

Table 4-6 The cost times of ATTMU


Movie Num. Total cost times (s) Cost time per frame (s)
Movie(a)-white ball 583 0.3993
Movie(a)-yellow ball 712 0.4102
Movie(b)-white ball 636 0.4030
Movie(b)-yellow ball 843 0.3881
Movie(c)-white ball 667 0.4238
Movie(c)-yellow ball 962 0.3957
Movie(d)-white ball 923 0.4112
Movie(d)-yellow ball 879 0.4581

Finally, we assess the efficacy of ATTMU. Table 4-6 shows that ATTMU performs

well and spend a little to detect, this is because we narrow the detection range in ball

detection. So, we can avoid the situation of time consuming. Integrate the results in

Table 4-5 and Table 4-6, and we can confirm that ATTMU has the advantage of high

accuracy and saving time.

56
Chapter 5 Conclusions and Future Prospects

5.1 Conclusions

The thesis proposes an automatic table tennis match umpiring system (ATTMU)

for a variety of environments with high accuracy. The system is divided into four

stages step, including table detection, net detection, ball detection, regular system.

The purpose of table detection is to cut out the table area. Net detection is to detect the

equation of the net by the table area. Ball detection is to detect the centroid

coordinates of the ball, and then use the centroid coordinates and the table area to

identify the placements. And regular system is to use the placement from the previous

step and comply them with the rules of A11. So, we can get the final scores.

In the experimental results, we use eight films composed by four different

environments with two major specified table tennis balls in the international

competitions, including the white ball and the yellow ball. Then, we compare the

results, such as the edge of table, the equations of the nets, the centroid coordinates of

the balls, the centroid coordinates of the placements with experts’ judgment. Then we

assess the performance of the ATTMU. The experiment results show that regardless of

specificity, ME, RFAE, accuracy in table detection, and the accuracy in ball detection,

placement detection and regular system, ATTMU has outstanding performance.

57
5.2 Future Prospects

The experiments results of most of the test films have high accuracy. But if the

contrast between the background and the color of the ball is not big enough, the

accuracy of table detection will decline, such as the white wall with the white ball.

Thus, it leads the situation that the results of ATTMU will be inconsistent with the

actual scores. Also, the films we make do not include the situation about that the ball

touch the net or the desk and the curveball, and ATTMU is suitable for singles. In

future, we will make more films with more circumstance which may happen. So we

expect to improve the accuracy of ball detection by selecting the camcorder with

higher pixels or adding more detailed steps in ball detection in future research. Hope

to overcome a variety of environmental constraints and be able to achieve the goal of

umpiring the scores accurately.

58
Reference

[1] S.T. Rodrigues, J.N. Vickers & A.M Williams, “Head, eye and arm

coordination in table tennis”, Journal of Sports Sciences, vol.20, 187-200, 2002

[2] X. Zhang, “Analysis of the effect of new competition rules on table tennis

technique”, Journal of Anhui Sports Science, vol.01, 2002

[3] B. Zhang, W. Chen, W. Dou, Y. Zhang, L. Chen, “Content-based Table Tennis

Games Highlight Detection Utilizing Audiovisual Clues, ”Image and Graphics,

2007. ICIG 2007. Fourth International Conference, vol. 22-24, pp. 833 – 838,

2007

[4] W. Chen, Y. Zhang. “Tracking Ball and Players with Applications to Highlight

Ranking of Broadcasting Table Tennis Video, ”Computational Engineering in

Systems Applications, IMACS Multi-conference, pp. 1896 – 1903, 2006.

[5] R.C. Gonzalez, R.E. Woods, “Digital Image Processing”, 2nd, Prentice-Hall,

2002.

[6] C. Ballester, V. Caselles, J. Verdera, M. Bertalmio, and G. Sapiro, "A variational

model for filling-in gray level and color images", Proc. Int. Conf. Computer

Vision, pp.10 -16, 2001

[7] S. W. Yang, M. H. Sheu, H. H. Wu, H. E. Chien, P. K. Weng, and Y. Y. Wu,

“VLSI Architecture Design for a Fast Parallel Label Assignment in Binary

Image.” Circuits and Systems, IEEE International Symposium on, pp. 2393-2396,

2005.

[8] N. Otsu, “A Threshold Selection Method from Gray-level Histogram, ”IEEE

Transactions on System Man Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.


59
[9] H. Tanaka , S. Uejima and K. Asai "Linear regression analysis with fuzzy

model", IEEE Trans. Systems Man Cybernet, vol. 12, pp.903 -907, 1982

[10] T. Horiuchi and S. Hirano, “Colorization Algorithm for Grayscale Image by

Propagating Seed Pixels,” in Proc. IEEE International Conference on Image

Processing (ICIP), vol.1, pp. 457-460, 2003.

[11] ITU-R Rec. BT.1210-3, "Test materials to be used in subjective assessment,"


February 2004.

60

You might also like