Professional Documents
Culture Documents
Jeng-Shyang Pan
Jianpo Li
Pei-Wei Tsai
Lakhmi C. Jain Editors
Volume 157
Series Editors
Robert J. Howlett, Bournemouth University and KES International,
Shoreham-by-sea, UK
Lakhmi C. Jain, Faculty of Engineering and Information Technology,
Centre for Artificial Intelligence, University of Technology Sydney,
Sydney, NSW, Australia
The Smart Innovation, Systems and Technologies book series encompasses the
topics of knowledge, intelligence, innovation and sustainability. The aim of the
series is to make available a platform for the publication of books on all aspects of
single and multi-disciplinary research on these themes in order to make the latest
results available in a readily-accessible form. Volumes on interdisciplinary research
combining two or more of these areas is particularly sought.
The series covers systems and paradigms that employ knowledge and intelligence
in a broad sense. Its scope is systems having embedded knowledge and intelligence,
which may be applied to the solution of world problems in industry, the environment
and the community. It also focusses on the knowledge-transfer methodologies and
innovation strategies employed to make this happen effectively. The combination of
intelligent systems tools and a broad range of applications introduces a need for a
synergy of disciplines from science, technology, business and the humanities. The
series will include conference proceedings, edited collections, monographs, hand-
books, reference books, and other relevant types of book in areas of science and
technology where smart systems and technologies can offer innovative solutions.
High quality content is an essential feature for all book proposals accepted for the
series. It is expected that editors of all accepted volumes will ensure that
contributions are subjected to an appropriate level of reviewing process and adhere
to KES quality principles.
** Indexing: The books of this series are submitted to ISI Proceedings,
EI-Compendex, SCOPUS, Google Scholar and Springerlink **
Editors
Advances in Intelligent
Information Hiding
and Multimedia Signal
Processing
Proceedings of the 15th International
Conference on IIH-MSP in conjunction
with the 12th International Conference
on FITAT, July 18–20, Jilin, China, Volume 2
123
Editors
Jeng-Shyang Pan Jianpo Li
College of Computer Science Northeast Electric Power University
and Engineering Chuanying Qu, Jilin, China
Shandong University of Science
and Technology Lakhmi C. Jain
Qingdao Shi, Shandong, China Centre for Artificial Intelligence
University of Technology Sydney
Pei-Wei Tsai Sydney, NSW, Australia
Swinburne University of Technology
Liverpool Hope University
Hawthorn, Melbourne, Australia
Liverpool, UK
University of Canberra
Canberra, Australia
KES International, UK
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Conference Organization
Conference Founders
Honorary Chairs
Advisory Committees
v
vi Conference Organization
General Chairs
Program Chairs
Publication Chairs
Finance Chairs
Program Committees
Committee Secretaries
xi
xii Preface
Acknowledgements The IIH-MSP 2019 and FITAT 2019 Organizing Committees wish to
express their appreciation to Prof. Keun Ho Ryu from Chungbuk National University for his
contribution to organizing the conference.
xiii
xiv Contents
Lakhmi C. Jain Ph.D., M.E., B.E. (Hons), Fellow (Engineers Australia), serves at
University of Technology Sydney, Australia, University of Canberra, Australia,
Liverpool Hope University, UK and KES International, UK. He founded KES
International to provide the professional community with the opportunities for pub-
lication, knowledge exchange, cooperation, and teaming. Involving around 5000
xix
xx About the Editors
1.1 Introduction
Over the past years, in developing countries like Vietnam motorcycles and scooters
have dominated. However, as the country becomes wealthier, it is likely to move
toward car ownership, placing great burden on already overcrowded roads [1]. In
T. H. N. Vu (B)
Faculty of Information Technology, University of Engineering and Technology,
Vietnam National University, Hanoi, Vietnam
e-mail: vthnhan@vnu.edu.vn
addition, the growing number of car’s fuel air causes air pollution, traffic jams, and
energy crisis. Ridesharing is believed to be the most effective strategy to achieve
green and efficient transportation [2, 3].
Most existing ride-on-demand and activity-based travel demand services directly
use raw GPS data like coordinates and timestamps without much understanding.
These systems usually force the riders to adapt to their recommended travel routes
instead of receiving an itinerary based on their needs. These systems do not provide
much support in giving useful information about geospatial locations while the users
are traveling either. Naturally, before going to an unknown region, users wish to
know which locations are the most interesting places in this region and what the most
optimal travel sequences users should follow. Ridesharing recommendation services
enable a group of people with similar frequent trips or similar activity preferences
to share a car [4, 5].
So far, there are two popular types of ridesharing, namely, casual carpooling and
real-time ridesharing. The former is usually used by commuters who have common
routes, departing from public transit centers to work locations. However, a problem
with casual carpooling is that it requires users to register in advance and usually they
have some relationship, while in practice users usually have a spontaneous travel
demand. Real-time ridesharing is able to address this problem with the support of
mobile devices and automated ride-matching algorithms, which enables the organi-
zation of participant only need to be done minutes prior to the beginning of the trip
or even when the trip is occurring. Some of the popular applications that have been
deployed lately include Uber [6]. However, in most of these applications, common
trips are considered invalid because those applications operate like traditional taxi
while the value common trips can bring to us is a great deal.
Profile matching is an approach to generate groups of participants. One of the most
recent studies uses a social distance to measure the relationship between participants
but only distance between home and office is discussed [7]. Another work introduces
time–space network flow technique to address the ridesharing problem using pre-
matching information such as smoking, non-smoking, or gender of the participant
[8]. However, knowledge of frequent routes is not included in this work.
In this paper, we propose a framework for ridesharing and location-based recom-
mendation services with the exploitation of knowledge discovered by spatiotemporal
data mining techniques. Users can send a ride request anytime. Depending on the
time the user needs a ride as well as his activity at the destination, his request can
be executed immediately or procrastinated to construct an optimal rideshare and
possibly suggest a location for his demanded activity so that the ride fare is lowest.
To receive ridesharing recommendation from the system the user must first send a
request for a ridesharing service to Premium Ridesharing Service. Pickup and drop-
off locations as well as validity period of the request would be included. Users can
1 A Framework for Ridesharing Recommendation Services 5
select their pickup and drop-off locations from a list of frequent addresses. Default
time limit can be used in case there’s no validity period specified. The system will
process the request by sending the specified addresses to geocoding service and
get back the coordinates for them. All information regarding the request is then
sent to ridesharing engine. The request can be executed in online or offline fashion
depending on the specified validity period. Ridesharing engine calls the appropriate
algorithm to construct a rideshare which consists of users with common or similar
routes. The information about the rideshare is then sent to the scheduling/routing
engine (Fig. 1.1).
This section explains the basic concepts and trajectory preprocessing procedure with
semantic information.
Definition 1.1 (GPS Trajectory): a raw trajectory of a moving user is formally rep-
resented by a sequence of points as oid , pi + in which oid is the moving user
identifier, pi + is a sequence of the geographical point pi of the user. Each point pi is
represented by x, y, t in which x and y are the spatial coordinates and t is timestamp
at which the position is captured.
6 T. H. N. Vu
Definition 1.2 (Point Of Interest): POI is a geographical location where people can
perform an activity. Formally, POI is defined as a tuple {p, lbl, topic, T } where p
is a spatial location, lbl is the name of the POI, topic is a category assigned to POI
depending on the application, and T is the business hour represented by the time
interval [open, close].
Definition 1.3 (Stay point): Stay point is a geographic region in which the user stayed
for a certain interval of time. Given a raw trajectory, stay points are detected with the
use of two scale parameters, temporal threshold t , and spatial threshold s . A stay
point is characterized by a group of consecutive points P = {pi , pi+1 , …, pj } in which
for each i ≤ k ≤ j, the distance dist(pi , pk ) between two points pi and pk is less than the
threshold s and the time difference between the first and last points is greater than
the threshold t (i.e., dist(pi , pk ) ≤ s and pj ·t −pi ·t ≥ t ). Formally, the stay point
j j
p ·x p ·y
is denoted by s = (x, y, beginT, endT ) where s · x = |j−i+1|
k=i k
e and s · y = |j−i+1|
k=i k
are the average coordinates of the points of the set P, and beginT = pi · t and
endT = pj · t are the entering and leaving time of the user.
The first step is to detect the stay points from a user’s raw trajectory. Usually, each
stay point carries a particular semantic meaning, such as a restaurant, a rest area, or
some other tourist attraction. Annotating each stay point with a POI and annotating
each POI with a human activity can be done either manually or automatically. Given
a trajectory of user, temporal threshold t and spatial threshold s , all of the stay
points can be detected easily according to Definition 1.3.
Movement history is a set of locations that the user visited in geographical spaces
over an interval of time. Here, a user’s movement history MoveH is represented by
a sequence of stay points the user visited with corresponding entering time (beginT )
and leaving time (endT ). Therefore, we have MovH = s1 , s2 , . . . sn .
After detecting all the stay points, with the definition of the user’s movement
history above we now move on to segmenting the user movement into a set of routine
routes. A routine route is defined as a regular behavior about spatial and temporal
aspects of a user who performs the trip on a daily basis. This task is tackled by splitting
the series of locations into individual routes that the user took at a predefined time
window tw.
1 A Framework for Ridesharing Recommendation Services 7
In this step, the stay points of the routine routes are then mapped into a reference
plane. The reference plane is composed of geographical regions. In this study, we use
the raster method to represent regions. That means, the reference space is decomposed
into regular cells, thereby we call reference plan spatial grid. As a result, each stay
of a user is represented by a cell in which the user visited and remained for a time
interval [beginT, endT ].
Figure 1.2 illustrates stay points detected from a raw trajectory. Two stay points s1
and s2 are constructed from two sets of points p1 , p2 , p3 and p8 , p9 , respectively.
The spatial grid is represented by a matrix D[nx , ny ]. Since each cell corresponds
to the element D[i, j] we label the cell with Dij. A route can then be represented by a
sequence of cell labels. For instance, with the route shown in Fig. 1.2, the sequence
of stays points p0 , s1 , p5 , p6 , p7 , s2 , p10 can be converted into a series of cell labels
as D10, D20, D30, D31, D21, D11, D12, D13.
Users can go from a cell to its neighbors. From this idea, we can represent a grid
as a directed graph whose vertices are cells. The connection between two cells is
called an edge e. Now a route Dij+ consisting of a set of cell labels Dij can be
represented by a sequence of edges e+ .
The algorithm in Fig. 1.3 is able to reveal all of the stay points from a user’s
raw trajectory. Usually, each stay point carries a particular semantic meaning, such
as a restaurant, a rest area, or some other tourist attraction. Annotating each stay
point with a POI and annotating each POI with a human activity can be done either
manually or automatically.
p9 p10
T p8 raw GPS
trajectory
t7 p6 D[nx, ny]
p5 p10
3
Y 2 s2
p p4 1 p7 p6 p5
p1 p2 3 0 p s1
p0 0
p4
X Stay point detection 0 1 2 3
With the routes obtained from the previous step the frequent routes can be discovered
using the algorithm introduced in [7].
Frequency of a directed edge e is defined as the number of routes passing by this
edge (i.e., f (e) > α).
An edge is said qualified if its frequency is greater than a threshold α.
The frequency of a route r is reflected by a measure route score Sr() defined as
follows:
h(f (e), α)
Sr(r) = (1.1)
e∈r
n
1 ifx > θ
in which h() is a membership function that is defined as h(x, θ ) = .
0 otherwise
A route is said qualified if its score is greater than a threshold γ.
1 A Framework for Ridesharing Recommendation Services 9
Generally, the algorithm for discovering frequent routes works as follows. First,
all of the qualified edges are determined by calculating the edge frequency using
Eq. (1.1) and all of the edges whose frequency is greater than α are kept in a linked
list named qEList. From the qualified edges found, all of the qualified routes are then
discovered using Eq. (1.2) and stored in the list qRList. The third step determines
which qualified edges are not frequent and then will be removed from the list qEList.
The process repeats step 2 and 3 until no more routes are removed from the list
qRList. The remaining elements in the qRList are finally the result of the algorithm.
In response to the user request, the algorithm finds the possibility of new destina-
tions from the same category of POI and also proposes an adjustment to his or her
original schedule, which allows the user still do the activity he/she desires and at the
same time can be more flexible compared to the original schedule.
For each route tr in the set of frequent routes FR, the algorithm finds all the
possible pickup cells from the route tr within the time interval [RR.deptT – tw,
RR.arrT + tw]. Second, all of the cells containing the requested POI (RR.POICat)
that are traversed by the route tr during that interval [RR.deptT – tw, RR.arrT + tw]
would be determined. All the possible pairs of pickup point and destination would
be sent to the user.
With this spatiotemporal service matching strategy, the user would have more
options in making decision of performing his/her activity. This way enables the user
to be more flexible in life instead of sticking to the original schedule.
1.5 Conclusion
In this work, travel demands are modeled based on the activities individuals intend
to perform. A driver is a person who has a relatively stable routine, owns a car,
and is willing to offer a ride to other people. Given a ridesharing request including
information such as departure place and time, arrival place and time, and intended
activity at the visited place, a driver as well as an optimal routing is recommended by
the system. To this end, frequent route of the person who can share his/her vehicle is
employed. Besides that, the matching method also considers the demanded activity
in connection with spatial and temporal constraints. Consequently, both driver and
rider derive advantage from ridesharing in terms of travel expense.
We are currently carrying out the performance analysis of the proposed method
on real datasets and implementing a system prototype.
References
7. He, W., Hwang, K., Li, D.: Intelligent carpool routing for urban ridesharing by mining gps
trajectories. IEEE Trans. Intell. Transp. Syst. 15(5), 2286–2296 (2014)
8. Yan, S., Chen, C.Y.: A model and a solution algorithm for the car pooling problem with pre-
matching information. Comput. Ind. Eng. Elsevier 61(3), 512–524 (2011)
Chapter 2
Optimal Scheduling and Benefit Analysis
of Solid Heat Storage Devices in Cold
Regions
Feng Sun, Xin Wen, Wei Fan, Gang Wang, Kai Gao, Jiajue Li and Hao Liu
2.1 Introduction
The solid heat storage device can be installed in the heating range of the thermal power
plant to utilize the low-valley electricity and the power plant to jointly supply heat,
or directly connected to the wind power generator to use the abandoned wind power
storage heat to achieve clean energy heating. The scheduling scheme using the heat
storage device only needs to ensure the demand of the heat load consumption in one
scheduling period, thereby providing greater flexibility for the unit and effectively
improving the problem of time–space mismatch of the energy load of the power
system. The specific implementation mode is: at load lows, when the power grid is
abandoned, the heat storage device is put into operation, the wind power is stored
in the form of heat energy, and the wind power consumption space is increased.
When the user needs to supply heat, the solid-state heat storage device replaces the
cogeneration unit to transfer the stored heat energy to the heat user to alleviate the
operating pressure of the thermal power unit during peak hours.
2 Optimal Scheduling and Benefit Analysis … 15
The operation principle of using solid-state heat storage technology in the power
system is shown in Fig. 2.1. The solution has the following characteristics: when
the heat storage device is put into operation, which can be equivalent to a constant
power load, and compared with the heating operation mode of the combined heat and
power generating unit, the operating technical constraints are greatly reduced, and
it has universality. Change the traditional heating mode, realize the decoupling of
electric heating and heating, better match the power demand and clean energy output
characteristics, and effectively solve the problem of clean energy consumption.
During the period of low load, the limitation of peak load capacity of thermal power
units caused the grid to accept insufficient wind power space, and there was a serious
contradiction between the peak characteristics of wind power and the wind power,
16 F. Sun et al.
Heating period
Heating period
Abandoned wind power MW )
January February March April May June July August September October November December
Time Resolution 15min
Fig. 2.2 Wind power curtailment throughout the year in a cryogenic provincial grid
which caused the power system to abandon wind and power. In particular, for the
analysis of the annual abandonment of wind power in a provincial power grid in a
cold area (as shown in Fig. 2.2).
It can be seen from Fig. 2.3 that the abandonment wind power is the difference
between the equivalent wind power output and the network load.
P jcurtail = P jwind + Pmin
unit
− P jload (2.1)
Among them, P jwind indicates the total power of wind power at the j-th moment,
unit
Pmin indicates the minimum output of the thermal power unit, and it is easy to know
unit
that Pmin is the equivalent wind power output. In this paper, the time interval T c of
wind
Pj occurring during the load valley period is optimized.
When the wind power is sufficient, it is limited by the total capacity of the heat
storage device, and the optimized scheduling plan will be regulated according to
the fixed target. Due to the uncertainty and volatility of wind power output, when
the wind power output is insufficient for a certain period of time to store heat in all
devices, the dispatch plan will be adjusted to the strategy of following the wind power
output. Therefore, in order to effectively suppress the peak-to-valley difference of
the load and maximize consumption of wind power, the scheduling target at the first
moment takes the minimum value between the total capacity limit of the system heat
storage device and the equivalent wind power output, that is,
2 Optimal Scheduling and Benefit Analysis … 17
Unit adjustable
capacity
Upward rotation
reserve
Total network
load
Power
MW
Network
abandonment of
wind power
Wind power
equivalent output
Wind power
consumption space
Unit minimum
output
Time
goal
N
Pj = min Piheat + DG
Pmin , P jwind + unit
Pmin (2.2)
i
Among them, Piheat represents the rated power of the i-th heat storage device, N
DG
represents the total number of heat storage devices, Pmin is the minimum load of the
wind
curve during the low-valley period, P j indicates the total power of wind power at
unit
the j-th moment, and Pmin is the equivalent wind power output.
According to the content of the previous section, the plan is set according to the
scheduling target, and the optimal scheduling model of the heat storage device is
established. By controlling the input and cutout of the large-scale heat storage device,
the wind power is stored in the form of heat energy to the maximum extent. The
scheduling objective function is
M
N
min z = Pj goal
− Piheat xi, j + P jload (2.3)
j i
18 F. Sun et al.
Among them, xi, j denotes the state of the first heat storage device at the j-th
scheduling time, N denotes the total number of heat storage devices, M denotes the
total number of nodes in the low-valley scheduling time, and P jload denotes the power
load of the power system at the j-th time.
2.3.3 Restrictions
System thermal load constraints. In order to reduce waste of resources, the total
amount of heat stored in the wind power supply should not exceed the heat load
demand during the dispatch cycle.
N
M
Piheat xi, j · Tc · β ≤ Q lperiod (2.4)
i j
Among them, β represents the efficiency of the solid-state heat storage device and
period
Ql represents the total heat load during the scheduling period.
Constraints on the capacity of the heat storage device. During a single schedul-
ing period, the operating capacity of the heat storage device participating in the
dispatch does not exceed the effective capacity of the heat storage device.
M
Piheat xi, j · t ≤ Cirated − Cireserve (2.5)
j=1
Among them, Cirated is the rated capacity of the i-th heat storage device and t
is the minimum scheduling time step. Considering the short-term prediction error
of wind power, the heat storage device reserves the reserve capacity Cireserve to cope
with the situation that the actual wind power output exceeds the predicted value.
The system runs security constraints. From the perspective of safety and reliability,
when the wind power fluctuates and cannot be merged into the power grid, the thermal
power unit has the ability to bear this part of the load of the heat storage device.
Therefore, it is required that the total heat storage capacity incorporated in each time
node does not exceed the maximum peaking capacity of the thermal power unit.
N
Piheat xi, j + P jload ≤Pmax
peak
(2.6)
i
peak
Among them, Pmax is the maximum adjustable peak power of the system.
2 Optimal Scheduling and Benefit Analysis … 19
Considering the stable heating load during the heating period in winter, the heat
storage device is used to store heat in the low-valley period, and the heat is supplied
through the rated exothermic power to meet the heating demand for the peak load time
and even the whole day. Therefore, the direct economic benefits of using solid-state
heat storage devices for heating are
N
1 deprecit
Fheat = (Sunit −Swind ) · L unit − · Fi
build
+ Fi + Fi
maintain
(2.7)
i
Tiheat
Sunit , Swind are the cost of power supply for hotspot cogeneration units and wind power
supply units, L unit is the total power consumption of the solid-state heat storage
deprecit
device for the heat storage at the low valley, and Tiheat , Fibuild , Fi , Fimaintain ,
respectively, indicate the service life, construction cost, total depreciation cost, and
total maintenance cost of the i-th heat storage device.
Reduce the auxiliary service market to compensate for power plant peak shaving
Increase the wind power consumption space by optimizing the heat storage device as
shown in Fig. 2.4. It can solve the problem of clean heating and improve the peaking
ability of the system. In order to achieve the goal of clean energy total consumption, a
large amount of wind power connected to the power grid will inevitably bring peaking
pressure to the thermal power unit without using a solid-state heat storage device.
According to the Interim Measures for the Administration of Auxiliary Services of
Grid-connected Power Plants, the regional auxiliary service market compensation
prices are shown in Table 2.1.
After the application of the heat storage device, the compensation cost for the
peak shaving of the thermal power unit can be reduced indirectly.
P
M
N
Fcomp = η j · f N − η Lj · f L · Piunit · t (2.8)
i j
η Lj and f L , respectively, represent the proportion of the deep peak shaving and the
compensation cost of the unit before the heat storage device is dispatched at the j-th
time. η Nj and f N , respectively, represent the proportion of the deep peak shaving
20 F. Sun et al.
Fig. 2.4 Principle of heat storage device optimization scheduling lifting system wind power con-
sumption space
Table 2.1 Unit peaking compensation fee schedule in regional auxiliary services market
Unit peaking depth Electricity subsidy per kw h Remarks
(yuan/kw h)
60% Fine *
50% * System-defined peak shaving
depth
40–50% 0.4 *
40% or less 1 *
of the unit after the optimal dispatch of the heat storage device at the j-th time and
should be compensated. Cost P represents the total number of units operating during
the dispatch day, and Piunit is the rated active power of the i-th unit.
2 Optimal Scheduling and Benefit Analysis … 21
Based on the total load data of the network in Liaoning Province from September 22
to 25, 2017, it can be seen that the variation of load peaks and valleys in the heating
season is obvious (Fig. 2.5).
By analyzing the load characteristics of the province, the typical days are selected
from September 22 and 23 trough (21 o’clock to 7 o’clock). The capacity allocation
of the heat storage devices of the three major thermal storage plants in Liaoning
Province is shown in Table 2.2.
Heat storage device with a total heat storage capacity of 400 MW in FuXin has the
advantages of small capacity and flexible dispatching and distribution. The total heat
storage capacity of Dandong Jinshan and Diaobingshan heat storage are 300 MW
and 260 MW, respectively, which has the characteristics of large capacity and high
stability.
Table 2.2 Capacity of heat Power plant name Heat storage capacity
storage unit in Liaoning
Province 40 MW 60 MW 70 MW 80 MW
DiaoBingShan * * 2 2
DanDong JinShan * 2 2 *
FuXin 10 * * *
In order to simplify the analysis, the constraints of network load are not considered.
The optimal scheduling model is used to solve the calculation of different capacity
heat storage devices. The calculations are calculated in the following three ways:
Mode 1: The heat storage device does not participate in scheduling. In the system,
the unit is heating the heat load, and the operation mode of “determining power
generation by heating” is the reason that the system consumes insufficient wind
power space and causes the grid to abandon wind power.
Mode 2: The heat storage device does not adopt an optimized scheduling scheme,
and performs scheduling control according to the following operational principles.
According to the province’s real-time wind power data, the operation of the heat
storage device is scheduled for the abandonment period, the grid and the wind turbine
cooperate with for the heat storage device to store heat. When the abandonment wind
power is greater than the rated capacity of the heat storage device, the heat storage
device is put into operation and the wind turbine is used for heat storage. Otherwise,
the system discards the wind and obtains electric energy from the power grid for
heat storage. For the non-abandonment stage, the heat storage device only supplies
heat to meet the heat load demand. Adopting the above control strategy will increase
the pressure of the thermal power unit, and the wind power cannot be completely
absorbed.
Mode 3: The heat storage device performs heating according to an optimized
scheduling plan. The unit operation mode is not changed, and rationalizes the dispatch
of heat storage to absorb wind power.
In this paper, the municipal bureau of Liaoning province was analyzed during the
typical daily trough period, and the optimized scheduling model of the heat storage
device was solved, and the operation scheme of the solid heat storage device was
obtained under three different operating modes. In the case of the same wind power
output, the output plan of the heat storage device in different ways is shown in Fig. 2.6.
2 Optimal Scheduling and Benefit Analysis … 23
Mode 3
Mode 2
Mode 1
Heat storage device output MW)
Combined with the actual wind power data, the situation of the abandoned wind
power under the three scheduling schemes is shown in Fig. 2.7.
It can be seen from the abandonment wind meter that the operation mode 2
and mode 3 can effectively reduce the abandonment wind power, and the scheduling
plan of the optimized solid-state heat storage device optimization scheduling method
reduces the most abandonment wind power.
It should be noted that when operating mode 2 is adopted, the cogeneration unit
is required to provide part of the heat storage capacity while reducing the amount
of abandonment wind power. This converts high-grade electrical energy into heat,
which creates unnecessary energy waste. The results show that the heat storage
device scheduling plan of optimized scheduling model is better. According to the
relevant provisions of the notice on the trial of peak-to-valley electricity price policy
for residential electric heating users, Table 2.3 gives a comparison of the benefits of
the three schemes.
24 F. Sun et al.
Mode1
Mode2
Mode3
MW)
Abandoned
wind power
Abandoned wind power
Fig. 2.7 Comparison of three operation modes of system wind power curtailment
2.6 Conclusion
This paper proposes to use large-scale high-power solid-state heat storage to absorb
wind power and reduce the system peak-to-valley difference. The example shows that
optimizing the dispatching heat storage device can not only save the high electricity
cost generated by the peak heating but also improve the energy load level to reduce
the deep peak peaking of the unit, maximize the system wind power consumption
space, and optimize the overall utility income.
References
3.1 Introduction
In WSNs, most sensor nodes are randomly deployed and their specific localization is
unknown [1]. Node localization based on RSSI ranging has certain localization error
because of the vulnerability of electromagnetic wave transmission to environmental
interference [2]. Therefore, how to improve node localization algorithm in order to
improve localization accuracy without additional hardware has become a research
hot spot of node localization technology [3]. A trigonometric extremum suppres-
sion localization algorithm is proposed. It has better stability, but cannot avoid the
existence of gross error [4]. A cooperative localization algorithm based on received
signal strength is proposed. It improved the localization accuracy of nodes to a cer-
tain extent, but it ignored the information between unknown nodes, resulting in a lot
of waste redundant information [5]. A node deployment strategy of wireless sensor
network based on IRVFA algorithm is presented. The strategy can improve network
coverage rate and effective utilization rate of nodes at the same time, but it will also
lead to increased node localization costs [6]. A parameter tracking method based on
RSSI is proposed. It can improve positional accuracy, but the algorithm is complex
and difficult to achieve quick localization requirement [7]. The shuffled frog leaping
algorithm is presented. It can reduce the localization error, but it cannot be suitable
for big scale networks [6].
In order to find more suitable parameters of the transmission model in the detection
area, this paper proposes an optimization algorithm of RSSI transmission model for
distance error correction. This algorithm uses the optimization characteristics of FA
algorithm and the fast approximation characteristics of PSO algorithm to introduce
FA algorithm into PSO algorithm to help it get the global optimal solution, and then
uses FA algorithm to get the global optimal solution. This paper also proposes a
logarithmic decrement inertia weight to improve the precision of searching solution
and accelerate convergent speed.
The localization algorithm based on RSSI ranging includes ranging phrase and local-
ization phrase. In the ranging stage, the commonly used signal transmission model
is mainly logarithmic-distance path loss model, which is described as
dut
PL (dut ) = PL (d0 ) + 10k lg( ) + xσ (3.1)
d0
where PL (dut )(dBm) is the path loss when the distance between unknown node u and
anchor node t is dut (m). PL (d0 )(dBm) is the path loss when the referenced distance
is d0 (m), typically d0 = 1 m. k is the path loss exponent, usually k = 2–6 xσ is the
Gaussian noise variable with zero mean and mean variance of σ [8].
3 Optimization Algorithm of RSSI Transmission Model … 29
Therefore, the distance between unknown node and anchor node is depicted as
PL (dij )−PL (do )−xσ
dut = d0 × 10 10k (3.2)
(x − xε )2 + (y − yε )2 + (z − zε )2 = dutε
2
(ε = 1, 2, 3, 4) (3.3)
The distance between the unknown node U (x, y, z) and the anchor node
Tε (xε , yε , zε ), ε = 1, 2, 3, 4 is dutε . But in the actual environment, RSSI localiza-
tion algorithm is easily affected by the surrounding environment, multipath effect,
non-line-of-sight transmission, and so on, thus generating localization errors, so only
the estimated coordinate value of the unknown node U (x, y, z) can be obtained.
In logarithmic-distance path loss model, the parameters affecting RSSI ranging accu-
racy are path loss factor and reference path loss. Their values are related to the
surroundings. Therefore, in this paper RSSI-DEC is proposed. First, FA algorithm
is introduced into PSO algorithm to help it obtain the global optimal solution. In
addition, in order to enable the algorithm to search a large area at high speed at the
beginning of iteration and gradually shrink to a better space at the end of iteration
and implement more detailed search, this paper introduces logarithmic decrement
inertia weight based on the method of linear decrement of inertia weight to improve
the accuracy of the search solution and speed up the convergence speed.
By using information of all anchor nodes (M) that can communicate with each other,
for any two anchor nodes u and t that can communicate with each other, according
to the principle of minimum sum of squares of errors, the following are
30 Y. Liu et al.
where dut is the measured distance between anchor nodes, Dut is the actual distance
between anchor nodes, PL(dut ) is the measured path loss according to the current
environment of the node. Individuals who use intelligent algorithm to optimize are
the path loss factor k and the reference path loss PL(d0 ) between anchor nodes are
recorded as x(k, PL(d0 )). As the objective functions of FA and PSO algorithm, the
optimal parameters of two signal transmission models are found.
Inertial weight ω plays an important role in adjusting the search capability in PSO
algorithm and FA algorithm. In order to avoid the prematurity of algorithm and
balance the local search ability and global search ability of algorithm, this paper
optimizes the inertia weight function.
where ωi represents the current iteration weight value. ωmax , ωmin are the maximum
value and minimum value of inertia weight, respectively, itermax is the current max-
imum number of iterations, iter is the current number of iterations of the algorithm,
λ is logarithmic adjustment factor, when 0 < λ<1, it is compression factor and when
λ > 1 it is expansion factor.
The logarithmic decrement type inertia weight function is introduced into FA algo-
rithm’s position update formula and FA algorithm’s speed update formula, respec-
tively, in this paper.
(1) Localization update of FA algorithm
The relative fluorescence brightness of fireflies is
β = β0 × e−γ rij
2
(3.7)
objective function value, the higher the brightness of fireflies themselves. γ is the
absorption coefficient of light intensity, because fluorescence will gradually weaken
with the increase of distance and absorption of propagation medium, so setting the
absorption coefficient of light intensity to reflect this characteristic can be set as a
constant, rij is the European distance between fireflies i and j, and in this article is
the European distance between x(k, PL(d0 ))i and x(k, PL(d0 ))j .
Logarithmic decrement type inertia weight function and update the localization
where fireflies i are attracted to move toward fireflies j
where xi and xj are fireflies i and j in spatial positions. α is the step factor and is [0,
1] constant on [0, 1]. rand is a random factor with uniform distribution.
(2) PSO algorithm speed update
The logarithmic decrement type inertia weight function is introduced to update the
particle velocity formula and position update formula
xi = xi + vi (3.10)
if satisfied f (xFA ) < f (xp ) and f (xFA ) < f (xg ) then update the optimal solution
pbesti . gbesti then proceed to the next step, otherwise return (2).
(4) Update the speed and localization of PSO algorithm.
(5) Check the termination condition. The termination condition is set as iteration
number of iterations. If the iteration number reaches iteration number, the algo-
rithm ends and returns to the current global optimal particle position, which is
the best combination of the three parameters in FA algorithm. If the termination
condition is not met, return (3).
According to the above RSSI-DEC algorithm, more precise distances between nodes
can be obtained. In order to realize node localization in three-dimensional environ-
ment, four-sided ranging method can be used to obtain coordinates of unknown
nodes.
Four-sided ranging method is extended from three-sided measurement method.
Assuming that the coordinates of four beacon nodes are, respectively, Aa (xa , ya , za ),
Ab (xb , yb , zb ), Ac (xc , yc , zc ), and Ad (xd , yd , zd ) and the coordinates of unknown node
U are (x, y, z), the distance measured from the unknown node to each beacon node
is da , db , dc , and dd . According to the three-dimensional spatial distance formula, a
set of nonlinear equations can be obtained as follows:
⎧
⎪
⎪ (x − xa )2 + (y − ya )2 + (z − za )2 = da2
⎨
(x − xb )2 + (y − yb )2 + (z − zb )2 = db2
(3.11)
⎪
⎪ (x − xc )2 + (y − yc )2 + (z − zc )2 = dc2
⎩
(x − xd )2 + (y − yd )2 + (z − zd )2 = dd2
By solving the linear equation, the coordinates of unknown nodes can be obtained.
In MATLAB 2014, we distribute 150 nodes (including 50 anchor nodes and 100
unknown nodes) in the area of 100 × 100 × 100 m3 . Node communication radius
rnode = 10 m, reference distance between nodes d0 = 1 m, ωmax = 0.9, ωmin = 0.4,
and λ = 0.2, firefly brightness I = 1, attractive force β = 0.2, and learning factor
c1 = c2 = 2.
Figure 3.1 shows the fitness curve of FAPSO algorithm with different inertia
weights. As can be seen from the figure, with the increase of iteration times, the
fitness function value gradually decreases, that is, gradually approaches the optimal
value. The logarithmic decrement inertia weight function proposed in this paper
3 Optimization Algorithm of RSSI Transmission Model … 33
60
ω=1
ω=0.8
50
ω=ωi
the fitness function value(%)
40
30
20
10
0
0 50 100 150 200
iteration
Fig. 3.1 RSSI-DEC algorithm with different inertia weights optimal value change curve
has the smallest fitness function value and the smallest error. When the number of
iterations is about 127, the fitness function value tends to the minimum. It can be
seen that the attenuation inertia weight proposed in this paper plays an active role
in the operation of the algorithm. The algorithm has relatively weak global search
capability and relatively strong local search capability. At this time, the algorithm
has stronger search capability near the extreme value, which is helpful to find the
optimal solution.
Figure 3.2 shows the comparison of node localization errors after localiza-
tion using different model parameter optimization algorithms. Algorithms include
WPSO, WFA, and RSSI-DEC and those are used to optimize the model parameters,
and then localization using four-sided localization method. The above three algorithm
optimization X (k, PL(d0 )) parameters are, respectively, and the average localization
error is about 24.08%, 18.98%, and 9.17% after localization using the three signal
transmission models, respectively, XPSO (3.91, −45.32), XW FA (3.62, −40.09), and
X (3.17, −41.53). It can be seen that the average relative localization error of RSSI-
DEC algorithm is lower than that of WPSO algorithm and WFA algorithm, which
validates the optimization effect of RSSI-DEC algorithm model effectively, and the
X (k, PL(d0 )) optimal parameter obtained is X (3.17, −41.53).
Figure 3.3 shows the comparison of average relative localization errors of nodes
after localization using different node localization algorithms. As can be seen from the
34 Y. Liu et al.
50
WPSO
WFA
RSSI-DEC
40
average relative localization error%
30
20
10
0
0 20 40 60 80 100
unknown nodes
Fig. 3.2 Comparison of the average relative localization error of unknown nodes after localization
using different model parameter optimization algorithms
figure, the average relative errors of WRSSI algorithm, ARSSI algorithm, and RSSI-
DEC algorithm proposed in this paper are 31.46%, 15.02%, and 9.17%, respectively.
It can be seen that the average relative localization error of RSSI-DEC algorithm
is lower than that of WRSSI algorithm and ARRSSI algorithm, which has good
localization effect.
3.5 Conclusion
In order to obtain the most suitable parameters of the signal transmission model for
the wireless sensor network node localization algorithm based on RSSI ranging, an
RSSI-DEC optimization algorithm based on ranging error correction is proposed. The
optimal parameters are solved by intelligent algorithm, and a new transmission model
is constructed. The model is applied to node localization. The simulation results show
that the algorithm proposed in this paper overcomes the limitations of traditional RSSI
algorithm model parameters, improves the environmental adaptability of algorithm,
and has better ranging accuracy and stability compared with the algorithm optimized
by the same distance. The RSSI-DEC-based node localization algorithm proposed in
3 Optimization Algorithm of RSSI Transmission Model … 35
60
WRSSI
ARSSI
50 RSSI-DEC
average relative localization error%
40
30
20
10
0
0 20 40 60 80 100 120
unknown nodes
Fig. 3.3 Comparison of average relative localization error of unknown nodes after localization
using different node localization algorithms
this paper has an average relative localization error of 9.17%. It is 22.29% lower than
the RSSI-based weighted centroid localization algorithm (WRSSI) and it is 5.85%
lower than the adaptive RSSI localization algorithm (ARSSI).
Acknowledgements This work was supported by “Research on Lightweight Active Immune Tech-
nology for Electric Power Supervisory Control System”, a science and technology project of State
Grid Co., Ltd. in 2019.
References
1. Yourong, C., Siyi, L., Junjie, C.: Node localization algorithm of wireless sensor networks with
mobile beacon node. Peer-to-Peer Netw. Appl. 10(3), 795–807 (2017)
2. Fariz, N., Jamil, N., Din, M.M.: An improved indoor location technique using Kalman filtering
on RSSI. J. Comput. Theor. Nanosci. 24(3), 1591–1598 (2018)
3. Teng, Z., Qu, Z., Zhang, L., Guo, S.: Research on vehicle navigation BD/DR/MM integrated
navigation positioning. J. Northeast Electr. Power Univ. 37(4), 98–101 (2017)
4. Rencheng, J., Zhiping, C., Hao, X.: An RSSI-based localization algorithm for outliers suppres-
sion in wireless sensor networks. Wirel. Netw. 21(8), 2561–2569 (2015)
5. Zhang, X., Xiong, W., Xu, B.: A cooperative localization algorithm based on RSSI model in
wireless sensor networks. J. Electr. Meas. Instrum. 30(7), 1008–1015 (2016)
36 Y. Liu et al.
6. Teng, Z., Xu, M., Zhang, L.: Nodes deployment in wireless sensor networks based on improved
reliability virtual force algorithm. J. Northeast Dianli Univ. 36(2), 86–89 (2016)
7. Jinze, D., Jean, F.D., Yide, W.: A RSSI-based parameter tracking strategy for constrained position
localization. EURASIP J. Adv. Signal Process. 2017(1), 77 (2017)
8. Yu, Z., Guo, G.: Improvement of localization technology based on RSSI in ZigBee networks.
Wirel. Pers. Commun. 95(3), 1–20 (2016)
9. Sun, Z., Zhou, C.: Adaptive clustering algorithm in WSN based on energy and distance. J.
Northeast Dianli Univ. 36(1), 82–86 (2016)
Chapter 4
A New Ontology Meta-Matching
Technique with a Hybrid Semantic
Similarity Measure
Abstract Ontology is the kernel technique of semantic web, which can be used to
describe the concepts and their relationships in a particular domain. However, dif-
ferent domain experts would construct the ontologies according to different require-
ments, and there exists a heterogeneity problem among the ontologies, which hinders
the interaction between ontology-based intelligent systems. Ontology matching tech-
nique can determine the links between heterogeneous concepts, which is an effective
method for solving this problem. Semantic similarity measure is a function to calcu-
late to what extent two concepts are similar to each other, which is the key component
of ontology matching technique. Generally, multiple semantic similarity measures
are used together to improve the accuracy of the concept recognition. How to com-
bine these semantic similarity measures, i.e., the ontology meta-matching problem,
is a challenge in the ontology matching domain. To address this challenge, this paper
proposes a new ontology meta-matching technique, which applies a novel combi-
nation framework to aggregate two broad categories of similarity measures. The
experiment uses the famous benchmark provided by the Ontology Alignment Eval-
uation Initiative (OAEI). Comparing results with the participants of OAEI shows the
effectiveness of the proposal.
4.1 Introduction
Since ontology can reach consensus on the meaning of concepts in a certain field
and provides rich domain knowledge and semantic vocabularies for the interaction
between intelligent systems, it is considered as a solution to the heterogeneity of
data in the semantic web. However, due to the decentralized nature of the semantic
web, the same concept may have different definitions in different ontologies, which
causes the so-called ontology heterogeneity problem. The ontology heterogeneity
problem seriously affects the sharing between domain knowledge and has become
the bottleneck of interaction and collaboration between semantic web application
systems. Ontology matching technique can determine the links between heteroge-
neous concepts, which is an effective method for solving this problem. Semantic
similarity measure is a key component of ontology matching technology, which
is a function to calculate the similarity between two concepts. There are currently
four types of semantic similarity measures, i.e., literal-based method, background-
knowledge-based method, context-based method, and instance-based method [1].
Each type of method is subdivided into a number of specific methods, for example,
with respect to the background-knowledge-based similarity measure [2], the specific
method could be the node-based methods, the edge-based methods, and the mixed
methods of two approaches. Usually, multiple semantic similarity measures are used
together to improve the accuracy of the concept recognition [3, 4], but how to com-
bine these semantic similarity measures, i.e., the ontology meta-matching problem,
is a challenge in the ontology matching domain [5]. To address this challenge and
improve the ontology alignment’s quality, in this paper, a new combination frame-
work is proposed to aggregate two broad categories of similarity measures, i.e., the
ones based on edit distance and background knowledge base.
The rest of this paper is organized as follows: Sect. 4.2 introduces the basic con-
cepts, Sect. 4.3 describes the composition of similarity measures in detail, Sect. 4.4
shows the experimental study, and finally Sect. 4.5 draws the conclusion and presents
the future work.
There are many definitions of ontology. Here, for the convenience of work, ontology
is defined as follows:
O = (C, P, I ),
4 A New Ontology Meta-Matching Technique … 39
where
(e, e , n),
where
e and e are the entities in two ontologies, n is the similarity value between e and
e , which is in [0, 1].
AN = f (O, O , A),
where
O and O are two ontologies, respectively, A is the set of entity similarity value n
in Definition 4.2.
The value of ontology matching value is the average value of entity matching
value. In this paper, it is the average value of all attribute similarity, and the same
weight is adopted for different attributes. The similarity value interval of ontology
is [0, 1]. When the similarity value of two ontologies is 1, it means that the two
ontologies are equivalent. When the similarity value of two ontologies is 0, the two
ontologies are completely unrelated.
This paper utilizes two broad categories of similarity measures, i.e., the edit-distance-
based similarity measure and the background-knowledge-based similarity measure.
In this work, we use the similarity measure proposed by Wu and Palmer [6], which
works with the Wordnet [literature]. With respect to the edit-distance-based similarity
measure, we use the N-gram distance [7] similarity measure and cosine distance
similarity measure. Next, these measures are described one by one in detail.
Similarity measure technology based on background knowledge base
WordNet is an electronic language database that covers a collection of synonyms for
various vocabularies. It has hierarchical sub-parent relationships and is commonly
used to measure similar relationships between concepts. This paper uses the Wu and
40 J. Lu et al.
Palmer similarity measure, which considers the depth of the recent public parent con-
cept of the two concepts in WordNet. The deeper the parent concept in WordNet, the
stronger the conceptual semantic relationship between the two concepts. Compared
with the SimLC similarity measure [8], it considers the change in the strength of the
connection between concepts, and the measurement will be more accurate. Given
two concepts c1 and c2 , the similarity measure of Wu and Palmer between them is
Simwp (c1 , c2 ) equal to
2 ∗ depth(LCAc1 ,c2 )
Simwp (c1 , c2 ) = , (4.1)
depth(c1 ) + depth(c2 )
2 ∗ comm(s1 , s2 )
N −gram(s1 , s2 ) = , (4.2)
NS1 + NS2
where comm(s1 , s2 ) represents the number of common substrings in the two strings,
and Ns1 and Ns2 , respectively, represent the number of substrings in the string s1 and
the string s2 , respectively.
As a famous edit distance measure, cosine distance is suitable for the similarity
measure of sentences. Given two sentences D1 and D2 , the cosine distance is defined
as follows:
V1 · V2
Cos(D1 , D2 ) = , (4.3)
V1 × V2
The quality of ontology matching results is usually evaluated through recall and
precision. Recall (also known as complete sex) is used to measure the proportion
of the correct matching results found to account for all correct results. A value of 1
for recall means that all correct matching results have been found. However, recall
does not provide the number of incorrect matching results in the found matching
results. Therefore, recall needs to be considered together with precision (also called
correctness), which is used to measure the proportion of the correct matching result
in the found matching results. A precision value of 1 means that all found matching
results are correct, but this does not mean that all correct matching results have
been found. Therefore, recall and precision must be weighed together, which can
be achieved by the f-measure (i.e., the weighted harmonic mean of the recall and
precision).
Given a reference matching R and a matching result A, the recall, precision, and
f-measure can be calculated by the following formula:
|R ∩ A|
recall = (4.4)
R
|R ∩ A|
precision = (4.5)
A
2 ∗ recall ∗ precision
f −measure = (4.6)
recall + precision
After parsing the benchmark test set in OAEI, three types of entities are obtained, i.e.,
data property, class, and object property. Each of the three types of entities contains
three attributes, i.e., ID, label, and comment. This paper will measure the similarity
of the three types of entities separately.
• According to the similarity measure matrix, when an entity in the ontology com-
pares with all entities of the same type in another ontology, we consider the ID and
label in the entity as a group, and measure the two entities by the N-gram method
and the Wu and Palmer method, respectively. If the maximum value of the ID and
label similarity values is larger than the threshold, the corresponding matching
pairs will, respectively, be added into the similar sets N, W of the N-gram and Wu
and Palmer. After that, N and W are combined to obtain U. In particular, when
combining N and W, there are four types of possible situations in the above three
sets as follows:
42 J. Lu et al.
1. For the first type of situation, take the entity matching pair and put it into the
same entity matching set S;
2. For the second type of situation, the related entities in the matching pairs of
multiple entities in the union U are taken out, and the similarity measure is
performed on the comment of the entity using the cosine distance, and finally
the matching pair with the largest value of the merit is put into the set S;
3. For the third type of situation, take out the matching pairs in the set N − W
and the set W − N , and use the cosine distance to measure the similarity of the
entity’s comment attribute, and finally take the matching pair with the largest
common similarity value into the set S;
4. For the fourth type of situation, use cosine to measure the similarity of the
comments of all entities, and finally take the matching pair with the largest
common similarity value into the set S;
5. For the second type of situation, the third type of situation, and the fourth type
of situation, there will often be no comment, then the N-gram distance will be
used to measure the similarity of the ID and the label, taking the average of
the two as the similarity value between the entities, when the similarity value
is greater than the threshold, the corresponding matching pair is put into the
set S.
• We find that when the entity matching order is disturbed (reverse order compari-
son, random comparison), the entity matching pairs in the set S will change. By
sequential comparison, reverse order comparison, and random comparison, we
extract the entities in the change matching pair, and use the N-gram distance to,
respectively, identify their IDs and labels. The similarity measure is performed,
and the similarity measure is performed on their comment using the cosine dis-
tance. Finally, the average of the three similarity values is taken, and the entity
pair with the largest average value is put into the set S.
4 A New Ontology Meta-Matching Technique … 43
In this test, the famous Ontology Alignment Evaluation Initiative (OAEI) 20161 test
case set was used. A brief description of the OAEI 2016 test case set is shown in
Table 4.1. Each test case in the OAEI test case set consists of two ontology to be
matched and one reference match for evaluating the matching results.
In this experiment, in the preprocessing stage each entity string will be lowercased
by the letter, when the matching pair cannot be determined, and the ID and label
need to be measured, WordNet will first be used to detect whether the vocabulary
constituting ID and label exists.
In terms of thresholds, the thresholds for each phase are determined by commis-
sioning as follows:
• When using N-gram distance and Wu and Palmer similarity measure to measure
ID and label, the threshold is taken as 0.9. (When the similarity value is greater
than 0.9, the concepts being measured may be considered similar or identical.)
• When using cosine distance to measure the similarity of comment, the threshold
is taken as 0.9. (When the similarity value is greater than 0.9, the concepts being
measured may be considered similar or identical.)
• When ID and label cannot determine the matching pair and there is no comment,
using N-gram distance to measure ID and label. Take the average of the similarity
values of the two as the final entity similarity. The threshold is 0.95. (Because
WordNet is not used here, raising the threshold is beneficial to improve the preci-
sion of the metric.)
Table 4.2 compares the results obtained by the method presented in this paper with
those of OAEI 2016 participants, where the values are the matching results for the
three types of test cases described in Table 4.1. According to the relevant OAEI reg-
ulations, test cases that are not automatically generated are removed for convenience
comparison: 102–104, 203–210, 230–231. The results obtained by the method in this
paper are the average of the results in five independent runs (the OAEI participants
are the average of the results in five independent runs), and in Table 4.2, the symbols
P, F, and R represent the values of precision, f-measure, and recall, respectively.
It can be seen from Table 4.2 that the precision obtained in this paper ranks third,
but the recall rate and f-measure are higher than other measures, so the measure of
this paper is effective.
Acknowledgements This work is supported by the Program for New Century Excellent Talents
in Fujian Province University (No. GY-Z18155), the Program for Outstanding Young Scientific
Researcher in Fujian Province University (No. GY-Z160149), the 2018 Program for Outstanding
Young Scientific Researcher in Fujian, the Scientific Research Project on Education for Young and
4 A New Ontology Meta-Matching Technique … 45
Middle-aged Teachers in Fujian Province (No. JZ170367), and the Scientific Research Foundation
of Fujian University of Technology (No. GY-Z17162).
References
1. Xue, X., Wang, Y.: Using memetic algorithm for instance coreference resolution. IEEE Trans.
Knowl. Data Eng. 28(2), 580–591 (2016)
2. Xue, X., Pan, J.S.: A compact co-evolutionary algorithm for sensor ontology meta-matching.
Knowl. Inf. Syst. 56(2), 335–353 (2018)
3. Xue, X., Wang, Y.: Optimizing ontology alignments through a memetic algorithm using both
MatchFmeasure and unanimous improvement ratio. Artif. Intell. 223, 65–81 (2015)
4. Cai, Y., Zhang, Q., Lu, W., et al.: A hybrid approach for measuring semantic similarity based
on IC-weighted path distance in WordNet. J. Intell. Inf. Syst. 51(1), 23–47 (2018)
5. Xue, X., Wang, Y., Ren, A.: Optimizing ontology alignment through memetic algorithm based
on partial reference alignment. Expert Syst. Appl. 41(7), 3213–3222 (2014)
6. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual
Meeting on Association for Computational Linguistics, pp. 133–138. Association for Compu-
tational Linguistics (1994)
7. Mascardi, V., Locoro, A., Rosso, P.: Automatic ontology matching via upper ontologies: a
systematic evaluation. IEEE Trans. Knowl. Data Eng. 22(5), 609–623 (2010)
8. Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense
identification. In: WordNet: An Electronic Lexical Database, vol. 49, no 2, pp. 265–283 (1998)
9. Richard Benjamins, V. (ed.): Knowledge Engineering and Knowledge Management: Ontologies
and the Semantic Web. Springer Verlag, Berlin (2003)
Chapter 5
Artificial Bee Colony Algorithm
Combined with Uniform Design
Abstract As artificial bee colony algorithm is sensitive to the initial solutions, and
is easy to fall into local optimum and premature convergence, this study presents a
novel artificial bee colony algorithm based on uniform design to acquire the better
initial solutions. It introduces an initialization method with uniform design to replace
random initialization, and selects the better ones of those initial bees generated by the
initialization method as the initial bee colony. This study also introduces a crossover
operator based on uniform design, which can search evenly the solutions in the
small vector space formed by two parents. This can increase searching efficiency and
accuracy. The best two of the offsprings generated by the crossover operator based on
uniform design are taken as new offsprings, and they are compared with their parents
to determine whether to update their patents or not. The crossover operator can ensure
that the proposed algorithm searches uniformly the solution space. Experimental
results performed on several frequently used test functions demonstrate that the
proposed algorithm has more outstanding performance and better global searching
ability than standard artificial bee colony algorithm.
5.1 Introduction
Artificial bee colony (ABC) algorithm [1–3] is a novel heuristic optimization algo-
rithm inspired by bees’ collecting honey. Standard ABC accomplishes the optimiza-
tion for a problem by simulating the process of bees’ looking for nectar sources,
which includes the stage of employed bees, that of onlookers, and that of scouters
as well. Because of few control parameters, high accuracy, and strong search perfor-
mance, ABC has been applied to continuous space optimization, data mining, neural
network training, etc. However, ABC has still some disadvantages such as premature
convergence and be easy to fall into local optimum. Many researchers have proposed
a variety of improvement methods to improve the performance of ABC; however, till
now, it is still a difficult problem how to improve the convergence of the algorithm
and avoid falling into the local optimum.
Uniform design was first proposed by Wang and Fang in 1978. It aims how to
distribute uniformly the design points within the test range, so as to obtain as many
information as possible using as few test points as possible. ABC performs the assay
by a set of elaborately designed tables, which is similar to the orthogonal design.
Each uniform design table is accompanied by a usage table, which indicates how
to select the appropriate columns from the design table and the uniformity levels of
testing program formed by the selected columns. Uniform extends the methods for
the classical, deterministic univariate problems into the calculation of multivariate
problems. Its main goal is to sample a small number of points from a given set of
points so that the sampled points can be evenly distributed in the whole solution
vector space.
In order to search the solution space uniformly, the study introduces uniform
design to generate the initial bee colony, so that the individuals in the bee colony can
scatter evenly over the feasible space of a problem. In order to increase the guidance
and influence of the optimal nectar source on each nectar source, the study introduces
the crossover operator based on uniform design, so that two parents participating in
crossover can acquire their offsprings uniformly. The crossover operator is performed
between each nectar source and the optimal nectar source, which is to search evenly
the small vector space formed by them. This can increase the influence of the optimal
nectar source and acquire good fine search.
Artificial bee colony (ABC) belongs to one of swarm intelligent optimization algo-
rithms. Inspired by the process of bees’ collecting honey, it simulates the types of
bees, the roles of bees, and the process of collecting honey to address the practical
optimization problems. If the problem to optimize is regarded as the nectar source to
search, and then its feasible solution is equivalent to the location of a nectar source,
while its fitness is equivalent to the amount of nectar in the nectar source. The more
5 Artificial Bee Colony Algorithm Combined with Uniform Design 49
the amount of nectar is, the better the nectar source is. The maximization optimiza-
tion problem can be solved directly using ABC, while the minimization optimization
problem needs to be transformed to use ABC indirectly. According to different roles
of bees, they can be divided into three types such as employed bees, onlookers, and
scouters. The number of employed bees is generally assumed to be equal to the
number of onlookers, and be equal to the number of nectar sources. However, the
number of scouters is only 1, and it can work only when certain conditions have been
met. Therefore, the searching process of an optimization problem is correspondingly
divided into the stage of employed bees, that of onlookers, and that of scouters.
Given the dimension of a problem is D, the amount of nectar sources, employed
bees, and onlookers SN, then the standard ABC algorithms regard the process of
seeking the solution for the problem as that of searching the nectar source in D-
dimensional vector space. Its detailed steps are as follows:
(1) Initialization of bee colony
Random initialization method is utilized to initialize SN nectar sources, and the
initialization formula is shown in formula (5.1):
where vid indicates a new nectar source, x id is the same as formula (5.1), x k represents
a nectar source different from x i , x kd indicates the d-dimensional value of x k ; k = i and
k ∈ {1, 2, . . . , SN }, and r 2 denotes the random number distributed uniformly within
the interval [0, 1]. In the above formula, it is to look for a different neighbor nectar
source, and it updates the old nectar source of bees by differential mode. Formula
(5.2) cannot ensure that the updated nectar sources of employed bees lie in the scopes
of the feasible solutions of the problem to optimize. Therefore, bound scopes need
to be checked by means of setting the values less than lower bound or those larger
than upper bound into lower bound or upper bound, respectively. After the nectar
source of employed bees was obtained by the above formula, greedy algorithms are
50 J. Zhang et al.
utilized to compare the fitness of nectar source and that of employed bees’ nectar
source. The greedy selection strategy is employed to select the better nectar source.
(3) Stage of onlookers
At this stage, onlookers select nectar sources by means of roulette strategy. This
is to ensure that the nectar source with higher fitness is updated more likely. The
probability of each nectar source is calculated according to the following formula
(5.3):
Fi
Pi = SN (5.3)
i=1 Fi
where Fi denotes the fitness of the i-th nectar source, and its calculation formula is
shown in the following formula (5.4):
1
, fiti ≥ 0
Fi = 1+fiti (5.4)
1 + |fiti |, fiti < 0
where fit i and |fit i | represent the objective function value and its absolute value,
respectively.
Similarly, employed bee, after selecting a nectar source, onlooker updates its
nectar source using formula (5.2), checks its bounds, compares its fitness and the
fitness of nectar source in terms of greedy algorithms, and selects the better nectar
source by means of the greedy selection strategy.
(4) Stage of scouters
For each nectar source, the parameter trail can determine the number of the nectar
source that does not update. This is equivalent to the number of the optimal solution
of the problem to optimize that does not change. At initialization, the trail values of
all nectar sources are all equal to 0. At the stages of employed bees and onlookers,
if a nectar source is updated, namely, a better nectar source is found, and then trail
← 0, while if a nectar source is maintained as the previous nectar source, then trail
← trail + 1. In ABC, a predefined parameter limit is utilized to control scouters. If
trail is larger than or equal to limit, then the stage of scouters will start.
Before terminal condition is satisfied, ABC goes to the abovementioned stages
of employed bees, onlookers, and scouters orderly and repeatedly, respectively. The
best nectar source so far is saved in each loop. The solution of the optimal nectar
source is regarded as the optimal solution of the problem to optimize [4].
5 Artificial Bee Colony Algorithm Combined with Uniform Design 51
Uniform design [5–11] is a sample method. It enables the sampled data points to
scatter uniformly over the solution space of a problem to optimize. This is to both
increase the diversity of data points and improve the search efficiency. The solution
space is divided into multiple subspaces first, and then uniform design is applied in
each of the subspaces to obtain the initial population generation algorithm based on
uniform design [6, 8, 9]. According to the intersection of the upper and lower bounds
of two parents, uniform design is applied in two parents to obtain the crossover
operator with uniform design [6].
ABC algorithm is sensitive to the initial solution, but the initial population plays an
important role in the subsequent iteration. The good initial solution may acquire the
optimal solution quickly, while the poor may fall into local optimum. ABC algorithm
uses the random initialization method, which does not ensure that the obtained initial
solutions are scattered in the vector space of the problem. These solutions may
concentrate only in several regions while other regions are not distributed any at all.
Therefore, the study presents an artificial bee colony based on uniform design. It
uses the initial colony generation algorithm based on uniform design to generate a
group of the initial bee colony scattered evenly over the vector space. Between each
nectar source and the optimal nectar source, the crossover-based uniform design is
conducted to generate the better nectar source. If the better nectar source is generated,
then the current nectar source is substituted by the better nectar source, otherwise
the current nectar source is kept.
Step 5 Go to the stage of scouters. For each nectar source in P3 , if trail ≥ limit,
then generate a new nectar source using formula (5.1) to replace the current
nectar source; otherwise, keep the current nectar source. The generated new
nectar source is marked as P4 .
Step 6 Calculate the fitness of each nectar source in P4 and find out the optimal
nectar source bestP2 . If bestP2 is superior to bestP1 , then bestP1 ← bestP2 .
Step 7 Perform the crossover operator based on uniform design on each nectar
source in P4 and bestP1 and find out the best one Oopt from the generate Q1
offsprings. If Oopt is superior to the current nectar source, then update the
current nectar source to obtain a new nectar source P5 . If Oopt is superior to
bestP1 , then bestP1 ← Oopt .
Step 8 If the terminal condition is not satisfied, then P1 ← P5 and turn Step 3; oth-
erwise, output the optimal nectar source bestP1 and terminate the algorithm.
Several commonly used test functions are utilized to evaluate the performance of the
proposed algorithm UABC. These test functions, respectively, take 50, 100, and 200
dimensions to evaluate the robustness of UABC. UABC and ABC are, respectively,
conducted 20 runs to calculate average value and standard deviation of the optimal
values.
The symbols and function names of several test functions are as follows: f 1 ↔ Sphere,
f 2 ↔ Rosenbrock, f 3 ↔ Griewank, f 4 ↔ Rastrigrin, f 5 ↔ Schwefel’s problem 22,
f 6 ↔ Ackley, f 7 ↔ Sum of different power, f 8 ↔ Step, f 9 ↔ Quartic, and f 10 ↔ axis
parallel hyper-ellipsoid. The expressions and search scopes of several test functions
are shown in Table 5.1.
• Parameters for ABC: the size of bee colony SN = 60; the number of employed
bees, onlookers, and nectar sources is all equal to SN, while the number of scouters
is 1; the predefined parameter at the stage of scouters limit = 10.
• Parameters for the uniform design: the number of subintervals S = 4; the number
of the sample points or the size of bee colony in each subinterval Q0 = 17; the
parameter in uniform cross Q1 = 5.
5 Artificial Bee Colony Algorithm Combined with Uniform Design 53
n−1 [−30, 30]
f2 = 100(xi+1 − xi2 )2 + (xi − 1)2
i=1
n
n [−600, 600]
f3 = 1
4000 xi2 − cos( √xi ) + 1
i
i=1 i=1
n [−5.12, 5.12]
f4 = [x2i − 10 cos(2π xi ) + 10]
i=1
n
n [−10, 10]
f5 = |xi | + |xi |
i=1 i=1
N
N [−30, 30]
f6 = −20 exp −0.2 1
N xi2 − exp 1
N cos(2π xi ) + exp(1) + 20
i=1 i=1
n [−1, 1]
f7 = |xi |i+1
i=1
n
2 [−100, 100]
f8 =
xi + 0.5
i=1
n [−1.28, 1.28]
f9 = i · xi4 + rand ()
i=1
2 [−100, 100]
n
i
f10 = xj
i=1 j=1
• Terminal condition: the number of maximal iterations t max = 100. When the num-
ber of iterations t is satisfied t > t max , UABC terminates.
5.4.3 Results
When the dimension of test functions are, respectively, set as 50, 100, and 200,
the results obtained by ABC and UABC are shown in Table 5.2 and Table 5.3,
respectively.
A comparison between Tables 5.2 and 5.3 shows that the average values obtained
by UABC are much better than those obtained by ABC, and their difference is several
orders of magnitude. If considering floating-point errors, the values less than 10−6 are
regarded as 0, then for 50-dimensional test function, UABC obtains the theoretical
optimal value 0 except f 2 , f 9 , and f 10 , while ABC does not obtain the theoretical
optimal value for all test functions. For 50-dimensional test function, f 1 , f 2 , f 3 , f 4 ,
54 J. Zhang et al.
Table 5.2 Average value and standard deviation of the optimal values obtained by ABC
Average value Standard deviation
50 100 200 50 100 200
f1 3.77E+04 1.73E+05 5.01E+05 3.87E+03 1.10E+04 1.34E+04
f2 1.52E+08 8.74E+08 2.48E+09 2.30E+07 7.05E+07 7.53E+07
f3 341.74 1.54E+03 4.58E+03 45.55 112.34 103.80
f4 576.91 1.41E+03 3.14E+03 19.37 27.42 43.35
f5 1.37E+05 1.15E+30 3.55E+80 3.47E+05 2.37E+30 1.46E+81
f6 2.98 3.06 3.08 0.0203 5.83E−03 1.86E−03
f7 0.468 0.699 0.840 0.169 0.137 0.132
f8 4.09E+04 1.70E+05 5.02E+05 4.32E+03 1.11E+04 1.67E+04
f9 4.56 7.06 8.93 0.159 0.123 0.0638
f 10 1.42E+05 5.37E+05 2.19E+06 1.40E+04 8.63E+04 1.83E+05
Table 5.3 Average value and standard deviation of the optimal solutions obtained by UABC
Average value Standard deviation
50 100 200 50 100 200
f1 2.40E−11 4.91E−11 1.84E−09 5.43E−11 1.89E−11 7.39E−09
f2 36.86 79.50 162.39 9.14 23.69 46.63
f3 6.99E−08 2.61E−04 5.36E−04 3.13E−07 1.16E−03 8.88E−04
f4 6.55E−10 3.66E−09 1.28E−08 4.59E−10 1.32E−09 2.75E−09
f5 6.04E−06 3.18E−05 9.86E−05 2.48E−06 7.27E−06 9.74E−06
f6 1.44E−06 2.29E−06 2.95E−06 7.41E−07 4.84E−07 3.28E−07
f7 3.65E−12 5.92E−12 1.26E−11 3.71E−12 7.93E−12 1.50E−11
f8 0 0 0 0 0 0
f9 4.42E−04 4.08E−04 2.73E−04 3.52E−04 4.22E−04 2.46E−04
f 10 0.653 10.28 15.04 2.38 4.54 3.26
f 5 , f 8 , and f 10 , the optimal values obtained by ABC are several orders of magnitude
larger than the theoretical optimal values, while the maximal difference between the
optimal values obtained by UABC and the theoretical optimal values in solely one
order of magnitude (for f 2 ). For 100-dimensional and 200-dimensional test functions,
the phenomena are similar to 50-dimensional test function.
From Tables 5.2 and 5.3, it can also be seen that whether ABC or UABC, the opti-
mal values of 100-dimensional functions are better than those of 50-dimensional
functions and those of 200-dimensional functions are better than those of 100-
dimensional functions. This is reasonable because the differences of the obtained
optimal values and theoretic optimal values are bound to increase as the dimensions
of the problem increase. However, the increase in speed of UABC is much less than
that of ABC. In f 1 , f 3 , f 4 , f 5 , f 6 , f 7 , f 8 , and f 9 , the increase in speed of UABC is very
5 Artificial Bee Colony Algorithm Combined with Uniform Design 55
tiny, and especially in f 8 , there was no increase and it is always equal to the theoretic
optimal value 0. However, the increase in speed of ABC is an order of magnitude,
and especially in f 5 , the optimal value in 50 dimension is 105 , while that in 100
dimension and 200 dimension is 1030 and 1080 , respectively. This fully demonstrates
that UABC is not sensitive to the dimensions of the problem, and suitable for very
high-dimensional problem.
From Table 5.3, we can clearly see that the standard deviations obtained by UABC
are very small except f 2 and f 10 . This demonstrates that UABC has very high robust-
ness. By comparing Tables 5.2 and 5.3, it can be obviously observed that the standard
deviations obtained by UABC are much less than those by ABC. This fully demon-
strates that robustness of UABC is much larger than that of ABC.
This study presents an artificial bee colony algorithm based on uniform design. It
makes full use of the advantage of uniform design, and generates the initial bee colony
by means of uniform design, so that nectar sources can scatter evenly into the vector
spaces of the feasible solutions. The crossover operator based on uniform design is
conducted on each nectar source and the optimal nectar source. This is to perform the
refine search as soon as possible in potential optimal vector space, in order to jump
out of local optimum quickly and find the global optimal solution. The experimental
results performed on several common test functions demonstrate that the proposed
algorithm has a strong ability to seek the optimal solutions. The algorithm can obtain
the satisfactory optimal solutions for different dimension problems. This fully shows
that the proposed algorithm has a strong robustness and applicability.
This algorithm is going on for further enhancement and improvement. One attempt
is to use a more efficient method to improve its converging speed. Another attempt is
to extend its application scopes to the other problems, such as community detection,
brain network analysis, and single cell data analysis as well.
References
1. Cao, Y., et al.: An improved global best guided artificial bee colony algorithm for continuous
optimization problems. Clust. Comput. 2018(2018), 1–9 (2018)
2. Cui, L., et al.: Modified Gbest-guided artificial bee colony algorithm with new probability
model. Soft. Comput. 22(7), 2217–2243 (2018)
56 J. Zhang et al.
3. Ning, J., et al.: A food source-updating information-guided artificial bee colony algorithm.
Neural Comput. Appl. 30(3), 775–787 (2018)
4. Bharti, K.K., Singh, P.K.: Chaotic gradient artificial bee colony for text clustering. Soft Comput.
20(3), 1113–1126 2016
5. Liu, X., Wang, Y., Liu, H.: A hybrid genetic algorithm based on variable grouping and uniform
design for global optimization. J. Comput. 28(3), 93–107 (2017)
6. Leung, Y.-W., Wang, Y.: Multiobjective programming using uniform design and genetic algo-
rithm. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 30(3), 293–304 (2000)
7. Zhang, J., Wang, Y., Feng, J.: Attribute index and uniform design based multiobjective associ-
ation rule mining with evolutionary algorithm. Sci. World J. 2013(2013), 1–16 (2013)
8. Dai, C., Wang, Y.: A new decomposition based evolutionary algorithm with uniform designs
for many-objective optimization. Appl. Soft Comput. 30(1), 238–248 (2015)
9. Zhu, X., Zhang, J., Feng, J.: Multi-objective particle swarm optimization based on PAM and
uniform design. Math. Probl. Eng. 2015(2), 1–17 (2015)
10. Jia, L., Wang, Y., Fan, L.: An improved uniform design-based genetic algorithm for multi-
objective bilevel convex programming. Int. J. Comput. Sci. Eng. 12(1), 38–46 (2016)
11. Dai, C., Wang, Y.: A new uniform evolutionary algorithm based on decomposition and CDAS
for many-objective optimization. Knowl. Based Syst. 85(1), 131–142 (2015)
Chapter 6
An Orthogonal QUasi-Affine
TRansformation Evolution (O-QUATRE)
Algorithm for Global Optimization
6.1 Introduction
Genetic Algorithm (GA) [3], Particle Swarm Algorithm (PSO) [4], Ant Colony Algo-
rithm (ACO) [5], Differential Evolution (DE) [6], Ebb-Tide-Fish (ETF) algorithm
[7], Monkey King Evolution [8], QUasi-Affine TRansformation Evolution (QUA-
TRE) algorithm [9], etc.
In 2016, Meng et al. proposed the QUATRE algorithm to conquer positional
bias of DE algorithm. The related works of QUATRE algorithm can be found in
[7–11]. The QUATRE algorithm is a swarm-based intelligence algorithm, which has
many advantages and has been used for hand gesture segmentation [10]. However,
it has the same disadvantages as the DE and PSO algorithms. Many researchers
have learned about these evolutionary algorithms and proposed many variants to
enhance their performance. Zhang and Leung [12] advocated incorporating exper-
imental design methods into the GA, and they have proposed Orthogonal Genetic
Algorithm (OGA). Their experimental results demonstrated that OGA can be more
robust and statistically sound, and has a better performance than the traditional GA.
Tsai et al. [13] have adopted the Taguchi method (namely, Taguchi orthogonal arrays)
into the GA’s crossover operator, and have presented the Hybrid Taguchi–Genetic
Algorithm (HTGA). Other researchers have used Taguchi method to improve the
performance of PSO [14], PCSO [15], and DE [16]. The improved algorithms men-
tioned above all use orthogonal array to reduce the number of experiments, thereby
improving the performance and robustness of the algorithm. In this paper, we will
use orthogonal array to improve the performance of QUATRE algorithm.
The rest of the paper is composed as follows. The QUATRE algorithm and the
orthogonal array are briefly reviewed in Sect. 6.2. Our proposed method Orthogonal
QUasi-Affine TRansformation Evolution (O-QUATRE) algorithm is presented in
Sect. 6.3. The experimental analysis of O-QUATRE algorithm under CEC2013 test
suite for real-parameter optimization is given, and O-QUATRE algorithm is com-
pared with the QUATRE algorithm in Sect. 6.4. The conclusion is given in Sect. 6.5.
The QUATRE algorithm was proposed by Meng et al. for solving opti-
mization problems. The individuals in QUATRE algorithm evolve according
to Eq. 6.1, which is a quasi-affine transformation evolution equation. X =
[X1,G , X2,G , . . . , Xi,G , . . . , Xps,G ]T denotes the individual population matrix with ps
different individuals, Xi,G = [xi1 , xi1 , . . . , xi1 , . . . xiD ], i ∈ {1, 2, . . . , ps} denotes
the location of ith individual of the Gth generation, which is the ith row vector of the
matrix X, and each individual Xi,G is a candidate solution for a specific D-dimension
T
optimization problem. B = B1,G , B2,G , . . . , Bi,G , . . . , Bps,G denotes the donor
matrix and it has several different calculation schemes which can be found in [10].
6 An Orthogonal QUasi-Affine TRansformation Evolution (O-QUATRE) … 59
X ← M ⊗ X + M̄ ⊗ B (6.1)
where Xr1,G and Xr2,G both denote random matrices which are generated by ran-
domly permutating the sequence of row vectors in the population matrix X of the
Gth generation with all elements of each row vector unchanged. F is the muta-
tion scale factor, which ranges from 0 to 1, and its recommended value is 0.7.
T
Xgbest,G = Xgbest,G , Xgbest,G , . . . , Xgbest,G is the global best matrix with each row
vector equaling to the Gth global best individual Xgbest,G .
⎡ ⎤ ⎡ ⎤
1 1
⎢1 1 ⎥ ⎢ ... ⎥
Mini =⎢
⎣
⎥∼⎢
⎦ ⎣1 1 ...
⎥=M (6.3)
... 1⎦
1 1 ... 1 1 1
60 N. Liu et al.
⎡ ⎤ ⎡ ⎤
1 1 ... 1
⎢1
⎢ 1 ⎥ ⎢
⎥ ⎢ ... 1⎥⎥
⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥
⎢1 1 ... 1⎥ ⎢ ⎥
⎢ ⎥ ⎢1 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢1 ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
Mini =⎢
⎢
1 1 ⎥ ∼ ⎢1 1
⎥ ⎢
... 1⎥ = M
⎥ (6.4)
⎢ ... ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢1 1 ... 1⎥ ⎢ 1⎥
⎢ ⎥ ⎢1 ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎢ . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥
⎣1 ⎦ ⎣1 1 ... 1⎦
11 1
The orthogonal array [13] is a fractional factorial matrix, which can be used in
many designed experiments to determine which combinations of factor levels can be
used for each experimental run and for analyzing the data, and it is a major tool of
experimental design method and Taguchi method. An orthogonal array can ensure a
balanced comparison of levels of any factor or interactions between factors. Each row
in it represents the level of the factors for one run of the experiment, and each column
in it indicates a specific factor that can be evaluated independently. What’s more, the
merit of orthogonal array is that it can reduce the number of experiments efficiently.
Although it reduces the number of experiments, it is still reliable due to the powerful
support of statistical theory. For example, a problem involving three factors, three
6 An Orthogonal QUasi-Affine TRansformation Evolution (O-QUATRE) … 61
levels per
4factor,
requires 33 = 27 experiments to be tested, but with orthogonal
array L9 3 [13], only nine representative experiments need to be conducted.
In this paper, we adopt two-level orthogonal array to change the evolution matrix
M of
QUATRE algorithm, and the general notation for two-level orthogonal array
is Ln 2n−1 , where L, n, n − 1, and 2 denote Latin square, number of experimental
runs, number of columns in the orthogonal array, and number of levels per factor,
respectively. For example, assume that we have two sets of solutions with 10 dimen-
sions in the optimization
problem
and we want to find the best combination of their
values. Then, the L12 211 orthogonal array is given in Table 6.1. The number on
the left of each row represents the experiment number and varies from 1 to 12. The
elements “0” and “1” of each row indicate which factor’s value should be used in
one run of the experiment. The element “1” represents the value of the factor should
be taken from the first set of solution, and the element “0” represents the value of
the factor should be taken from the second set of solution. The illustration of eighth
experiment for 10 factors/dimensions problem according to eighth row of orthogonal
array is shown in Fig. 6.2.
Table 6.1 L12 211 Experiment number Considered factors
orthogonal array
1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 1 0 0 1 0 1 1 1
2 0 0 1 0 0 1 0 1 1 1 0
3 0 0 1 0 1 1 1 0 0 0 1
4 0 1 0 0 1 0 1 1 1 0 0
5 0 1 0 1 1 1 0 0 0 1 0
6 0 1 1 1 0 0 0 1 0 0 1
7 1 0 0 0 1 0 0 1 0 1 1
8 1 0 0 0 1 0 0 0 1 0 0
9 1 0 1 1 1 0 0 0 1 0 0
10 1 1 0 0 0 1 0 0 1 0 1
11 1 1 1 0 0 0 1 0 0 1 0
12 1 1 1 1 1 1 1 1 1 1 1
Fig. 6.3 Illustration of changing the evolution matrix M using orthogonal array
Fitness error
Fitness error
200
Fitness error
100
150 500
50
0 100 0
0 1 2 3 0 1 2 3 0 1 2 3
5 5 5
NFE x 10 NFE x 10 NFE x 10
Fig. 6.4 Simulation of functions f13 , f24 , and f28 with 10-D
functions f1 –f5 are unimodal functions, the next 15 functions f6 –f20 are multi-modal
functions, and the rest 8 functions f21 –f28 are composition functions. All test func-
tions’ search ranges are [−100, 100]D and they are shifted to the same global best
location, O{o1 , o2 , . . . , od }.
In this paper, for all these benchmark functions, we compared the performance of
the algorithms under 10-dimensional problems. Each algorithm on each benchmark
function is conducted for 150 times independently and the best, mean, standard
deviation of these runs is recorded to make statistical analysis. The parameter settings
of the O-QUATRE algorithm are ps = 100, F = 0.7, rc = 0.1, D = 10, Generations
= 1000 (NFE = 209,890,
NFE denotes the number of function evaluation), and
orthogonal array L12 211 , the parameter settings of the QUATRE algorithm are ps =
100, F = 0.7, D = 10, Generations = 2100 (NFE = 210,000). The comparison results
are shown in Table 6.2, and the simulation results of some benchmark functions are
shown in Fig. 6.4.
From Table 6.2, we can see that the QUATRE algorithm has better best value on
function f2–4 , f7 , f13–16 , f20 , f23 , f25 , and the O-QUATRE algorithm has better best
value on function f8 , f10 , f12 , f17–19 , f22 , f24 , f26 , and they have the same best value
on the other rest eight functions. The QUATRE algorithm can find two more results
with better best value than the O-QUATRE algorithm, but the O-QUATRE algorithm
has better mean and standard deviation of fitness error than the QUATRE algorithm,
which means that the O-QUATRE algorithm is more robust and has better stability.
6.5 Conclusion
Table 6.2 Comparison results of best, mean, and standard deviation of 150-run fitness error
between QUATRE algorithm and the O-QUATRE algorithm under 10-D CEC2013 test suite
10-D QUATRE Algorithm O-QUATRE Algorithm
No. Best Mean Std Best Mean Std
1 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
2 0.0000E+00 2.1373E−13 2.0590E−12 1.2476E−08 3.1751E−06 1.2088E−05
3 0.0000E+00 1.0646E−01 7.3970E−01 5.2296E−12 7.4604E−01 3.9460E+00
4 0.0000E+00 4.3959E−14 9.0093E−14 1.0687E−11 1.0451E−09 1.1094E−09
5 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
6 0.0000E+00 4.4586E+00 4.7865E+00 0.0000E+00 5.7996E+00 7.8827E+00
7 1.1369E−13 8.8622E−01 4.2265E+00 4.2608E−07 1.1732E+00 5.5359E+00
8 2.0191E+01 2.0454E+01 9.5454E−02 2.0143E+01 2.0420E+01 8.5264E−02
9 0.0000E+00 1.6659E+00 1.2816E+00 0.0000E+00 1.8768E+00 1.2334E+00
10 3.2016E−02 1.7697E−01 1.1713E−01 9.8573E−03 1.7678E−01 1.2074E−01
11 0.0000E+00 2.9849E+00 1.6863E+00 0.0000E+00 2.3879E+00 1.3042E+00
12 2.9849E+00 1.4597E+01 5.6877E+00 5.4788E−01 1.2752E+01 6.6829E+00
13 1.9899E+00 2.0407E+01 8.8161E+00 4.2551E+00 1.9388E+01 7.7625E+00
14 3.5399E+00 1.0737E+02 8.5581E+01 3.6648E+00 8.0443E+01 7.1238E+01
15 1.7137E+02 9.7413E+02 3.0636E+02 2.7315E+02 9.1454E+02 3.0936E+02
16 3.1547E−01 1.1312E+00 3.5646E−01 4.0250E−01 1.1659E+00 3.1999E−01
17 5.9338E−01 1.0659E+01 3.3035E+00 3.7821E−02 1.0272E+01 3.0306E+00
18 1.0477E+01 3.1443E+01 8.9107E+00 1.0370E+01 3.1674E+01 8.3332E+00
19 2.2333E−01 6.1946E−01 1.9613E−01 1.0191E−01 6.0428E−01 1.8916E−01
20 7.9051E−01 2.9757E+00 5.9590E−01 1.2980E+00 2.9364E+00 5.4814E−01
21 1.0000E+02 3.6082E+02 8.1914E+01 1.0000E+02 3.6683E+02 7.5749E+01
22 1.7591E+01 1.9007E+02 1.2395E+02 8.9048E+00 1.6717E+02 1.0956E+02
23 1.6517E+02 9.5665E+02 3.0949E+02 1.6626E+02 8.9420E+02 3.2540E+02
24 1.0905E+02 2.0543E+02 9.2427E+00 1.0704E+02 2.0479E+02 1.1772E+01
25 1.0617E+02 2.0336E+02 1.3893E+01 1.1073E+02 2.0178E+02 1.5877E+01
26 1.0398E+02 1.6999E+02 4.8561E+01 1.0298E+02 1.6630E+02 4.9120E+01
27 3.0000E+02 3.8820E+02 9.7914E+01 3.0000E+02 3.6475E+02 9.1244E+01
28 1.0000E+02 2.8780E+02 6.5449E+01 1.0000E+02 2.9200E+02 3.9323E+01
Win 11 10 12 9 16 14
Lose 9 16 14 11 10 12
Draw 8 2 2 8 2 2
The best results of the comparisons are emphasized in BOLDFACE fonts
66 N. Liu et al.
rithm is evaluated under CEC2013 test suite for real-parameter optimization. The
experimental results indicate that the O-QUATRE algorithm has a better mean and
standard deviation of fitness error than the QUATRE algorithm, which means that
the O-QUATRE algorithm has the advantages of more robustness and better stability.
References
1. Pan, J.S., Kong, L.P., Sung, T.W., et al.: Hierarchical routing strategy for wireless sensor
network. J. Inf. Hiding Multimed. Signal Process. 9(1), 256–264 (2018)
2. Chang, F.C., Huang, H.C.: A survey on intelligent sensor network and its applications. J. Netw.
Int. 1(1), 1–15 (2016)
3. Holland, J.H.: Adaptation in Nature and Artificial Systems. The University of Michigan Press,
Ann Arbor (1975)
4. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE International
Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE (1995)
5. Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: optimization by a colony of cooperating
agents. IEEE Trans. Syst. Man Cybern. Part B Cybern. 26(1), 29–41 (1996)
6. Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimiza-
tion over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)
7. Meng, Z., Pan, J.S., Alelaiwi, A.: A new meta-heuristic ebb-tide-fish inspired algorithm for
traffic navigation. Telecommun. Syst. 62(2), 1–13 (2016)
8. Meng, Z., Pan, J.S.: Monkey king evolution: a new memetic evolutionary algorithm and its
application in vehicle fuel consumption optimization. Knowl.-Based Syst. 97, 144–157 (2016)
9. Meng, Z., Pan, J.S., Xu, H.: QUasi-Affine TRansformation Evolutionary (QUATRE) algo-
rithm: a cooperative swarm based algorithm for global optimization. Knowl.-Based Syst. 109,
104–121 (2016)
10. Meng, Z., Pan, J.S.: QUasi-affine TRansformation Evolutionary (QUATRE) algorithm: the
framework analysis for global optimization and application in hand gesture segmentation. In:
2016 IEEE 13th International Conference on Signal Processing (ICSP), pp. 1832–1837 (2016)
11. Meng, Z., Pan, J.S.: QUasi-Affine TRansformation Evolution with External ARchive
(QUATRE-EAR): an enhanced structure for differential evolution. Knowl.-Based Syst. 155,
35–53 (2018)
12. Zhang, Q., Leung, Y.W.: An orthogonal genetic algorithm for multimedia multicast routing.
IEEE Trans. Evol. Comput. 3, 53–62 (1999)
13. Tsai, J.T., Liu, T.K., Chou, J.H.: Hybrid Taguchi-genetic algorithm for global numerical opti-
mization. IEEE Trans. Evol. Comput. 8(4), 365–377 (2004)
14. Liu, C.H., Chen, Y.L., Chen, J.Y.: Ameliorated particle swarm optimization by integrating
Taguchi methods. In: The 9th International Conference on Machine Learning and Cybernetics
(ICMLC), pp. 1823–1828 (2010)
15. Tsai, P.W., Pan, J.S., Chen, S.M., Liao, B.Y.: Enhanced parallel cat swarm optimization based
on Taguchi method. Expert Syst. Appl. 39, 6309–6319 (2012)
16. Ding, Q., Qiu, X.: Novel differential evolution algorithm with spatial evolution rules. HIGH.
Tech. Lett. 23(4), 426–433
17. Liang, J.J., et al.: Problem definitions and evaluation criteria for the CEC 2013 special session
on real-parameter optimization. Computational Intelligence Laboratory, Zhengzhou University,
Zhengzhou, China and Nanyang Technological University, Singapore, Technical report 201212
(2013)
Chapter 7
A Decomposition-Based Evolutionary
Algorithm with Adaptive Weight
Adjustment for Vehicle Crashworthiness
Problem
Cai Dai
7.1 Introduction
In the automotive industry, crashworthiness refers to the ability of a vehicle and its
components to protect its occupants during an impact or crash [1]. The crashwor-
thiness design of vehicles is of special importance, yet, highly demanding for high-
quality and low-cost industrial products. Liao et al. [2] presented a multi-objective
model for the vehicle design which minimizes three objectives: (1) weight (mass),
(2) acceleration characteristics (Ain), and (3) toe-board intrusion (intrusion).
Multi-objective optimization problems (MOPs) are complex. They usually include
two or more conflicting objectives. A minimized MOP can be described as follows
[3]:
⎧
⎨ minF(x) = (f1 (x), f2 (x), . . . , fm (x))
s.t. gi (x) ≤ 0, i = 1, 2, . . . , q (7.1)
⎩
hj (x) = 0, j = 1, 2, . . . , p
C. Dai (B)
School of Computer Science, Shaanxi Normal University, Xi’an 710119, China
e-mail: cdai0320@snnu.edu.cn
where
In the subsection, the adaptive weight vector adjustment [19] is used in this work. The
main idea of this adjustment is that, if the distance of two adjacent non-dominated
solutions is large, some weight vectors are added between corresponding weight
vectors of these two non-dominated solutions, then some weight vectors should
be deleted. This adjustment strategy uses the distances of obtained non-dominated
solutions to delete or add some weight vectors to solve the problems with complex
PF and maintain relative stability of weight vectors. The detail of the adaptive weight
vector adjustment is as follows.
70 C. Dai
the fitness value of a solution in the selection operators. In this way, a solution with
a sparser neighbor is more likely to be selected to generate new solutions. These
new solutions are possible to be non-dominated solutions in sub-region solution d
belongs to. These non-dominated solutions are more close to true PF. Thus, this
selection scheme can help to improve the convergence.
MOEA/DA uses the evolutionary framework of MOEA/D. The steps of the algorithm
MOEA/DA is as follows:
Input:
N the number of weight vectors (the sub-problems);
T the number of weight vectors in the neighborhood of each weight vector,
0 < T < N ; and
λ1 , . . . , λN a set of N uniformly distributed weight vectors;
Output: Approximation to the PF: F x1 , F x2 , . . . , F xN
Step 1 Initialization:
Step 2 Update:
For i = 1, . . . , N , do
i
Step 2.1 Reproduction: A better solution xi is selected by the selection
strategy. Randomly select two indexes r2, r3 from B ii , then
i
generate a set of new solution y from xi , xr2 , and xr3 by using
the crossover operators.
Step 2.2 Mutation: Apply a mutation operator on y to produce yj .
Step 2.3 Update of z: For s = 1, . . . , m, if zs < fs yj , then set zs = fs yj .
Step 2.4 Update of neighboring solutions
and sub-population: For each
index k ∈ B(i), if g yj |λ , z < g x |λ , z , then xk = yj
TE k TE k k
End for;
72 C. Dai
Update evol_pop according to the Pareto dominance and the vicinity dis-
tance.
Step 3 Adaptive Weight adjustment
Use the adaptive weight vector adjustment of Sect. 7.3.1 to modify the weight
vectors W , re-determine B(i) = {i1 , . . . , iT }, (i = 1, . . . , H ) (where H is the
size of W ), and randomly select solutions from POP to allocate the new sub-
problem as their current solution.
Step 4 Stopping
criteria:
If stopping
criteria is satisfied, then stop and output
F x1 , F x2 , . . . , F xN ; otherwise, go to Step 2.
In this work, the aggregation function is the variant of Tchebycheff approach
whose form is as follows:
minimize gTE x|W i , Z∗ = max f j (x) − z∗j /W i,j (7.7)
x∈Ω 1≤j≤m
where Z∗ is the reference point of the MOP. The optimal solution x∗i of (7.7)
must be the Pareto optimal solution of (7.1). If optimal solution x∗i of (7.6) is
not the Pareto optimal
solution
of (7.1),
there is a solutiony which is better than
∗ ∗
xi , so f j (y) − zj ≤ f j xi − zj , j = 1, . . . , m, max f j (y) − z∗j /W i,j ≤
∗ ∗
1≤j≤m
max f j x∗i − z∗j /W i,j . Thus, x∗i is not the optimal solution of (7.7), which is
1≤j≤m
a contradiction.
In this section, MOEA/D [9], MOEA/D-AWA [20], and NSGAII [4] are used to com-
pare with MOEA/DA to solve the MOP of vehicle crashworthiness problem. These
algorithms are implemented on a personal computer (Intel Xeon CPU 2.53 GHz,
3.98 G RAM). The individuals are all coded as the real vectors. Polynomial muta-
tion and simulated binary crossover (SBX [21]) are used in MOEA/DA. Distribution
index is 20 and crossover probability is 1 in the SBX operator. Distribution index
is 20 and mutation probability is 0.1 in mutation operator. The population size is
105. Each algorithm is run 20 times independently and stops after 500 generations.
In real-world cases, the Pareto optimal solutions are often not available. Therefore,
to compare the performance of these there algorithms for the synthesis gas problem
quantitatively, the HV metric [22] and coverage metric [23] (C metric) are used.
Table 7.1 presents the mean and the best values of IGD obtained by MOEA/DA,
MOEA/D, MOEA/D-AWA, and NSGAII. In this experiment, the reference points are
set to (1700, 12, 1.1). From the table, it can be seen that the convergence performance
of MOEA/DA is better than MOEA/D, MOEA/D-AWA, and NSGAII, and we can
see that the mean values of five obtained by these four algorithms are smaller than
0.04, which indicates that the convergence performances of these four algorithms
7 A Decomposition-Based Evolutionary Algorithm … 73
Table 7.1 C and HV obtained by MOEA/DA, MOEA/D, MOEA/D-AWA, and NSGAII on vehicle
crashworthiness problem (A represents the algorithm MOEA/DA and B represents the algorithms
MOEA/D, MOEA/D-AWA, and NSGAII)
MOEA/DA MOEA/D MOEA/D-AWA NSGAII
C(A,B) Mean NA 0.0156 0.0214 0.0345
Std NA 0.0062 0.0071 0.0102
C(B,A) Mean NA 0.0084 0.0101 0.0135
Std NA 0.0025 0.0094 0.0100
HV Mean 103.5694 96.8083 99.0426 102.4931
Std 1.2827 5.8045 2.1546 1.2017
are almost the same; the mean values of HV obtained by MOEA/DA is much bigger
than those obtained by MOEA/D, MOEA/D-AWA, and NSGAII on VCP, which
indicate that the coverage and convergence of solutions obtained by MOEA/DA to the
true PF are better than those obtained by MOEA/D, MOEA/D-AWA, and NSGAII.
Moreover, the mean values of HV metric obtained by MOEA/DA are bigger than that
obtained by MOEA/D-AWA on VCP, which indicate that MOEA/DA can effectively
approach the true PFs. In summary, the comparisons of the simulation results of
these four algorithms show that MOEA/DA is able to obtain much better spread,
distributed and convergent PFs.
7.5 Conclusions
References
1. Du Bois, P., et al.: Vehicle crashworthiness and occupant protection. American Iron and Steel
Institute, Southfield, MI, USA, Report (2004)
2. Liao, X., Li, Q., Yang, X., Zhang, W., Li, W.: Multiobjective optimization for crash safety design
of vehicles using stepwise regression model. Struct. Multidiscipl. Optim. 35(6), 561–569 (2008)
3. Van Veldhuizen, D.A.: Multiobjective Evolutionary Algorithms: Classifications, Analyses, and
New Innovations. Air Force Institute of Technology Wright Patterson AFB, OH, USA (1999)
4. Deb, K., et al.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol.
Comput. 6(2), 182–197 (2002)
5. Tang, B., Zhu, Z., Shin, H., Tsourdos, A., Luo, J.: A framework for multi-objective optimisation
based on a new self-adaptive particle swarm optimisation algorithm. Inf. Sci. 420, 364–385
(2017)
6. Wang, X.P., Tang, L.X.: An adaptive multi-population differential evolution algorithm for
continuous multi-objective optimization. Inf. Sci. 348, 124–141 (2016)
7. Shang, R.H., Jiao, L.C., Liu, F., Ma, W.P.: A novel immune clonal algorithm for MO problems.
IEEE Trans. Evol. Comput. 16(1), 35–50 (2012)
8. Zhan, Z.H., Li, J.J., Cao, J.N., Zhang, J., Chung, H.H., Shi, Y.H.: Multiple populations for mul-
tiple objectives: a coevolutionary technique for solving multiobjective optimization problems.
IEEE Trans. Cybern. 43(2), 445–463 (2013)
9. Zhang, Q.F., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposi-
tion. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)
10. Zhao, S.Z., Suganthan, P.N., Zhang, Q.F.: Decomposition-based multiobjective evolutionary
algorithm with an ensemble of neighborhood sizes. IEEE Trans. Evol. Comput. 16(3), 442–446
(2012)
11. Wang, L., Zhang, Q., Zhou, A.: Constrained subproblems in a decomposition-based multiob-
jective evolutionary algorithm. IEEE Trans. Evol. Comput. 20(3), 475–480 (2016)
12. Zhu, H., He, Z., Jia, Y.: A novel approach to multiple sequence alignment using multiobjec-
tive evolutionary algorithm based on decomposition. IEEE J. Biomed. Health Inform. 20(2),
717–727 (2016)
13. Jiang, S., Yang, S.: An improved multiobjective optimization evolutionary algorithm based on
decomposition for complex Pareto fronts. IEEE Trans. Cybern. 46(2), 421–437 (2016)
14. Zhou, A., Zhang, Q.: Are all the subproblems equally important? Resource allocation in
decomposition-based multiobjective evolutionary algorithms. IEEE Trans. Evol. Comput.
20(1), 52–64 (2016)
15. Zhang, H., Zhang, X., Gao, X., et al.: Self-organizing multiobjective optimization based on
decomposition with neighborhood ensemble. Neurocomputing 173, 1868–1884 (2016)
16. Li, H., Zhang, Q.F.: Multiobjective optimization problems with complicated Pareto sets,
MOEA/D and NSGA-II. IEEE Trans. Evol. Comput. 13(2), 284–302 (2009)
17. Al Mpubayed, N., Petrovski, A., McCall, J.: D2MOPSO: MOPSO based on decomposition
and dominance with archiving using crowding distance in objective and solution spaces. Evol.
Comput. 22(1), 47–78 (2014)
18. Zhang, H., et al.: Self-organizing multiobjective optimization based on decomposition with
neighborhood ensemble. Neurocomputing 173, 1868–1884 (2016)
19. Dai, C., Lei, X.: A Decomposition-Based Multiobjective Evolutionary Algorithm with Adaptive
Weight Adjustment. Complexity, 2018
20. Qi, Y., Ma, X., Liu, F., Jiao, L., Sun, J., Wu, J.: MOEA/D with adaptive weight adjustment.
Evol. Comput. 22(2), 231–264 (2014)
21. Deb, K.: Multiobjective Optimization Using Evolutionary Algorithms. Wiley, New York (2001)
22. Deb, K., Sinha, A., Kukkonen, S.: Multi-objective test problems, linkages, and evolutionary
methodologies. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary
Computation GECCO’06, Seattle, WA, pp. 1141–1148 (2006)
23. Zitzler, E., Thiele, L.: Multi-objective evolutionary algorithms: a comparative case study and
the strength Pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999)
Chapter 8
Brainstorm Optimization in Thinned
Linear Antenna Array with Minimum
Side Lobe Level
8.1 Introduction
tion problems with respect to the synthesis of linear array geometry with minimum
side lobe level [14]. Wang et al. proposed a modified binary PSO in the synthesis
of thinned linear and planar arrays with a lower SLL. The chaotic sequences were
embedded in the proposed algorithm to determine the inertia weight of the binary
PSO for the diversity of particles, resulting in improved performance [15]. Ma et al.
modeled a hybrid optimization method of particle swarm optimization and convex
optimization of which the peak side lobe level is considered as the objective function
to optimize the linear array synthesis [16].
In this paper, we present the method of optimization of uniformly spaced lin-
ear arrays based on a Brainstorm Optimization (BSO) algorithm. The remainder of
this paper is organized as follows. In Sect. 8.2, the thinned linear antenna array is
described. In Sect. 8.3, the brainstorm optimization is modified for thinned antenna
array. Simulation experiments and comparisons are provided in Sect. 8.4. Finally,
conclusion is given in Sect. 8.5.
N
2π
dm (cos θ sin ϕ−cos θ0 sin ϕ0i )
F(φ, θ ) = fm (φ, θ )Am ej λ (8.1)
m=1
In this paper, the following antenna constraints were assumed for simplicity of
calculations: amplitude only excitation and no phase difference, uniform array of
elements is considered: fm (φ, θ ) = 1, Am = 1, uniform spacing between neighboring
elements—λ\2.
d
θ
0
1 2 …… N y
φ
x
78 N. Bulgan et al.
N −1
2π
dm (sin θ−sin θ0 )
F(θ ) = ej λ (8.2)
m=0
N −1
2π
dm (sin θ−sin θ0 )
F(θ ) = ej λ ∗ fm (8.3)
m=0
From this equation, we can see that F(θ ) is a complex nonlinear continuous
function. Typically, the array factor is expressed by an absolute value by above
formula, normalized to its maximum and is plotted in dB scale.
In our case, we search Minimum Side Lobe Level (MSLL) value in dBs
F(θ )
MSLL = max{Fdb (θ )} = max (8.4)
φ∈S φ∈S max(F(θ ))
where S denotes the side lobe region and max FF is the peak of main beam, that is,S =
θ |θmin ≤ θ ≤ θ0 − ϕ0 ∪ θ0 + ϕ0 ≤ θ ≤ θmax (Ref: φ ∈ S: φ is element of S).
To suppress SLL, the fitness function can be defined as
min(MSSL)f (8.5)
Population
IniƟalizaƟon initialization
of populaƟon
Yes
Stopping condition
is true
No
Use the conformed
Population clustering individuals as the output
New individual
generation operation
End
Selection operation
Binary operation
randomly selected. The class center of the class is replaced, then the new entity is
created and the competitive selection operation is performed. This process continues
until the termination condition is established. Figure 8.2 shows the thinning operation
flowchart.
For the above BSO steps to conform to the characteristics of the thinning linear
array, some of the procedures need modification. The processes to be adjusted are as
follows:
Population clustering: Group n individuals into m clusters by a clustering algorithm.
New individual generation operation: Select one or two cluster(s) randomly to gen-
erate new individual (solution).
Selection: The newly generated individual is compared with the existing individual
with the same individual index and the better one is kept and recorded as the new
individual.
Binary operation: The candidate solution is obtained using the equation shown below:
1 if rij < xij (t)
xij (t + 1) = (8.6)
0 otherwise
80 N. Bulgan et al.
where rij is a uniform random number in the range [0, 1] and the normalization
function is a sigmoid function.
1
xij (t) = sig xij (t) = (8.7)
1 + e−xij (t)
The experimental parametric of the uniform linear array is set to 100 array elements
with an equidistant of 1 21 m wavelength, an equal omnidirectional amplitude of
wavelength 1 m, and an array aperture of 49.5 m. The antenna beam is pointed at
0°. To achieve a minimum side lobe level, a sparse array of 50 array elements is
performed for the sparse directional pattern to attain the lowest side lobe level.
The thinned antenna array based on the modified BSO algorithm is shown in
Fig. 8.3 and the element locations are illustrated in Fig. 8.4.
The change Ymin and Ymax in the individual transformation process is shown
in Fig. 8.5 with the suspected change range within (0, Ymin) (Ymax, 1) during the
constraint conversion and further run process for 200 iterations. Figure 8.6 establishes
the best fitness curves of both BSO and GA for 200 iterations. The curve shows that
the GA surpasses the BSO for some number of iterations (before 40 iterations) but
the latter shows the modified BSO fairly outperforming the GA. Comparatively,
reaching of a local optimum between the two algorithms have the BSO frequently
attaining the local optimum at a short period. This implies the BSO algorithm has
good applicability and importance to the synthesis of the thinned linear array.
-20
Array gain/dB
-30
-40
-50
-60
-80 -60 -40 -20 0 20 40 60 80
8 Brainstorm Optimization in Thinned Linear Antenna Array … 81
1.5
0.5
10 20 30 40 50 60 70 80 90 100
Array element position
0.9
0.8
YMAX
0.7
0.6
0.5
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
ymax
0.4
YMIN
0.2
0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
ymin
8.5 Conclusions
The combinatorial nature of antenna array is challenging and this makes the designing
of a suitable algorithm for thinning a large-scale antenna array very complex and
difficult. The application of our modified BSO algorithm to pattern synthesis of a
linear antenna array is successful and the simulation results of the proposed BSO
82 N. Bulgan et al.
16
Fitness Values
15.5
15
14.5
BSO
14 GA
13.5
13
0 50 100 150 200
Iterations
algorithm establish its importance and good applicability to the synthesis of thinned
planar arrays. The supremacy of the proposed algorithm is tested by comparing it to
the GA and our modified algorithm outperformed.
Acknowledgements This work is supported by the National Key R&D Program of China (No.
2018YFC0407101), Fundamental Research Funds for the Central Universities (No. 2019B22314),
National Natural Science Foundation of China (No. 61403121), Program for New Century Excellent
Talents in Fujian Province University (No. GYZ18155), Program for Outstanding Young Scientific
Researcher in Fujian Province University (No. GY-Z160149), and Scientific Research Foundation
of Fujian University of Technology (No. GY-Z17162).
References
9. Chen, K., He, Z., Han, C.: A modified real GA for the sparse linear array synthesis with multiple
constraints. IEEE Trans. Antennas Propag. 54(7), 2169–2173 (2006)
10. Jain, R., Mani, G.S.: Solving antenna array thinning problem using genetic algorithm. Appl.
Comput. Intell. Soft Comput. 24 (2012)
11. Ha, B.V., Mussetta, M., Pirinoli, P., Zich, R.E.: Modified compact genetic algorithm for thinned
array synthesis. IEEE Antennas Wirel. Propag. Lett. 15, 1105–1108 (2016)
12. Quevedo-Teruel, O., Rajo-Iglesias, E.: Ant colony optimization in thinned array synthesis with
minimum sidelobe level. IEEE Antennas Wirel. Propag. Lett. 5, 349–352 (2006)
13. Li, W.T., Shi, X.W., Hei, Y.Q.: An improved particle swarm optimization algorithm for pattern
synthesis of phased arrays. Prog. Electromagn. Res. 82, 319–332 (2008)
14. Mandal, D., Das, S., Bhattacharjee, S., Bhattacharjee, A., Ghoshal, S.: Linear antenna array
synthesis using novel particle swarm optimization. In: 2010 IEEE Symposium on Industrial
Electronics and Applications (ISIEA), pp. 311—316, Oct 2010
15. Wang, W.B., Feng, Q.Y., Liu, D.: Synthesis of thinned linear and planar antenna arrays using
binary PSO algorithm. Prog. Electromagn. Res. 127, 371–388 (2012)
16. Ma, S., Li, H., Cao, A., Tan, J., Zhou, J.: Pattern synthesis of the distributed array based on
the hybrid algorithm of particle swarm optimization and convex optimization. In: 2015 11th
International Conference on Natural Computation (ICNC), pp. 1230–1234, Aug 2015
17. Shi, Y.: Brain storm optimization algorithm. In: International conference in swarm intelligence,
pp. 303–309, June 2011. Springer, Berlin, Heidelberg
Chapter 9
Implementation Method of SVR
Algorithm in Resource-Constrained
Platform
Abstract With the development of the Internet of Things and edge computing,
machine learning algorithms need to be deployed on resource-constrained embedded
platforms. Support Vector Regression (SVR) is one of the most popular algorithms
widely used in solving problems characterized by small samples, high-dimensional,
and nonlinear, with its good generalization ability and prediction performance. How-
ever, SVR algorithm requires a lot of resources when it is implemented. There-
fore, this paper proposes a method to implement SVR algorithm in the resource-
constrained embedded platform. The method analyses the characteristics of the data
in the SVR algorithm and the solution process of the algorithm. Then, according
to the characteristics of the embedded platform, the implementation process of the
algorithm is optimized. Experiments using UCI datasets show that the implemented
SVR algorithm is correct and effective, and the optimized SVR algorithm reduces
time and memory consumption at the same time, which is of great significance for
the implementation of SVR algorithm in resource-constrained embedded platforms.
9.1 Introduction
All calculations in the early embedded intelligent system are concentrated in the
MCU such as A/D conversion, signal conditioning, dimensional transformation of
the sensors [1]. However, with the advent of smart sensors, these computing tasks
related to smart sensors are transferred to the front or backend of embedded intelligent
systems. Such calculations assigned to smart sensors can also be referred to as edge
calculations for embedded systems. The transfer of calculations makes the system
more uniform and more real time, and allows the MCU to engage in more new tasks.
Therefore, it is of great significance and economic value to implement machine
learning algorithms on resource-constrained embedded platforms.
In recent years, machine learning has developed rapidly. Support Vector Regres-
sion (SVR) is widely used in pattern recognition, probability density functions esti-
mation, time series prediction, and regression estimation. In most application sce-
narios, data is collected by the embedded platform and sent to the PC. The training
process of the SVR algorithm is performed on the PC rather than on the embedded
platform.
Since the SVR algorithm needs to occupy a large amount of resources, especially
memory in the training process, it is difficult to implement the SVR algorithm in
the resource-constrained embedded platform. There are not many related researches
in this field. Therefore, this paper proposes a method to implement SVR algorithm
on the resource-constrained embedded platform to reduce the resources and time
consumption of SVR algorithm. In this paper, the data structure of SVR algorithm
and its solution flow are analyzed and then optimized considering the constrained
resource of the embedded platform. Then UCI datasets are applied to verify the
correctness of the implemented SVR algorithm, and the effectiveness of the proposed
time and memory optimization method.
The structure of the rest of the paper is as follows. The second section and the third
section introduces the principle of SVR algorithm and SMO algorithm, respectively,
the fourth section proposes the implementation and the optimization method of this
paper, the fifth section carries out experimental verification and analysis, and the
sixth section summarizes the paper.
Linear support vector machines were proposed by Cortes and Vapnik [2]. At the
same time Boser, Guyon and Vapnik introduced nuclear techniques and proposed
nonlinear support vector machines [3]. And Drucker et al. extended it to support
vector regression [4].
For a training set T = {(x1 , y1 ), . . . , (xl , yl )}, where xi ∈Rn , i = 1, . . . , l is the
feature vector, and yi ∈R is the target value, l is the number of training samples. SVR
hopes to find a linear model f (x) = wT x + b that f (x) is as close as possible to y,
9 Implementation Method of SVR Algorithm … 87
1 l
min ∗ w2 + C xi + xi∗
w,b,x,x 2 i=1
⎧ T
⎨ w φ(xi ) + b − yi ≤ ε + ξi ,
s.t. y − wT φ(xi ) − b ≤ ε + ξ∗i , (9.1)
⎩ i
ξi , ξi ∗ ≥ 0, i = 1, . . . , l.
Among them C > 0 is the regularization constant, which is used to balance the
complexity and generalization ability of the model. ε > 0 is the upper error limit,
indicating that the sample with an absolute error less than ε is not punished. ξi , ξ∗i ≥ 0
are the relaxation factors used to process samples that exceed the error limit. Using
the Lagrange multiplier method, we can get the dual problem:
1
l l l l
min (αi −α i ∗)K xi , xj αj −α j ∗ + ε (αi +α i ∗) + yi (αi −α i ∗)
α,α∗ 2
i=1 i=1 i=1 i=1
⎧
⎨ l
(α − α∗) = 0,
s.t. (9.2)
⎩ i=1
0 ≤ αi , α i ∗ ≤ C, i = 1, . . . , l.
where K xi , xj ≡ φ(xi )T φ xj is a kernel function introduced to avoid calculating
the inner product in the high-dimensional feature space. And the most
widely used
kernel function is the Gaussian kernel function K(x1 , x2 ) = exp −γ x1 − x2 2 ,
where γ = 2σ1 2 . The KKT condition needs to be met in the process of solving the
above dual problem.
After solving the above problem (9.2), the final model can be obtained as
l
f (x) = (−αi +α i ∗)K(xi , x) + b. (9.3)
i=1
The training process of SVR is essentially the process of solving the dual problem of
the primary convex quadratic programming problem. At first, solve the dual problem
and get the optimal solution (−α + α∗), and then calculate b in the optimal solu-
tion of the original problem. Such convex quadratic programming problems have
global optimal solutions, and many optimization algorithms can be used to solve
88 B. Liu et al.
them. And sequence minimal optimization (SMO [5]) algorithm is one of the most
popular methods. The dual problem of convex quadratic programming that requires
the solution of SMO algorithm can be reexpressed as
1 ∗ T T
K −K α ∗
T α∗
min α ,α + εe − y , εe + y
T T T
α,α∗ 2 −K K α α
α∗
s.t. yT = 0, 0 ≤ αi , αi∗ ≤ C, i = 1, . . . , l, (9.4)
α
Among them
⎡ ⎤T
K −K 2
Q= , Kij = exp −γ xi − xj , y = ⎣1, . . . , 1, −1, . . . , −1⎦ .
−K K
l l
The SMO algorithm is a heuristic algorithm. The basic idea is that if the solu-
tions of all variables satisfy the Karush–Kuhn–Tucker (KKT) optimality condition
[6] of the optimization problem, then the solution of this optimization problem is
obtained, because the KKT condition is a sufficient and necessary condition for the
optimization problem. Otherwise, the original quadratic programming problem is
continuously decomposed into suboptimization problems with only two variables,
and the subproblems are solved analytically until all variables satisfy the KKT condi-
tion. And in the sub-question, one variable is the one that violates the KKT condition
most seriously, and the other is automatically determined by the constraint, so the
two variables are updated simultaneously in the sub-question. Because subproblems
have analytical solutions, each sub-problem is very fast. Although the number of
subproblems is many, it is generally efficient. The SMO algorithm mainly consists
of two parts: an analytical method for solving two variables quadratic programming
and a heuristic method for selecting subproblems. And as is shown in (9.4), the most
difficulty in implementing SVR algorithm in the resource-constrained embedded
platform is that the matrix Q consumes a lot of memory and calculations, which
need to be considered and resolved.
9.4 Method
Given formula (9.5) and the principle of SMO algorithm, this paper at first imple-
mented the initial version of the SVR algorithm, then optimized it according to the
characteristics of the resource-constrained embedded platform after analyzing the
data structure and the algorithm flow. The flowchart of the initial and the optimized
SVR algorithm is shown in Fig. 9.1.
9 Implementation Method of SVR Algorithm … 89
Failure Failure
Success or Success or
Start Failure Start Failure
Success Success
Initialize Initialize
Update iterations Update iterations
G matrix Y matrix
Fig. 9.1 The flowchart of SVR (the left one is initial and the right one is optimized)
Among them
T
G = (ε − y1 ), . . . , (ε − yl ), (ε + y1 ), . . . , (ε + yl ) , (9.5)
T
T
Alpha = α1 , . . . , αl , α1∗ , . . . , αl∗ , Alpha = −α1 + α1∗ , . . . , −α1 , . . . , αl∗ ,
(9.6)
Alpha Status = [s1 , . . . , s2l ]T , si ∈{upper, lower, free}, (9.7)
K −K K
Q= , Q = , QCij = Kii + Kjj − 2Kij , i, j = 1, . . . , l. (9.8)
−K K −K
In the solution process of SVR algorithm, since the values in Q need to be called
frequently, the values in QC are also calculated from the values in Q. Therefore, in
order to avoid repeated operations and reduce the time of function calls, matrix Q
and matrix QC are calculated and stored in the beginning.
90 B. Liu et al.
At the same time, because the Gaussian kernel function is used in this paper, a large
number of floating-point exponential operations are needed in the operation process,
but the most embedded platform does not have a separate floating-point processing
unit. The calculation of the exponential function is implemented by software and is
N
time consuming. Therefore, this paper uses the Maclaurin formula ex = 1 + Nx
where N ∈R, to avoid calculating the exponential function directly. Although it can
lead to some loss of accuracy, when N is large enough, such as N = 256, the loss
can be ignored to some degree.
Note that the matrix Q consumes most memory. For a data set with l training samples,
with data type of floating point, for a 32-bit embedded platform, Q needs to occupy
4 ∗ 4 ∗ l ∗ l bytes of RAM, but the memory of the embedded platform is very limited.
In order to save memory, in the implementation of the SVR algorithm, the symmetry
of Q itself is utilized to cut it to be Q , the memory cost of which is only half of Q.
At the same time, the Alpha Status in the original algorithm flow is used to indicate
the state of the sample. After each sub-question is solved, the state matrix is updated
with the value of Alpha . But this is not necessary. This paper judges the state of
the sample by directly comparing the value of Alpha with 0 and C, thus saving the
memory occupied by Alpha Status and saving unnecessary time for updating it.
In this paper, LIBSVM dataset [7] and UCI datasets [8–10] are used to verify the
proposed optimization method. The experimental platform is 32-bit ARM micro-
controller STM32F103. The chip has 512 KB Flash and 64 KB SRAM. The clock
frequency used in the experiment is 72 MHz.
A training set of 40 samples was randomly selected, each one has 5 data, 4 of
which are input features and the rest one is output feature. The 4 input features were
normalized to the interval of (0, 1). Then fivefold cross-validation was performed on
the 40 samples to test the prediction accuracy of the SVR. The RMSE and R2 of the
prediction results of the primary algorithm and the optimized algorithm are shown
in Table 9.1.
As is shown in Table 9.1, the results of the primary and optimized SVR algorithm
are almost the same, the average root mean square error (RMSE) is only about 1%,
and the average R2 is above 0.91, which show that both the primary and optimized
algorithm has good prediction accuracy.
Then the number of training samples and the number of input features are changed
to verify time and memory optimization. At first, the number of input features is fixed
9 Implementation Method of SVR Algorithm … 91
as 4, the number of training samples is 40, 50, 60, 70, respectively. Then the number
of training samples is fixed as 40, and the number of input features is 4, 7, 8, 12,
respectively. Using these data and the SVR algorithm before and after optimization,
16 experiments were performed, and in each experiment, the time taken by the
algorithm initialization process and training and the RAM and ROM occupied by
the algorithm were recorded. The results of time and ROM optimization is shown in
Fig. 9.2.
The experimental results in Fig. 9.2 show that the time of the algorithm training
process is mainly related to the number of training samples, but has nothing to do
with the number of input features. The algorithm initialization process is related to
the number of training samples and the number of input features, which is consis-
tent with the previous analysis. Moreover, the optimization method proposed in this
paper reduces the time consumed by the SVR algorithm training process and the
initialization process by about 25%.
And the experimental results in Fig. 9.2 show that the RAM used in the algorithm
training process is mainly related to the number of training samples, and is indepen-
dent of the number of input features, which is consistent with the previous analysis.
At the same time, using the optimization method proposed in this paper, the ROM
occupied by the SVR algorithm is reduced by about 25%, and the RAM is reduced
by 22–24%.
The above experimental results prove that the proposed method for implementing
SVR algorithm in the resource-constrained platform is correct, and the proposed
method to improve the performance of the algorithm, including improving the run-
ning speed of the algorithm and reducing the memory consumption of the algorithm
is effective.
9.6 Conclusion
With the development of edge computing, machine learning and the Internet of
Things, it is of great significance to implement the shallow machine learning algo-
rithm in resource-constrained embedded platforms. SVR is a very extensive machine
92 B. Liu et al.
3 2
0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)
2.5 1.5
0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)
100 50
Before Optimization Before Optimization
82.41
80 After Optimization 40 After Optimization
29.83 30.45
RAM (Kb)
62.45
RAM (Kb)
0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)
20 20
Before Optimization Before Optimization
16 After Optimization 16 After Optimization
ROM (Kb)
ROM (Kb)
4 4
0 0
40 50 60 70 4 7 8 12
Number of Samples Dimension of Feature Vectors (40 Samples)
learning algorithm, but it needs to take up a lot of resources in the training process,
so this paper analyses the characteristics of the process and the data of the algo-
rithm, and combines the characteristics of the embedded platform to optimize the
algorithm. The experimental results using the UCI data set demonstrate that the time
of each iteration of the SVR algorithm and the time of initialization both reduced
about 25% by calculating the data that needs to be frequently invoked in advance,
removing the redundant algorithm process, and introducing the substitution function
of exponential function. And the experimental results also demonstrate that the cost
of RAM reduced 22–24%, and the cost of ROM reduced about 25% by utilizing the
symmetry of the data structure, removing unnecessary variables, and adjusting the
flow of the algorithm.
References
1. Brereton, R.G., Lloyd, G.R.: Support vector machines for classification and regression. Analyst
135(2), 230–267 (2010)
2. Cortes, C., Vapnik, V.: Support-vector network. Mach. Learn. 20, 273–297 (1995)
3. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In:
Proceedings of the Fifth Annual Workshop on Computational Learning Theory—COLT ‘92,
p. 144
4. Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A.J., Vapnik, V.N.: Support vector regression
machines. In: Advances in Neural Information Processing Systems 9, NIPS 1996, pp. 155–161.
MIT Press (1997)
5. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In:
Advances in Kernel Methods. MIT Press (1999)
6. Kuhn, H.W., Tucker, A.W.: Nonlinear programming. In: Proceedings of 2nd Berkeley Sympo-
sium, pp. 481–492. Berkeley, University of California Press (1951)
7. Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines, pp. 1–27. ACM
(2011)
8. Tüfekci, P.: Prediction of full load electrical power output of a base load operated combined
cycle power plant using machine learning methods. Int. J. Electr. Power Energy Syst. 60,
126–140 (2014). ISSN 0142-0615
9. Quinlan, R.: Combining instance-based and model-based learning. In: Proceedings on the Tenth
International Conference of Machine Learning, pp. 236–243. University of Massachusetts,
Amherst, Morgan Kaufmann (1993)
10. Waugh, S.: Extending and benchmarking cascade-correlation. Ph.D. thesis, Computer Science
Department, University of Tasmania (1995)
Chapter 10
A FPGA-Oriented Quantization Scheme
for MobileNet-SSD
Yuxuan Xie , Bing Liu , Lei Feng, Xipeng Li and Danyin Zou
Abstract The rising popularity of mobile devices, which have high performance in
object detection calls for a method to implement our algorithms efficiently on mobile
devices. As we know, Deep Learning is a good approach to achieve state-of-the-art
results. But it needs lots of computation and resources, mobile devices are often
resource-limited because of their small size. Recently, FPGA is a device famous
for parallelism and many people try to implement the Deep Learning Networks on
FPGA. After our investigation, we choose MobileNet-SSD to implement on FPGA
because that this network is designed for mobile devices and its size and cost are
relatively smaller. There are also some challenges about implementing the network
on FPGA, such as the large demand of resources and low latency, which are pretty
important for mobile devices. In this paper, we show a quantization scheme for object
detection networks based on FPGA and a process to simulate the FPGA on PC to help
us predict the performance of networks on FPGA. Besides, we propose an integer-
only inference based on FPGA, which truly reduce the cost of resources greatly. The
method of Dynamic Fixed Point is adopted and we make some improvement based on
object detection networks to quantize the MobileNet-SSD, which is a suitable object
detection network for embedded system. Our improvements make its performance
better than Ristretto.
10.1 Introduction
Deep Learning is gradually replacing the traditional computer vision method and
play a more and more important role in objection detection [1]. In order to get
better performance, deep neural networks are becoming more complicated and the
requirement of computation, storage, and energy are extremely large and increasing.
We can get that from Table 10.1. At the same time, applying this technology to FPGA
is more and more popular because of its parallelism and high performance [2]. But
the resources of mobile devices are limited and precious, FPGA is no exception.
It can be pretty difficult to implement deep neural networks in FPGA and achieve
a good and real-time performance. Then approaches about reducing the resources
consumption and speeding up are very popular. Quantizing the float point data to
fixed point is a very effective approach to achieve that.
The computational cost represents the number of calculations in an inference and
its unit is GFLOPS, which means that 109 floating-point mathematical operations.
And the DRAM access means the number of bytes read and written from memory.
The throughput means the theoretic frame number per second.
Approaches about quantization can roughly be divided into two categories. The
first category focuses on designing novel network to exploit the computation effi-
ciency to limit the consumption of resources. Such as MobileNet [3] and SqueezeNet
[4]. Others want to quantize the weight from floating point to other types to reduce
the cost of resources. This methodology includes ternary weight networks (TWN
[5]), XNOR-net [6]. And our scheme also focuses on quantizing the floating-point
data into fixed-point data which have less bit width.
It is proved that floating-point arithmetic is more complicated than fixed-point
arithmetic and require more resources and time. In addition, the accuracy loss caused
by precision loss can be restricted to a small range. Diannao [7] quantize data to 16-
bit fixed point with accuracy loss less than 1% on classification network. Ristretto
successfully quantifies CaffeNet and SqueezeNet to 8 bits in dynamic fixed-point
format [8]. When we apply Ristretto to quantize the object detection networks, the
mAP decline greatly. Then we make some improvements in dynamic fixed point to
quantize MobileNet-SSD and get higher performance than Ristretto. Therefore, we
quantize the floating-point MobileNet-SSD to fixed point and limit the bit width of
data. Besides, we design an integer-only inference scheme on FPGA, which can truly
reduce the cost of resources [9]. We also run our fixed point in HLS, a simulation
tool of FPGA and get a report about the resources consumption. And in order to have
higher working efficiency, we propose a quantization scheme based on FPGA and a
method to simulate the FPGA on PC. And proving that data can achieve equation in
every bit. Cause it can be also very difficult to set up deep neural networks in FPGA
as a result of completely different programming method [10]. And we cannot get the
performance of deep neural networks until we set up them in FPGA. Simulating the
FPGA on PC can be a very good way to solve it and can really improve our working
efficiency.
In this section, we introduce our quantization arithmetic, dynamic fixed point. This
arithmetic was proposed by making some improvements on fixed-point quantization.
Early some people quantized the float model to fixed point and could not get a good
result because of the large loss. Dynamic fixed point can be a very good way to solve
problems that different layers have a significant dynamic range. So every layer can
have least precision loss. In fixed point, each number is formed by three parts, a sign,
the integer part, and fractional part. The data format can be shown in Fig. 10.1. And
we can use C++ to represent this data format as a result of its advantages that allows
different types to appear in a struct. In the structure, we define a bool variable named
s to represent the positive and negative sign. Two char variables bw and fl stand for
the bit width and the length of fractional part, respectively. And the real data which
have not quantized are represented by rd.
bw−2
(−1)s × 2−fl × 2i xi (10.1)
i=0
We can get quantized data through (10.2). And the precision loss is less than 2−fl .
round rd × 2fl
(10.2)
2fl
We define the round(x) as follows. And [x] means that the largest integer in all
integers smaller than x.
[x] if [x] ≤ x ≤ [x] + 0.5
round(x) = (10.3)
[x] + 1 if [x] + 0.5 ≤ x ≤ [x] + 1
One problem left is that how we determine the length of fractional part and integer
part. Philipp Matthias Gysel uses (10.4) and gets a good performance in classification
networks. In this equation, the data represent a set of data such as input or weight.
Besides, we also merge the Batch Normalization layers [11] into neighboring
convolution layers to make it convenient to deploy MobileNet-SSD. It is because
the main function of Batch Normalization layers is to speedup the training process
and merging them does not have any bad effect on inference. At first, we define μ
as the mean of the input data, σ2 as the variance of the input data, ε that represents
a small number to make sure the denominator is not zero. In addition, there are two
parameters γ, β that are able to train. And what we quantize is the model that has
been trained to perform well. So, these two parameters can be seen as constant. Then
we calculate the intermediate variables α by (10.5)
γ
α= √ (10.5)
σ2 + ε
Then we calculate through (10.6), (10.7) to get the two new parameters in con-
volution layers Weightnew , biasnew . And weight and bias represent the parameters
before we merge the Batch Normalization layers. Then we get the MobileNet-SSD
that has no Batch Normalization layers.
16 bits
conv Weight
32 bits
32 bits
bias +
32 bits
ReLU
input 2
We describe our quantized scheme carefully in this section and our improvements.
The method we use is dynamic fixed point. At first, we run several epochs to get the
Maximum of every layer’s input and weight, respectively. Then we can calculate the
length of integer part to make sure the data will not overflow. Ristretto uses (10.4)
to get the length of integer part, but this method does not have a good performance
in object detection networks. We make an exchange based on (10.4) and get (10.8)
and a better performance.
After we get the format of every layer’s input and weight, we replace the traditional
convolution layer with our own convolution layer. In this way, we can quantize the
data into fixed point in the layer. Though we represent the data by float, but they
are the same value. To achieve equal between PC and FPGA, we quantize our input
and weight before convolution operation. In fact, we can also quantize the output of
every layer. The output of current layer will be the input of next layer and they will
be quantized there.
In general, the frame of our quantization scheme can be shown as Fig. 10.2. We
take a layer as example and carefully describe the data path in this process. At first,
we quantize the input and weight into 16 bits based on the length of integer part.
Then we convolve the input and weight and we use 32 bits to represent these results
to make sure the data will not overflow. And the format of the result depends on the
input and weight, and the length of results’ fractional part is the sum of input’s and
weight’s fractional part. And we also quantize the bias into 32 bits integer and its
format is the same with the result. This is because two fixed-point data must have
the same length of fractional part to make sure their decimal point aligned when they
add. The result was sent to ReLU as input. Finally, the data are sent to the next layer.
100 Y. Xie et al.
And the weight and bias are the same. And we can show the real data through
(10.10)
output
(10.10)
2fl_input1× 2fl_weight
Then the results are sent to ReLU. After that, the data are sent into the next layer.
We will prove that the data on PC and FPGA can achieve equal. We have built up
the model of the data path very clearly. And what we need to do now is to describe
the data path mathematically according to the model we have built up before. We
can get the value of data in every link. In general, the data path on PC can be shown
as follows. And the data represent input and weight.
round datafloat32 × 2fl_data
dataint16 = (10.11)
2fl_data
round biasfloat32 × 2fl_input+fl_weight
biasint32 = (10.12)
2fl_input+fl_weight
outputint32 = biasint32 + convolution inputint16 , weightint16 (10.13)
round ReLU outputint16 × 2fl_input2
input2int16 = (10.14)
2fl_input2
The data path of our scheme on FPGA can be shown as follows: data represents
input and weight.
dataint16 = round datafloat32 × 2fl_data (10.15)
biasint32 = round biasfloat32 × 2fl_input +fl_weight (10.16)
outputint16 = biasint32 + convolution inputint16 , weightint16 (10.17)
ReLU outputint32 × 2fl_input2
input2int16 = round (10.18)
2fl_input +fl_weight
10 A FPGA-Oriented Quantization Scheme for MobileNet-SSD 101
As our parameters about length of fractional part are generated by the data which
are not quantized, these parameters on FPGA and PC are the same. So fl = fl. We
can get the result that can be simplified to (10.19)
input2int16 round ReLU outputint16 × 2fl_input2
= (10.19)
2fl - input2 2fl_input2
And we can get (10.20). Then we can prove that the data on FPGA and PC can
be completely equal.
input2int16
= input2int16 (10.20)
2fl - input2
10.5 Experiments
In this part, we will present the performance of ours scheme and Ristretto. And we
do our experiment on PC whose core is i5-6200U with using no GPU. We evaluate
quantization scheme on VOC0712 and the number of test data is 1000. The unit is
mAP which is the average of the maximum precisions at different recall values. And
the number of class in this dataset is 20. In addition, we merge the batch normalization
layer into convolutional layer. The results we get is shown in Table 10.2.
From the results, ours performs better than Ristretto and has nearly no loss when
quantized to int16. And ours has a better performance than Ristretto when the data
are quantized to int8. As a result, our quantization scheme can be applied to FPGA
without loss, which really it contributes to the development of AI mobile devices.
References
1. Lee, A.: Comparing Deep Neural Networks and Traditional Vision Algorithms in Mobile
Robotics. Swarthmore University (2015)
2. Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques
and applications. J. Netw. Intell. 1(3), 83–98 (2016)
3. Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks
for mobile vision applications (2014). arXiv:1704.04861
4. Iandola, F.N., Han, S., Moskewicz, M.W., et al.: Squeezenet: Alexnet-level accuracy with 50x
fewer parameters and <0.5 mb model size (2016). arXiv:1602.07360
5. Yin, P., Zhang, S., Xin, J., et al.: Training ternary neural networks with exact proximal operator
(2016). arXiv:1612.06052
6. Rastegari, M., Ordonez, V., Redmon, J., et al.: Xnor-net: Imagenet classification using binary
convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542.
Springer, Cham (2016)
7. Chen, Y., Du, Z., Sun, N., Wang, J., et.al.: Diannao: a small-footprint high-throughput acceler-
ator for ubiquitous machine-learning. In: ASPLOS, vol. 49, no. 4. ACM, pp. 269–284 (2014)
8. Kuang, F.-J., Zhang, S.-Y.: A novel network intrusion detection based on support vector machine
and tent chaos artificial bee colony algorithm. J. Netw. Intell. 2(2), 195–204 (2017)
9. Fan,C., Ding, Q.: ARM-embedded implementation of H.264 selective encryption based on
chaotic stream cipher. J. Netw. Intell. 3(1), 9–15 (2018)
10. Gysel, P.: Ristretto: hardware-oriented approximation of convolutional neural networks (2016).
arXiv:1605.06402
11. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift (2015). arXiv:1502.03167
12. Liu, B., Zou, D., Feng, L., Feng, S., Fu, P., Li, J.: An FPGA-based CNN accelerator integrating
depthwise separable convolution. Electronics 8, 281 (2019)
Chapter 11
A High-Efficient Infrared Mosaic
Algorithm Based on GMS
Abstract The most important thing in infrared image mosaic technology is image
registration technology. In order to adapt to the real-time requirement of the bat-
tlefield, the ORB algorithm in this paper is used for feature extraction. In order to
obtain high-quality feature matching points, this paper proposes a new IGMS algo-
rithm on the basis of GMS algorithm. The experimental results show that the correct
point matching rate is increased by 8%. Using RNASAC to obtain the transforma-
tion matrix between the two images, and finally using the fade-in and fade-out fusion
algorithm to obtain a complete wide-field military investigation map.
11.1 Introduction
(2) In the feature extraction stage, in order to meet the requirements of real-time
matching, this paper uses the ORB algorithm with fast calculation speed. The
algorithm not only has rotation invariance and scale invariance, but the main
thing is that the algorithm can resist noise interference.
(3) In the image registration stage, based on the unidirectional matching principle
in GMS, a bilateral search algorithm is proposed to achieve high-quality feature
matching. Experimental data shows that the accuracy of the matching algorithm
is increased by 8%.
(4) In the image fusion stage, for images with large overlapping regions, a fade-in
and fade-out fusion algorithm are used to eliminate the stitching gap in image
registration due to uneven illumination.
The development of image mosaic technology is traced back to 1966, Richard Szeliski
proposed a new panoramic mosaic model with eight parameters based on the L-M
(Iterative Nonlinear minimization Method) [1]. The biggest advantage of this algo-
rithm is that it can converge quickly. The most important thing is that the accuracy
of the calculation is very high. In 2000, Shmuel Peleg obtained inspiration based on
the movement of the camera, proposed an image mosaic algorithm that adaptively
selects the motion model [2]. In 2004, M. Brown and D. Glowe [3] proposed an
SIFT (Scale-Invariant Feature Transform) feature point detection algorithm based on
multi-resolution image mosaic. The feature points have scale invariance and rotation
invariance. In the later experimental results, SIFT achieved a good stitching effect.
In 2006, Bay [4] based on the principle of SIFT operator to reduce dimensionality,
proposed a new SURF feature, the calculation time is twice as fast as the SIFT cal-
culation time. In 2008, BRIEF grew out of research that uses binary tests to train a
set of classification trees [5]. He used a probabilistic model to implement the func-
tion of automatic sorting. In 2009, Rittavee Matungka proposed the APT (Adaptive
Polar Transform) algorithm [6], which can solve the problem of unevenness in the
transformation of polar coordinates. In 2010, Jungpli Shin [7] used energy spectrum
technology, which used this technology to eliminate the ghosts that appear after
image fusion, making the image gap transition smoother. In 2011, Ethan Rublee
[8] proposed an ORB algorithm suitable for real-time performance. The feature
extraction of the algorithm is extracted by FASF algorithm, and then the feature is
described by using the BRIEF algorithm. It can maintain its rotation invariance and
scale invariance. The biggest contribution is that the overall calculation speed time
is 100 times faster than the SIFT. Meanwhile, it is also 10 times faster than SURF.
In 2013, Yigisoy M. and Navab N. established a new mosaic method based on the
idea of structural communication. The core of the algorithm is to divide the overall
image into regions, so that many subregions are obtained. Then it tries to search for
valid information in each subarea, followed by mutual propagation in the subareas.
Therefore, with this method, a structural overall mapping can be formed between
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 107
the two images [9]. In 2017, Jiawang Bian proposed a simple encapsulation method
based on statistical feature matching methods. The algorithm can quickly distinguish
between correct matching and false matching, and greatly improve the stability of
feature point matching. Experimental data shows that the GMS algorithm has strong
real time and super robustness [10].
The ORB [11] algorithm was proposed by Ethan Rublee in 2011. It combines the
detection method of FAST feature points with the BRIEF descriptor.
The FAST [12] mainly considers the gray level change of a pixel. If the difference
between the gray point value of the pixel in the neighborhood and the point to be
measured is sufficiently large, the candidate point is a feature point. Although the
FAST algorithm is fast, it only compares the gray value. To achieve its dimensional
and rotational invariant characteristics, the ORB algorithm solves by constructing
a Gaussian pyramid and using intensity centroid [13] method to obtain its scale
feature, and the main direction is determined by the offset between the grayscale and
the centroid of the corner point.
BRIEF [14] uses the binary code string to descriptor the feature point, so that
the speed is greatly improved. In order to make the BRIEF descriptor have rotation
invariance, the coordinate system established by the ORB in calculating the BRIEF
descriptor is centered on the feature point, and the 2D [15] coordinate system is
established by connecting the line of the feature point and the centroid of the point
region as the X-axis [16]. Thus, regardless of how the image is rotated, the coordinate
system of the ORB pick point pair is fixed [17]. At different rotation angles, the points
we take out in the same point mode are consistent [18]. This solves the problem of
rotation consistency.
using the BRIEF algorithm, and finally, the GMS is used to eliminate the false
matching.
Although the GMS algorithm has better robustness and efficiency, the performance
is still insufficient at the correct matching rate. It is found through experiments that
when the overlapping area of the two pictures is very large, two or more points in Ia
match the same point of Ib , which cannot be eliminated by GMS. So here, we use the
idea of bilateral matching to get better matching points, we can called the improved
algorithm is IGMS. As shown in Fig. 11.1, we first look for A point a in Ia , then find
the matching point B corresponding to A in Ib , and then find the matching point A
corresponding to matching point B in Ia , if A and A is the same point, then we think
that A and B are the true matching points. The calculation formula is
A → B A ∈ Ia if A=A
−−−→ A ⇔ B (11.1)
B →A B ∈ Ib
The basic principle of the fade-in and fade-out method determines according to the
size of the pixel weight in the overlap region and the distance from the overlap region
to its boundary. Its calculation formula is as follows:
⎧
⎨ f1 (x, y) (x, y) ∈ R1 and (x, y) ∈
/ R2
f (x, y) = d1 f1 (x, y) + d2 f2 (x, y) (x, y) ∈ R1 and (x, y) ∈ R2 (11.2)
⎩
f2 (x, y) (x, y) ∈ R2 and (x, y) ∈
/ R1
where the d1 = M − width M and d2 = width M needs to be satisfied: 0 <
d1 , d2 < 1 and d1 + d2 = 1. As shown in Fig. 11.2, where width represents the width
o
W1 x
W2
of the overlapping pixels of the two images and M represents the overlapping pixel
area. And W1 and W2 are the widths of the two images.
For the feature matching of infrared images, the following three experiments mainly
compare the ORB+RANSAC, the ORB+GMS, and the ORB+IGMS matching algo-
rithms in terms of matching accuracy, the RMSE, and matching speed. The exper-
iment was performed on a computer with a CPU for Intel Core i7, 3.60 GHz, a
computer with 8.00 GB of memory and a Windows 7 operating system. The algo-
rithm is implemented in OpenCV 2.4.13 with Visual Studio 2017.
The following is a matching effect picture of the three sets of infrared images collected
in the garden. And the size of the image is 720 × 576.
There are three evaluation indicators for image registration: CMR, RMSE, and
Matching Speed.
(1) CMR (Correct Matching Rate) is the ratio of the exact number of matching
points to the correct number of matching points. The larger the value of CMR,
the more accurate the accuracy of the match. The calculation expression for
CMR is
CMR = NC /N (11.3)
110 X. Pei et al.
(2) RMSE (Root Mean Square Error) is a measure of the deviation between observed
and true values. It is the ratio of the sum of the squared deviations of the observed
and true values, and then squared. The larger its value, the greater the difference
between the reference image and the registration. The calculation expression
for RMSE is:
N
1
Ermse (f ) =
(xi , yi ) − f (x , y )2 (11.4)
N i=1 i i
where N represents the number of exact matching pairs, (xi , yi ) represents
the coordinates of the feature points in the image to be registered, and (xi , yi )
represents the coordinates of the feature points corresponding to (xi , yi ) in the
reference picture.
(3) Matching Speed: the speed of the matching speed is also the running time of
the algorithm. The longer the run time, the slower the match.
As shown in Table 11.1, the registration ratios of the three groups of exper-
iments are compared. From the data, the false matching points after purification
using the RANSAC algorithm are still relatively large, as shown in Fig. 11.3. And
the ORB+GMS registration rate is still relatively high as shown in Fig. 11.4, but the
IGMS has the highest registration accuracy after bilateral matching. Experiments
show that the IGMS has an accuracy of 8% higher than the GMS (Fig. 11.5).
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 111
Table 11.2 Match rate and Image Algorithm RMSE Matching speed/s
root mean square error table
A ORB+RANSAC 1.071 0.16931
ORB+GMS 1.003 0.0704
IGMS 0.679 0.0333
B ORB+RANSAC 1.365 0.1559
ORB+GMS 1.264 0.0503
IGMS 0.726 0.0345
C ORB+RANSAC 0.951 0.1687
ORB+GMS 0.841 0.0661
IGMS 0.545 0.0419
As shown in Table 11.2, from the perspective of registration accuracy, the smaller
the RESM value, the higher the registration accuracy, so the registration accuracy
of the IGMS is more accurate. In terms of matching speed, the matching time of
the IGMS is minimal, so it is more suitable for real-time matching requirements.
In summary, the IGMS is more suitable for matching infrared images in terms of
registration accuracy, registration rate, and real-time performance.
are smoother, and the texture and edge details of the image remain very complete,
as shown in Fig. 11.6.
11.6 Conclusion
The IGMS matching algorithm in this paper is more suitable for the stitching of
infrared images. Experimental data shows that the IGMS can better eliminate the
mismatching pair than the RANSAC algorithm. The accuracy of IGMS matching is
increased by 8% compared to GMS. Finally, the gradual integration of the fusion
algorithm is used to make the stitching gap of the image transition more smoothly
and naturally, so the algorithm achieves very good results in the mosaic of infrared
images.
References
1. Shin, J., Tang, Y.: Deghosting for image stitching with automatic content-awareness. Pattern
Recognit. 23(26), 26–27
2. Zheng, W., Zhengchao, C., Bing, Z., et al.: CBERS-1 digital images mosaic and mapping of
China. 11(6), 787–791
3. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis.
60(2), 91–110
4. Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. In: European Conference
on Computer Vision, May 2006
5. Calonder, M., Lepetit, V., Fua, P.: Keypoint signatures for fast learning and recognition. In:
European Conference on Computer Vision (2008)
6. Matungka, R., Zheng, Y.F.: Image registration using adaptive polar transform. IEEE Trans.
Image Process. 18(10), 2340–2354 (2009)
7. Schmid G., Mohr, R.: Local grayvalue invariants for image retrieval. IEEE Trans. Pattern Anal.
Mach. Intell. 19(5), 530–534 (1997)
8. Rublee, E., Rabaud, V., Konolige, K., et al.: ORB: An efficient alternative to SIFT or SURF
(2011)
9. Yigitsoy, M., Navab, N.: Structure propagation for image registration. IEEE Trans. Med. Imag-
ing 32(9), 1657–1670 (2013)
10. Bian, J., Lin, W.-Y., Matsushita, Y., Yeung, S.-K., Nguyen, T.D., Cheng, M.-M.: GMS: grid-
based motion statistics for fast, ultra-robust feature correspondence. In: IEEE CVPR (2017)
11 A High-Efficient Infrared Mosaic Algorithm Based on GMS 113
11. Rublee, E., Rabaud, V., Konolige, K.: ORB: an efficient alternative to SIFT or SURF. IEEE
Int. Conf. Comput. Vis. 58(11), 2564–2571 (2011)
12. Rosten, E., Porter, R., Drummond, T.: Faster and better: a machine learning approach to corner
detection. IEEE Trans. Pattern Anal. Mach. Intell. 32, 105–119 (2010)
13. Rosin, P.L.: Measuring corner properties. Comput. Vis. Image Underst. 73(2), 291–307 (1999)
14. Calonder, C., Lepetit, V., Strecha, C.: BRIEF: binary robust independent elementary features.
In: European Conference on Computer Vision, pp. 778–792 (2010)
15. Chen, W.-K., Chen, H.-P., Tso, H.-K.: A friendly and verifiable image sharing method. J. Netw.
Intell. 1(1), 46–51 (2016)
16. Shen, W., Hao, S., Qian, J., Li, L.: Blind quality assessment of dehazed images by analyzing
information, contrast, and luminance. J. Netw. Intell. 2(1), 139–146 (2017)
17. Hong, S., Wang, A., Zhang, X., Gui, Z.: Low-dose CT image processing using artifact sup-
pressed total generalized variation. J. Netw. Intell. 3(1), 26–49 (2018)
18. Harold, C., Nelta, N.: Blind images quality assessment of distorted screen content images. J.
Netw. Intell. 3(2), 91–101 (2018)
Chapter 12
A Load Economic Dispatch Based on Ion
Motion Optimization Algorithm
Abstract This paper presents a new approach for dispatch generating powers of
thermal plants based on ion motion optimization algorithm (IMA). Electrical power
systems are determined by optimization in power balancing, transporting loss, and
generating capacity. The scheduling power generating units for stabilizing different
dynamic responses of the control power system are mathematically modeled for the
objective function. Economic load dispatch (ELD) gains as the objective function is
optimized by applying IMA. In the experimental section, several cases of different
units of thermal plants are used to test the performance of the proposed approach.
The preliminary results are compared with the other methods in the literature shows
that the proposed plan offers higher effect performance.
12.1 Introduction
Recently, electrical modes of renewable energy sources have increased rapidly [1].
Fast rate load and fluctuation in the power grids needs stable balance. Economic
load dispatch (ELD) [2] refers to a scheduling method that rationally allocates the
productive output of each generating unit to meet the constraints of power system
m
F= fi (Pi ) (12.1)
i=1
According to the analysis the cost function of ith generating unit, fi (P) is a
quadratic polynomial function that is described a:
fi (Pi ) = ai + bi Pi + ci Pi2 $ /h (12.2)
For different variables and notations are used, here, PD is power load demand. Pi
is active power deliver. Pimin is the minimum real power output. Pimax is maximum
12 A Load Economic Dispatch Based on Ion Motion … 117
power. Pi0 is the previous real power output. ai , bi and ci are coefficients of fuel cost.
ei , fi are coefficients of the power grid system with valve-point effects. m is a number
of the committed units. Ploss denotes the transporting load loss. Bij , B0i , B00 are B-
matrix coefficients for transmission power loss. Uri and Dri are the up ramp limits
and down ramp limits. Pimax is lower limits of the prohibited zone for generating
pzk
unit. PiU is upper limits of kth is the prohibited zone for ith generating unit. Imax is
a maximum number of iteration. I is current iteration.
According to the equality constraints, total power generation m i=1 Pi is equivalent
to load demand PD and total loss as the following equation:
m
(Pi ) = PD + PLoss (12.3)
i=1
For the loss coefficient on transition load, the total loss may be derived as
m
n
m
PLoss = Pi Bij Pj + B0i Pi + B00 (12.4)
i=1 j=1 i=1
The ion motion optimization algorithm (IMA) [15] simulated the motion of the ions
with anion (negative ion charge) set and cation (positive ion charge) set. These two
sets are employed as ion candidate solutions in the operation process. They per-
form different evolutionary strategies in the liquid phase and the solid phase and
circulate between the liquid and the solid phase to achieve the purpose of optimiz-
ing the ions. Ions in the IMA algorithm can move toward best opposite charges. It
means that anions move toward the best cation; on the other hand, cations move
toward the best anion. The movement of ions in this algorithm can guarantee the
improvement of all ions throughout iterations. Their movement power depends on
the attraction/repulsion forces between them. The amount of this force specifies the
momentum of each atom. The following steps represent the process of the algorithm.
Initialization
An initial random population is randomly generated according to a uniform distri-
bution within the lower and upper boundaries with D dimensions.
Liquid phase
In the liquid phase, the anion group (A) and the cation group (C) updated according
to the following patterns, respectively.
Ai,j = Ai,j + AFi,j × Cbestj − Aj (12.9)
Ci,j = Ci,j + CFi,j × Abestj − Cj (12.10)
where Cbest and Abest are cation and anion optimization, respectively. Subscript
i = 1, 2, 3, . . . , NP/2, (NP/2 is the size of ions population), and j = 1, 2, 3, . . . , D.
The optimal anion and cation are the anion and cation with the lowest fitness value
in the entire anion group and the Cation group, respectively, for a minimization
problem.
The resultant of anions attracted force AFi,j and CFi,j are mathematically modeled
as follows:
1
AFi,j = (12.11)
1 + e−0.1/ADi,j
1
CFi,j = (12.12)
1 + e−0.1/CDi,j
where ADi,j and CDi,j are the distances of ith anion from the best cation,
and cation
from the best anion in dimension, respectively. AD = A − Cbest
j , and CDi,j =
i,j i,j
Ci,j − Abestj .
Solid phase
The ion is gradually gathered with iteration near the optimal ion by the gravitational
force. The solid phase was set for breaking the phenomenon of excessive concen-
12 A Load Economic Dispatch Based on Ion Motion … 119
tration, and also providing diversity for the algorithm in case of over-concentration
of ions to make the algorithm fall into a local optimum. The ion motion gradually
slows down like the physical process as the iteration proceeds from the initial intense
motion, and gradually, the liquid state ions will recrystallize into crystals. The process
of recrystallization was simulated in IMA was known as a solid phase.
Aj + ϕ1 × (Cbest − 1), if rand > 0.5
Aj = (12.13)
Aj + ϕ1 × (Cbest), otherwise
Cj + ϕ2 × (Abest − 1), if rand > 0.5
Cj = (12.14)
Cj + ϕ2 × (Abest), otherwise
Termination condition
Completion of the solid phase evolution strategy to determine whether to achieve
the termination conditions of the algorithm. The termination conditions include the
presupposition accuracy, the number of iterations, and so on. If it is reached, the
optimal ion is directly output; otherwise, the anions and cations are returned to the
liquid phase from the solid phase and continue to be iterated. In such a process,
anions and cations are circulated in the liquid phase and solid stage, and the optimal
solution is gradually obtained with iteration.
Search space optimization of the ELD includes both feasible and unfeasible scenarios
that the main work is to identify the feasible points which produce close optimum
results within the boundary framework. It means the possible points have to satisfy
all the constraints, while the unworkable aspects violate at least one of them. As
mentioned in the above section, the power system economic scheduling problems
have multiple constraints, such as power balance constraints, operational constraints,
slope limits, and prohibited operating space. These constraints make the feasible
domain space of the problem very complicated [3].
Therefore, the solution or set of optimized points must necessarily be feasible,
i.e., the points must satisfy all constraints. So, it is essential to design a suitable
objective function, which results in success of an optimization problem. The perfor-
mance indices utilize in the area of optimization purposes with high acceptance rate.
The objective function characterized by the given different execution conditions and
constraints [3].
To handle constraints, we use the penalty functions to deal with unfeasible points.
We attempt figuring out an unconstrained problem in the search space points by
modifying the objective function in Eq. (12.1). The formula function is as follows:
f (Pi ), if Pi ∈ F
Min f = (12.15)
f (Pi ) + penalty(Pi ), otherwise
120 T.-T. Nguyen et al.
Start Calculate
The objectives
Modelling No
Dispatch Space
Feasible
Yes
Ions mapping
ELD model Search feasible
points for
Optimization
No
Success
i=i+1 i< iterMax
Yes Yes
End
Fig. 12.1 Flowchart of the proposed IMA for dispatch power generation (ELD)
The nearest distance points in the possible areas measure the effort to refine the
solution.
n 2
n
n
Min f = Fi (Pi ) + q1 Pi − PL − PD + q2 Vj (12.17)
i=1 i=1 j=1
12 A Load Economic Dispatch Based on Ion Motion … 121
4
10 Case study with a six-unit system
1.489
The mean value of fitness function
1.488
1.487
1.486
1.485
1.484
1.483
0 20 40 60 80 100 120 140 160 180 200
Iterations
Fig. 12.2 Comparison of the proposed IMA for dispatching load scheduling generators with FA,
GA, and PSO approaches in the same condition
Parameters of penalty factors and constants associated with the power balance are
used to tune practically with values 1000 set to q1 , and the value one set to q2 in the
simulation section.
The necessary steps of IMA optimization for scheduling power generation dis-
patch:
Step 1. Initialize the IMA population that associated model dispatch space.
Step 2. Update the Anion group (A) and the Cation group (C) updated according to
Eqs. (12.10) and (12.11) as the patterns, respectively.
Step 3. Calculate ions according to the fitness value of the function as Eq. (12.17),
figure current nearest solutions and then update the position as feasible archives.
Step 4. If the termination condition met (e.g., max iterations), go to step 2, otherwise,
terminate the process and produces the result (Fig. 12.1).
To evaluate the performance of the proposed approach, we use the case study of
six-unit and fifteen-unit systems to optimize the objective function in Eq. (12.17).
The outcome of the case testing for dispatch ELD is compared to other approaches,
i.e., genetic algorithm(GA) [11], firefly algorithm (FA) [16], and particle swarm
optimization (PSO) [10]. Setting parameters for the approaches: population size N
is set to 40, and the dimension of the solution space D is set to 6 and 16 for the
six-unit system he fifteen-unit system, respectively. The max-iteration is set to 200,
122 T.-T. Nguyen et al.
and number of runs is set to 15. The final obtained results averaged the outcomes
from all runs. The compared results for ELD are shown in Fig. 12.2.
A. Case study of six units
The features of a system with six thermal units are listed in Table 12.1. The power
load demand is set to 1200 (MW).
The coefficients as Eq. (12.2) for a six-unit system in the operating normally with
capacity base 100 MVA are given as follows:
⎡ ⎤
0.15 0.17 0.14 0.19 0.26 0.22
⎢ 0.17 0.60 0.13 0.16 0.15 0.20 ⎥
⎢ ⎥
⎢ ⎥
−3 ⎢ 0.15 0.13 0.65 0.17 0.24 0.19 ⎥
Bij = 10 × ⎢ ⎥,
⎢ 0.19 0.16 0.17 0.71 0.30 0.25 ⎥
⎢ ⎥
⎣ 0.26 0.15 0.24 0.30 0.69 0.32 ⎦
0.22 0.20 0.19 0.25 0.32 0.85
B0 = 10−3 [−0.390 − 0.129 0.714 0.059 0.216 − 0.663],
B00 = 0.056,
Table 12.4 depicts the comparison of the proposed approach with the other pro-
cedures, e.g., FA, GA, and PSO methods in the same condition for the optimization
124 T.-T. Nguyen et al.
system with 15 generators. The statistical results involved the generation cost, eval-
uation value, and average CPU time are summarized in the table.
Observed over Tables, the results of quality performance in terms of the cost,
power loss and time consumption of the proposed method also produced better the
other approaches. The proposed IMA outperforms other methods.
The observed results of quality performance in terms of convergence speed and
time consumption show that the proposed method of parallel optimization outper-
forms the other methods.
12.5 Conclusion
In this paper, we presented a new approach based on ion motion optimization algo-
rithm (IMA) for dispatching power generators outputs. Economic load dispatch
(ELD) is optimized with different responses of the control system in balancing,
transporting loss, and generating capacity. The linear equality and inequality con-
straints were employed in modeling objective function. The experimental section,
12 A Load Economic Dispatch Based on Ion Motion … 125
several cases of different units of thermal plants are used to test the performance of
the proposed approach.
The preliminary results are compared with the other methods in the literature such
as FA, GA, and PSO in the same condition that shows that the proposed approach
provides better quality performance and runs less time than the other criteria.
References
1. Tsai, C.-F., Dao, T.-K., Pan, T.-S., Nguyen, T.-T., Chang, J.-F.: Parallel bat algorithm applied
to the economic load dispatch problem. J. Internet Technol. 17 (2016). https://doi.org/10.6138/
JIT.2016.17.4.20141014c
2. Al-Sumait, J.S., Sykulski, J.K., Al-Othman, A.K.: Solution of different types of economic load
dispatch problems using a pattern search method. Electr. Power Compon. Syst. 36, 250–265
(2008). https://doi.org/10.1080/15325000701603892
3. Dao, T., Pan, T., Nguyen, T., Chu, S.: Evolved bat algorithm for solving the economic load
dispatch problem. In: Advances in Intelligent Systems and Computing, pp. 109–119 (2015).
https://doi.org/10.1007/978-3-319-12286-1_12
4. Vajda, S., Dantzig, G.B.: Linear programming and extensions. Math. Gaz. (2007). https://doi.
org/10.2307/3612922
5. Nguyen, T.-T., Pan, J.-S., Chu, S.-C., Roddick, J.F., Dao, T.-K.: Optimization localization in
wireless sensor network based on multi-objective firefly algorithm. J. Netw. Intell. 1, 130–138
(2016)
6. Yeniay, Ö.: Penalty function methods for constrained optimization with genetic algorithms.
Math. Comput. Appl. (2005)
7. Soliman, S.A.-H., Mantawy, A.-A.H.: Modern Optimization Techniques with Applications in
Electric Power Systems (2012). https://doi.org/10.1007/978-1-4614-1752-1
8. Nguyen, T.-T., Pan, J.-S., Wu, T.-Y., Dao, T.-K., Nguyen, T.-D.: Node coverage optimization
strategy based on ions motion optimization. J. Netw. Intell. 4, 1–9 (2019)
9. Xue, X., Ren, A.: An evolutionary algorithm based ontology alignment extracting technology.
J. Netw. Intell. 2, 205–212 (2017)
10. Sun, J., Palade, V., Wu, X.J., Fang, W., Wang, Z.: Solving the power economic dispatch problem
with generator constraints by random drift particle swarm optimization. IEEE Trans. Ind.
Informatics. 10, 222–232 (2014). https://doi.org/10.1109/TII.2013.2267392
11. Chiang, C.L.: Improved genetic algorithm for power economic dispatch of units with valve-
point effects and multiple fuels. IEEE Trans. Power Syst. 20, 1690–1699 (2005). https://doi.
org/10.1109/TPWRS.2005.857924
12. Suppapitnarm, A., Seffen, K.A., Parks, G.T., Clarkson, P.J.: Simulated annealing algorithm for
multiobjective optimization. Eng. Optim. (2000). https://doi.org/10.1080/03052150008940911
13. Du, K.L.: Clustering: a neural network approach. Neural Netw. (2010). https://doi.org/10.1016/
j.neunet.2009.08.007
14. Nanda, S.J., Panda, G.: A survey on nature inspired metaheuristic algorithms for partitional
clustering (2014). https://doi.org/10.1016/j.swevo.2013.11.003
15. Javidya, B., Hatamloua, A., Mirjalili, S.: Ions motion algorithm for solving optimization prob-
lems. Appl. Soft Comput. J. 32, 72–79 (2015). http://dx.doi.org/10.1016/j.asoc.2015.03.035
16. Apostolopoulos, T., Vlachos, A.: Application of the firefly algorithm for solving the economic
emissions load dispatch problem (2011). https://doi.org/10.1155/2011/523806
Chapter 13
Improving Correlation Function Method
to Generate Three-Dimensional
Atmospheric Turbulence
Abstract Atmospheric turbulence is a common form of wind field that causes turbu-
lence for aircraft. A high-intensity turbulence field may negatively affect flight safety.
With the development of simulation modeling and software engineering, the influence
of the atmospheric turbulence on an aircraft has been widely studied using simula-
tion experiments. Because the method for generating one-dimensional atmospheric
turbulence is now mature, researchers have been confronted with a growing need to
generate the three-dimensional atmospheric turbulence field that is required in the
new simulation experiments. In the current study, we generate a three-dimensional
atmospheric turbulence field based on an improved correlation function method.
The main innovation is that we use the double random switching algorithm to adapt
the Gaussian white noise sequence that is closer to the ideal condition when cre-
ating the one-dimensional atmospheric turbulence field. The two-dimensional and
the final three-dimensional atmospheric turbulence field can be generated based on
the one-dimensional one by iteration. There are experimental results to confirm that
the three-dimensional atmospheric turbulence generated by this method provides
improved transverse and longitudinal correlations as well as reduced error when
compared with the theoretical values.
13.1 Introduction
Atmospheric turbulence is the random motion of the atmosphere that usually accom-
panies the transfer and exchange of energy, momentum, and matter. Such turbulence
L. Lin (B) · J. Li
School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin
150001, China
e-mail: linlianlei@hit.edu.cn
K. Yan
Beijing Near Space Airship Technology Development Co., Ltd, Beijing 100070, China
potentially may have significant adverse effects on flight safety. As simulation mod-
eling and software engineering has developed, these techniques have been widely
used to study the influence of atmospheric turbulence on an aircraft. In such simula-
tion experiments, virtual atmospheric turbulence fields can be constructed. This has
great significance to atmospheric turbulence field modeling.
Studies of atmospheric turbulence using a mathematical model started in 1942.
Dryden established a spectrum model of atmospheric turbulence based on massive
observational data. Later, von Karman created an atmospheric turbulence energy
model with higher accuracy and a more complex form [1]. Both models considered
the classic models for the atmospheric turbulence field. Development of modeling
techniques for atmospheric turbulence started with a one-dimensional method [2].
Of these techniques, the main method is the shaping filter method [3]. Not only
has a method for one-dimensional atmospheric turbulence modeling been developed
based on Dryden’s model [4, 5], modeling methods based on von Karman’s model
have also been reported [6, 7]. Currently, the methods for modeling one-dimensional
atmospheric turbulence are considered mature.
As technology developed, multidimensional modeling and simulation technology
for atmospheric turbulence emerged in the 1990s. A method for constructing two-
dimensional atmospheric turbulence out of the one-dimensional shaping filter method
was reported by Xiao [8]. Then a method for generating two-dimensional atmospheric
turbulence based on the spatial correlation function was developed by Lu et al. [9].
In 2001, a Monte Carlo method which generated three-dimensional atmospheric
turbulence values using a correlation function matrix was reported by Hong. This
method, however, was unwieldy because of its large memory footprint and lengthy
computation time [10]. In 2012, an improved method to solve the disadvantages of the
Monte Carlo method was developed by Gao et al., but the low efficiency remained
as an unsolved problem [11]. In 2008, Gao et al. generated a three-dimensional
atmospheric turbulence field using a time–frequency transform, yet its requirement
for pre-stored data makes it unsuitable for real-time simulation [12]. Based on the
study by Lu et al. [9], an algorithm for simulating a three-dimensional atmospheric
turbulence featured good real-time performance and accuracy. It was developed based
on the correlation function method [13].
Gaussian white noise was used in existing models of atmospheric turbulence. The
quality of Gaussian white noise will directly affect the generation of atmospheric
turbulence in simulation experiments. Hunter and Kearney reported that the white
noise is improved by using a double random switching algorithm [14]. The von
Karman three-dimensional atmospheric turbulence field modeling established by
Gao et al. in 2012 also used this algorithm [11]. We develop an improved method for
generating three-dimensional atmospheric turbulence by referring to reported studies
[11] and our previous work [13]. An improved Gaussian white noise sequence is used
to generate the initial one-dimensional atmospheric turbulence, which improves the
correlation of the overall three-dimensional atmospheric turbulence field.
13 Improving Correlation Function Method … 129
where r is the Gaussian white noise; a and σ w are undetermined parameters which
can be generated by the correlation function method. According to the random model
and the definition of correlation function, we can have the following equations:
R0 = E[w(x)w(x)] = E [aw(x − h) + σw r (x)]2 = a 2 R0 + σw2 (13.2)
R1
a= , σw = R0 (1 − a 2 ) (13.4)
R0
Substitute the resulting Gaussian white noise sequence into the random model to
get the atmospheric turbulence value.
In two- and three-dimensional space, a random model can be set up as
The detailed procedure is described as follows. First, the initial value of the atmo-
spheric turbulence at the origin is set, followed by the calculation of parameter values
and turbulence values for the one-dimensional model. Then the parameters and tur-
bulence value for a two-dimensional model are calculated using the one-dimensional
turbulence values as boundary conditions. Finally, the three-dimensional turbulence
value is deduced based on the two-dimensional turbulence values as boundary con-
ditions.
It should be noted that the Gaussian white noise r is used during the whole calcu-
lation, so the quality of atmospheric turbulence field largely depends on the quality
of Gaussian white noise.
13 Improving Correlation Function Method … 131
According to the above theory, two main factors affect the accuracy of the numerical
simulation of atmospheric turbulence: one is the calculation of the model parameters.
Errors can be avoided so long as the original model is not simplified in the theoretical
derivation. The other factor is the choice of values for the Gaussian white noise
sequence substituted into the random model. If the standard Gaussian white noise
sequence is generated, the resulting turbulence value should satisfy the characteristics
of the frequency domain and time domain of the atmospheric turbulence; however,
in real numerical simulation experiments, the generated Gaussian white noise is not
ideal.
The ideal “Gaussian white noise” indicates that the frequency distribution function
of the noise fits a normal distribution (also known as a Gaussian distribution). Mean-
while, in the power density part, the ideal “white noise” refers to a noise signal that
has a constant density of spectral power, which means that the power of the signal
uniformly distributes over each frequency range.
The sequence is set to x(n). To approximate the mean value of the sequence as 0
and the standard deviation as 1 and to get a better normal distribution characteristic,
the following formula is applied:
x(n) − μ
y(n) = (13.13)
σ
where μ is the mean value of sequence, σ is the standard deviation of sequence, and
y(n) is the improved sequence.
The probability distribution of the noise sequence is already very close to the ideal
characteristics, and the spectral power density can be improved with a double random
switching algorithm while retaining the probability density [14]. The main idea is
to randomly switch the arrangement of two points in the sequence and repeat such
switching until the spectral power density is more evenly distributed. This is based
on the uniformity of the sequence power spectrum evaluated by the least squares of
the autocorrelation function.
The detailed steps are described as follows:
Step 1: Sequence x i (n) is generated by interchanging the positions of two randomly
selected data points in sequence x i−1 (n);
Step 2: Calculate the autocorrelation function of sequence x i (n) as following
N −k−1
1
ri (k) = xi (n)xi (n + k) k = 0, 1, . . . , N − 1 (13.14)
N n=0
132 L. Lin et al.
-5
Power/frequency (dB/rad/sample)
-10
-15
-20
-25
-30
-35
the numerical Gaussian noise
-40 the improved noise
-45
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized Frequency ( rad/sample)
Step 3: Calculate the sum of the squares of the autocorrelation function as following
N −1
SSi = [ri (k)]2 i = 0, 1, 2, . . . (13.15)
k=1
Step 4: Stop the program if SS i < ε or i reaches the predetermined maximum switch-
ing times Nmax. ε is the preset standard number for performing inter-
changes before stopping the calculation.
Step 5: If SS i < SS i−1 , then return to Step 1 to continue with the calculation; other-
wise, drop the current random exchange and return to Step 1 to repeat the
above process until the requirement in Step 4 is satisfied
Theoretically, the algorithm only changes the order rather than the value of the
stochastic sequence, so the mean value and standard deviation, as well as the proba-
bility distribution of the sequence will not be affected.
There are 2000 points in the Gaussian white noise sequence in the experiment,
and thus, the sum of the squares of the autocorrelation function is reduced by 60%.
The power density spectrums before and after improvement are shown in Fig. 13.1.
13 Improving Correlation Function Method … 133
According to the figure, the power spectrum of the improved Gaussian white noise
sequence is much more evenly distributed than the numerical Gaussian noise, with
fewer isolated points and a smaller amplitude, making it closer to the ideal spectrum.
6
8
4
6
2
4
w(m/s)
w(m/s)
2
-2 0
-4 -2
-6 -4
-8 -6
60 60
60 60
40 50 40 50
40 40
y/h 20 30 20 30
20 y/h 20
10 x/h 10 x/h
0 0 0 0
3.5
transverse correlation (theoretical)
transverse correlatin (improved)
3
longitudinal correlation (theoretical)
longitudinal correlation (improved)
2.5
2
R
1.5
0.5
0
0 50 100 150 200 250 300
ξ(m)
The initial 60 grids are used to verify the turbulence field in the 10th grid (at a
height of 700 m) and the 20th grid (at a height of 1400 m). The generated sectional
profile is shown in Fig. 13.2.
The correlation is calculated and compared with the theoretical value, and the
results are shown in Fig. 13.3.
From Fig. 13.2, we can see that the random variation of atmospheric turbulence
agrees with the real atmospheric turbulence. Figure 13.3 shows that the trends of
both transverse and longitudinal correlations in the generated atmospheric three-
dimensional turbulent flow field produced with the proposed method are consistent
with the theoretical values and consistent with the limited error.
13 Improving Correlation Function Method … 135
13.5 Conclusions
Acknowledgements This work is supported by the National Science Foundation of China under
Grant No. 61201305.
References
1. Real, T.R.: Digital simulation of atmospheric turbulence for Dryden and von Karman models.
J. Guid. Control Dyn. 16(1), 132–138 (1993)
2. Reeves, P.M.: A non-Gaussian turbulence simulation. Air Force Flight Dynamics Lab Technical
Report AFFDL-TR-69-67, Wright-Patterson Air Force Base, OH, Nov 1969
3. Fichtl, G.H., Perlmutter, M.: Nonstationary atmospheric boundary-layer turbulence simulation.
J. Aircr. 1(12), 639–647 (1975)
4. Zhao, Z.Y., et al.: Dryden digital simulation on atmospheric turbulence. Acta Aeronaut. Astro-
naut. Sin. 10, 7(5), 433–443
5. Ma, D.L., et al.: An improved method for digital simulation of atmospheric turbulence. J.
Beijing Univ. Aeronaut. Astronaut. 3, 57–63 (1990)
6. Djurovic, Z., Miskovic, L., Kovacevic, B.: Simulation of air turbulence signal and its applica-
tion. In: The 10th Mediterranean Electrotechnical Conference, vol. 1(2), pp. 847–850 (2000)
7. Zhang, F., et al.: Simulation of three-dimensional atmospheric turbulence based on Von Karman
model. Comput. Stimul. 24(1), 35–38 (2007)
8. Xiao, Y.L.: Digital generation method for two-dimensional turbulent flow field in flight simu-
lation. Acta Aeronaut. Astronaut. Sin. 11(4), B124–B130 (1990)
9. Lu, Y.P., et al.: Digital generation of two-dimensional field of turbulence based on spatial
correlation function. J. Nanjing Univ. Aeronaut. Astronaut. 31(2), 139–145 (1999)
10. Hong, G.X., et al.: Monte Carlo stimulation for 3D-field of atmospheric turbulence. Acta
Aeronaut. Astronaut. Sin. 22(6), 542–545 (2001)
11. Gao, J., et al.: Theory and method of numerical simulation for 3D atmospheric turbulence field
based on Von Karman model. J. Beijing Univ. Aeronaut. Astronaut. 38(6), 736–740 (2012)
12. Gao, Z.X., et al.: Generation and extension methods of 3D atmospheric turbulence field. J.
Traffic Transp. Eng. 8(4), 25–29 (2008)
136 L. Lin et al.
13. Wu, Y., Jiang, S., Lin, L., Wang, C.: Simulation method for three-dimensional atmospheric
turbulence in virtual test. J. Comput. Inf. Syst. 7(4), 1021–1028 (2011). Proctor, F.H., Bowles,
R.L.: Three-dimensional simulation of the Denver 11 July 1988 Microburst-producing storm.
Meteorol. Atmos. Phys. 49, 108–127 (1992)
14. Hunter, I.W., Kearney, R.E.: Generation of random sequences with jointly specified probability
density and autocorrelation functions. Biol. Cybern. 47, 141–146 (1983)
15. Cai, K.B., et al.: A novel method for generating Gaussian stochastic sequences. J. Shanghai
Jiaotong Univ. 38(12), 2052–2055 (2004)
Chapter 14
Study on Product Name Disambiguation
Method Based on Fusion Feature
Similarity
Abstract Analyzing and processing the data of product quality safety supervision
and spot check is the key to maintain healthy and sustainable development of prod-
ucts, because the data sources are extensive. In view of the ambiguity of prod-
uct names in the data, a method based on fusion feature similarity is proposed,
which disambiguates product names using features such as manufacturer name-
related information, product-related information, topic-related information, and so
on. Experiment results show that the proposed method is effective for product name
disambiguation.
14.1 Introduction
In recent years, product quality safety incidents have occurred continuously in China,
causing severe influences on people’s lives and the properties. The incidents are
attributed to many causes. Every year, relevant authorities of the state conduct super-
vision and spot check on key products, disclose results to the public in time, analyze
and process the supervision and spot check data, which have great significance to
improve product quality. However, a large number of supervision and spot check data
contain an identical reference for different products, so it is necessary to disambiguate
product names.
For example:
(1) 15 batches of notebooks are identified as unacceptable in spot check, because
the sizing degree, brightness, dirt, marks, insufficient gutter, page number, and
deviation are not up to standards.
(2) Ms. Wang from Chengdu complained that a laptop she had just bought could
not boot up, and the laptop was found to have quality problems in inspection.
X. Ning · X. Lu (B) · Y. Xu · Y. Li
Quality Management Branch, China National Institute of Standardization, Beijing 10001, China
e-mail: 475554762@qq.com
© Springer Nature Singapore Pte Ltd. 2020 137
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_14
138 X. Ning et al.
In the above examples, “notebooks” in example (1) refer to paper notebooks, while
“laptops” in example (2) refer to notebook computers. In order to better analyze and
process the supervision and spot check data, it is necessary to fundamentally solve
the product name ambiguity problems.
The essence of disambiguation is to calculate the similarity between the reference
and the product, and select the most similar products as correlative products [1].
In recent years, many scholars at home and abroad have studied disambiguation
methods. Bunescu and Pasca [2] proposed a method based on cosine similarity sorting
for disambiguation. Bagga [3] and Mann [4] et al. expressed the context of reference
and the context of object, respectively, as BOW (bag of words) vector, and realized
disambiguation of human name using the vector space model. Huai et al. [5] proposed
a object naming correlation method based on the probabilistic topic model; Ning et al.
[6] proposed a hierarchical clustering method based on the heterogeneous knowledge
base for Chinese object name disambiguation. Zhu et al. [7] proposed a method
combining the disambiguation of reference clustering with the disambiguation of
the same reference in Baidu Baike word list.
By analyzing related reports of product quality safety supervision and spot check, the
following attributes are defined and attribute values are listed, as shown in Table 14.1.
In the profile structure, the attribute value is obtained from related reports of
product quality safety supervision and spot check, or taken as null if it is unavailable
from the reports. According to Table 14.1, information expressed by some attributes
is correlative to some degree, such as product name, trademark, and model, all of
which represent information related to products. Therefore, attributes are classified
into the following three features, manufacturer-related information, product-related
information, and topic-related information, based on the correlation of information
expressed by attributes. The manufacturer-related information includes the manu-
facturer name and manufacture place, the product-related information includes the
trademark, model, and manufacture date, and the topic-related information includes
the standard and inspection items.
The most important thing about product name disambiguation is to choose some main
features that can distinguish different products to the greatest extent. Analyze the
Feature extraction
Manufacturer
related information
Feat
Disa
Simil ure
Product related mbig
Preprocess arity simil
information uatio
Text to be disambiguated calc arity
n
ulati com
resul
on binat
t
ion
Topic related
information
selected features, assign different feature weights according to the importance degree
of product name distinction, combine the feature weights, calculate the similarity
degree of product name, and eliminate ambiguity. For example, for any two texts T1
and T2 to be disambiguated, the computation complexity can be reduced by improving
the similarity calculation method of three categories of features.
According to the correlation between the manufacturer and the product name in
the report, corresponding product names to the same manufacturer refer to the same
product in most cases, so the manufacturer has high product name distinction
degree.
For the manufacturer-related information, the calculation method log d /df is used,
and the similarity degree is as follows:
2
simP (T1 , T2 ) = log d /dfk (14.3)
k=1
d is the total number of reports, dfk is the number of reports in which both the
product name to be disambiguated and the manufacturer or manufacture place are
referred.
3
simC (T1 , T2 ) = simC (con1i ∩ con2i ) (14.4)
i=1
14 Study on Product Name Disambiguation Method … 141
P ωi , ωj represents co-occurrence frequency, x represents topic, X represents
topic cluster, ωi and ωj represent different words, and Y represents word cluster.
Assuming that each report expresses a topic and totally N reports are included, the
prior probability of the topic is P(x) = 1/N . When a word appears in a report with
the posterior probability of p[ωi |x ] = 1, if the words ωi and ωj occur simultaneously
in m sentences, their joint probability is P ωi , ωj = m/N . Therefore, the word
co-occurrence probability can be calculated by the following formula:
text ωi , ωj
T ωi , ωj = (14.6)
text
text ωi , ωj represents a report cluster whose text vector includes both ωi and ωj ,
text represents individual text, and · represents number of elements.
A text set matrix Qm×n similar to a vector space model is constructed. Assuming
that text set Q includes n reports and m concurrent word classes, the text set can be
expressed as a m × n matrix in which the column vector represents a report, and the
row vector represents distribution of a concurrent word in the text; if the concurrent
word occurs, the original value of the matrix is 1; if the concurrent word does not
occur, the original value of the matrix is 0; namely:
⎡ ⎤
q11 q12 · · · q1n
⎢ q21 q22 · · · q2n ⎥
⎢ ⎥
Qm×n =⎢ . .. ⎥ (14.7)
⎣ .. . ⎦
qm1 qm2 · · · qmn
142 X. Ning et al.
Judge whether the two product names refer to the same product according to the
similarity of the two product names.
1, product(T1 , T2 ) ≥ threshold
Con = f (product) = (14.10)
0, product(T1 , T2 ) < threshold
2,000 quality safety supervisions and spot reports are checked from the website of
individual local market supervision and administration authorities as the data set
for this experiment, then the incomplete texts are removed, and finally, 200 reports
are selected at random as the experimental data. First, the 8 product names to be
disambiguated in the 200 reports are marked manually, and the number of reports
selected is shown in Fig. 14.2, including the number of product name references
14 Study on Product Name Disambiguation Method … 143
{12, 9, 15, 11, 6, 14, 14, 10}, which fully indicates randomness of the data. Then,
Stanford NLP word segmentation tool is used for word segmentation of the standard
and inspection items in the report.
With the accuracy, recall rate, and F of eight product names to be disambiguated as
experimental results, the feature weight, similarity threshold, and different similarity
feature combination are analyzed.
According to formula (14.9) and formula (14.10), feature weight α + β + γ = 1
where α, β, γ ∈ (0, 1) and similarity threshold. Different values of α, β, γ and
threshold are tested for many times to define F, so as to obtain the optimal feature
weight combination. Namely, when α = 0.17, β = 0.54, γ = 0.29 and threshold =
0.52, the largest F of individual product name is F = 91.36, and the average largest
F among the eight sets of data is F = 89.15.
Figures 14.3, 14.4, and 14.5 show the effects of α, β and γ on the value of accu-
racy, recall rate, and F. With α, β and γ increase, the accuracy increases while the
recall rate decreases; the largest F occurs when α = 0.17, β = 0.54, γ = 0.29. The
proportion of the manufacturer-related information, product-related information, and
topic-related information to the similarity degree of reference rises with the increase
of weight of α, β, and γ , so the reference related to the same manufacturer-related
information, product-related information, and topic-related information is mistak-
enly seemed as the same product, resulting in a constant decrease of recall rate.
However, the accuracy is highest when α approaches 1, because the manufacturer-
related information (manufacturer name and manufacture place) can better distin-
guish different products.
Figure 14.6 shows the effects of similarity threshold on accuracy, recall rate, and
F. When the threshold is too small, many different reports are retrieved into the same
category, and it is impossible to accurately identify the reference, resulting in very
high recall rate but low accuracy. Largest F occurs when threshold = 0.52. When
the threshold is too high, only references with high similarity can be identified as the
same reference, resulting in high accuracy but decreasing recall rate.
14.4 Conclusion
For a larger number of product name ambiguity problems in product quality safety
supervision and spot check reports, analysis is made in terms of the manufacturer-
related information, product-related information, and topic-related information, then
topic features of product names are selected with the method based on word co-
occurrence clustering, and finally, product names are disambiguated with differ-
ent feature weight parameters and similarity thresholds. The simulation experiment
shows that the method used in this paper is effective for product name disambigua-
tion, which proves effectiveness of the algorithm.
Acknowledgements This research is supported and funded by the National Science Foundation
of China under Grant No. 91646122 and the National Key Research and Development Plan under
Grant No.2016YFF0202604 and No.2017YFF0209604.
References
1. Zhao, J., Liu, K., Zhou, G., Cai, L.: Open information extraction. J. Chin. Inf. Process. 25(6),
98–110 (2011)
2. Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In:
Proceedings of the 11st Conference of the European Chapter of the Association for Computa-
tional Linguistics, pp. 9–16. Trento, Italy (2006)
3. Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space
model. In: Proceedings of the 17th International Conference on Computational Linguistics, vol.
1, Association for Computational Linguistics, pp. 79–85. Montreal, Canada (1998)
4. Mann, G.S., Yarowsky, D.: Unsupervised personal name disambiguation. In: Proceedings of
the 7th Conference on Natural Language Learning at HLT-NAACL 2003, vol. 4, pp. 33–40.
Sapporo, Japan (2003)
5. Huai, B., Bao, T., Zhu, H., et al.: Topic modeling approach to named entity linking. J. Softw.
25(9), 2076–2087 (2014)
6. Ning, B., Zhang, F.: Named entity disambiguation based on heterogeneous knowledge base. J.
Xi’an Univ. Posts Telecommun. 19(4), 70–76 (2014)
7. Zhu, M., Jia, Z., Zuo, L., et al.: Research on entity linking of Chinese micro blog. Acta Sci. Nat.
Univ. Pekin. 50(1), 73–78 (2014)
8. Hinton, G., Deng, L., Yu, D., et al.: Deep neural networks for acoustic modeling in speech
recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
9. National Institute of Standards and Technology. Open KWS13 keyword search evaluation plan
(2013)
Chapter 15
Delegated Preparation of Quantum
Error Correction Code for Blind
Quantum Computation
Abstract The universal blind quantum computation protocol allows a client to del-
egate quantum computation to a remote server, and keep information private. Since
the qubit errors are inevitable in any physical implementation, quantum error cor-
rection codes are needed for fault-tolerant blind quantum computation. In this paper,
a quantum error correction code preparation protocol is proposed based on remote
blind qubit state preparation (RBSP). The code is encoded on the brickwork state for
fault-tolerant blind quantum computation. The protocol only requires client emitting
weak coherent pulses, which frees client from dependence on quantum memory and
quantum computing.
15.1 Introduction
Quantum computation has come into the focus of quantum information science
because quantum algorithms can quickly solve some NP problems such as factoring
large numbers [15]. The existing traditional protocols are threatened as a result of
the huge progress in quantum computing. In order to resist the quantum attack, many
signature and transfer protocols are presented based on the assumption of the hard-
ness of lattice problem [8, 17]. Although modern quantum computation is making
strides toward scalable of quantum computers, the small and privately owned quan-
tum computers remain very distant. If the large quantum computers are used as rental
system, users are granted access to the computers to do quantum computation. The
Broadbent, Fitsimons, and Kashefi proposed the universal blind quantum computa-
tion [4], which allows the client (named as Alice) execute a quantum computation on
a quantum server (named as Bob) without revealing any information about the com-
Q. Zhao · Q. Li (B)
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
e-mail: qiongli@hit.edu.cn
putation except the upper bound of the size. This protocol has been experimentally
realized in an optical system [2, 3].
In blind quantum computation, a quantum computation can be conceptually
divided into a classical part and quantum part in framework of measurement-based
quantum computation [11, 12]. Alice, as classical controller unit, prepares qubits and
decides the measurement angles, while Bob, as quantum unit, performs the measure-
ment. The inputs are prepared into the desired single-photon state by Alice. How-
ever, the quantum states are easily affected by the environment and imperfect devices
[1, 5, 10, 13], which will inevitably produce errors. The errors may occur during qubit
preparation, quantum transmission, and quantum measurement. Hence, a practical
blind quantum computation system requires Alice to have the ability of preparing
encoded logical qubits for quantum error correction.
Quantum error correction was independently presented by Shor and Steane
[14, 16]. The Shor code is a combination of the 3-qubit phase flip and bit flip codes,
and it is nine qubits code. The Steane’s code uses seven qubits to encode one qubit,
which can protect against the effects of an arbitrary error on a single qubit. More-
over, the Steane method has an advantage over the Shor procedure in measurement
syndrome, which only 14 ancilla bits and 14 CNOT gates are needed. Hence, the
Stean’s code is used to do quantum error correction in our paper.
For fault-tolerant blind quantum computation, the encoded logical qubits, which
are prepared based on the encoding circuit, are required to replace original qubits
in the brickwork state. In [4], Broadbent, Fitsimons, and Kashefi had proposed a
fault-tolerant blind quantum computation protocol, which can convert encoding cir-
cuit to a measurement-based quantum computation on the brickwork state. However,
the encoding preparation requires Alice to have the ability of preparing the single
photon states, and consumes a large number of qubits to prepare an encoded logical
qubit. Chien presented two fault-tolerant blind quantum computation protocols [5].
In the first protocol, Alice prepares the encoded logical qubits-based quantum cir-
cuit, and then send to Bob. In the second protocol, Bob prepares initial encoded
logical qubits, and Alice randomly performs phase gate on these logical qubits, then
sends back to Bob via quantum teleportation. Two protocols all require Alice to have
the ability of quantum memory and quantum computing. In the ideal blind quan-
tum computation, Alice has to prepare perfect qubits for the blindness. However,
the preparation will inevitably be imperfect in any physical implementation. Hence,
a remote blind qubit state preparation (RBSP) protocol is presented by Dunjko et
al. [6] to prepare the approximate blind qubits. To improve the preparation effi-
ciency, a modified RBSP protocol with two decoy states is proposed by Zhao and Li
[18, 19]. Nevertheless, these prepared single qubits cannot be used for fault-tolerant
blind quantum computation.
In the paper, a quantum error correction code preparation protocol is proposed
based on RBSP, which is able to prepare the encoded logical qubits for fault-tolerant
blind quantum computation. In the protocol, Alice emits weak coherent pulses, and
delegates Bob to prepare quantum error correction code on the brickwork state, i.e.,
a universal family of graph state. According to Alice’s instructions, Bob performs
the measurement-based quantum computation on the brickwork state to prepare the
15 Delegated Preparation of Quantum Error Correction … 149
encoded logical qubits. The protocol only requires Alice to have the ability of emitting
weak coherent pulses.
The rest of this paper is organized as follows: in Sect. 15.2, technical preliminaries
are introduced. In Sect. 15.3, the delegated preparation protocol is presented, which
can prepare the encoded logical qubits on the brickwork state for fault-tolerant blind
quantum computation. In Sect. 15.4, conclusions are drawn.
⎛ ⎞
1 0 0 0
1 1 1 1 0 1 0 ⎜0 1 0 0⎟
H=√ , S= , T = , C N OT = ⎜
⎝0
⎟
2 1 −1 0 −i 0 eiπ / 4 0 0 1⎠
0 0 1 0
(15.1)
Fig. 15.1 Diagram of quantum gates a Pauli-X gate. b Pauli-Z gate. c Hadamard gate. d Phase
gate S. e π/8 gate. f Controlled-NOT (CNOT) gate. g Controlled-Z (CZ) gate. h Controlled-Phase
(CPhase) gate. i SWAP gate
A popular quantum code is the [[n, k, d]] stabilizer code, which can encode k qubits
into n qubits [7, 9, 10]. The parameter d is the distance of the code. The stabilizer
code can be also described by the generator matrix G, which has 2n columns and
n − k rows. The generator matrix is denoted as G = (X G |Z G ). In the paper, we use
a common stabilizer code, i.e., the 7-qubit Steane’s code [[7, 1, 3]]. The code can
encode one qubit in seven qubits and correct any 1-qubit errors. The encoded logical
qubit basis is denoted as {|0 L , |1 L }. The generator matrix of the [[7, 1, 3]] is shown
as follows [9]:
⎛
⎞
0 0 0 1 1 1 1
0 0 0 0 0 0 0
⎜0 1 1 0 0 1 1
0 0 0 0 0 0 0⎟
⎜
⎟
⎜1 0 1 0 1 0 1
0 0 0 0 0 0 0⎟
⎜
G [[7,1,3]] = ⎜
⎟. (15.2)
⎟
⎜0 0 0 0 0 0 0
0 0 0 1 1 1 1⎟
⎝0 0 0 0 0 0 0 0 1 1 0 0 1 1⎠
0000000
1010101
The quantum circuit that encodes qubits can be designed according to the generator
matrix. The circuit shown in Fig. 15.2 is used to prepare an unknown logical qubit
for quantum error correction [10]. It is easy to understand that the CNOT gates of the
circuit is based on an alternative expression for X G , which permutes the columns.
An unknown logical qubit α|0 + β|1 and six ancilla qubits |0⊗6 can be used to
encode into α|0 L + β|1 L , as shown in Fig. 15.2.
15 Delegated Preparation of Quantum Error Correction … 151
As we all known, the encoded logical qubit |0 L is the equally weighted superposition
of all of the even weight codewords of the Hamming code, and logical qubit |1 L
is the equally weighted superposition of all of the odd weight codewords of the
Hamming code.
|0 L = 1
√
2 2
(|0000000 + |0001111 + |0110011 + |0111100
+ |1010101 + |1011010 + |1100110 + |1101001)
(15.3)
|1 L = 2√1 2 (|1111111 + |1110000 + |1001100 + |1000011
+ |0101010 + |0100101 + |0011001 + |0010110)
To prepare the unknown encoded logical qubits, a good scheme was presented
by Preskill for the Steane’s [[7, 1, 3]] code [10]. A qubit in an unknown state can
be encoded using the circuit, as shown in Fig. 15.2. The alternative expression of
the generator matrix G [[7,1,3]] is used to construct the encoding circuit. The encoded
logical qubits are determined by the generators of G. Since the rank of the matrix X G
is 3, the 3 bits of the Hamming string completely characterize the data represented
in Eq. (15.3). The remaining four bits are the parity bits that provide the needed
redundancy to protect against errors. Hence, we can use two CNOT gates to prepare
the state |0000000 +√e2 |0000111 for the unknown input state |+θ . To add |0 L to this state,
iθ
the rest CNOT gates of the circuit switch on the parity bits determined by G [[7,1,3]] .
According to the encoding circuit in Fig. 15.2, we can prepare the unknown encoded
logical qubit |+θ L for an unknown qubit |+θ to Bob. If Alice wants to delegate
Bob to prepare the encoded logical qubits, Bob needs to convert the encoding circuit
from Alice to a measurement-based quantum computation. In our paper, we present
152 Q. Zhao and Q. Li
Fig. 15.3 a The encoding circuit for Steane’s [[7, 1, 3]] code. b The encoding circuit where CNOT
gates only operate on adjacent qubits. Red solid boxes represent SWAP gates, which are replaced
with three consecutive CNOT gates. c The encoding circuit that quantum gates are arranged to fit
the bricks in the brickwork state. d The brickwork state to implement the encoding circuit
a universal family of graph state, i.e., brickwork state, to prepare the encoded logical
qubits.
If Bob uses the brickwork state to perform the encoding computation, he needs
to preprocess the input qubits in Fig. 15.2. In order to entangle the ancilla qubits and
the desired qubits |+θ using the CZ gates for Bob, the input ancilla qubits have to be
the |+ states. In addition, since the bricks are even–odd interleaved in the brickwork
state, CNOT gates can only be acted on specific two adjacent lines of qubits in each
layer. Thus, SWAP gates are required for implementing quantum gates which operate
on two nonadjacent qubits. In the following, the encoding circuit in Fig. 15.2 will
be converted to a measurement-based quantum computation on the brickwork state.
The specific processes are described as follows.
Step 1—the encoding circuit is used to preprocess the input ancilla quits |+ using
the Hadamard gates, as shown in Fig. 15.3a.
Step 2—SWAP gates are added to make sure that CNOT gates operate on adjacent
qubits as shown in Fig. 15.3b. Since the construction of SWAP gates on the brickwork
state is very complex, the SWAP gates can be replaced with the three consecutive
CNOT gates.
Step 3—the encoding circuit is divided into many layers so that all quantum gates
are arranged to fit a brick in the brickwork states as shown in Fig. 15.3c.
Step 4—these 1-qubit gates and CNOT gates can be implemented on the brickwork
state, as shown in Fig. 15.3d. In blind quantum computation, Fig. 15.3d shows that
the brickwork state needs to be divided into the bricks corresponding to the quantum
gates of the encoding circuit. The measurement basis from Alice are assigned to each
qubit of the brickwork state.
15 Delegated Preparation of Quantum Error Correction … 153
Based on the above analysis, the delegated preparation of quantum error correction
code on the brickwork state is designed as follows. In our protocol, the 97 layers of
the bricks are required to prepare an encode logical qubit. The seven input qubits
in the encoding circuit are converted to seven rows of qubits in the brickwork state.
Thus, this brickwork state consists of 2723 qubits. Bob needs to use 3298 CZ gates to
create the brickwork state. The measurement basises from Alice are assigned to each
qubit in the brickwork state, except the last column of qubits which are the output
qubits. Thus, the 2716 measurements are required for the preparation computation
on the brickwork state.
In our protocol, the interaction measurement stage is different from the basic
universal blind quantum computation. Since the ancilla qubit of encoding circuit is
carried without encoded information, their measurement basis does not need to be
encrypted. We only make sure that the required qubit |+θ prepared based on RBSP is
-blind to Bob in encoding computation. In the basic blind quantum computation, the
measurement basis of encoded qubit is encrypted as δ = φ + θ + πr, r x,y ∈ {0, 1}.
Thus, the polarization angle θ is independent of δ in our protocol. Hence, if the qubit
prepared based on RBSP is -blind to Bob, the encoded logical qubit is also -blind.
Protocol: Delegated preparation of quantum error correction code on the brickwork state
(1) Alice’s preparation
(1.1) Alice sends N weak coherent pulses which the polarization angles σ are chosen at random
in {kπ/4 : 0 ≤ k ≤ 7}
(1.2) Alice sends a sequence of the ancilla pulses with the polarization state |+ to Bob. The
ancilla qubits can be public
(2) Bob’s preparation
(2.1) According to the remote blind qubit state preparation protocol [6], Bob can prepare the
required qubit |+θ i , i = 1, 2, ..., S
(2.2) Bob entangles the required qubit |+θ i and a group ancilla qubits to create the brickwork
state using CZ gates
(3) The interaction measurement
For each column x = 1, ... m in the brickwork state
For each row y = 1, ..., n in the brickwork state
(3.1) Alice computes δx,y = φx,y + θx,y + πr x,y , r x,y ∈ {0, 1} based on the real measurement
angle φ and the previous measurement results. If a used qubit is ancilla state, θx,y = 0
(3.2) Alice transmits δx,y to Bob via the classical channel. Bob measures in the basis
{|+δx,y , |−δx,y }
(3.3) Bob transmits the one-bit measurement result to Alice via the classical channel
15.4 Conclusions
and no quantum memory and no quantum computing are needed. In addition, the
resource consumption of our protocol is analyzed to prepare an encoded logical qubit.
Acknowledgements This work is supported by the Space Science and Technology Advance
Research Joint Funds (Grant Number: 6141B06110105) and the National Natural Science Founda-
tion of China (Grant Number: 61771168).
References
1. Aharonov, D., Ben-Or, M.: Fault-tolerant quantum computation with constant error rate. SIAM
J. Comput. (2008)
2. Barz, S., Fitzsimons, J.F., Kashefi, E., Walther, P.: Experimental verification of quantum com-
putations. arXiv preprint arXiv:1309.0005 (2013)
3. Barz, S., Kashefi, E., Broadbent, A., Fitzsimons, J.F., Zeilinger, A., Walther, P.: Demonstration
of blind quantum computing. Science 335(6066), 303–308 (2012)
4. Broadbent, A., Fitzsimons, J., Kashefi, E.: Universal blind quantum computation. In: 50th
Annual IEEE Symposium on Foundations of Computer Science, 2009. FOCS’09, pp. 517–
526. IEEE
5. Chien, C.H., Van Meter, R., Kuo, S.Y.: Fault-tolerant operations for universal blind quantum
computation. ACM J. Emerg. Technol. Comput. Sys. 12, 9 (2015)
6. Dunjko, V., Kashefi, E., Leverrier, A.: Blind quantum computing with weak coherent pulses.
Phys. Rev. Lett. 108(20) (2012)
7. Gottesman, D.: Stabilizer codes and quantum error correction. arXiv preprint quant-ph/9705052
(1997)
8. Liu, M.M., Hu, Y.P.: Equational security of a lattice-based oblivious transfer protocol. J. Netw.
Intell. 2(3), 231–249 (2017)
9. Nielsen, M.A., Chuang, I.: Quantum computation and quantum information (2002)
10. Preskill, J.: Fault-tolerant quantum computation. In: Introduction to Quantum Computation
and Information, pp. 213–269. World Scientific (1998)
11. Raussendorf, R., Briegel, H.J.: A one-way quantum computer. Phys. Rev. Lett. 86(22) (2001)
12. Raussendorf, R., Browne, D.E., Briegel, H.J.: Measurement-based quantum computation on
cluster states. Phys. Rev. A 68(2) (2003)
13. Shor, P.W.: Fault-tolerant quantum computation. In: Proceedings of 37th Annual Symposium
on Foundations of Computer Science, 1996, pp. 56–65. IEEE
14. Shor, P.W.: Scheme for reducing decoherence in quantum computer memory. Phys. Rev. A
52(4), 2493 (1995)
15. Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a
quantum computer. SIAM Rev. 41(2), 303–332 (1999)
16. Steane, A.M.: Error correcting codes in quantum theory. Phys. Rev. Lett. 77(5), 793 (1996)
17. Sun, Y., Zheng, W.: An identity-based ring signcryption scheme in ideal lattice. J. Netw. Intell.
3(3), 152–161 (2018)
18. Zhao, Q., Li, Q.: Blind Quantum Computation with Two Decoy States. Springer International
Publishing (2017)
19. Zhao, Q., Li, Q.: Finite-data-size study on practical universal blind quantum computation.
Quantum Inf. Process. 17(7), 171 (2018)
Chapter 16
Design of SpaceWire Interface
Conversion to PCI Bus
16.1 Introduction
The SpaceWire standard has become the ECSS standard and been published since
2003. Since then it has been adopted for using on many spacecrafts, with over 100
spacecrafts in orbit or being designed using SpaceWire [4].
Throughout the specification, design, development, and testing of a SpaceWire
system, it is important that the system is tested and verified to the various levels of
the standard [5]. If spacecraft uses SpaceWire as data-handling network, the design
of SpaceWire electronic checkout and ground support equipment will be necessary.
On the other hand, CompactPCI/PXI modular test system has the advantages of
small size, low cost, easy make, high integration, flexible software, etc., which is
widely used in aerospace and other industrial test fields. Thus, there is a need for
SpaceWire-cPCI/PXI communicating card.
In a CompactPCI-based automatic test system developed by us, a four-channel
SpaceWire-cPCI communication card is required. It is mainly used as a SpaceWire
receiving node, and responsible for transmitting SpaceWire data to the cPCI con-
troller. In this system, the amount of test data is very large so that the system needs to
make a good utilization of PCI bandwidth. However, the standard SpaceWire-cPCI
card can work in a maximum data transfer rate of 160 Mbit/s with single SpaceWire
channel [6, 7], which cannot meet the requirement in the limit case (200 Mbit/s).
We redesigned a SpaceWire-cPCI communication card with FPGA on a single
hardware board, and optimized the firmware of data receiving and data transfer. After
testing, the maximum data transfer rate was greatly improved when the card was used
as a receiving node. This design, which maximizes utilization of the bandwidth and
storage resources, is very suitable for SpaceWire instruments which use PCI as the
host interface. The remaining chapters of this paper mainly introduce the design and
optimization of FPGA firmware in detail.
External
DDR2
coding, which comprises eight data bits and one control flag. Table 16.1 shows this
coding form [1].
Format converter’s function is to convert packet format of SpaceWire into another
format that suitable for 32-bit PCI transfer. DMA controller connects format con-
verter with PCI interface, which can provide high data throughput. By the way, we
designed Avalon slave/master interfaces for all blocks, so that the whole firmware is
interconnected by Avalon Memory Map which is an on-chip bus defined by Altera.
The increasing of transfer rate mainly depends on the format converter and the
DMA controller. The next chapter mainly introduces the structure and operation of
these two blocks.
As shown in Table 16.1, code with the control flag set to zero is a normal SpaceWire
data, and any code with the control flag set to one and the least significant bit of
Table 16.1 Host data Control flag Data bits (MSB…LSB) Meaning
interface coding
0 xxxxxxxx 8-bit data
1 xxxxxxx0 (use 00000000) EOP
1 xxxxxxx1 (use 00000001) EEP
158 Z. Wang et al.
the data set to zero represents an EOP (End of Packet), while set to one represents
an EEP (Error End of Packet). Thus, a valid SpaceWire packet for a receiving node
shall comprise multiple data and an end_of_packet marker (EOP or EEP). Figure 16.2
shows this format.
This format makes it easy for a computer to identify that whether the code currently
acquired is a valid data or an end_of_packet marker, so that different packets can be
distinguished.
However, as we all know, the common data types of computer are “char”, “short”,
“int”, and “long”. None of them is a 9 bits type. If we process SpaceWire code with
short-type (16 bits) directly, the board resources of storage and bandwidth will be
almost half-wasted. So, we converted the SpaceWire packet format.
Figure 16.3 shows structure diagram of the format converter. It comprises three
major blocks: conversion logic, a DATA_FIFO developed by external DDR2, and an
MSG_FIFO developed by storage resources of FPGA.
When a SpaceWire code is sent to the format converter, it can be identified by
convert logic that whether this code is a data or an end_of_packet marker. In addition,
there is a packet length countern which can automatically add one every time when a
code is got. Once an end_of_packet marker is identified, the value of length counter
will be stored into MSG_FIFO and be reset to zero. Therefore, the length of each
SpaceWire packet is stored in MSG_FIFO in chronological order.
After identified, the highest bit of original code is discarded, leaving only the
remaining 8 bits. Meanwhile, convert logic will combine every four processed 8-bit
Format Converter
Read by PCI
Packet Length Target Controller
MSG_FIFO
External DDR2
bit 32
24
data0 data4 data8 data0 data4 data8 data0 data4 0 or 1
16
data1 data5 0 or 1 data1 data5 data9 data1 data5 0
8
data2 data6 0 data2 data6 data10 data2 data6 0
0
data3 data7 0 data3 data7 0 or 1 data3 data7 0
data into an 32-bit data. When the end_of_packet marker is detected but the number
of remaining data is less than 4, convert logic will add zero after the end_of_packet
marker. Because the width of DATA_FIFO is also 32-bit, all data operation can be
processed with the int-type. In this way, bandwidth and storage resources can be
utilized maximally. Figure 16.4 shows converted format that stored in continuous
address with int-type, when there are three continuous SpaceWire packets the length
of which is 10, 12, and 9, respectively.
Interrupt
N
MSG_FIFO is Non-empty
&& transfer_flag=0
N
DMA DONE
Y
transfer_flag=0
Configure DMA Controller
Enable DMA
transfer_flag=1
Interrupt Return
what DMA controller read is a DATA_FIFO, there is no need to configure DMA read
address.
Next again, the dmaEnable function will be called to start 32-bit bursting transfer
of DMA controller. And the transfer_flag variable will be set to 1 before
exiting the first interrupt service so that computer cannot operate the DMA controller
repeatedly while it is on working.
Then the computer goes into the idle status and wait for the DMA DONE interrupt.
After it arrives, a SpaceWire packet will have been stored in the physical memory we
previously allocated, and the computer will enter the interrupt service for a second
time. At this point, it will be ready to write SpaceWire data to a file or perform data
processing. The transfer_flag variable will be reset to 0 before exiting the
second interrupt service for a next DMA transfer. At this moment, one data transfer
process is completed.
16 Design of SpaceWire Interface Conversion to PCI Bus 161
16.4 Testing
The major work of testing is to obtain the speed of data transfer. We designed two
identical SpaceWire-cPCI communication card by using the firmware above to set
up the testing environment. For preventing conflict occupation of the PCI bus, we
placed these two cards in different CompactPCI chassis, and tested rate by making
them communicate with each other. Card A serves as a sender while card B serves
as a receiver. Figure 16.6 shows the structure of test environment.
It is important that how to get accurate time for a speed testing. In this design, this
is the time taken for data to be written to physical memory from the receiver card.
In order to make it include the time that calling driver functions, we decided to use
the software high-precision timer under the Windows OS [8].
This paragraph mainly introduces some operation about software timing. Com-
puter B will call QueryPerformanceFrequency function to get frequency
of the internal timer during initialization. Then card A will receive commands
from computer A and send packets of different lengths. After card B has received
packets from SpaceWire router and started data transfer, computer B will call
QueryPerformance-counter function twice when entering the interrupt ser-
vice for the first time and for the second time, respectively. In this way, we can realize
high-precision software timing and calculate the time taken by data transfer process.
Table 16.2 shows some transfer rate under different packet lengths after testing.
As packet length increases, the effective SpaceWire data transfer rate of the PCI
interface increases too. Its theoretical maximum bandwidth is significantly higher
than the value of the standard SpaceWire-cPCI card (160 Mbit/s) [7], and it is still
on increasing at the end of Table 16.2. We analyzed that this is benefits from the
converted packet format and the 32-bit burst transfer mode of the DMA controller.
Card A: Card B:
CPU A CPU B
Sender Receiver
test test
commond data
Keyboard SpaceWire Monitor
cable
1 2 3 4 5 6 7 8
SpaceWire Router
Table 16.2 Test data of Packet length (in Bytes) Transfer rate (in Mbit/s)
transfer rate
2500 48.35
5000 93.56
7500 119.83
10,000 146.31
15,000 187.32
20,000 218.87
25,000 244.51
30,000 265.42
However, due to the bad real-time performance of Windows OS and the low exe-
cution efficiency of NI-VISA, most of the transfer time is spent on the corresponding
interrupt and calling driver function. Therefore, the transfer rate is still very low when
the packet length is smaller than 2500. The following work can consider to develop
real-time software drivers or use real-time operating systems such as VxWorks.
References
Abstract A general approach based on the control factor for controlling the ampli-
tude of the Logistic map is discussed in this paper. We consider that the approach is
illustrated using the Logistic map as a typical example. It is proved that the amplitude
of the Logistic map can be controlled completely. Since the approach is derived from
the general quadratic map, it is suitable for all quadratic chaotic maps.
17.1 Introduction
Most of the classical chaotic attractors in these classical chaotic systems are
generated by unstable equilibria or fixed points. However, there may be some hidden
attractors in these chaotic systems. These attractors may be chaotic attractors or not.
The basin of the hidden attractors does not contain neighborhoods of equilibria or
fixed points. The investigations of hidden attractors can be traced back to the second
part of Hilbert’s 16th problem for two-dimensional polynomial systems [21]. In
1961, the problem of hidden oscillations in the two-dimensional systems of the phase-
locked loop has been revealed by Gubar [22]. With continuous researching on hidden
attractors in the automatic control systems, the hidden oscillations in the automatic
control systems with a unique stable fixed point and with a nonlinearity have been
found out [23]. The development of hidden oscillations was greatly promoted when a
new way of finding hidden attractors in Chua’s circuit was proposed [24]. Through the
research progress of hidden attractors, the existing investigations of hidden attractors
are always in continuous-time dynamic systems, and few of them are in discrete-time
dynamic systems.
Now, the research on chaotic systems is mainly focused on the study of chaotic
attractors and other chaotic behaviors, but the amplitude of chaotic systems is rel-
atively less studied. However, the amplitude control of chaotic signals is also an
important area in the application of chaotic systems. In 2013, Li and Sprott first pro-
posed an approach to control the amplitude of chaotic signals. By introducing control
functions, the amplitude of the Lorenz chaotic system was well controlled. Since
then, amplitude control of chaotic systems has been further studied. Li and Sprott
use amplitude control method to find the coexistence of chaotic attractors. However,
the existing research on amplitude control of chaotic signals is only focused on con-
tinuous chaotic systems. To the best knowledge of the authors, none of the existing
amplitude control approaches of the chaotic system is for discrete chaotic maps.
Therefore, a new approach is proposed to control the amplitude of the quadratic
chaotic map in this paper. We consider the approach is illustrated using the Logistic
map as a typical example. Since the approach is derived from the general quadratic
map, it is suitable for all one-dimensional quadratic chaotic maps.
In 1976, May proposed the famous Logistic map. The iteration map is
map is the one-dimensional discrete chaotic map, and it satisfies the period-three
theorem proposed by Li and York [30]. Period-three theorem is very important for
one-dimensional chaotic maps, and it is an important theoretical tool to study one-
dimensional chaotic maps. From the relationship between period-three points and
the period-one points of the discrete dynamical systems, the period-one points are
also the period-three points of the dynamical systems. For comparison, period-two
points correspond to period-four points. Therefore, we first obtain the period-one
points of the Logistic map. Let x(n) = f (μ, x(n)), we can obtain
Simplifying H (μ, x(n)), We can get a polynomial function about μ and x(n).
Let H (μ, x(n)) = 0, the root of the equation is the period-three points of the
Logistic map. H (μ, x(n)) is a polynomial function of x(n). The degree of the poly-
nomial function H (μ, x(n)) is six, then it has at most six roots. Since the Logistic
map is a chaotic map, it must have period-three points according to the period-three
theorem. Therefore, it can be ruled out that equation H (μ, x(n)) = 0 has two roots,
four roots, and five roots. Since the Logistic map must have period-three points, then
it must have three different double roots. It is difficult to get an analytic expression for
the solution of the equation H (μ, x(n)) = 0, and it is also difficult to obtain accurate
values by Matlab numerical solution. A small error in the period-three points of the
chaotic maps ultimately leads to a change in the entire chaotic maps. If truncation is
performed on the roots of equation H (μ, x(n)) = 0, then the obtained period-three
points of the Logistic map are not the true period-three points. Then the control of the
period-three points is not a true control of the Logistic map. To avoid the influence
of calculating error to the period-three points in the Logistic map, the period-three
points are rearranged in this paper, and the relationship between the control factor and
the Logistic map coefficients is derived from the period-three points. First, suppose
quadratic function is
f (x) = a1 x 2 + a2 x + a3 (17.5)
Suppose it has period-three points x31 , x32 , x33 , and let x31 < x32 < x33 . Bring
period-three points into (17.5), we can obtain three equations.
⎧
⎨ f (x31 ) = a1 x31
2
+ a2 x31 + a3 = x32
f (x32 ) = a1 x32
2
+ a2 x32 + a3 = x33 (17.6)
⎩
f (x33 ) = a1 x33 + a2 x33 + a3 = x31
2
Suppose m is control factor, and let x31 = mx31 , x32 = mx32 , x33 = mx33 .
Bring them into Eqs. (17.7–17.9). The relationship between new parameters and old
parameters is obtained
a1 a3
a1 = , a = a2 , a3 = . (17.10)
m 2 m
17 A Chaotic Map with Amplitude Control 167
From classical Logistic map, the new chaotic map with amplitude control factors
is obtained.
Fig. 17.2 When μ = 4 and m = 2, the nonlinear dynamic behaviors of the Logistic map with
amplitude control factor m. a Bifurcation diagram, b when x(0) = 0.1, output time-series, c phase
diagram, d when x(0) = 0.05, output time-series
168 C. Wang and Q. Ding
doubling the amplitude is the same as the classical Logistic map in Fig. 17.1b.
Therefore, when m = 2, the output time series with x(0) = 0.05 as the initial value
is in the same orbit as the output time series with the classical Logistic map at the
initial value of x(0) = 0.1. From the theory of topological conjugation, we can
know that two different initial values may be corresponding to the same orbits in two
different chaotic maps. The chaotic maps which are topologically conjugate to the
Logistic map include Tent map [32] and U-N (Ulam and von Neumann) map [33]. For
the Tent map, its transform function is g(x(n)) = sin2 ( π x(n)
2
). For the U-N map, its
transform function is h(x(n)) = 0.5 − 0.5x(n). Compared with g(x(n)), h(x(n)) is
simpler. Compared with the classical Logistic map, some topological conjugate maps
have the same ranges of , and some maps have different ranges of x(n). However,
the amplitude of these topological conjugate maps is fixed and cannot be changed.
If some parameters are changed in their corresponding transform function, it cannot
be guaranteed that these transformed maps also show chaotic behaviors. And it
is difficult to find suitable transform functions. Although the existing topological
conjugation methods cannot greatly control the amplitude of the Logistic map, they
still belong to the control methods of the internal change in the Logistic map. Its
block diagram is shown in Fig. 17.3a. In addition, another way is to add an extra
amplifier. This method directly scales the amplitude of the time series of the Logistic
map. Suppose the scaling factor is k, the block diagram of the scheme is shown in
Fig. 17.3b. Since the scaling factor k is directly applied to the output time series of
the Logistic map, it cannot control the orbit of the Logistic map.
From Fig. 17.1a, we can know that the Logistic map has the ability to control
amplitude. The amplitude of the Logistic map is changes over the parameter μ.
However, it is difficult to guarantee the Logistic map always has chaotic behavior
with different parameter μ. In addition, the maximum amplitude of the Logistic map
controlled by the parameter μ is 1. Therefore, the control ability of parameter μ to the
Logistic map is limited. Its block diagram is shown in Fig. 17.3c. The block diagram
of the method proposed in this paper is shown in Fig. 17.3d. After introducing the
amplitude control factor m, the new Logistic map cannot be decomposed into two
small subsystems. Logistic map with amplitude control factor m is still a chaotic
system with the chaotic attractor, which is inseparable and topologically transitive.
When m = 0.25, 0.5, 2, 4, the bifurcation and phase diagram is shown in Fig. 17.4.
(c) (d)
f ( µ , x(n)) f ( µ , x(n), m)
17 A Chaotic Map with Amplitude Control 169
Fig. 17.4 The bifurcation and phase diagram with different amplitude control factor m. a Bifurca-
tion diagram, b phase diagram
17.3 Conclusion
References
1. Chen, G., Mao, Y., Chui, C.: A symmetric image encryption scheme based on 3D chaotic cat
maps. Chaos Solitons Fractals 21, 749–761 (2004)
2. Chen, C.-M., Linlin, X., Tsu-Yang, W., Li, C.-R.: On the security of a chaotic maps-based
three-party authenticated key agreement protocol. J. Netw. Intell. 1(2), 61–66 (2016)
170 C. Wang and Q. Ding
3. Chen, C.-M., Wang, K.-H., Wu, T.-Y., Wang, E.K.: On the security of a three-party authenticated
key agreement protocol based on chaotic maps. Data Sci. Pattern Recogn. 1(2), 1–10 (2017)
4. Fan, C., Ding, Q.: ARM-embedded implementation of H.264 selective encryption based on
chaotic stream cipher. J. Netw. Intell. 3(1), 9–15 (2018)
5. Wu, T.-Y., Fan, X., Wang, K.-H., Pan, J.-S., Chen, C.-M.: Security analysis and improvement
on an image encryption algorithm using Chebyshev generator. J. Internet Technol. 20(1), 13–23
(2019)
6. Wu, T.-Y., Fan, X., Wang, K.-H., Pan, J.-S., Chen, C.-M., Wu, J.M.-T.: Security analysis
and improvement of an image encryption scheme based on chaotic tent map. J. Inf. Hiding
Multimed. Signal Process. 9(4), 1050–1057 (2018)
7. Chen, C.-M., Linlin, X., Wang, K.-H., Liu, S., Wu, T.-Y.: Cryptanalysis and improvements on
three-party-authenticated key agreement protocols based on chaotic maps. J. Internet Technol.
19(3), 679–687 (2018)
8. Chen, C.-M., Fang, W., Liu, S., Tsu-Yang, W., Pan, J.-S., Wang, K.-H.: Improvement on a
chaotic map-based mutual anonymous authentication protocol. J. Inf. Sci. Eng. 34, 371–390
(2018)
9. Wu, T.-Y., Wang, K.-H., Chen, C.-M., Wu, J.M.-T., Pan, J.-S.: A simple image encryption
algorithm based on logistic map. Adv. Intell. Syst. Comput. 891, 241–247 (2018)
10. Lorenz, E.N.: Deterministic non-periodic flow. J. Atmos. Sci. 20, 130–141 (1963)
11. Rössler, O.E: An equation for continuous chaos. Phys. Lett. A 57, 397–398 (1976)
12. Chua, L.O., Lin, G.N.: Canonical realization of Chua’s circuit family. IEEE Trans. Circuits
Syst. 37, 885–902 (1990)
13. Chen, G., Ueta, T: Yet another chaotic attractor. Int. J. Bifurc. Chaos 9, 1465–1466 (1999)
14. Lü, J., Chen, G.: A new chaotic attractor coined. Int. J. Bifurc. Chaos 3, 659–661 (2000)
15. Qi, G., Chen, G., Du, S., Chen, Z., Yuan, Z: Analysis of a new chaotic system. Phys. A Stat.
Mech. Appl. 352, 295–308 (2005)
16. May, R.M: Simple mathematical models with very complicated dynamics. Nature 261, 459–467
(1976)
17. Hénon, M.: A two-dimensional mapping with a strange attractor. Commun. Math. Phys. 50,
69–77 (1976)
18. Chen, G., Lai, D.: Feedback control of Lyapunov exponents for discrete-time dynamical sys-
tems. Int. J. Bifurc. Chaos 06, 1341–1349 (1996)
19. Lin, Z., Yu, S., Lü, J., Cai, S., Chen, G.: Design and ARM-embedded implementation of a
chaotic map-based real-time secure video communication system. IEEE. Trans. Circ. Syst.
Video 25, 1203–1216 (2015)
20. Wang, C.F., Fan, C.L., Ding, Q.: Constructing discrete chaotic systems with positive Lyapunov
exponents. Int. J. Bifurcat. Chaos 28, 1850084 (2018)
21. Hilbert, D.: Mathematical problems. Bull. Amer. Math. Soc. 8, 437C479 (1902)
22. Gubar, N.A.: Investigation of a piecewise linear dynamical system with three parameters. J.
Appl. Math. Mech. 25, 1011C1023 (1961)
23. Markus, L., Yamabe, H.: Global stability criteria for differential systems. Osaka Math. J. 12,
305C317 (1960)
24. Leonov, G.A.: Algorithms for finding hidden oscillations in nonlinear systems. The Aiz-erman
and Kalman conjectures and Chuas circuits. J. Comput. Syst. Sci. Int. 50, 511C543 (2011)
Chapter 18
Analysis of Factors Associated
to Smoking Cessation Plan Among Adult
Smokers
Abstract According to the World Health Organization (WHO), smoking has gener-
ated a lot of diseases, and tobacco has been the biggest threat to human beings. The
Republic of Korea government has implemented a policy to reduce damage from
smoking since 1986. But almost 1 out of 5 Koreans still smoked in 2017 (21.2%). In
this research, we collected datasets from the Korea Health and Nutrition Examina-
tion Survey (KNHANES) from 2013 to 2015 and used statistical methods to analyze
the smoking patterns of smokers among adults. We used the chi-square test for 28
independent variables based on the before and after preparation of the dependent
variables and evaluated the result based on the significance level getting from the
statistical analysis program SPSS. In our result, the gender distribution was found to
be 2,407 (84.4%) for males and 444 (15.6%) for females. The age range was 46.36 ±
15.13 and the range was from 31 to 61 years. There were more single smokers than
married smokers, and the results were significant in this study. Rather, it was reported
that anti-smoking policy at home was not relevant, and anti-smoking policy public
places were statistically significant (p = 0.007). The results of this study suggest
that many smokers should make a decision to quit smoking by providing a negative
aspect of smoking as a significant factor related to the preparation stage of smoking
cessation.
J. S. Lee
Department of Smart Factory, Chungbuk National University, Cheongju, South Korea
e-mail: richard@dblab.chungbuk.ac.kr
K. H. Ryu (B)
Faculty of Information Technology, Ton Duc Thang University,
Ho Chi Minh City 700000, Vietnam
e-mail: phamvanhuy@tdtu.edu.vn; khryu@tdtu.edu.vn; khryu@chungbuk.ac.kr
Department of Computer Science, College of Electrical and Computer Engineering,
Chungbuk National University, Cheongju, South Korea
18.1 Introduction
In our experiment, the data preprocessing process can be shown in Fig. 18.1.
In the first step, we collected raw data from KNHANES (Korean National Health
and Nutrition Examination Survey) from 2013 to 2015 after registering personal
information and signing a pledge of confidentiality, anyone can download the raw
datasets from the website [7]. The raw datasets include 22,948 instances and 862
features.
In the second step, our research subjects were selected from adult smokers, our
target datasets included 3,027 instances.
This dataset contains a lot of features, which include many unrelated features and
a number of missing values. Thus, in the third step, we removed some irrelevant
features like personal ID number, life type, and so on. And in the last step, we
18 Analysis of Factors Associated to Smoking Cessation Plan … 173
30.0
29.4
20.0
2013 2014 2015
Year
deleted some missing values. It happened because someone does not want to answer
the personal survey and so on.
There are total 2,851 adult smokers included in this study. The ratio of smokers
from 2013 to 2015 is as shown in Fig. 18.2, 37.4% appeared in 2013, 33.2% appeared
in 2014 and 29.4% appeared in 2015. The yearly contemplation and preparation
smokers and percentages are shown in Table 18.1.
174 J. S. Lee and K. H. Ryu
18.2.2 Measures
As per the measures, we used one question from the KNHANES questionnaire to
classify the preparation for smoking cessation as shown in Fig. 18.3 [7].
Other studies using KNHANES data were divided into three categories and studied
the stages of change [8], but we divided into two categories: smoking cessation
preparation and no thought. 1 and 2 were used as smoking cessation preparations
and 3 and 4 were regarded as no smoking cessation.
Our experimental framework is shown in Fig. 18.4. We collected our data from the
KNHANES for adult smokers,whose ages are more than 18 years from 2013 to 2015,
then removed missing values and outliers through data preprocessing, and performed
chi-square test through composite sample crossover analysis.
Feature Selection extracts a new set of attributes to provide the necessary infor-
mation and, in some cases, better information. Therefore, statistically significant (P
< 0.05) results were extracted and used to analyze characteristics related to people
who thought smoking cessation [9, 10].
Our result is shown in Table 18.2. A total of 2,851 people were selected for
pre-contemplation and preparation. The gender distribution was found to be 2,407
(84.4%) for males and 444 (15.6%) for females. The age range was 46.36 ± 15.13 and
Table 18.2 The general characteristics of the pre-contemplation and preparation groups
Variable Value Pre-contemplation (%) Preparation (%) P-value
Gender Male 1,490 (87.5) 917 (86.1) 0.260
Female 271 (12.5) 173 (13.9)
Age 19–24 114 (8.9) 105 (13.0) 0.008
25–49 913 (60.1) 559 (58.3)
50–74 662 (28.8) 391 (27.2)
75–80 72 (2.2) 35 (1.5)
Mean ± SD 46.99 ±14.98 45.35 ±15.32
Education Middle school 480 (20.0) 246 (17.7) 0.342
or lower
High school 716 (44.5) 478 (47.0)
College 565 (35.5) 366 (35.4)
graduate or
higher
Marriage Married 1,404 (74.4) 821 (67.5) 0.001
Single 357 (25.6) 269 (32.5)
BMI, kg/m2 ≤18.4 68 (4.0) 41 (3.9) 0.788
18.5–24.0 1,048 (57.9) 653 (59.4)
≥25.0 645 (38.0) 396 (36.7)
Physical Intense 11 (6.1) 11 (14.0) 0.047
activity at Moderate 120 (69.6) 78 (72.0)
company
Both 41 (24.3) 17 (14.0)
Exercises (per Walking 980 (54.9) 562 (51.7) 0.055
week) Muscle 38 (2.4) 28 (2.9)
Both 402 (24.9) 319 (29.9)
None 341 (17.8) 181 (15.5)
Stress Yes 522 (31.2) 367 (35.4) 0.037
No 1,239 (68.8) 723 (64.6)
EQ-5D 0.0–0.999 500 (25.1) 309 (25.8) 0.703
1 1,257 (74.9) 781 (74.2)
Alcohol (per Never or 406 (20.3) 232 (19.2) 0.489
month) under a glass
Over a glass 1,355 (79.7) 858 (80.8)
176 J. S. Lee and K. H. Ryu
the range was from 31 to 61 years. The statistical significance was 0.008 < p-value.
There were more single smokers than married ones, and the results were significant
in this study. Physical activity at the company was statistically significant with
moderate weight, and stress was higher than that of the recipient.
Physical activity at the company was to determine the relevance of smoking ces-
sation preparations.
Smoking-related characteristics of the pre-contemplation and preparation groups
are shown Table 18.3. The general characteristics related to smoking were the high
incidence of secondhand smoke in the two groups, and it was not statistically signifi-
cant but seemed to be not relevant. Rather, it was reported that secondhand smoke at
home was not relevant, and anti-smoking in public places was statistically significant.
The smoking initiation age was 20.26 ± 5.84, which means that smoking started from
15 to 25 years old. No. of cigarettes smoked per day was 14.21 ± 7.97 and many
smokers were able to judge that they smoked more than one cup at 7 times a day.
Smoking initiation age was to determine the relevance of smoking cessation
preparations.
18.4 Conclusion
Smoking is one of the major cause of various diseases and deaths. So, that is why
the government of Republic of Korea started smoking cessation business and try to
make a low smoking rate until now, but still many people are smoking.
Based on this research, we expect our result makes that people who want to quit
smoking about having a negative mine make a step to achieve smoking cessation.
Acknowledgements This research was supported by Basic Science Research Program through the
National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future
Planning (No. 2017R1A2B4010826), supported by the KIAT (Korea Institute for Advancement
of Technology) grant funded by the Korea Government (MOTIE: Ministry of Trade Industry and
Energy) (No. N0002429).
References
1. Choi, H.S., Sohn, H.S., Kim, Y.H., Lee, M.J.: Factors associated with failure in the continuity
of smoking cessation among 6 month’s smoking cessation successes in the smoking cessation
clinic of public health center. J. Korea Acad. Ind. Coop. Soc. 13(10), 4653–4659 (2012)
2. Kim, D.H., Suh, Y.S.: Smoking as a disease. Korean J. Fam. Med. 30(7), 494–502 (2009)
3. Kim, E.S.: Smoking high risk group woman, out-of-school youth research on development of
smoking cessation service strategy results report (2016)
4. National Prevention, Health Promotion and Public Health Council. In: 2010 Annual Status
Report. http://www.hhs.gov/news/reports/nationalprevention2010report.pdf. Accessed July
2010
5. Ministry of Health & Welfare. Yearbook of Health and Welfare Statistics (2001). http://www.
moha.go.kr
6. Kim, H.O.: The effect of smoking cessation program on smoking cessation and smoking behav-
ior change of adult smokers. Commun. Nurs. 13(1) (2002)
7. Korea Centers for Disease Control and Prevention. Korea National Health and Nutrition Exam-
ination Survey Data. Korea National Health and Nutrition Examination Survey, 1 Mar 2015
8. Leem, A.Y., Han, C.H., Ahn, C.M., Lee, S.H., Kim, J.Y., Chun, E.M.: Factors associated with
stage of change in smoker in relation to smoking cessation based on the Korean National Health
and Nutrition Examination Survey II–V. PLoS One 12(5), e0176294 (2017)
9. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)
10. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res.
3(Mar), 1157–1182 (2003)
Chapter 19
An Efficient Semantic Document
Similarity Calculation Method Based
on Double-Relations in Gene Ontology
Abstract Semantic text mining is a challenging research topic in recent years. Many
types of research focus on measuring the similarity of two documents with ontolo-
gies such as Medical Subject Headings (Mesh) and Gene Ontology (GO). However,
most of the researches considered the single relationship in an ontology. To rep-
resent the document comprehensively, a semantic document similarity calculation
method is proposed, based on utilizing Average Maximum Match algorithm with
double-relations in GO. In the experiment, the results show that the double-relations
based similarity calculation method is better than traditional semantic similarity
measurements.
J. Hu · M. Li (B) · Z. Zhang · K. Li
College of Information Engineering, Shanghai Maritime University, Shanghai, China
e-mail: mjli@shmtu.edu.cn
J. Hu
e-mail: Jingyu-Hu@outlook.com
Z. Zhang
e-mail: Zijun.Zhang1105@outlook.com
K. Li
e-mail: Kaitong-Li@outlook.com
19.1 Introduction
Recent years have witnessed the rapidly growing number of biological documents.
Classifying enormous literature efficiently is of vital significance for management
and reference consulting. Hence, biological text mining becomes important for auto-
matic classification, which is faster than traditional manual methods. At present,
many researchers focus on the study about text similarity measurement, such as
cross-lingual similarity measure [1], contextual similarity measure [2], passage-
based similarity measure [3], and page-count-based similarity measure [4], and so
on. Except content-based similarity calculation methods, ontology-based text sim-
ilarity calculation methods are commonly used for semantic text mining. Current
semantic similarity measures can be roughly divided into path-based methods [5–8]
and IC-based methods including Lord [9–11]. And many researchers began to apply
these methods to biological text data analyses [12–14, 16]. The transition of the
similarity of the terms from one-to-one to many-to-many can be achieved in text
clustering using these algorithms. The common feature above is that researchers are
focusing on inter-document calculation with a single relationship [12–14]. Neverthe-
less, as is known that relations, such as ‘is-a’, ‘part-of’ and ‘regulate’, differ among
gene ontology (GO) [15] terms. The role of other relations in clustering is neglected
consequently.
To consider more possible relationships between two documents, we proposed
a new method to calculate document similarity based on double-relations in the
ontology. With these double-relations combined, a document’s structure can be more
specifically described.
Fig. 19.1 The workflow of semantic biological document similarity calculation and clustering
19 An Efficient Semantic Document Similarity Calculation Method … 181
To represent the document with semantic information, GO terms were extracted from
documents as semantic biology features. As it is referred that transitive relation,
which offers a theoretical basis for paths connection, exists in both ‘IS-A’ and ‘Part-
of’ while other relations like ‘Regulate’ are still under proved.
Double-relations Similarity between Two Features. In this paper, we used two
kinds of semantic similarity measurement methods: path-based similarity measures
and weighted-information-content-based similarity.
The path-based similarity algorithm used in this research is WP [5]. WP introduces
the nearest ancestor for comparing similarity between translation words. An ancestor
with shorter Lowest Common Ancestors (LCA) will be chosen as the nearest ancestor
term c if there are multiple paths reachable between two terms. The similarity goes
to zero when there is no ancestor between two terms.
Double-relations Similarity between Two Documents. Generally, it’s more like an
ontology term set than one term of a document. Similarity’s reallocation is essen-
tial for multi-term comparison. A new double-relations text Similarity Scheme is
proposed, which is based on the Average Maximum Match (AMM).
By referring the proposal of AMM, similarity with single-relation between doc-
uments Cm , Cn can be defined as
⎧
⎨ W Sim(C , C ) = Simt(Cmi , Cn ) + Simt(Cnj , Cm )
m n
m+n (19.1)
⎩
(i = 1, . . . , m, j = 1, . . . , n)
Simt(Cma , Cn ) = MAX (F(Cma , Cnj ))(a ∈ [1, m], j = 1, . . . , n) (19.2)
where Cmi refers to ith terms of corpus C with m terms and Cnj means jth term
of corpus C with n terms. F Ci , Cj is the similarity between terms Ci and Cj by
using one of path-based and IC-based algorithms above. Afterwards, to make this
fundamental AMM module to apply to double-relations conditions, Eq. (19.1) is
rearranged and then aims to pick out biggest similarity index between term x and
term y within different relations R. The newly produced algorithm is as follows:
182 J. Hu et al.
The definition of term w’s TF–IDF in ith document set D is shown as follows.
count(w, Di ) size(D)
TF(w, Di ) = , IDF(w, D) = log( ) (19.4)
size(Di ) N
The experiment dataset contains 848 documents. It contained different classes that
are equally divided into four different classes. An 848 × 848-similarity matrix S is
obtained with selected similarity measurement.
To assess the performance of document clustering, three evaluation measures
including precision, recall, and F-measure are chosen to examine the difference
between test results and original cluster labels. Precision refers to the proportion of
documents’ mutual similarity in the same cluster. Recall is defined as the possibility
of cluster similar documents in the same cluster. F-measure is considered precision
and recall together. The formulas are the following equations:
TP
Precision = (19.5)
TP + FP
TP
Recall = (19.6)
TP + FN
2 × Precision × Recall
F= (19.7)
Precision + Recall
Cluster Annotation: We used the annotation method with TF–IDF to label the
text clusters. Compared with part-of relation based similarity measurement, double-
relations based similarity measurement and is-a relation based similarity measure-
ment can describe the text cluster more comprehensively (Table 19.1).
Comparison with other different methods: In the experiment, we compared our
proposed double-relations similarity measurement with other two similarity measure-
ments based on single-relation. As the experiment shows, double-relations similarity
measurement ranks top one.
184 J. Hu et al.
Table 19.1 Cluster 1 annotation result with top five TF–IDF terms
Original Double-relations Is-A only Part-of only
Protein binding Mitotic spindle Protein binding Nucleus
organization
Mitotic spindle Centrosome Mitotic spindle Protein binding
assembly assembly
Mitotic spindle Protein binding Mitotic sister Mitotic sister
midzone chromatid chromatid
segregation segregation
Mitotic spindle Mitotic spindle Condensed nuclear ESCRT III complex
elongation midzone chromosome
kinetochore
Microtubule Nucleus Nucleus Mitotic spindle pole
cytoskeleton body
organization
Table 19.2 Clustering quality evaluation among re-weighting, Is-A, and Part-of only
Precision Recall F-measure
Similarity measure with double-relations 0.7489 0.7505 0.7497
Similarity measure with is-a 0.6712 0.6980 0.6843
Similarity measure with part-of 0.3585 0.8045 0.4960
19.4 Conclusion
Acknowledgements This study was supported by the National Natural Science Foundation of
China (61702324).
19 An Efficient Semantic Document Similarity Calculation Method … 185
References
1. Danushka, B., Georgios, K., Sophia, A.: A cross-lingual similarity measure for detecting
biomedical term translations. PLoS One 10(6), 7–15 (2015)
2. Spasić, I., Ananiadou, S.: A flexible measure of contextual similarity for biomedical terms.
In: Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing, pp. 197–208
(2005)
3. Rey-Long, L.: Passage-based bibliographic coupling: an inter-article similarity measure for
biomedical articles. PLoS One 10(10), 6–10 (2015)
4. Chen, C., Hsieh, S., Weng, Y.: Semantic similarity measure in biomedical domain leverage
Web Search Engine. In: 2010 Annual International Conference of the IEEE Engineering in
Medicine and Biology (2010)
5. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual
Meeting of the Associations for Computational Linguistics (ACL’94), pp. 133–138 (1994)
6. Leacock, C., Chodorow, M.: Filling in a sparse training space for word sense identification. In:
Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics
(ACL94), pp. 248–256 (1994)
7. Li, Y., Bandar, Z., McLean, D.: An approach for measuring semantic similarity between words
using multiple information sources. IEEE Trans. Knowl. Data Eng. Bioinform. 15(4), 871–882
(2003)
8. Choudhury, J., Kimtani, D.K., Chakrabarty, A.: Text clustering using a word net-based
knowledge-base and the Lesk algorithm. Int. J. Comput. Appl. 48(21), 20–24 (2012)
9. Lord, P., Stevens, R., Brass, A., Goble, C.: Investigating semantic similarity measures across
the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10),
1275–1283 (2003)
10. Resnik, O.: Semantic similarity in a taxonomy: an information-based measure and its applica-
tion to problems of ambiguity and natural language. J. Artif. Intell. Res. Bibliometr. 19(11),
95–130 (1999)
11. Lin, D.: Principle-based parsing without overgeneration. In: 31st Annual Meeting of the Associ-
ation for Computational Linguistics, pp. 112–120. Association for Computational Linguistics,
USA (1993)
12. Zhang, X., Jing, L., Hu, X., et al.: A comparative study of ontology based term similarity
measures on PubMed document clustering. In: International Conference on Database Systems,
pp. 115–126. Springer, Berlin, Heidelberg (2007)
13. Jing, Z., Yuxuan, S., Shengwen, P., Xuhui, L., Hiroshi, M., Shanfeng, Z.: MeSHSim: an
R/Bioconductor package for measuring semantic similarity over MeSH headings and MED-
LINE documents. J. Bioinform. Comput. (2015) (BioMed Central)
14. Logeswari, S., Kandhasamy, P.: Designing a semantic similarity measure for biomedical doc-
ument clustering. J. Med. Imaging Health Inform. 5(6), 1163–1170 (2015)
15. The Gene Ontology Resource Home. http://geneontology.org/. Accessed 27 Feb 2019
16. Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S., Chen, C.F.: A new method to measure the semantic
similarity of go terms. Bioinformatics 23(10), 1274–1281 (2007)
17. Zare, H., Shooshtari, P., Gupta, A., Brinkman, R.: Data reduction for spectral clustering to
analyze high throughput flow cytometry data. BMC Bioinform. (2010)
18. Dongen, V.: A cluster algorithm for graphs. In: Information Systems, pp. 1–40. CWI (2000)
19. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters
in large spatial databases with noise. In: KDD’96 Proceedings of the Second International
Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
20. MacKay, D.: An example inference task: clustering. In: Information Theory, Inference and
Learning Algorithms, pp. 284–292. Cambridge University Press (2003)
21. Robertson, S.: Understanding inverse document frequency: on theoretical arguments for IDF.
J. Doc. 60(5), 503–520 (2004)
Chapter 20
Analysis of the Dispersion of Impact
Point of Smart Blockade and Control
Ammunition System Based on Monte
Carlo Method
Abstract In order to study the dispersion as well as analyze the influencing factors
of the impact point of the smart blockade and control ammunition system, a sim-
plified ballistic model of the parachute–payload system is established. Based on the
Monte Carlo method, the dispersion range of impact point is acquired, and the main
sensitive factors affecting the dispersion of impact point are compared and analyzed.
Simulation results show that the lateral dispensing velocity of the dispenser and the
factors of the parachute are the sensitive factors that affect the dispersion of the impact
point, in which the factors of the parachute are the most obvious. The research in
this paper provides reference and basis for the design of smart ammunition system
of the airborne dispenser.
20.1 Introduction
In future wars, it is crucial to effectively attack and block key targets or areas. With
the development and application of microcomputer technology, wireless communi-
cation technology, sensor technology, and network technology, various new types
of regional blockade ammunition are emerging. Therefore, the research on airborne
dispensers, rockets, and other platforms to adapt to the modern battlefield of the new
regional blockade ammunition system has become a hot spot [1].
The combat mission of the smart blockade and control ammunition system is to
blockade the key areas on the battlefield. The smart blockade and control ammuni-
tion studied in this paper is scattered by the platform of the airborne dispenser, and
the deceleration and attitude adjustment are realized by parachute. The dispersion of
the impact point has a direct impact on the network communication between ammu-
nitions, thus affecting the combat effectiveness of the whole system. Therefore, it is
necessary to strengthen the research on the dispersing technique, and the dispersion of
the impact point. In this paper, the dynamic model of the parachute–payload system
is established and the flight simulation experiment is carried out by the Monte Carlo
method. The range of distribution and dispersion of the impact point are obtained.
The main sensitive factors affecting the dispersion of the impact point are compared
and analyzed.
The airborne dispenser loaded with bullets is divided into three cabins, namely, the
front, the middle and the rear. Four ammunition contained in each cabin are divided
into upper and lower layers, 12 ammunition in total. The arrangement of the six
ammunitions in the lower layer is shown in Fig. 20.1. The process of dispersion is
as follows.
1. First, ammunition no. 1 and no. 2 in the rear cabin are thrown laterally at the
speed of v1 ;
2. After the time delay Δt 1 , no. 3 and no. 4 in the middle cabin are thrown laterally
with speed of v2 ;
3. After the time delay Δt 2 , no. 5 and no. 6 in the rear cabin are thrown laterally
with speed of v3 .
v1 v2 v3
Initial 1 3 5
dispersing
position
2 4 6
v1 v2 v3
z
o x
Δt1 Δt2 0 Δ x1 Δ x2 x
Assume that the mass of the parachute–payload system remains unchanged and
ignore the mass of parachute. The air drag of the system can be simplified as the
pulling force along the opposite direction of velocity. All moments and the forces
which have little influence on the motion are ignored (Fig. 20.2).
Based on the above assumptions, and Newton’s law and kinematics theorem,
the simplified dynamical model of the parachute–payload system is established as
follows [2]:
⎧
⎪ dvx Fx Fb + F p vx − wx ⎧ v = v + w
⎪
⎪ = = · ⎪
⎪ r
⎪
⎪ dt m 2m v ⎪
⎪
⎪
⎪ r ⎪
⎪
⎪
⎪ dv y Fy Fb + F p v y ⎪
⎪ vr = (vx − wx )2 + v2y + (vz − wz )2
⎪
⎪ = = · −g ⎪ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪ dt m 2m vr ⎪
⎪ 1
⎪
⎪ ⎪
⎪ Fb = − ρ(C A)b vr2
⎪
⎪ dv z F z F b + F p v z − w z ⎪
⎪ 2
⎨ = =− · ⎨
dt m 2m vr 1
F p = − ρ(C A)b vr2
⎪
⎪ d x ⎪
⎪ 2
⎪ ⎪
⎪ dt = vx = v cos ϕ cos θ
⎪
⎪
⎪
⎪
⎪
⎪
⎪ ⎪
⎪ v y
⎪
⎪ dy ⎪
⎪ θ = arctan
⎪
⎪ = = θ ⎪
⎪ vx2 + vz2
⎪ dt
⎪
v y v sin ⎪
⎪
⎪
⎪ ⎪
⎪
⎪ dz
⎪ ⎪
⎪ vz
⎪
⎩ = vz = v sin ϕ cos θ ⎩ ϕ = arctan
dt vx
(20.1)
Y
y
C x
O X
The basic idea and principle of using Monte Carlo method to simulate the impact
point dispersion is as follows [3].
There are m random variables Xi (i = 1, 2, 3, …, m) independent of each other.
According to the distribution of each random variable
Xi,
n sets of random numbers
x 1 , x 2 , …, x n obeying the normal distribution N μi , σi2 are generated, where μi , σi
are, respectively, the average and standard deviation of a normal distribution of
random variables Xi. The random variable sampling for each random perturbation
factor is completed in this way so that the flight trajectory and the dispersion of the
impact point under the random disturbance factors can be simulated and calculated.
20.3 Simulation
Based on the mathematical model of the parachute–payload system and the existing
literature research [4–7], the random disturbance factors affecting the dispersion
of the impact point are initial dispersion condition, dispersion of parachute-payload
system, and dispersion of random wind. The initial dispersing altitude is set as 100 m,
the horizontal and lateral dispersing velocities are 200 and 16 m/s. The standard
coordinates of the impact point on the x-z plane are (128.641, 10.2913 m). The value
of random disturbance factors is shown in Table 20.1.
Taking a single initial dispersing condition as a disturbance factor, 500 samples are
selected (Fig. 20.3 and Table 20.2).
The disturbances of horizontal and vertical velocities mainly affect the dispersion
in the x-direction. The disturbance of the initial lateral dispersing velocity influences
dispersion in the z-direction, and the influence degree is more obvious.
Assume the random wind as a breeze, so that the crosswinds are wx = ±0.5 m/s, wz
= ±0.5 m/s (Fig. 20.5 and Table 20.4).
Effect of horizontal velocity dispersion Effect of vertical velocity dispersion Effect of lateral velocity dispersion
10.36 14
10.5
Random impact points Random impact points
Random impact points
10.45 Standard impact point 13 Standard impact point
Standard impact point 10.34
10.4 12
10.32
10.35 11
z/m
z/m
z/m
10.3 10.3 10
10.25 9
10.28
10.2 8
10.26
10.15 7
10.1 6
127.5 128 128.5 129 129.5 130 10.24
128 128.5 129 129.5 128.4 128.45 128.5 128.55 128.6 128.65 128.7 128.75 128.8 128.85
x/m x/m x/m
(a) Effect of horizontal velocity (b) Effect of vertical velocity (c) Effect of lateral velocity
Fig. 20.3 Dispersion of impact point caused by initial dispersing conditions
Table 20.2 Dispersion deviation of the impact point caused by initial dispersing conditions
Disturbance factors Deviation in x-direction (m) Deviation in z-direction (m)
Horizontal velocity −0.6 to 1.0 −0.17 to 0.19
Vertical velocity −0.6 to 0.8 −0.04 to 0.0.6
Lateral velocity −0.23 to 0.19 −3.56 to 4.15
192 Y. Li et al.
Effect of resistance characteristic dispersion of the parachute Effect of resistance characteristic dispersion of the payload Effect of opening delay time dispersion of the parachute
11.5 10.32 14.5
Random impact points Random impact points Random impact points
Standard impact point Standard impact point 14 Standard impact point
10.31
11
13.5
10.3
10.5
13
z/m
z/m
z/m
10.29
12.5
10
10.28
12
9.5
10.27 11.5
9 10.26 11
115 120 125 130 135 140 145 128.3 128.4 128.5 128.6 128.7 128.8 128.9 129 140 145 150 155 160 165 170 175 180
(a) Effect of the parachute (b) Effect of the payload (c) Effect of opening delay time
Fig. 20.4 Dispersion of impact point caused by the parachute–payload system
Table 20.3 Dispersion deviation of the impacts point caused by parachute–payload system
Disturbance factors Deviation in x-direction (m) Deviation in z-direction (m)
Parachute resistance −11.76 to 14.05 −0.94 to 1.12
characteristic
Payload resistance characteristic −0.34 to 0.29 −0.027 to 0.023
Parachute opening delay time 11.83 to 48.38 0.95 to 3.87
Effect of Crosswind Dispersion in Horizontal Direction Effect of Crosswind Dispersion in Lateral Direction Effect of Comprehensive Random Wind Dispersion
10.36 11.5 11.5
Random impact points Random impact points
10.34 Standard impact point Standard impact point
11
11
10.32
10.5
10.5
10.3
z/m
z/m
z/m
10
10.28
10
9.5
10.26
9.5
9
10.24 Random impact points
Standard impact point
10.22 9 8.5
128.3 128.4 128.5 128.6 128.7 128.8 128.9 129 128.56 128.58 128.6 128.62 128.64 128.66 128.68 128.7 128.72 128.2 128.3 128.4 128.5 128.6 128.7 128.8 128.9 129 129.1
x/m x/m x/m
(a) Effect of horizontal crosswind (b) Effect of lateral crosswind (c) Effect of parachute opening
delay time
Fig. 20.5 Dispersion of impact point caused by the disturbance of random wind
22
Random impact points
20 Standard impact point
18
16
14
z/m
12
10
4
120 130 140 150 160 170 180 190
x/m
The influence of the crosswind disturbance in the lateral direction on the impact
point dispersion is more obvious.
Taking the influence of all the above random perturbation factors into consideration,
10,000 random samples are selected for the simulation test (Fig. 20.6).
The dispersion range is an elliptical shape centered on the standard point. The
closer to the center position, the greater the spread probability, and the farther away
from the center position, the smaller the spread probability. The initial dispensing
velocity of the dispenser and the factors of the parachute are sensitive factors affecting
the dispersion of the impact point, in which the parachute factors are the most obvious.
20.4 Conclusion
With the established mathematical model and by virtue of Monte Carlo method,
the flight simulation is carried out. The dispersion regularity of the impact point is
obtained. The influence of random disturbance factors on the dispersion of the impact
point is analyzed, and the most obvious factor is obtained. The results show that the
initial dispensing velocity of the dispenser and the parachute factors are sensitive
factors, in which the factors of the parachute are the most obvious.
194 Y. Li et al.
References
1. Sun, C., et al.: Development of smart munitions. Chin. J. Energ. Mater. 6 (2012)
2. Hang, Z.-P., et al.: The exterior ballistics of projectiles, 1st edn. Beijing Institute of Technology
Press, Beijing (2008)
3. Rubinstein, R.Y., Kroese, D.P.: Simulation and the Monte Carlo Method, vol. 10. Wiley (2016)
4. Mathews, J.H., Fink, K.D.: Numerical methods using MATLAB, vol. 3. Pearson Prentice Hall,
Upper Saddle River, NJ (2004)
5. Kong, W.-H., Jiang, C.-L., Wang, Z.-C.: Study for bomblets distribution on ground of aerial
cluster bomb. J. Aero Weapon. 4, 43–46 (2005)
6. Zeng B.-Q., Jiang, C.-L., Wang, Z.-C.: Research on the ballistic fall point spread of the parachute-
bomb system. J. Proj. Rockets Missile Guid. 30(1), 1–4 (2010)
7. Zhang, G., Feng, S.: Study on point dispersing of conductive fiber based on exterior ballistic
model. Trans. Beijing Inst. Technol. 36(12), 1216–1220 (2016)
Chapter 21
Analysis of the Trajectory Characteristics
and Distribution of Smart Blockade
and Control Ammunition System
Abstract In order to study the ballistic trajectory and distribution of the smart block-
ade and control ammunition system, a simplified ballistic model of the parachute—
payload system is established. Flight trajectory characteristics and distribution of the
smart blockade and control ammunition system are obtained and analyzed, and the
distribution and the area of the blockade zone of the 12 ammunition are simulated.
Simulation results show that the dispersing altitude has the greatest influence on the
falling time. The initial horizontal velocity of the dispenser and the resistance char-
acteristics together with the opening time delay of the parachute have an important
impact on the horizontal displacement and the lateral displacement, respectively.
The study of this paper provides an effective analysis method for the design of the
weapon system.
21.1 Introduction
With the breakthrough of various key technologies, various new types of blockade
ammunition have been introduced. At present, countries are increasing their efforts
in research and development of new blockade munitions [1]. Under various weather
and geographical conditions, the airborne dispenser can distribute many kinds and
large quantities of blockade ammunition to multiple areas in a single distribution,
which has wide blockade area and high reliability.
As an important part of the airborne dispenser, the dispersal system has a deci-
sive influence on the operational effectiveness of the ammunition. Therefore, it is
necessary to strengthen the research on the dispersing technique, the trajectory of
ammunition and the distribution of impact points. Based on the work of predecessors
[2, 3], this paper uses the Newton mechanics method to establish the dynamic model
of the smart blockade and control ammunition system and elaborates trajectory sim-
ulation programs for simulation. Then the trajectory and velocity, as well as the
displacement of the parachute-–payload system, are obtained, and the main factors
affecting the trajectory characteristics are analyzed. The impact points distribution
and blockade range of 12 ammunition are acquired.
The smart blockade and control ammunitions carried by dispenser are divided into
the upper and lower layers. Each dispenser has 3 cabins, namely, the front, the middle
and the rear. Each cabin is equipped with 4 ammunition, 12 ammunition in total. The
dispersal process and the arrangement of the six ammunitions in the lower are shown
in Fig. 21.1.
In order to simplify the complexity of the calculation in the analysis, the following
basic assumptions are made.
Dispenser
The rear The middle The front
carbin carbin carbin
The The The
front middle rear vx0
cabin cabin cabin
v1 v2 v3
Initial 1 3 5
dispersing
Each dispenser has 12 position
parachute-payloads in it.
2 4 6
v1 v2 v3
z
x o
Parachute-payload system Dispersal Process t1 t2
(a) System composition and dispersal process (b) Ammunitions layout at the initial time
Fig. 21.1 The dispersal process and the arrangement of the six ammunitions in the lower
21 Analysis of the Trajectory Characteristics … 197
1. The parachute opens instantly and ignores the process of straightening the
parachute rope and parachute inflation, and the changes in the quality and the
attitude of the parachute are ignored;
2. The pull force of the parachute on the payload is always parallel to the motion
direction of the barycenter, and the point of action is on the barycenter of the
payload;
3. The lift force, Coriolis acceleration, Magnus force, and Magnus moment are
ignored, and all moments and forces that have less influence on the payload
motion are omitted.;
4. Assume the gravity acceleration is constant (g = 9.8 m/s2 ), and the direction is
vertical downward.
5. The ground inertial coordinate system (O-XYZ) and the reference coordinate
system (C-xyz) are established, and the simplified dynamical model of the
parachute–payload system is derived [4, 5] (Fig. 21.2).
Using Newton’s law and kinematics theorem gives
dv
m = Fi Fi = Fb + Fp + G
(21.1)
dt
Y
y
C x
Fb+Fp
z
G
O X
Fig. 21.2 The force of the parachute–payload system under the coordinate system
198 Y. Li et al.
⎧ ⎧
⎪ dv Fx Fb + F p vx − wx v = vr + w
⎪ x
⎪ = = · ⎪
⎪
⎪
⎪ dt m 2m vr ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ dv Fy Fb + F p v y ⎪
⎪ vr = (vx − wx )2 + v2y + (vz − wz )2
⎪
⎪ y
= = · −g ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪ dt m 2m vr ⎪
⎪ 1
⎪
⎪ ⎪
⎪ Fb = − ρ(C A)b vr2
⎪
⎪ dvz Fz Fb + F p vz − wz ⎪
⎪ 2
⎨ = =− · ⎨
dt m 2m vr 1
F p = − ρ(C A)b vr2
⎪
⎪ dx ⎪
⎪ 2
⎪
⎪ = vx = v cos ϕ cos θ ⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪
dt ⎪
⎪ v y
⎪
⎪ dy ⎪ θ = arctan 2
⎪
⎪
⎪ = v y = v sin θ ⎪
⎪ vx + vz2
⎪
⎪ ⎪
⎪
⎪
⎪ dt ⎪
⎪
⎪
⎪ ⎪
⎪ vz
⎪
⎩
dz
= vz = v sin ϕ cos θ ⎩ ϕ = arctan
dt vx
(21.2)
where, m, g, and ρ represent, respectively, the mass of the payload, gravity accelera-
tion, and air density. v, vx , v y , and vz denote the resultant velocity, horizontal velocity,
vertical velocity, and lateral velocity of the system, respectively. wx and wz indicate
the velocity of crosswind from the direction of x and z. (C A)b and (C A) p denote
resistance characteristics of payload and parachute. Fb and F p stand for the aerody-
namic drags of payload and parachute. θ, ϕ indicate trajectory inclination angle and
trajectory deflection angle.
H = 100, 150, and 200 m are selected to simulate and calculate [6, 7], and the
results show that only the landing time and horizontal displacement increase with
the dispensing altitude (Fig. 21.3).
The landing time and horizontal displacement increase with the dispersal altitude.
The lateral displacement, steady falling velocity, and final falling angle are almost
invariable under different altitudes.
Initial horizontal velocities of dispenser are 200, 220, 240, 260, 280, 300 m/s
(Fig. 21.4).
21 Analysis of the Trajectory Characteristics … 199
(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time (θ-t)
The larger the initial horizontal velocity of the dispenser, the larger the horizontal
displacement is and the smaller the lateral displacement.
(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time ( θ-t)
The delay time of opening parachute is 0.1, 0.2, 0.3, 0.4, and 0.5 s, respectively,
(Fig. 21.6).
Opening delay time only affects the horizontal and lateral displacement of the
landing. The longer the opening time delay, the larger the horizontal displacement
and lateral displacement.
The spacing between two adjacent ammunition is set at (20, 50 m). vx0 = 200 m/s,
vy0 = 0 m/s, vz0 = ±16, ±30, ±16, ±30, ±16, ±30 m/s, H = 100 m, m = 15 kg,
21 Analysis of the Trajectory Characteristics … 201
(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time (θ-t)
Fig. 21.5 Ballistic parameters under different resistance characteristics of the parachute
(CA)p = 0.6 m2 , (CA)b = 0.018 m2 . The dispersing time interval of different cabins
is 0.2 s.
The trajectories of 12 ammunition do not overlap with each other. The mini-
mum and maximum distance between the two adjacent impact points are 20.583 and
38.38 m. The 12 ammunition communicate through the network in the way as shown
in Fig. 21.7b. Assuming the detection radius of ammunition is 50 m, the blockade
area is about 12324 m2 .
202 Y. Li et al.
(c) Velocity versus time curve(v-t) (d) Trajectory inclination angle versus time ( θ-t)
Fig. 21.7 The trajectory and impact point distribution of 12 ammunition in the airborne dispenser
21 Analysis of the Trajectory Characteristics … 203
21.4 Conclusion
In this paper, a model for calculating the trajectory of a deftly controlled ammuni-
tion system is established, and the effects of different conditions on the trajectory
characteristics and the distribution of impact points of the ammunition system are
compared and analyzed. The study shows that the dispersing altitude has the greatest
influence on the falling time, and the initial horizontal velocity of the dispenser, the
resistance characteristics and opening time delay of the parachute have an important
impact on the horizontal displacement and the lateral displacement.
References
1. Yang, J., He, G., Zhang, Z.: Common terminal-sensitive submunition with function of blockade
and control. In: 2016 5th International Conference on Advanced Materials and Computer Science
(ICAMCS 2016). Atlantis Press (2016)
2. Sun, C., Lu, Y.: Analysis of submunition distribution of an unguided cluster munition. J. Proj.,
Rocket., Missile Guid. 30(1), 1–4 (2010)
3. Fang, Y., Jiang, J.: Stochastic exterior ballistic model of submunitions and its monte carlo
solution. Trans. Beijing Inst. Technol. 29(10), 850–853 (2009)
4. Dmitrievskii, A.A.: Exterior Ballistics. Moscow Izdatel Mashinostroenie (1979)
5. Hang, Z., et al.: The Exterior Ballistics of Projectiles, 1st edn. Beijing Institute of Technology
Press, Beijing (2008)
6. White, F.M., Wolf, D.F.: A theory of three-dimensional parachute dynamic stability. Aircraft
5(1), 86–92 (1968)
7. Klee, H., Allen. R.: Simulation of Dynamic Systems with MATLAB® and Simulink®. Crc Press
(2018)
Chapter 22
Study on Lee-Tarver Model Parameters
of CL-20 Explosive Ink
22.1 Introduction
Modern warfare makes weapon systems develop toward miniaturization and intel-
lectualization. Since the 1990s, the technology of MEMS (Microelectromechanical
System) has developed rapidly [1]. How to realize the precise charge of micro-
explosive in explosive train and ensure that the explosive can initiate and boost reli-
ably has become a difficult problem, which restricts the development of the booster
sequence of MEMS. The direct write deposition of explosives is to write the explo-
sive ink directly on the base surface of the MEMS device through a digital controlled
direct write device. When the solvent in ink evaporates, the explosive solids will be
deposited in the predetermined position, which has the characteristics of safety, batch
deposition and accurate graphics. It has become a potential micro-charging method
for MEMS devices.
Explosive ink is a multicomponent mixing system consisting of explosive solid,
binder system (including binder and solvent) and other additives (other high-energy
explosive components or additives), usually in suspension or colloidal state. Since
2005, Fuchs [2] has developed EDF series of CL-20-based explosive ink, and suc-
cessfully loaded it into the MEMS fuze by direct write technology, and verified its
performance of detonation propagation in complex structures. In 2010, Ihnen [3] dis-
persed RDX in the binder system of cellulose acetate butyrate or polyvinyl acetate
to obtain the RDX-based explosive ink formulation. In 2013, Zhu [4] designed CL-
20/polyvinyl alcohol/ethylcellulose/water/isopropanol-based explosive ink. In 2014,
Stec III [5] reported the formulation of CL-20/polyvinyl alcohol/ethyl cellulose ink
which can be used in MEMS devices. In 2016, Wang [6] developed CL-20/GAP-
based explosive ink, which can be used for micro-scale charge, and the critical deto-
nation size is less than 0.4 × 0.4 mm. In 2018, [7] developed CL-20 based explosive
ink with ethyl cellulose (EC) and polyazide as binders, ethyl acetate as a solvent,
and studied its critical detonation propagation characteristics.
The critical size of explosive refers to the minimum charge size of explosive
for stable detonation. The critical size of CL-20 is significantly lower than that of
RDX and HMX, which means its suitable for preparing explosive ink. At present,
research on CL-20 explosive ink mainly focuses on formulation design and experi-
ment, seldom on simulation. In the finite element simulation calculation, JWL EOS
is generally used to describe the external function of explosive ink and its detonation
products. The JWL EOS parameters of explosive are usually calibrated by the cylin-
der test method proposed by Kurry [8]. However, it is difficult for the explosive ink
to realize the structure of large size charge. So the parameters of JWL EOS cannot
be obtained by cylinder test. In the MEMS explosive train, because of the small size
and the diameter effect of the charge, Lee-Tarver model is needed to describe the
nonideal detonation behavior.
In order to determine the Lee-Tarver model parameters of CL-20 explosive ink
with forming the density of 1.45 g/cm3 (93% CL-20, 3% GAP, 2% NC), we write CL-
20 explosive ink to groove with different sizes and measure the detonation velocities.
Explo-5 software is used to calculate the detonation and JWL EOS parameters of
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 207
CL-20 explosive ink. Besides, simulation models are established with AUTODYN
software according to the detonation velocity test. Combining with finite element
simulation and test results, the Lee-Tarver model parameters of CL-20 explosive ink
are fitted. According to the determined Lee-Tarver model parameters, a simulation
model is established to calculate the critical size of CL-20 explosive ink.
Due to the effect of high temperature and high pressure in the detonation reaction of
explosive ink, a sudden change of electrical signal is produced at the electrode probe.
The instantaneous information sensed by the probe is transformed into a pulse signal
with obvious waveform by RLC network, which is input into the transient recorder
as a timing pulse signal. Then, after the input signal is amplified and impedance
transformed, the analog signal is converted into a digital signal in A/D and sent to
memory for storage. After further D/A conversion, the analog signal is transmitted to
the ordinary oscilloscope for display in the form of the analog voltage. The data stored
in the transient recorder can also be read into the RAM of the computer through the
special interface inserted in the expansion slot of the computer, and then transmitted
or printed. The main test equipment and schematic diagram are shown in Fig. 22.1.
Keep the length and width of CL-20 explosive ink 100 mm and 1 mm unchanged,
and the charge thickness is, respectively, 0.2, 0.4, 0.8, 1 and 1.5 mm. The material
of the base plate is 2024Al, and the size of base plate is 180 × 40 × 12 mm. The
material of the cover plate is the same as the base plate, whose size is 180 × 40
× 10 mm. The electric detonator is used to detonate CL-20 explosive ink. The test
device is shown in Figs. 22.2, 22.3, and 22.4.
After signal processing, the average detonation velocity of CL-20 explosive ink
with different sizes is calculated as shown in Table 22.1.
As is shown in Fig. 22.5, the experimental data can be fitted with a correlation
coefficient of 0.997.
Accurately describing the characteristics of the material is the basis for ensuring
reliable calculation results. The material involved in this study is a constrained shell,
high explosive, and air.
High Explosive. The detonator is replaced by a 0.5 cm high cylindrical charge,
which is only used to detonate the CL-20 explosive ink. The detonation process of
210 R. Liu et al.
the explosive is neglected, and the expansion process of the product is described by
the JWL EOS, which is
ω ω ωE
p(V, E) = A(1 − )e−R1 v + B(1 − )e−R2 v + (22.2)
R1 V R2 V V
Here, F is the fraction reacted; t is the time in μs; p is the pressure in Mbar; ρ
is current density in g/cm3 , ρ 0 is the initial density; I, x and b are the parameters to
control the ignition term; a is the critical compression to prevent the ignition. Only
when the compression is ρ/ρ 0 > 1 + a, can the charge be ignited; G1 , c, d, and
y control the early growth of the reaction after ignition; G2 , e, g and z determine
the rate of high-pressure reaction. According to the meanings of Lee-Tarver model
parameters and Li’s [10] work, G1 , G2 , and z are taken as variables, and the rest of
parameters are fixed, as is shown in Table 22.3.
Air. The air in Euler grids is described by ideal gas state equation, which is
p = (γ − 1)ρ E g (22.4)
where γ is the adiabatic exponent (for the ideal gas, there is γ = 1.4); ρ is density,
and the initial density of air is 0.001225 g/cm3 ; the initial pressure is 105 Pa; Eg is
gas specific thermodynamic energy.
2024Al. The material parameters of 2024 aluminum are from the AUTODYN
material library and summarized in Table 22.4. The Dynamics response behavior of
2024Al was described by Johnson–Cook strength model and Shock state equation.
The shock EOS is the Mie-Gruneisen form of EOS that uses the shock Hugoniot as
reference.
γ
p − pH = (e − e H ) (22.5)
v
Here, p is the pressure, γ is the Gruneisen constant, v is the specific volume, and
e is the specific internal energy. Subscript H denotes the shock Hugoniot, which is
defined as the locus of all shocked states for each material. Here, Shock EOS need the
p-v Hugoniot. This Hugoniot is obtained from the U-u Hugoniot or the relationship
between shock and particle velocities.
U = C0 + su (22.6)
Here, σ is the yield stress or flow stress, A is the static yield stress, B is the
hardening constant, 2 is the strain, n is the hardening exponent, C is the strain rate
constant, ε̇ is the strain rate, ε̇0 is the reference strain rate, T is the temperature, T r
is the reference temperature, T m is the melting point, and m is the thermal softening
exponent.
Table 22.5 Detonation velocity of CL-20 explosive ink in 1.5 mm deep channel
Gauge #5 #6 #7 #8 #9 #10 Average
Peak time (μs) 1.5095 1.822 2.1341 2.4461 2.7581 3.07
Time interval (μs) 0.3125 0.3121 0.312 0.312 0.3119 0.3121
Dj (m/s) 6400 6408 6410 6410 6412 6408
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 213
in Table 22.6 in which H is the charge thickness of CL-20 explosive ink, and Ds and
Dt is respectively detonation velocity of simulation and test.
As Table 22.6 shows, the detonation velocity of the CL-20 explosive ink increases
with the charge thickness increasing. The deviation between the calculated detonation
velocity and the experimentation is within 10%. The experimental measurement
deviation of the detonation velocity will be greater in the smaller size. It proves that
Lee-Tarver model is suitable to describe the diameter effect of CL-20 explosive ink
in small size.
Critical Size. According to the determined Lee-Tarver model parameters, a numeri-
cal model with 0.1 mm thick CL-20 explosive ink is established to explore the critical
size. The pressure histories of gauge points are recorded, as is shown in Fig. 22.9.
Distance between each gauge point is 0.05 cm.
As can be seen from Fig. 22.9, the detonation pressure decreases with the increase
of the detonation depth and the detonation eventually extinguishes. When the shock
wave interacts on the CL-20 explosive ink, some of the explosives react because of
the high pressure. As a result, the pressure decreases slowly in Gauge #1–#5. Gauge
#6 begins, the low shock wave pressure cannot stimulate the explosive to react, which
decreases exponentially and eventually the detonation extinguishes.
214 R. Liu et al.
22.4 Conclusion
(1) The detonation velocity of CL-20 explosive ink is measured under different
charge sizes. The formula of detonation velocity with charge size was fitted:
D j = 6871.52 − 852.67e(− 0.4354 ) . The limit detonation velocity was about
x
6871 m/s.
(2) Based on BKW equation, the detonation parameters and JWL EOS parameters
of CL-20 explosive ink with a density of 1.4 g/cm3 are calculated by Explo-5
software.
(3) Lee-Tarver model can describe the diameter effect of small-sized charge. Com-
bining with finite element simulation and test results, a set of Lee-Tarver model
parameters which can describe the detonation velocity–size relationship of CL-
20 explosive ink is obtained.
(4) According to the determined parameters of the Lee-Tarver model, the critical
thickness of CL-20 explosive ink is calculated under the existing charge width
and constraints ranging from 0.1 to 0.2 mm.
References
1. Wang, K.-M.: Study on Interface Energy Transfer Technology of Explosive Train. Beijing
Institute of Technology, Beijing (2002)
2. Fuchs, B.E., Wilson, A., Cook, P., et a1.: Development, performance and use of direct write
explosive inks. In: The 14th International Detonation Symposium, Idaho (2010)
3. Ihnen, A., Lee, W.: Inkjet printing of nanocomposite high explosive materials for direct write
fuzing. In: The 54th Fuze Conference, Kansas (2010)
4. Zhu, Z.-Q., Chen, J., Qiao, Z.-Q., et al.: Preparation and characterization of direct write explo-
sive ink based on CL-20. Chin. J. Ener. Mater. 21(2), 235–238 (2013)
22 Study on Lee-Tarver Model Parameters of CL-20 Explosive Ink 215
5. Stec III, D., Wilson, A., Fuchs, B.E., et al.: High explosive fills for MEMS devices. U.S. Patent
8 636 861, 28 Jan 2014
6. Wang, D., Zheng, B., Guo, C., et al.: Formulation and performance of functional sub-micro
CL-20-based energetic polymer composite ink for direct-write assembly. RSC Adv. 6(113),
112 325–112 331 (2016)
7. Xu, C.-H., An, C.-W., Wu, B.-d., Wang, J.-y.: Performances and direct writing of CL-20 based
explosive ink. Init. Pyrotechn. 1, 41–44 (2018)
8. Kury, J.W., Hornig, H.C., Lee, E.L., et al.: Metal acceleration by chemical explosives. In: 4th
Symposium (Int) on Detonation
9. Tarver, C.M., Urtiew, P.A., Chidester, S.K.: Shock compression and initiation of LX-10. Pro-
pellants, Explos., Pyrotech. 18, 117–127 (1993)
10. Li, Y., Yang, X., Wen, Y., et al.: Determination of Lee-Tarver model parameters of JO-11C
explosive. Propellants, Explos., Pyrotech. 43, 1–10 (2018)
11. Ihnen, A., Fuchs, B., Petrock, A., et a1.: Inkjet printing of nanocomposite high explosive
materials. In: The 14th International Detonation Symposium, NJ (2010)
Chapter 23
Optimal Design of Online Peer
Assessment System
23.1 Introduction
Y. Lin (B)
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou 350121, People’s Republic of China
e-mail: 109109850@qq.com
Y. Lin · Y. Lin
School of Computer and Control Engineering, Minjiang University, Fuzhou 350108, People’s
Republic of China
e-mail: leafmissyou@126.com
they are lack of teaching and professional knowledge. Teachers need to strengthen
monitoring and management, which will undoubtedly increase their burden.
Up to now, many researchers are continually exploring how to improve the way of
peer assessment and have worked out on some effective strategies [4–8]. This study
draws on the research results of the above scholars, analyses the shortcomings of the
existing peer assessment system, and proposes a comprehensive solution to meet the
teaching characteristics of engineering courses in our university to further enhance
the reliability and effectiveness of online peer assessment and reduce the workload
of teachers.
The results of the mutual assessment can reflect the students’ normal learning level.
However, a single score may have some random noise. This system is usually used for
unit testing of courses and homework, not fit for final exams, because the final exam
is serious. If most peer scorers are casual and irresponsible in their attitudes toward
scoring, the scoring information is meaningless. Even the best algorithm strategy
calculation results cannot reflect the students’ true results. Therefore, the total Grade
Contribution Value (referred to as TGCV, followed by a definition formula) of each
student should account for a large proportion of the normal scores when formulating
the course assessment methods. Only in this way can students take this work seri-
ously and consolidate their knowledge in the process of peer assessment. In practical
application, TGCV accounts for 50% of the normal results.
Due to the limitation of an article, this chapter only discusses the key algorithms for
implementing peer assessment in the system.
Each subjective question in the question bank consists of six parts: (1) topic, (2)
reference answer, (3) scoring standard, (4) trusted threshold—CT, (5) non-trusted
threshold—UCT, and (6) topic total score (TTS).
In order to implement peer evaluation, teachers need to quantify the evaluation criteria
carefully.
So there are many specific scores in the scoring criteria. Therefore, when designing
the system, the total score attribute is added to the question in the question bank,
which corresponds to the score in the scoring criteria. When assigning homework,
teachers can add multiple questions. The final score of the student’s homework is
converted into a percentage system.
220 Y. Lin and Y. Lin
Each subjective question is different in difficulty and accuracy due to different con-
tents. CT reflects the allowable scoring error range. Supposing that student X scores
on an answer Y is V (X,Y ) , and the result of the answer based on the algorithm strategy
is V (F,Y ) , the trusted threshold
of the answer
to this question is CT Y , the untrusted
threshold is UCT Y . If V(F,Y ) − V(X ,Y ) <=CTY , it means that the student’s evalua-
tion of the answer is credible.
UCT reflects
the lower limit of a student’s unreliable
evaluation. That is, if V(F,Y ) − V(X ,Y ) >=UCTY , it means that the student’s evalua-
tion of the answer is not credible. Usually, good grades are evaluated poorly, which
is not credible with high probability. So the two thresholds satisfy the following
inequalities:
1
0 < CTY < TTSY < UCTY < TTSY (23.1)
2
The teacher can set the CT and UCT of the question according to the corrective
characteristics. If the score of a question is not prone to deviation, CT and UCT can
be set smaller, and vice versa. For example, for a 10 point question, CT can be set to
2 points; UCT is set to 6 points.
F(V( X, Z) ) (23.2)
Then, we define the reliability degree (CD) of student X in the Yth assignment
scoring as follows:
Y
GCD(X ,Y ) = CD(X ,y) (23.4)
y=1
Assuming that the course has a total of M assignments, the student’s score con-
tribution value TGCV is defined as
GCD(X ,M )
TGCVx = × 100 (23.5)
M
Scoring tasks allocation in the system is based on answer granularity, not job granu-
larity. Such a student’s homework may be evaluated by more students, reducing the
possibility of cheating in grading.
It is better that one answer is randomly assigned to K graders. Assuming that
each evaluator has the same scoring workload, the average number of assignments
evaluated by each student is not more than K. Generally, the larger the K value is,
the better the algorithm is. This system chooses K = 5 according to the experience
of many peer evaluators.
We can see from formulas (23.2) and (23.3) that the CD calculation of student X
needs V (F,Z) . After the first student assignment is submitted, the system requires
both teacher assessment and peer assessment. For answer Z, if the teacher’s grade
is V T , then V(F,Z) = VT . Therefore, the first time students participating in peer
222 Y. Lin and Y. Lin
assessment will not reduce the workload of teachers’ scoring. Its main role is to
produce the GCD(X,1) . The higher the GCD(X,1) , the higher the credibility of student
X’s evaluation.
For the answer Y in the ith homework (i >= 2), the system assigns K students to
correct the answer through the task, without loss of generality. Assuming that the
number of these K students is S1, S2, …, SK, the pseudocode for the calculation of
V (F,Y ) is as follows:
The program pseudocode indicates that the grade of the student with the highest
GCD from the previous time is selected as the final score. The reason for choosing
this algorithm strategy is that the students with high GCD in the past are more likely
to be close to the real scores. The algorithm combines the formulas (23.2) and (23.3)
to reflect the following ideas: If the student scores well, they will get more trust;
more trust will promote more high credibility.
23.3.4.4 Others
In order to prevent students from submitting objections casually, the system has
designed the following strategies:
For an answer X, after the peer correction, the score is V1, and the owner of the
answer is student Y, who thinks the answer is V2. Only when (V2 − V1 ) > CTX ,
the system allows student Y to submit objection information, including V1 and V2 ,
which is reserved for teacher evaluation. The teacher’s evaluation score for answer
X is V3 . If |V3 − V2 | <= CTX , the student will be deducted a certain score for this
assignment. The idea embodied in this strategy is that if the peer score is not much
different from the real score, it is considered a valid score.
The peer assessment system has been tested in several classes of Computer and Con-
trol Engineering College of Minjiang University, such as “User Interface Design and
Evaluation”, “Database Course Design”, “Software Development Tools and Envi-
ronment”, and so on. The results were satisfactory. Through this system, the author
collected 117 peer evaluation data from 2016 software engineering majors and 2016
Computer Science and Technology majors. In order to test the reliability of peer
assessment, teacher assessment and peer assessment were carried out simultane-
ously for one assignment in two classes. We also analyzed the assessment results
and get the scatter plots (see Figs. 23.1 and 23.2)
50
50 60 70 80 90
Teacher assessment
As can be seen from Figs. 23.2 and 23.3, the peer score and the teacher score
show a relatively linear, indicating that the correlation between the two is high and
the consistency is better. The questionnaire survey after using this system shows that
teachers and students are more satisfied with the use of the peer assessment system.
The introduction of peer assessment in blended teaching can better mobilize the
enthusiasm of students to participate.
Online peer assessment system basically meets the practical needs of peer evalua-
tion, but there are also some problems, such as the following problems in the existing
algorithm strategy: if the peer score is higher than the real situation, the students who
are graded will have no objection. As a result, everyone was happy to get high marks.
However, the result will affect the value of the evaluation. It is easy to see from the
previous two scatter plots that the peer score is generally higher than the teacher
score. Therefore, we should have a more perfect mechanism, so that students dare
not randomly give high marks to their peers. How to measure the students’ attitude
toward the task of correcting is worth further study.
This system provides an effective solution to carry out peer assessment activities.
In the next stage, we will consider extending the system to more courses, so as to
find more problems to be solved.
Acknowledgements This study was supported by the teaching reform Research project of Minjiang
University under Grants No. MJU2018B044 and supported by Fujian Provincial Key Laboratory
of Information Processing and Intelligent control.
References
1. Gielen, S., Peeters, E., Dochy, F., et al.: Improving the effectiveness of peer feedback for learning.
Learn. Instr. 20(4), 304–315 (2010)
2. Kollar, I., Fischer, F.: Peer assessment as collaborative learning: A cognitive perspective. Learn.
Instr. 20(4), 344–348 (2010)
3. Speyer, R., Pilz, W., Kruis, J.V.D., et al.: Reliability and validity of student peer assessment in
medical education: a systematic review. Med. Teacher 33(11), e572–e585 (2011)
4. Ueno, M., Okamoto, T., Nagaoka, K.: An item response theory for peer assessment. IEEE Trans.
Learn. Technol. 9(2), 157–170 (2008)
5. Shu, C.: Design and optimization of online peer assessment system. E-educ. Res. (1), 80–85
(2017)
6. Bai, H., Su, Y., Shen, S.: An empirical study on a blended leaning model integrated peer review.
E-educ. Res. 12, 79–85 (2017)
7. Sun, L., Zhong, S.: Probabilistic models of peer assessment in MOOC system. Res. Open Educ.
20(5), 83–88 (2014)
8. Xu, T.: Design of peer assessment in xMOOCs. Res. Open Educ. 21(2), 70–77 (2015)
9. Get the Java SSM framework development, https://www.imooc.com/course/programdetail/pid/
59/. Accessed on 23 Dec 2018
Part II
Power Systems
Chapter 24
A Method of Calculating the Safety
Margin of the Power Network
Considering Cascading Trip Events
Huiqiong Deng, Chaogang Li, Bolan Yang, Eyhab Alaini, Khan Ikramullah
and Renwu Yan
Abstract Aiming at the phenomenon of the cascading trip in power system, this
paper studies a calculation method of safety margin of power system considering
the cascading trip. First, this paper analyses the behavior of cascading trip according
to the action equation of relay protector, and proposes the concept of the critical
state according to find out whether there are cascading trip events occurring in a
power system. Then, this paper proposes an index for measuring the safety margin
of power system in the case of considering a cascading trip, and proposes a model
and an algorithm for calculating the safety margin. Finally, the rationality of the
algorithm is demonstrated by the example of IEEE39 system.
24.1 Introduction
Cascading tripping may cause cascading failure in complex, and even blackouts. In
recent years this phenomenon has been confirmed by a number of power blackouts
in the world. Due to the great impact of blackouts, the phenomenon of cascading
failures, including Cascading Trips, has received extensive attention and research in
recent years [1–3]. Some researchers have carried out research on cascading failures
by the mechanism of cascading failures, the simulation of cascading failure, the risk
analysis of cascading failures, the transmission effect of power network structure on
cascading failures, and so on. They have obtained many beneficial research results,
which provide a lot of inspiration for the further study of the cascading failures of
power system.
In addition, the paper [4] studies the risk prediction method of power network
faults evolution based on intelligent state causality chain, and forms the power net-
work faults risk assessment system based on intelligent state causal chain, and [5]
studies the main chain fault mode of the system by using chain fault mode mining
algorithm, combined with the system topology and power flow state, the system chain
fault propagation law is analyzed, and the stochastic power flow and risk value the-
ory are introduced. In the paper [6], a high-risk fault chain model is constructed, and
the risk level of cascading outages with large-scale wind turbine system is studied
and analyzed, which are studied from the intelligent state, data mining and random
currents and other angles to cut in, providing some new perspectives. In a word, the
research of chain fault is very important to the security of power system, but most
of the current research is still in the exploration stage, and there is a long way to go
into the guidance of the actual power grid.
From the blackout occurred in recent years, cascading trip in the early stage
of blackouts usually appeared. Although the actual operation of the power grid is
generally carried with various safety check works, it can not completely eliminate
the blackout. In view of this, the defense should be studied from a broader field of
vision to the cascading trip and power blackouts.
This paper gives a safety margin model and its solving method is based on opti-
mization theory. In this paper, the node injection power of the grid is taken as the main
parameter in the definition of cascading trip security margin. Finally, the IEEE39 sys-
tem is used for the numerical example analysis and verification.
In this paper, we focus on the problem of the first level cascading trips caused by
the initial failure of the normal operation of the power grid. In the process of the
cascading trip, the main equipment which drives the after stage fault is the circuit
relay protection which is used for the backup [7]. The research of this paper is mainly
aimed at the situation of current type backup protection. A method of determining
24 A Method of Calculating the Safety Margin of the Power Network … 229
the expression equation of cascading trip is presented based on the action behavior
of backup relay protection in the literature [8]. An initial fault occurs at a certain
time in a power network, and the initial fault branch is a branch L ij between the node
i and the node j. In addition to the branch L ij in any branch of the power network,
such as between the node s and node t the branch L st whether the cascading trip can
be used to determine by the Eq. (24.1):
In Eq. (24.1), the ωst·lim and ωst are related to the fixed value and the measured
value of the backup protection which is arranged with the branch L st , and is related to
the action equation of the backup protection. Take the current type backup protection
as an example, the ωst·lim can be taken as the protection of the set value of I st·set , ωst
can be taken to protect the measured value of I st . ωst·dist is used to measure the amount
of electrical distance between ωst·lim and ωst in the branch L st , which is expressed in
terms: if the ωst·dist < 0, then the branch L st occurs cascading trip. If the ωst·dist > 0,
the branch L st does not occur cascading trip. If the ωst·dist = 0, then the branch L st
in the occurrence of the cascading trip out of the border. The analysis of this paper
assumes that the grid can be unified by Eq. (24.1) for analysis.
For the power network, in addition to the initial fault branch, when there is at least
one branch occurence in the cascading trip, then the power grid will be cascading trip.
Cascading trips are not occurring in the power grid when any branch of a branch is
not in place. When any branch L st meet ωst·dist ≤ 0, and at least a slip at the boundary
of the cascading trip, the grid will in the boundary of the cascading trip. When the
node injection power of the power grid before and after the initial failure is ignored,
the ωst·dist is mainly determined by the power grid node injection power before the
initial fault. According to the power of node injection in power network, the operation
state of the power network can be divided into three states, one is Cascading Trips
do not occur: one is Cascading Trips: T 2 , one is the critical state: T 0 .
In order to analyze the safety margin of quantification, the T 0 state which cor-
responds to the grid node injection power vector is set to S. Node-injected power
vector for the current to be analyzed is set to S . The distance between these two
vectors can be expressed by Eq. (24.2).
D = S − S (24.2)
If the current running state of the power grid is in T 1 , the shortest distance between
the running state of the power grid and the running state of the T 1 is set to min D.
Obviously, the min D > 0 shows that when the initial fault is given, the power grid has
a certain safety margin. According to the previous analysis, if the current operation
status of the power grid can be changed, regardless of the change, as long as min D >
0, the grid can have certain safety margin when the initial fault is given. Thus, it can
be seen min D is one of the most important parameters that can be used as a measure
of safety margin index. The following paper mainly focuses on the analysis of the
230 H. Deng et al.
current status of S 1 in the T 1 collection. So it is inferred that when the initial fault is
given, the problem of the safety margin in the operating state is solving the problem
of min D. This can be classified as optimization problems. Its objective function can
be written in the form of Eq. (24.3).
F = min D (24.3)
By the previous analysis and Eqs. (24.2), (24.3), we can know that the S is the
known power vector of the node in the current to be analyzed. The amount to be
found is S 0 which is the closest between S and T 0 in Eq. (24.3). In the amount of S 0 ,
the active power, reactive power and reactive power of the PV nodes are determined
by the constraint of the power flow equation. S 0 the variables to be optimized in S 0
are the active power of the PV node, the active and reactive power of the PQ node,
which is expressed by O. Z is used to represent variables apart from the variable O
in the amount of S 0 .
When the power network is in the T 0 set, the nodal injection power is S0 , and
the corresponding equality constraints are the power flow constraints which must be
satisfied. Its specific form is shown in the Eq. (24.4) [4].
⎧ n
⎪
⎪ 0
PGi = PDi
0
+ Ui1 U 0j G i0j cos θi0j + Bi0j sin θi0j
⎪
⎪
⎨ j=1
n
0
= 0
+ 0 0 0
θ 0
− 0
θ 0 (24.4)
⎪
⎪ Q Gi Q Di U i U j G i j sin i j Bi j cos i j
⎪
⎪ j=1
⎩
i = 1, 2, · · · , N . θV0 θ = 0
In Eq. (24.4), superscript “0” indicates that before the initial fault occurs, the
power grid is in the T 0 set, and the node injection power is S 0 . N represents the total
number of nodes in the grid. PGi and QGi, respectively, indicate the active power
and reactive power of the power supply on the node i in the system. PDi and QDi,
respectively, represent the active load and reactive load of node i. U i represents the
modulus value of the voltage vector on the node i. Gij and Bij, respectively, represent
the real and imaginary parts of the element Y ij in line i and column j of the node
admittance matrix. θ Vθ represents the voltage phase angle of the balance node. θ ij is
the voltage phase angle difference between node i and node j, and its specific form
as shown in Eq. (24.5).
When the power grid is in the T 0 set, the node power is S 0 , the inequality constraints
are mainly the grid normal operation requirements of the various constraints before
24 A Method of Calculating the Safety Margin of the Power Network … 231
the initial fault. Any node i (i = 1, 2… N), the corresponding inequality constraints
can be expressed as Eqs. (24.7)–(24.10) as shown in the form.
Except the initial fault slip, there are L branches in the network. According to
the Eqs. (24.7)–(24.10), 3 × N + L + 1 inequality constraints are formed. These
different constraints are unified, write in the form of Eq. (24.11).
g 0 x 0 , y0 , z 0 (24.11)
When the power grid is in the T 0 set, the node power is S 0 , after the initial fault
occurs, the power grid should first meet the constraint conditions which are similar
to the Eq. (24.4) of the power flow constraints. It can be written in the form (24.12)
as shown in the abbreviated form similar to the Eq. (24.4).
hb x b , yb , z b = 0 (24.12)
In order to further express the critical state of the cascading trip, the branch L st is
set as branch l. Make J l = ωst·dist , and form the matrix (24.13) as shown.
J = diag(J1 , . . . , Jl , . . . , JL ) (24.13)
From the previous analysis, we can know that when any element of the matrix J
is greater than or equal to zero and the matrix J is singular, the power grid is in a
critical state for a given initial, it can be summed up in Eq. (24.14).
| J| = 0
(24.14)
Jl ≥ 0, l = 1, 2, . . . , L
Equality and inequality in Eq. (24.14) are abbreviated as part of Eqs. (24.15),
(24.16) in the form of
f b x b , yb , z b = 0 (24.15)
gnb x b , yb , z b ≤ 0
(24.16)
n = 1, 2, . . . , L
232 H. Deng et al.
Through the above analysis, it can form a complete solution of the mathematical
model of safety margin by Eqs. (24.2)–(24.17) together. As the former analysis, it
can be considered y0 and yb two equal symbols, and then unified with y to express.
In this way, the final model is shown in Eq. (24.18).
⎧
⎪
⎪ min D =
S − S
⎪
⎪
⎪
⎪ s.t. h0
x 0 , y, z 0 = 0
⎪
⎨
hb
x b , y, z b = 0
(24.18)
⎪
⎪ f b
x b , y, z b = 0
⎪
⎪
⎪
⎪ g 0
x 0 , y, z 0 ≤ 0
⎪
⎩
g b x b , y, z b ≤ 0
D is obtained by solving Eq. (24.18) which is required for the solution of the
safety margin.
Considering the complex constraint conditions, this paper uses the particle swarm
optimization algorithm to give its solution. The particle is taken as the above to be
optimized variable y.
In the process of solving the equation, equality constraints in Eq. (24.18) which
correspond to Eqs. (24.6) and (24.12) can be solved by solving the power flow equa-
tion. If you do not meet the requirements, then remove the corresponding particles
to generate the new particles. The other constraints in the Eq. (24.18) can be pro-
cessed in the form of penalty function. The problem represented by Eq. (24.18)
can be converted into the problem represented by Eq. (24.19), and α, β, and γ are
penalized.
2
min D = D + αk min 0, −gk0 x 0 , y, z 0
k
2
2
+ βk min 0, −gkb x b , y, z b + γ f b x b , y, z b (24.19)
k
24 A Method of Calculating the Safety Margin of the Power Network … 233
Thus, Eq. (24.19) has become an unconstrained form. When the particle swarm
algorithm is used, the basic form of Eqs. (24.20) and (24.21) is adopted in this paper.
vik+1 = wvik + c1r1 P best·i − yik + c2 r2 g best − yik (24.20)
F = 1/D (24.22)
In this paper, the IEEE39 node system is used as an example to carry out example
analysis. The wiring diagram is shown in Fig. 24.2. According to the idea of solving
the safety margin, the solving process of particle swarm algorithm in this paper is
given below. We mainly calculate the F value in Eq. (24.19) and the D value in
Eq. (24.22) which are regarded as computational results. The reference capacity is
100 MVA. The initial fault slip is assumed to be L 17–18 . At the same time, in the
system of Fig. 24.1, it is assumed that the backup protection for the circuit is current
type protection. In other words, the ωst·lim can be taken as the protection of the set
value of I st·set and assume that the value is 7.5 kA.
For the active and reactive power of the generator output in Fig. 24.1, it is assumed
that the lower limit of the active power output of each generator node is 0, no upper
limit is given, and the reactive power output of each generator is assumed to have no
consideration of the upper and lower bounds. The voltage mode value of each node
in Fig. 24.1 is allowed to range from 0.95 to 1.05 (p.u.). It is assumed that the active
power limit for the transmission of the branches is 0, the upper limit is 1000 MVA.
In this example, the voltage module value of the PV node is assumed to be the
same as that of the current state, and the current situation of the power grid is taken
as a typical state shown in Fig. 24.1, the corresponding node power data can be found
in the literature [8]. After this process, the voltage module value of the PV node is
not used as the optimization variable. At this time, the optimization variables are the
234 H. Deng et al.
G G
30 37
2 25 26 28 29
27
38
1 3 18 17
G
16 21
G
39
15
4 14 G
24
36
5 13
6 12 23
9
19
11 22
7
20
8 31 32 10
34 33 35
G G G G G
D´
F
CN CN
active power of the PV node and the active and reactive power of the PQ node, that
is, the part of the vector S 0 which is expressed by O. Each element in the vector O
is, respectively, corresponding to the active power of the PV node, the active and
reactive power of the PQ node. Further, it is arranged in sequence according to the
number of nodes in the system shown in Fig. 24.1. In the vector O, two elements are
used to represent the optimized variables corresponding to the PQ node and they are
placed close to the place.
Next, each particle corresponds to a vector O. Vector O can be generated by
assignment. The specific operation of assignment is: based on typical data of the
system shown in Fig. 24.1, ΔP was added to each element corresponding to the PV
nodes power in vector O. ΔP and ΔQ were, respectively, added to the corresponding
element corresponding to the PQ node power in vector O. In the iterative solution
24 A Method of Calculating the Safety Margin of the Power Network … 235
process, w in Eq. (24.20) is reduced from 0.9 to 0.1 in a linear fashion. c2 and c1
were taken as 2.
A vast amount of calculation indicates that when the power of the node is
increased, the power grid will be cascading trip, and the power grid will not happen
when the node injection power is reduced. This indicates that the margin of safety
by the example calculated is credible, and that this calculation method is effective.
24.5 Conclusion
Based on the running state of power network, cascading trips were studied from the
point of safety margin. The main conclusions are as follows: It can use the distance
between the actual operating state of the power grid and the critical state of the
occurrence of cascading trip as a measure of power network security margin index.
It can also use the optimized model to represent the cascading tripping safety margin
index, and it can be solved by optimization method. The example indicates that it
is feasible that using optimization method to solve the problem of safety margin of
cascading trips. This provides a reference for further research.
Acknowledgment This research was financially supported by Fujian Provincial Natural Science
Foundation of China under the grant 2015J01630, Doctoral Research Foundation of Fujian Univer-
sity of Technology under the grant GY-Z13104, and Scientific Research and Development Foun-
dation of Fujian University of Technology under the grant GY-Z17149.
References:
1. Shi, L., Shi, Z., Yao, L., et al.: Research on the mechanism of cascading blackout accidents in
modern power system. Power Syst. Technol. 34(3), 48–54 (2010)
2. Xue, Y., Xie, Y., Wen, F., et al.: A review on the research of power system cascading failures.
Autom. Electr. Power Syst. 37(19), 1–9, 40 (2013)
3. Liu, Y., Hu, B., Liu, J., et al.: The theory and application of power system cascading failure
(a)—related theory and application. Power Syst. Prot. Control 41(9), 148–155 (2013)
4. Xiao, F., Leng, X., Ye, K., et al.: Research on fault diagnosis and prediction of chain trip based
on fault causal chain of finite state machine. Power Big Data 21(08), 48–57 (2018)
5. Liu, Y., Huang, S., Mei, S., et al.: Analysis on patterns of power system cascading failure based
on sequential pattern mining. Power Syst. Autom. 1–7 (2019). http://kns.cnki.net/kcms/detail/
32.1180.TP.20190124.1036.036.html
6. Xu, D., Wang, H.: High risk cascading outage assessment in power systems with large-scale
wind power based on stochastic power flow and value at risk. Power Grid Technol. 43(02),
400–409 (2019)
7. Huang, P., Zhang, Y., Zeng, H.: Improved particle swarm optimization algorithm for power
economic dispatch. J Huazhong Univ. Sci. Technol. (Natural Science Edition) 38(3), 121–124
(2010)
8. Cai, G.: Branch transient potential energy analysis method for power system transient stability.
Harbin Institute of Technology (1999)
Chapter 25
Research on Intelligent Hierarchical
Control of Large Scale Electric Storage
Thermal Unit
Tong Wang, Gang Wang, Kai Gao, Jiajue Li, Yibo Wang and Hao Liu
Abstract Through the control of the thermal storage unit, the local control and
remote control strategy including the thermal storage unit are realized and incorpo-
rated into the power generation plan of the day before, so that the electric thermal
comprehensive scheduling model of the power system with the large-scale thermal
storage unit is established.
25.1 Introduction
Due to the random fluctuation of wind power generation, the grid operation of wind
power generation brings great challenges to the traditional power system. In order
to ensure the safe and reliable operation of the whole system, the phenomenon of
T. Wang · G. Wang · J. Li
State Grid Liaoning Electric Power Company Limited, Electric Power Research Institute,
Shenyang 110006, Liaoning, China
e-mail: 18159335@qq.com
G. Wang
e-mail: wangg_ldk@ln.sgcc.com.cn
J. Li
e-mail: en-sea@163.com
K. Gao
State Grid Liaoning Electric Power Supply Co., Ltd., Shenyang 110006, Liaoning, China
e-mail: gk@ln.sgcc.com.cn
Y. Wang · H. Liu (B)
Northeast Electric Power University, Jilin 132012, Jilin Province, China
e-mail: 1282625960@qq.com
Y. Wang
e-mail: 469682939@qq.com
© Springer Nature Singapore Pte Ltd. 2020 237
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_25
238 T. Wang et al.
wind abandoning often occurs. In order to effectively solve the consumption problem
of wind power, literature [1–4] uses Forecasting method to study the prediction of
wind power output, and get some research results. Literature [5] proposed a dual-
time scale coordinated control method using battery energy storage system to reduce
wind power fluctuations. Literature [6] established a day-ahead scheduling model
based on the day-ahead heating load prediction, wind power output prediction and
the operation mechanism of the heat-storage device and solved it. Literature [7]
demonstrated that large-capacity heat storage can effectively solve the problems of
renewable energy consumption and peak regulation. In literature [8], the heat storage
was incorporated into the active power scheduling system of power systems with wind
power. However, the most critical problem of various energy storage technologies is
that the capacity cannot accept wind power on a large scale.
It proposes a strategy of accepting dispatching power generation instructions on
the power plant side and rationally arranging power generation and heat storage
switching. In the power grid dispatching side, the heat storage load is incorporated
into the daily dispatching plan, and the direct control of the power grid is realized
through automatic power generation control, thus forming a new large-scale accep-
tance wind power generation method.
For the large-capacity thermal storage system built on the side of the power plant, it is
connected with the urban heat network and becomes another coupling point between
the heat network system and the power grid system, forming a new power-thermal
coupling system. The schematic diagram of the power-thermal coupling system is
shown in Fig. 25.1.
In this paper, the thermal storage unit body device, power plant heat storage system
and thermo-electric coupling system are taken as the research object, and the unit-
collection-cluster hierarchical control strategy of unit level, power plant level and
system level is constructed, as follows.
Unit control refers to the control method that considers the operation constraints of
the heat storage unit body device. Unit control is the basis of the layered control
strategy, which is only limited by the working state of the heat storage unit itself.
The specific operational constraints are modeled as follows:
Ht = ηHt−1 + St t = 1, 2, . . . , 24 (25.1)
25 Research on Intelligent Hierarchical Control … 239
Power grid
Thermal
storage system
Power plant
Heating
User load network Unit Collection Cluster
control control control
−hin
max ≤ St ≤ hmax t = 1, 2, . . . , 24
in
(25.3)
24
St = 0 (25.4)
t=1
Among them, Ht is the thermal storage capacity of the heat storage device at the
end of time t; η is the thermal storage tank efficiency; Hmax and Hmin are the upper
and lower bounds of the thermal storage capacity of the thermal storage device;
hin out
max and hmax are the upper limits of the input and output heat power; Eq. (25.1)
characterizes the heat balance state of the thermal storage device; (25.2) and (25.3)
are the absorption and exothermic energy constraints of the energy storage system;
Eq. (25.4) indicates that the heat capacity of the thermal storage device remains
unchanged during one cycle. It is balanced in one cycle.
In order to make full use of the capacity margin of the thermal storage unit group,
each thermal storage unit needs to be maintained at a certain energy level, so that it
240 T. Wang et al.
Smax
Limit interval of heat storage Limit heat
switching storage
Sa-max
Optimal Thermal
operating storage
Normal input interval interval of unit
thermal storage capacity
unit
Sa-min
can obtain a reasonable balance between charging and discharging, thereby realizing
the unrestricted fast response under different switching instructions.
According to the operating characteristics of the thermal storage unit, the operating
state of the thermal storage unit can be divided into three intervals: a normal input
interval, a thermal storage switching limit interval and a minimum thermal storage
interval, as shown in Fig. 25.2.
It can be seen from Fig. 25.2 that for each of the three operating
intervals of the
thermal storage unit, it is most reasonable to operate in the Sa−min < S < Sa−max
interval, that is, the thermal storage system has a certain charging margin and a certain
allowance margin. The energy storage unit has higher flexibility at this energy level,
and it is easier to meet the daily dispatching demand.
Normal input interval. When the thermal storage unit is operated in the interval, the
effective control of the thermal storage unit can provide sufficient heat for the heat
network system, and at the same time provide a certain adjustable load for the power
system, that is, the thermal storage unit has the best adjustment capacity margin.
When the thermal storage unit is in the normal input interval, its control logic is:
Pft > Pct = 0, 1
S
2 max
< S < Sa−max
(25.5)
Pct = Pce > Pft ≥ 0, Sa−min < S < 21 Smax
25 Research on Intelligent Hierarchical Control … 241
Among them: Pft represents the heat release power of the thermal storage unit; Pct
is the rated heat storage power of the thermal storage unit; Pce is the rated thermal
storage power of the heat storage unit. In this state, the thermal storage unit remains
operating near 50% of capacity.
Heat storage switching limit interval and minimum heat storage interval. When
the thermal storage unit is in these two intervals, its control logic is:
Pft = Pf −max > Pct = 0, Sa−max ≤ S ≤ Smax
(25.6)
Pct = Pce > Pft = 0, 0 ≤ S ≤ Sa−min
Among them, Pf −max is the maximum heat release power of the thermal storage
unit. At this time, the thermal storage unit in the thermal storage switching restriction
interval does not have the ability to realize further thermal storage, and the thermal
storage unit can only perform the heat release control. Similarly, the heat dissipation
capacity of the heat storage unit in the minimum heat storage interval is limited, and
only the heat storage operation control can be performed.
The heat storage system should have sufficient heat storage capacity:
n
m
Pcti + Pcfj = Pquota (n + m = N ) (25.7)
i=1 j=1
Among them, Pcti represents the heat storage power of the i-th unit, Pcfj represents
the heat release power of the j-th unit; Pquota represents the system power quota;
Among them, n and m represent the amount of heat storage and exothermic state of
the heat storage unit in the power plant, N represents the total number of thermal
storage units configured in the power plant.
When receiving the grid dispatching instruction, the control strategy is as follows:
The overall thermal storage system power is equal to the dispatching command:
n
m
Pij = Pdispatch (25.8)
i=1 j=1
Among them: Pij represents the power of the j-th thermal storage unit in the i-th
group, the exotherm is positive and the thermal storage is negative; Pdispatch represents
the value of the system dispatch command; n and m respectively represent the number
of thermal storage unit groups installed in the power plant and the number of thermal
storage units in each thermal storage group.
242 T. Wang et al.
n
m
MAX = Nij (25.9)
i=1 j=1
Among them: Nij represents the number of the j-th heat storage unit in the i-th
group, and is 1 when the condition is satisfied, otherwise it is 0; n and m have the
same meaning (25.8).
24
24
f (θ ) = R − C = [λc (t) · Pc (t) + λw (t) · Pw (t) + λh (t) · HL (t)] − Ft
t=1 t=1
(25.10)
2
Ft = ai Pc (t) + CV hct Hc (t) + St + bi Ptc + CV hct + St + ci (25.11)
Among them: R represents the total revenue, which includes the revenue from
selling electricity and heating revenue; C represents the total cost, including the cost
of power generation and heating; Pc and Pw respectively represent the output of
thermal power plants and wind farms; HL is the thermal load of the thermal power
plant; λc and λw respectively represent the on-grid price of thermal power plants
and wind farms; λh represents the heating price of the thermal power plant; Ft is
the operating cost of power generation and thermal storage units in thermal power
plants. ai , bi and ci are the operating cost coefficients of the thermal power plant; CV
is the operating parameter of the unit; St is the heat storage/exothermic power of the
heat storage device at time t, which is positive at heat storage and negative at heat
release.
Among them, Pel,t (t) represents the output of the thermal power unit in the region;
Pw (t) is the wind power connected to the network at time t in the system; Pex (t)
indicates the exchange power between the region and the external system at time t.
When the value is positive, it indicates that the power is delivered outward. When the
value is negative, it indicates that the external system supplies power to the region;
PD,el (t) is the electrical load value at time t in the system.
System heating constraint:
Among them: k is the total number of heating zones; PDhk (t) is the total heat load
that the k-th district thermal power plant needs to bear at time t; Shk (t) is the heat
storage of the heat storage device in the k-th partition at time t.
The unit constraint. Upper and lower limit constraints of unit thermal output:
0 ≤ Ph ≤ Ph,max (25.14)
Among them, Ph,max is the maximum limit of the heat output of the unit i, which
mainly depends on the capacity of the heat exchanger.
Unit climbing rate constraint:
P(t) − P(t − 1) ≤ Pup
(25.15)
P(t − 1) − P(t) ≤ Pdown
Among them, Pup and Pdown are the upward and downward climbing speed con-
straints of unit i respectively.
Operating constraint of thermal storage device. Constraints on the stor-
age/discharge capacity of the heat storage device:
t
Sh,k − Sh,k
t−1
≤ Ph,k,c max
(25.16)
Sh,k − Sh,k
t−1 t
≤ Ph,k,f max
Among them, Ph,k,c max and Ph,k,f max are the maximum storage and release power
of the thermal storage device, respectively.
Capacity constraints of thermal storage devices:
t
Sh,k ≤ Sh,k,max (25.17)
Among them, Sh,k,max is the thermal storage capacity of the thermal storage device.
244 T. Wang et al.
The regional system is shown in Fig. 25.3. The thermal storage units in the regional
system are separately analyzed in terms of local and remote control modes, and the
benefits brought by the thermal storage units are analyzed.
Only the self-interval limitation of the heat storage unit is considered, and the simu-
lation calculation is carried out with the goal of maximum wind power consumption.
The simulation results are shown in Fig. 25.4.
It is learned from the historical operation of the power system in Liaoning Province
that in practice, the real trough and acceptance difficulties period of the system are
[00:00–04:00]. In order to better respond to the needs of the power grid, the heat
storage unit is divided into groups and switched. The specific switching strategy of
the heat storage unit is shown by the dotted line in Fig. 25.4. The local control strategy
is adopted to adjust the heat storage unit to maximize the space for the system to
absorb wind power during the peak wind power abandonment period.
S1 S2 S3 S4
35/0.69kV
G1 G2 70MW 70MW 80MW 80MW
220/35kV
Fig. 25.4 Operation curve of thermal power plant active output in local control
The direct control electric storage heat load can meet the peak regulation of power
network and the user’s heating demand when receiving the dispatching command
from the power grid. The adjustable amount of the power generation limit of the
power plant is [0, 600 MW]. When the output is less than 300 MW, the adjustment
principle is shown in Fig. 25.5.
t=21h start
storing heat
The thermoelectric unit in sequence
initial PGmin=300
Time/h
In the figure, PGmin represents the minimum output value of thermal power unit
during the low valley load period. Due to the limit of the heat load, the wind power
consumption capacity is restricted, and the wind abandonment phenomenon occurs.
According to the operation strategy of the direct-controlled heat storage device
proposed in this paper, the switching of the heat storage device will be completed in
the low valley period when the output of the unit is limited. The switching process
is: 0 MW, 70 MW, 2 * 70 MW, 2 * 70 + 80, 2 * 70 + 2 * 80. When the heat storage
system is fully put into operation, the output value of the thermal power unit will
be 0 MW, which means that 300 MW of capacity can be provided for the system to
receive wind power.
It can be seen from Fig. 25.6 that during the low load period, the heat storage
operation curve is positive, and this is also the peak period of the grid wind abandon-
ment. Therefore, on the one hand, the heat storage system operation increases the
load value; on the other hand, the thermal power plant output decreases. This makes
the wind power consumption space increase. Due to the operation of the heat storage
device, the daily load curve of the system is corrected from y1 to y2 , which reduces
the peak-to-valley difference of the system to 1679.1 MW, which makes the system
run more smoothly.
In order to facilitate the dispatching organization to prepare the power generation
plan, firstly, the heat storage system operation curve is obtained according to the
remote control strategy, and then the heat storage control strategy is used to correct
daily load curve y2 and formulate a dispatch plan. Figure 25.7 shows the output curve
of the thermal power plant unit. It can be seen from the figure that when the unit is
operated in the remote control mode proposed in this paper, the maximum output
is 600 MW and the minimum output is 0 MW, which reduces the number of starts
and stops of the unit, and reserves more space for receiving wind power during the
Power/MW
Time/h
trough, and at the same time the economy Operation and deep peak shaving of gird
can be achieved.
The use of local control and remote control during the low valley period can effec-
tively raise the trough load and provide a larger capacity margin for the grid to
consume more wind power.
In the safe operation condition of the heat storage system, the more amount of
wind power consumed by the power grid because of the heat storage system is:
365
t2
EGwind = fHS (t)dt (25.18)
k=1 t1
Among them, t1 and t2 are the start and end times of direct control heat storage
during the low valley period; fHS (t) is the heat storage unit power at time t, which is
a step function.
According to the selected Liaoning regional power grid, the control strategy can
increase the adjustable load capacity of 300 MW for the grid side. Wind power
consumption capacity was improved by 300 MW, and the consumption capacity of
wind power was calculated under the condition that the heat storage device was
operated for 7 h every day and 5 months every year. The wind power consumption
capacity of 315 million was obtained.
248 T. Wang et al.
25.5 Conclusion
The effective utilization of the heat storage power source is realized by constructing
the local control and remote control strategy of the heat storage unit. At the same time,
the grid optimization scheduling model with large-scale electric thermal storage unit
is constructed with the goal of maximizing the operating efficiency of the system.
Finally, the rationality of the model was verified by using the actual data of Liaoning
Power Grid, and the power efficiency under the model was analyzed.
References
1. Peng, X., Xiong, L., Wen, J., et al.: A summary of methods for improving short-term and ultra-
short-term power forecast accuracy. Chin. Soc. Electr. Eng. 36(23), 6315–6326 (2016)
2. Lu, M.S., Chang, C.L., Lee, W.J., et al.: Combining the wind power generation system with
energy storage equipment. IEEE Trans. Ind. Appl. 45(6), 2109–2115 (2009)
3. Heming, Y., Xiangjun, L., Xiufan, M., et al.: Wind energy planning output control method
for energy storage system based on ultra-short-term wind power prediction power. Power Grid
Technol. 39(2), 432–439 (2015)
4. Zhao, S., Wang, Y., Xu, Y.: Fire storage combined related opportunity planning and scheduling
based on wind power prediction error randomness. Chin. Soc. Electr. Eng. 34(S1), 9–16 (2014)
5. Jiang, Q., Wang, H.: Two-time-scale coordination control for a battery energy storage system to
mitigate wind power fluctuations. IEEE Trans. Energy Convers. 28(1), 52–61 (2013)
6. Yu, J., Sun, H., Shen, X.: Joint optimal operation strategy for Wind-Thermal power units with
heat storage devices. Power Autom. Equip. 37(6), 139–145 (2017) (in Chinese)
7. Xu, F., Min, Y., Chen, L., et al.: Electrical-thermal combined system with large capacity heat
storage. Chin. J. Electr. Eng. 34(29), 5063–5072 (2014) (in Chinese)
8. Chen, T.: Research on wind power scheme for thermal power plant based on heat storage. Dalian
University of Technology (2014)
Chapter 26
Global Maximum Power Point Tracking
Algorithm for Solar Power System
Ti Guan, Lin Lin, Dawei Wang, Xin Liu, Wenting Wang, Jianpo Li and
Pengwei Dong
Abstract The P-U curve of the PV (photovoltaic) system has multi-peak charac-
teristics under non-uniform irradiance conditions (NUIC). The conventional MPPT
algorithm can only track the local maximum power points, therefore, PV system fails
to work at the global optimum, causing serious energy loss. How to track its global
maximum power point is of great significance for the PV system to maintain an
efficient output state. Artificial Fish Swarm Algorithm (AFSA) is a global maximum
power point tracking (GMPPT) algorithm with strong global search capability, but
the convergence speed and accuracy of the algorithm are limited. To solve the men-
tioned problems, a Hybrid Artificial Fish Swarm Algorithm (HAFSA) for GMPPT
is proposed in this paper by using formulation of the Particle Swarm Optimization
(PSO) to reformulate the AFSA and improving the principal parameters of the algo-
rithm. Simulation results show that when under NUIC, compared with the PSO and
AFSA algorithm, the proposed algorithm has well performance on the convergence
speed and convergence accuracy.
26.1 Introduction
Solar energy is an important sort of renewable energy and MPPT algorithm is one of
the key technologies in PV power generation system. Under uniform irradiance, there
is only one maximum power point on the P-U output curve where the PV module can
operate at maximum efficiency and produce maximum output power [1]. But when
part of the PV array receives lower solar irradiance due to occlusion by objects such
as clouds, trees and buildings, that condition is known as non-uniform irradiance
conditions (UNIC), the output of the PV system will be affected [2].
In order to ensure the PV system operating at the maximum power point simul-
taneously, many MPPT algorithms have been proposed like Perturb and Observe
(P&O) [3] and Incremental Conductance (INC) [4]. Under uniform irradiance, P&O
and INC show good tracking efficiency and speed. However, under NUIC, conven-
tional MPPT techniques fail to track the global peak and instead converge onto one
of the local maximum power points, resulting in considerable underutilization of the
PV power [5]. Reference [6] points out that under NUIC, the conventional MPPT
algorithm may cause a decrease in output power of the PV array by about 70%. There-
fore, under NUIC, GMPPT technology is crucial for tracking the global maximum
power point (GMPP).
To solve the problems of tracking GMPP under NUIC, the intelligent algorithm
is introduced into the GMPPT technology, and the GMPPT is achieved by using the
global search capability of the intelligent algorithm like Particle Swarm Optimization
(PSO) [7], Back Propagation (BP) Neural Network [8], and Cat Swarm Optimization
(CSO) [9]. PSO has been proposed as a GMPPT algorithm based on the behavior
of birds flocking [10]. In this technique, particles collectively solve a problem by
sharing information to find the best solution. The technique is limited by the presence
of random variables in its implementation, and it requires several parameters to be
defined for each system. Another GMPPT algorithm based on simulated annealing
(SA) optimization [11] has been proposed recently. However, this method incurs
more PV voltage variations during searching process and needs higher convergence
time.
The intelligent AFSA algorithm is introduced into the GMPPT technology, and
the paper proposes a Hybrid Artificial Fish Swarm Algorithm (HAFSA) including:
(1) using formulation of the PSO to reformulate the AFSA, (2) extending to mem-
ory behavior, communication behavior into AFSA, and (3) improving the principal
parameters of the algorithm, so that the values change is adapted to the parameter
requirements in different search stages.
The equivalent circuit with series and parallel resistance of each PV cell is shown in
Fig. 26.1.
where Iph is the PV current; Id is the current of parallel diode; Ish is the shunt
current; I is the output current; U is the output voltage; Rs is series resistance; Rsh is
shunt resistance.
According to the equivalent circuit of Fig. 26.1, the relationship between the
output current and the voltage of the PV cell is described as:
26 Global Maximum Power Point Tracking Algorithm … 251
U
Iph Ish Rsh
Id
q(U + Rs × I ) U + Rs × I
I = Iph − I0 exp −1 − (26.1)
nKT Rsh
The principle of Artificial fish swarm algorithm is to simulate the fish in the nature of
foraging, cluster and collision behavior and mutual assistance between fish swarm,
so as to realize the global optimal.
Define the most moving distance of artificial fish is Step, the apperceived distant
of artificial fish is Visual, the retry number is Try_Number and the factor of crowed
degree is η. The situation of artificial fish individual can be described as result the
vector X = (X 1 , X2 , . . . , Xn ), and the distance between artificial i and artificial j is
dij = Xi − Xj .
(1) Prey
Consider they apperceive food by eyes, current situation is Xi , and randomly select
a situation Xj in their apperception range.
where, rand () is a random number between 0 and 1. If Yi > Yj , then move forward
at this direction. Otherwise random choose a new situation Xj to judge whether it
satisfies the move condition. If it does:
Xj − Xit
Xit+1 = Xit + × Step × rand () (26.3)
Xj − X t
i
If it cannot satisfy the move condition after Try_Number times, then random
move:
252 T. Guan et al.
(2) Swarm
In order to avoid crowding too much, set artificial current situation is Xi .Searching
the number of its companies nf and center Xc in the area (namely dij < V isual).
Then it can move toward its companies’ center location.
Xc − Xit
Xit+1 = Xit + × Step × rand () (26.5)
Xc − X t
i
Xj − Xit
Xit+1 = Xit + × Step × rand () (26.6)
Xj − X t
i
(4) Random
The behavior of randomization can make artificial fish find food and companies in a
larger area. One situation is randomly selected, and artificial fish can move toward
it.
In order to improve the convergence speed and accuracy of the algorithm, the paper
introduces several features like the velocity inertia factor, the memory factor, and
the communication factor of the PSO into the AFSA. The HAFSA algorithm makes
the artificial fish move with velocity inertia characteristic, and the behavior patterns
of artificial fish expands to memory behavior and communication behavior. The
HAFSA algorithm also reduces the blindness in the artificial fish searching process.
(1) The paper uses the formulation of the PSO to reformulate the AFSA.
The introduction of velocity inertia weight can reduce the blindness of the artificial
fish movement. Taking the update of swarm behavior as an example, if Yc /nf < η×Yi ,
the update Eqs. (26.7) and (26.8) are:
26 Global Maximum Power Point Tracking Algorithm … 253
Step × (Xtc − Xt )
Vt+1 = ωVt + rand () × (26.7)
norm(Xtc − Xt )
Xt+1 = Xt + Vt (26.8)
(2) It introduces the memory factor and the communication factor of the PSO into
the AFSA so as to add memory behavior and communication behavior.
First, the algorithm introduces the memory behavior pattern. The memory behavior
pattern is the optimal position that the artificial fish can refer to when it is moving.
If Ypbest /nf < η × Yi , it shows that the location of its companies has much food and
don’t crowd. The update Eq. (26.9) is:
pbest
Step × (Xt − Xt )
Vt+1 = ωVt + rand () × pbest
(26.9)
norm(Xt − Xt )
pbest
where, Xt is the best location vector of the artificial fish at tth iteration.
Second, the communication behavior pattern is the optimal position of the entire
fish swarm that the artificial fish can refer to them when it is moving. If Ygbest /nf <
η × Yi , it shows that the location of its companies has much food and don’t crowd.
The update Eq. (26.10) is:
gbest
Step × (Xt − Xt )
Vt+1 = ωVt + rand () × gbest
(26.10)
norm(Xt − Xt )
gbest
where, Xt is the best location vector of all artificial fishes on the bulletin board at
iteration.
In order to meet the requirement that the fish swarm can run at high speed in the early
stage and understand the search space effectively while the fish swarm can accurately
searched at low speed within the optimal solution neighborhood in the later stage,
the paper proposes a new nonlinear decrement method based on the inertia weight
ω with linear decreasing. As shown in Eq. (26.11):
In order to further improve the performance of the algorithm, this paper proposes
an improved way to meet the expectations of changes of Step and Visual. As shown
in Eqs. (26.12) and (26.13):
V ISmax
V isual(t) = × [(V ISmin /V ISmax )1/(tmax −1) ]t (26.12)
(V ISmin /V ISmax )1/(tmax −1)
where, V ISmin and V ISmax are the upper and lower limits of the Visual respectively.
XE − X Y
Step(t) = V isual(t) ×
× 1− (26.13)
XE − X YE
where, X is the current situation of artificial fish, XE is the next situation that artificial
fish X explores in various behaviors, Y and YE are the fitness value corresponding to
situations X and XE respectively.
Iph2 −I I −I
nKT
ln( + 1) − nb KTb
ln( I0bph1 + 1) − IRs , Iph1 < I ≤ Iph2
U = q I0
I −I
q
I −I (26.14)
nKT
q
ln( ph1I0 + 1) + nKT
q
ln( ph2I0 + 1) − 2IRs , 0 ≤ I < Iph1
where nb is the diode influence factor, I0b is the saturation leakage current of the
bypass diode under standardized testing conditions.
Simulation result is shown in this paper and several performances of HAFSA are
compared with that of the PSO and the AFSA under the same condition.
In the experiment, a PV array with three modules connected in series is taken
as an example. Under standardized conditions, G = 1000 W/m2 , T = 25 ◦ C, the
parameters of a single PV module are shown in Table 26.2.
When the irradiance conditiond is G = 1000, 800, 400 W/m2 , the GMPP of
the PV array is tracked by the proposed HAFSA GMPPT algorithm. The tracking
process is shown in Fig. 26.2.
It can be seen from Fig. 26.2 that the HAFSA GMPPT algorithm can accurately
track the GMPP with a high efficiency. After the 21st iteration of the algorithm, the
result tends to be smooth, the GMPP is Pmax = 894.3010 W when Impp = 6.7685 A.
When T = 25 ◦ C and G = 1000, 800, 400 W/m2 , PSO, AFSA, and HAFSA are
used for GMPPT. The tracking results are shown in Fig. 26.3.
As can be seen from the tracking process shown in Fig. 26.3, the three algorithms
can track the GMPP, and the proposed algorithm can track the GMPP with fewer
iterations. It can be seen from Fig. 26.3 that the HAFSA algorithm shows better
convergence speed and stability than the other two algorithms when tracking the
GMPP.
Figure 26.4 show the population distribution of the three algorithms after 30th iter-
ations. It can be seen, after the 30th iteration, the populations of the three algorithms
have been distributed in the optimal solution neighborhood, and the distribution of
515 7
6
500
Current(A)
Power(W)
485
4
3
470
2
455 GMPP reference
Power 1
Current
440 0
0 50 100 150
Iteration
530
510
490
AFSA
PSO
470
HAFSA
Power(W)
450
430
410
390
370
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
Iteration
Iteration=30
Power
503.7
PSO
AFSA
503.6
HAFSA
503.5
503.4
Power(W)
503.3
503.2
503.1
503
502.9
Fig. 26.4 Results of the three algorithms after the 30th iteration
26.6 Conclusion
Acknowledgements This work was supported by “Research on Lightweight Active Immune Tech-
nology for Electric Power Supervisory Control System”, a science and technology project of State
Grid Co., Ltd in 2019.
258 T. Guan et al.
References
1. Li, C., Cao, P., Li, J., Zhao, B.: Review on reactive voltage control methods for large-scale
distributed PV integrated grid. J. Northeast Electr. Power Univ. 37(2), 82–88 (2017)
2. Wang, H., Chen, Y., Li, G., Zhuang, G.: Solution of voltage beyond limits in distribution
network with large scale distributed photovoltaic generators. J. Northeast Electr. Power Univ.
37(6), 8–14 (2017)
3. Femia, N., Petrone, G., Spagnuolo, G.: Optimization of perturb and observe maximum power
point tracking method. IEEE Trans. Power Electron. 20(4), 963–973 (2005)
4. Sera, D., Mathe, L., Kerekes, T.: On the perturb-and-observe and incremental conductance
MPPT methods for PV systems. IEEE J. Photovolt. 3(3), 1070–1078 (2013)
5. Roman, E., Alonso, R., Ibanez, P.: Intelligent PV module for grid-connected PV systems. IEEE
Trans. Ind. Electron. 53(4), 1066–1073 (2006)
6. Manickam, C., Raman, G.R., Raman, G.P.: A hybrid algorithm for tracking of GMPP based on
P&O and PSO with reduced power oscillation in string inverters. IEEE Trans. Ind. Electron.
63(10), 6097–6106 (2016)
7. Oliveira, F.M.D., Silva, S.A.O.D., Durand, F.R.: Grid-tied photovoltaic system based on PSO
MPPT technique with active power line conditioning. IET Power Electron. 9(6), 1180–1191
(2016)
8. Yang, M., Huang, X., Su, X.: Study on ultra-short term prediction method of photovoltaic
power based on ANFIS. J. Northeast Electr. Power Univ. 38(4), 14–18 (2018)
9. Yin, L., Lv, L., Lei, G.: Three-step MPPT algorithm for photovoltaic arrays with local shadows.
J. Northeast Electr. Power Univ. 37(6), 15–20 (2017)
10. Miyatake, M., Veerachary, M., Toriumi, F.: Maximum power point tracking of multiple pho-
tovoltaic arrays: a PSO approach. IEEE Trans. Aerosp. Electron. Syst. 47(1), 367–380 (2011)
11. Lyden, S., Haque, M.E.: A simulated annealing global maximum power point tracking
approach for PV modules under partial shading conditions. IEEE Trans. Power Electron. 31(6),
4171–4181 (2016)
Chapter 27
A Design of Electricity Generating
Station Power Prediction Unit with Low
Power Consumption Based on Support
Vector Regression
Abstract During the process of electricity generating station operation, its output
power will be affected by environmental factors, so there will be a large fluctuation.
If we can monitor the environmental data and the output power of the electricity
generating station in real time, we can make an accurate and effective estimation of
the operation status of the electricity generating station. To meet this demand, we
designed an electricity generating station power prediction unit based on support vec-
tor regression algorithm. The power consumption of the unit is very low, and by using
machine learning, the characteristics and rules of each index can be learned from
the environmental data collected by sensors. By processing and analyzing the newly
collected data, the real-time operation status of the electricity generating station can
be monitored.
27.1 Introduction
With the continuous development of machine learning and deep learning technology,
the concept and knowledge system of machine learning have been improved day
by day. For a long time in the future, it can be predicted that machine learning will
develop towards the direction of marginalization and terminal. The ability of machine
learning algorithm to extract features makes it very suitable for systematic analysis
and prediction which is greatly affected by environmental factors. Therefore, many
machine learning related algorithms can be applied in practical production.
In practical applications, it is more common to upload data and computing tasks to
the cloud, and then the results are returned to the local after data collection and model
training are completed in the cloud. This way of machine learning can be called cloud
computing. The advantages of cloud computing lie in the large amount of data stored
in the server, high accuracy and strong computing power. Although cloud computing
is powerful, it also has disadvantages. Many computational scenarios need to be done
locally, such as driverless vehicles. If the collected data is uploaded to the cloud for
processing and calculation during the driving process, the time delay caused by
this process is likely to lead to safety accidents. Comparatively speaking, running
machine learning algorithms on terminal devices has the advantages of real-time and
low latency, and is more suitable for many practical scenarios.
Although machine learning algorithm has a good effect and performance in solv-
ing many specific problems, the operation of the algorithm itself needs to consume
a lot of computing resources. In this paper, we need to run the machine learning
algorithm on the terminal device, so we need to process and optimize the data before
the algorithm runs, in order to reduce the running time required by the algorithm and
implement the algorithm efficiently under low power consumption.
The output power of power plant will be affected by environmental factors to a great
extent. In order to monitor the operation of power plant in real time, we design a
power prediction unit of power plant, and realize extracting the characteristics of each
environmental data index through the classical support vector regression algorithm
in machine learning, so as to accurately estimate the operation status of power plant.
27 A Design of Electricity Generating Station Power … 261
We use FRDM-KW01 9032 as the embedded platform used in the experimental part.
This series of development boards adopt ARM Cortex M0+ core, which has very
low power consumption and supports ISM band wireless communication. It is very
suitable for many practical application scenarios. Therefore, this type of development
board is selected to verify and debug the algorithm.
In Sect. 27.2 we introduce the workflow of the power prediction unit and the SVR
algorithm. In Sect. 27.3 we mainly discuss the methods of data preprocessing and
precision evaluation. In Sect. 27.4 we introduce the implementation of the algorithm
on the embedded platform. In Sect. 27.5 we will summarize the content of the article.
In this Section we mainly introduce the operation mode of the electricity generating
station we monitored, the structure of the power prediction unit, and briefly intro-
duces support vector regression algorithm and some key parameters involved in the
operation of the algorithm.
The combined cycle power plant consists of gas turbine (GT), steam turbine (ST)
and heat recovery steam generator. In the process of power generation in electricity
262 B. Liu et al.
generating station, electricity is generated by gas and steam turbines, which are
combined in one cycle and transferred from one turbine to another. When vacuum
is collected from steam turbines and affects them, the three environmental variables
affecting the performance of gas turbines are temperature, air relative humidity and
pressure. Therefore, the output power of the power plant is mainly related to the
above three environmental variables and the exhaust vacuum. According to the data
we collected in the factory through sensors, we can get the numerical range of these
four indicators. The range of values is shown in Table 27.1.
By acquiring the range of these parameters, we can normalize the data to a scale
between 0 and 1 before the algorithm runs, so as to avoid the inconsistency of the
weights of the running data caused by the direct use of the original data, and improve
the efficiency of the algorithm on the embedded platform.
The main purpose of the power prediction unit we designed in this paper is to obtain
real-time operation information of the power plant by acquiring environmental data
and learning its data characteristics, and by predicting the ideal output power value
in this environment and comparing it with the output power value, so as to achieve
the purpose of monitoring the working state of the power plant.
Firstly, the sensor on the power prediction unit reads the environmental data,
KW01 reads the data through ADC. After the Support Vector Regression algorithm
is completed, the power predict unit outputs an ideal value in the current state. Then
the ideal value will be compared with the actual value of the measured power, so as to
judge whether there are problems in the operation of the power plant. The structure
and design flow of the prediction unit are shown in Fig. 27.1.
Support Vector Regression (SVR) is a simple machine learning algorithm. Its task
is, for given data, to find a hyperplane that can fit as many data points as possible,
and apply the results of regression to target prediction and analysis. In the case of
27 A Design of Electricity Generating Station Power … 263
Fig. 27.1 .
Fig. 27.2 .
This section mainly introduces the accuracy evaluation method and parameter selec-
tion of training algorithm. Appropriate parameters can speed up code running and
improve the efficiency of the algorithm on embedded platform. The part of parameter
optimization is completed on PC.
For the evaluation index of model accuracy, we choose root mean square error
(RMSE) at the beginning of the experiment. The formula is as follows:
n
i=1 yi − ŷi
RMSE = (27.1)
n
Through the formula, we can find that the index can describe the relationship
between the predicted value and the actual value very well. However, in the specific
experiments, we found that in the process of data preprocessing, we only zoom the
values of the four dimensions of the input, but not the label values, so this index cannot
describe the accuracy performance of the same model in different scale data sets very
well. Therefore, this paper chooses a statistical index to describe the goodness of fit,
and the formula is as follows:
n
(yi − ȳi )
R = 1 − ni=1
2
(27.2)
i=1 yi − ŷi
27 A Design of Electricity Generating Station Power … 265
When the value of this index approaches 1, the accuracy of the representative
model will be better.
When training the model on computer, we achieved good prediction results, and the
output predicted values are in good agreement with the target values. However, when
running on KW01, we face a different situation. It has less available resources, the
main frequency is only 48 MHz, less than 1/50 of the PC, and the computing ability
is poor. Therefore, it is necessary to optimize from the level of hyperparameters to
minimize the number of iterations, so as to shorten the runtime of the algorithm. For
the three hyper-parameters of support vector regression algorithm, Epsilon needs to
be selected manually by the scale of target data. For the other two parameters, C and
Gamma, the increase of C and Gamma can improve the accuracy of the model in the
training process, but at the same time, it will make the model more complex, resulting
in more iterations and longer runtime. In theory, when Gamma is large enough, the
model can fit all the known data points, but correspondingly, the model will fall into
over-fitting, which will make the generalization effect worse. At the same time, the
number of iterations required for the operation of the algorithm will become very
large, and the efficiency of the algorithm will become low. Therefore, considering
the practical application needs to run on a single-chip computer, we will transform
the optimization goal into how to reduce the number of iterations on the basis of
ensuring the accuracy of the model. In each training, we first determine the value of
Epsilon according to the scale of the target data, and determine an expected accuracy
(with R as the evaluation index) that we want the algorithm to achieve. Then the
exhaustive method is used to find the optimal C and Gamma in a certain interval. In
evaluating the advantages and disadvantages of current C and Gamma, we divide the
data into ten parts by ten-fold cross-validation. Nine of them are taken as training
data sets at a time, and the remaining one is used as test data sets to record the mean
of R-side obtained by using each parameter training model. When R reaches a given
value, C and Gamma will not increase any more, so that we can get several sets of
values of C and Gamma sum, and then choose the optimal parameters according to
the number of iterations. Table 27.2 shows intuitively the selection of parameters
when the expected accuracy takes different values.
266 B. Liu et al.
Firstly, we store the data of the existing data sets on chip, and after the MCU runs
the support vector regression algorithm, the resource occupancy can be checked on
IDE. We took 300 sets of data for training, and five sets for prediction. Then we
calculate R-square to evaluate. The R-square we achieved is 0.911, which achieves
good fitting results. The running time of code can be viewed through IAR software
simulation. Viewing the map file exported from the project can obtain information
about memory usage. The project code takes up 28.2 KB Flash space and 9.6 KB
SRAM space. From this, we can see that the design makes reasonable and effective
use of on-chip resources on the basis of completing the purpose of the algorithm and
ensuring the accuracy of the algorithm.
27 A Design of Electricity Generating Station Power … 267
Fig. 27.3 .
We connect sensors to MCU to obtain external data and store the latest 20 sets of data
for running support vector regression algorithm. When external sensors are attached,
the workflow of the prediction unit is shown in Fig. 27.3.
27.5 Conclusion
This paper mainly introduces the design of a power prediction unit with low power
consumption, and briefly introduces the support vector regression algorithm. In prac-
tical applications, we can run machine learning algorithms on low-power and low-
cost platforms, which can extract the characteristics of environmental data and realize
real-time monitoring of power plant operation status. At the same time, we designed
a complete set of parameter optimization methods and corresponding optimization
strategies. In the case of minimizing resource occupation, the runtime of the algo-
rithm can be shortened as much as possible by setting the hyper-parameters.
268 B. Liu et al.
References
1. Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques and
applications. J. Netw. Intell. 1(3), 83–98 (2016)
2. Kuang, F.-J., Zhang, S.-Y.: A novel network intrusion detection based on support vector machine
and tent chaos artificial bee colony algorithm. J. Netw. Intell. 2(2), 195–204 (2017)
3. Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for
training support vector machines. J. Mach. Learn. Res. 6, 1889–1918 (2005)
4. Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl.
Discov. (2) (1998)
5. Kim, E., Lee, J., Shin, K.G.: Real-time prediction of battery power requirements for electric
vehicles. In: ACM/IEEE 4th International Conference on Cyber-Physical Systems, ACM, New
York, NY, USA, pp. 11–20 (2013)
Chapter 28
Design of Power Meter Calibration Line
Control System
Liqiang Pei, Qingdan Huang, Rui Rao, Lian Zeng and Weijie Liao
Abstract Aiming at the problem that the manual calibration in the power meter cal-
ibration is inefficient and error-prone, and the existing automatic calibration equip-
ment is cumbersome, this paper proposes a pipelined automatic calibration solution
for the instrument. Combined with the instrument automatic calibration device and
the assembly line equipment, the assembly line calibration operation of the instru-
ment verification is realized, and multiple power meters can be calibrated at the same
time. This paper introduces the structure of the power meter automatic calibration
assembly line system and the design of hardware and software. The experimental
results show that the designed system can realize the fully automated calibration
operation of the instrument.
28.1 Introduction
low calibration efficiency and cumbersome operation, and the calibration personnel
are extremely prone to fatigue after prolonged operation and are prone to error [1].
Therefore, it is necessary to develop automated calibration technology for power
meters.
At present, some power meter automatic calibration devices have appeared at
home and abroad, using DSP or computer as the processor, and using machine
vision to obtain the instrument representation number [2]. These devices basically
realize the automation of meter reading acquisition and calibration data processing,
which improves the calibration efficiency and calibration accuracy [3]. However,
these instrument calibration devices still have the following deficiencies: firstly, it
is necessary to manually classify and place the instruments, and automatic handling
has not yet been realized; secondly, manual connection and disconnection opera-
tions are still required; finally, the system has poor versatility and can be calibrated.
There are fewer types. And only one instrument can be verified at a time, and the
pipeline operation is not realized [4]. In order to solve the above problems, this paper
designs a power meter calibration pipeline system that can realize the power line
calibration pipeline line operation, and automatically completes the transport meter,
disconnection, calibration and range adjustment operation of the instrument [5]. It
can significantly shorten the meter calibration time and improve the meter calibration
efficiency.
The overall structure of the power meter calibration pipeline control system is shown
in Fig. 28.1. It is mainly composed of four parts: system main control unit, instrument
calibration unit, instrument identification and grabbing unit and pipeline conveyor
belt.
In the power meter calibration pipeline control system, the system main control
unit controls the operation of the whole system; the instrument calibration unit real-
izes the automatic calibration operation of the power meter [6]; the instrument identi-
fication and capture unit is used to check the power meter from the to-be-checked. The
instrument warehouse grabs the assembly line and identifies the instrument model.
The assembly line conveyor is responsible for transporting the instrument between
the instrument calibration unit and the instrument storage warehouse. The power
meter calibration pipeline is a distributed system. The main control unit, the instru-
ment identification and capture unit and the instrument calibration unit are connected
in the same control LAN. In actual use, different numbers of instrument calibration
units can be connected according to actual needs. The more the number of meter cal-
ibration units used, the more instruments that can be simultaneously calibrated, and
the higher the calibration efficiency. The instrument calibration unit is the core equip-
ment of the system, and is designed with independent control computer, automatic
transport meter device, automatic disconnecting device and automatic calibration
device. Through the mutual cooperation of these devices, the automatic calibration
operation of the instrument is completed [7].
The basic workflow of the power meter calibration pipeline system is shown in
Fig. 28.2. After the system is started, the main control unit sends a status query
command to each meter calibration unit. If there is an idle calibration unit, the main
control system will send a grab instrument instruction to the meter identification and
capture unit, and carry the instrument to be verified from the warehouse to the assem-
bly line. After the operation is completed, the operation completion signal and the
instrument model information will be returned to the main control unit. After receiv-
ing the operation completion signal, the main control unit sends a start calibration
command and meter model information to the meter calibration unit. After receiving
the command, the instrument calibration unit will first read the instrument calibration
plan and instrument parameters from the main control unit database. Then verify the
instrument according to the calibration plan and instrument parameter information.
After the meter calibration is completed, the meter calibration unit will return the
calibration completion signal to the system main control unit. After receiving the cal-
ibration completion signal, the main control unit will start the pipeline to transport
the meter that has been verified.
The instrument calibration unit is responsible for completing the transport meter,
disconnection, range adjustment and automatic calibration operation of the power
meter in this system, which is the main component of the system. The schematic
diagram of the hardware structure of its control system is shown in Fig. 28.3. It
consists of a smart camera, a displacement control box, a standard source, a standard
source channel switchboard, a digital I/O board, a motor driver control board, and
a plurality of electric actuators. The meter calibration unit completes the transport
272 L. Pei et al.
meter, disconnection and range adjustment operation of the power meter through the
cooperation of multiple electric actuators [8].
If the control system of the meter calibration unit is divided by functions, it can
be divided into the calibration device, the upper and lower device control circuit, the
disconnection device control circuit and the range adjustment circuit. The instrument
calibration unit control system is connected to the control LAN through a network
interface with the computer as the core, and realizes the communication connection
with the main control unit.
A digital I/O board with a PCI interface installed on the main control computer
is used to control the action of the electric actuator [9]. A motor driver control board
was designed to control two servo motor drives, three stepper motor drives and a
steering gear.
The motor driver control board is connected to the computer via a USB interface. In
order to realize the automation of the calibration standard source channel switching,
a standard source channel switching board is designed to switch the standard source
channel output channel. The standard source channel switch board is connected to
the host computer via a USB interface.
28 Design of Power Meter Calibration Line Control System 273
The power meter calibration pipeline system software is designed to follow the
principles of reliability, modifiability, readability and testability [10], using multi-
threading, network communication and database technology. The system software
adopts the client/server structure, and the system software is divided into the main
control unit software and the instrument calibration unit software. The main con-
trol unit software is used as the server software, and the instrument calibration unit
software is used as the client software. A database is built in the main control unit
computer, and the instrument calibration plan, instrument parameters and instrument
calibration results are uniformly stored in it, which is conducive to unified manage-
ment of data [11].
The main control unit software is used to control the running status of the
entire instrument automatic calibration unit, and realize functions such as instrument
scheduling, status monitoring and data processing. The main control unit software
274 L. Pei et al.
to the instrument terminal; after the wiring is completed, the instrument calibration
operation is started; After the calibration is completed, execute the instrument discon-
nection control module to remove the connection line from the instrument terminal;
then execute the instrument’s meter control module to move the instrument from the
calibration station to the assembly line. This completes the instrument calibration.
28.5 Experiments
In order to verify the function of the designed power meter calibration pipeline
control system, the main control unit, the instrument calibration unit and the pipeline
conveyor belt are combined, and the system function calibration environment shown
in Fig. 28.6 is built to verify the system function.
The key point of calibration is whether the system can realize the whole process of
the transport meter, disconnection, range adjustment and automatic calibration of the
power meter under the control of the main control unit. The designed experimental
scheme is to put the instrument to be inspected on the pipeline, and then use the
system main control unit to start the whole system and record the time required for
each operation. Analyze the experimental results.
A total of five meters were used for testing throughout the process. In all five
experiments, all the operational procedures of the instrument calibration were suc-
cessfully completed. The experimental results are shown in Table 28.1. Excluding
the time spent on meter calibration, the average time of the transport meter, dis-
connection, range adjustment and instrumentation operation of each meter is about
260 s. The functional design basically meets the expected design goals of the system.
28.6 Conclusion
Aiming at the problem that the existing power meter calibration equipment has poor
versatility and low calibration efficiency, this paper designs a power meter cali-
bration pipeline control system. The system combines computer technology, infor-
mation management technology and digital control technology. Under the synergy
of multiple motors, the automatic power meter’s automatic transport meter, auto-
matic disconnection, automatic calibration and other functions are realized. When
the system is equipped with multiple instrument calibration units, one system can
simultaneously verify multiple instruments, which can reduce the calibration time of
the power meter, reduce the labor intensity of the calibration personnel, and improve
the calibration efficiency of the instrument.
References
1. Li, Q., Fang, Y., He, Y.: Automatic reading system based on automatic alignment control for
pointer meter. In: Industrial Electronics Society, IECON 2014—40th Annual Conference of
the IEEE, pp. 3414–3418 (2014)
2. Yue, X.F., Min, Z., Zhou, X.D., et al.: The research on auto-recognition method for analogy
measuring instruments. In: International Conference on Computer, mechatronics, Control and
Electronic Engineering, pp. 207–210 (2010)
3. Zhang, J., Wang, Y., Lin, F.: Automatic reading recognition system for analog measuring
instruments base on digital image processing. J. Appl. Sci. (13), 2562–2567 (2013)
4. Chen, C., Wang, S.: A PC-based adaptative software for automatic calibration of power trans-
ducers. IEEE Trans. Instrum. Meas. (46), 1145–1149 (1997)
5. Pang, L.S.L., Chan, W.L.: Computer vision application in automatic meter calibration. In:
Fourteenth IAS Annual Meeting. Conference Record of the 2005, pp. 1731–1735 (2005)
6. Smith, J.A., Katzmann, F.L.: Computer-aided DMM calibration software with enhanced AC
precision. IEEE Trans. Instrum. Meas. 36, 888–893 (1987)
7. Wang, S.C., Chen, C.L.: Computer-aided transducer calibration system for a practical power
system. In: IEE Proc. Sci. Measure. Technol. (6), 459–462 (1995)
28 Design of Power Meter Calibration Line Control System 277
Abstract On the basis of traditional gray level co-occurrence matrix (GLCM) and
8-neighborhood element matrix, a novel 20- or twenty-neighborhood color motif co-
occurrence matrix (TCMCM) is proposed and used to extract the foreground in color
videos. The processing of extracting the foreground is briefly described as follows.
First, the background is constructed by averaging the first many frames of the con-
sidered video. Following this, the TCMCM of each point is computed in the current
frame and background frame respectively. Next, based on the TCMCM, the entropy,
moment of inertia and energy in each of their color channel are introduced to represent
color texture features. Finally, Euclidean distance is used to measure the similarity of
color texture features between the foreground and background. Experimental results
show that the presented method can be effectively applied to foreground extraction
in color video, and can get better performance on the foreground extraction than the
traditional method based on GLCM.
29.1 Introduction
With the development of the Internet and the wide application of visual sensors,
people have entered an era of information explosion. How to accurately and quickly
extract the interesting foreground or target from a large amount of visual infor-
mation will directly affect the follow-up tracking and positioning, and it also is a
key preprocessing in the future prediction of target behavior and scene understand-
ing. At present, the classical methods of foreground extraction include: optical flow
method, frame difference method and background difference method [1], etc. How-
ever, optical flow method requires multiple iterative operations, which causes the
method having complex and time-consuming computation. Moreover, optical flow
method has poor anti-noise ability and it is rarely applied in real scenarios [2]. Frame
difference method is easy to produce cavity phenomenon and image dragging for
the rapidly-moving foreground object with low accuracy [3]. Background difference
method depends on the updating model of background. The shadow generated by
light is detected by most methods of foreground detection, because it has the same
motion property with the target, which affects the accuracy of extraction.
As an important perception cue on the surface of objects, texture is widely used
in feature extraction. Therefore, this paper starts with texture features and looks
for a method of foreground extraction according to the texture similarity between
foreground and background. At present, the methods of texture feature extraction
mainly lie on statistical methods and structural methods [4]. Gray level co-occurrence
matrix (GLCM) is used as a classical statistical method [5], and motif matrix is
commonly used in structural method [6].
GLCM and its derivative matrix (gray motif co-occurrence matrix [7, 8]) are
mainly based on gray level information for statistical quantity of feature, and to the
best of our knowledge, few studies have been presented on color images [7–10]. In
fact, color features can provide abundant information of color, which is conducive
to the extraction and detection of image features. Therefore, this paper intends to
mix the color features of an image and GLCM, and presents the color motif co-
occurrence matrix. However, GLCM is mainly based on 8-neighborhood motif matrix
of each pixel, and it often occurs that the extraction of moving objects in an image
is incomplete or the extraction of small moving objects is not available. Therefore,
by expanding the 8-neighborhood motif matrix, 20- or twenty-neighborhood color
motif co-occurrence matrix (TCMCM) is proposed in this paper. A new algorithm
based on the proposed TCMCM is applied to extract the foreground from the color
video, which obtains more accurate information from neighborhood pixels, and dis-
tinguishes foreground target points and background points according to different
texture features of foreground and background, so as to extract the interesting fore-
ground.
GLCM has been proposed by Haralick et al. [11]. It characterizes texture features
statistics according to the spatial correlation and gray level relationship between
paired pixels of an image, and it has been widely used in various fields in recent
years [12].
29 Foreground Extraction Based on 20-Neighborhood Color … 283
GLCM is used to present the occurrence probability of paired pixels. Let L be the
gray level of image, i and j denote the respective gray values of any paired pixels,
which are between 0 and L − 1. More notations are shown as follows. θ is the angle
between the line determined by paired pixels and horizontal plane, which reflects the
direction of the paired pixels. Usually, the value of θ is 0, 45, 90 or 135 with unit of
degree. λ denotes the distance between two pixels of any pair. Thus, the element of
a GLCM is expressed with the above notations as follows [13].
When the direction and distance between the paired pixels are determined, the
corresponding GLCM is the expression of Eq. (29.2).
⎡ ⎤
p(0, 0) ··· p(0, j) ··· p(0, L − 1)
⎢ .. .. .. .. .. ⎥
⎢ . . . . . ⎥
⎢ ⎥
Pλ = ⎢
θ
⎢ p(i, 0) ··· p(i, j) ··· p(i, L − 1) ⎥
⎥ (29.2)
⎢ .. .. .. .. .. ⎥
⎣ . . . . . ⎦
p(L − 1, 0) · · · p(L − 1, j) · · · p(L − 1, L − 1)
GLCM requires high computation and the methods based on GLCM have inac-
curate expression and thus have poor extraction results.
Motif matrix is composed of motif values, and the value at one pixel is based on its 4
or 8 neighborhood pixels. A motif value presents the torque of neighborhood pixels
to their corresponding central pixel [14].
Suppose that the non-boundary pixel point (x, y) is considered, and G(x, y) is the
gray value of each pixel of an image. When the torque of 4 neighborhood pixels is
used to measure the motif value of the considered pixel [14], m(x, y) can be gotten
by
Here, L x and L y are the number of quantization levels along the x and y direction
respectively.
Similarly, the motif value can be calculated as Eq. (29.4) when 8 neighborhood
pixels are considered for a central pixel (x, y) [7].
284 C.-F. Guo et al.
√
m(x, y) =INT { 2[G(x − 1, y − 1) + G(x − 1, y + 1) + G(x + 1, y − 1) + G(x + 1, y + 1)]
+ [G(x − 1, y) + G(x, y + 1) + G(x + 1, y) + G(x, y − 1)]},
x = 1, . . . , Lx − 2, y = 1, . . . , Ly − 2 (29.4)
GLCM and its derivative matrix are mainly used to present statistical feature quantity
of image based on the information of gray level. For small foreground targets or
small difference between foreground and background colors, the aforementioned
matrices easily introduce incomplete extraction of the target. For color videos, each
color channel has texture information [15]. To improve the extraction performance
in color videos, we construct TCMCM on the basis of GLCM, color feature and
20-neighborhood motif matrix. The element of the constructed TCMCM matrix is
expressed as
where L 1 denotes the maximum value of color co-occurrence matrix on each channel
of RGB, and L 2 is the maximum motif value of 20-neighborhood.
CP(i, j, r, t|λ, θ ) is the number of paired pixels when the r-th channel value is i
and the motif value is j under the conditions of direction θ and distance λ in the color
image at time t in video.
In order to reduce the computation, the values in this paper are compressed and
quantized into 16 levels before constructing the color motif co-occurrence matrix.
The color motif co-occurrence matrix cannot be directly regarded as the features,
and its elements are used to do further statistics. The entropy, energy, contrast, cor-
relation, moment of inertia, moment of deficit, angular second moment and other
14 features are usually considered as texture statistics [16, 17]. In order to reduce
the computation and combine the features of foreground and background, our work
selects entropy, moment of inertia and energy as texture statistics which are shown
as Eqs. (29.8)–(29.9) respectively. These quantities have strong descriptive ability as
the features of foreground and background texture for statistics.
Entropy:
H (t, r, λ, θ ) = − CP(i, j, r, t|λ, θ )log CP(i, j, r, t|λ, θ ) (29.8)
i j
Energy:
2
E(t, r, λ, θ ) = − CP(i, j, r, t|λ, θ ) (29.9)
i j
286 C.-F. Guo et al.
Moment of inertia:
I (t, r, λ, θ ) = − (i − j)2 CP(i, j, r, t|λ, θ ) (29.10)
i j
Considering that color motif co-occurrence matrix represents the spatial depen-
dence between image pixels and the comprehensive information of color space, we
construct the color texture feature vector by using the nine parameters of the texture
features of R, G and B channels as
In order to describe the similarity between current foreground region and the back-
ground region at time t in video, Euclidean distance as Eq. (29.12) is introduced to
measure the similarity of foreground and background texture feature.
T
f f
d (t, x, y) = Vt − V b Vt − V b (29.12)
Here, V is obtained by Eq. (29.11), and the superscript f and b denote foreground
and background respectively.
The smaller Euclidean distance means the higher similarity between the current
pixel texture and the background texture.
(1) The first M (>100) frames of a video are input and the average of these frames
are calculated to build up the background model. The value of M is decided by
video size and complexity, usually it is set to 100, 200, 300 or more.
(2) Calculate the TCMCM of each point in the background model, and, the texture
feature quantity in the neighborhood around each pixel in the background image
is measured as V b according to Eq. (29.11).
(3) Input the time-t frame of the video and calculate the TCMCM of each pixel in
the current image. According to Eq. (29.11), the texture feature quantity in the
f
neighborhood around each pixel is measured as Vt at this moment.
(4) Based on Eq. (29.12), the similarity of texture feature quantity of each pixel (x,
y) between the current frame and background frame is calculated.
(5) If the similarity of texture feature quantity of each pixel (x, y) is less than the
threshold T, the current pixel point (x, y) belongs to the background point;
otherwise, it belongs to the foreground point.
(6) Input the next frame and repeat the processing from step (3) until all frames of
the video have been processed.
29.5 Experiment
(a)The 530thframe (b) The traditional method (c) The proposed method
(d)The 654thframe (e) The traditional method (f) The proposed method
(a)The 89thframe (b) The traditional method (c) The proposed method
(d)The 176thframe (e) The traditional method (f) The proposed method
(a)The 189thframe (b) The traditional method (c) The proposed method
(a)The 115thframe (b) The traditional method (c) The proposed method
to detect small targets as in Fig. 29.2c, and incomplete extraction of individual moving
objects will not occur as shown in Figs. 29.2 and 29.4.
For videos including motion shadow generated with target motion as Figs. 29.3
and 29.5, this proposed method has higher accuracy of foreground extraction and
less noise than the traditional method.
29.6 Conclusions
Acknowledgements This work is supported by Educational Research Project for Young and
Middle-aged Teachers of Fujian No. JAT-170667 and Teaching Reform Project of Fuqing Branch
of Fujian Normal University No. XJ14010.
References
1. Lin, G., Wang, C.: Improved three frame difference method and background difference method
a combination of moving target detection algorithm. Equip. Manuf. Technol. 3, 172–173 (2018)
2. Fu, D.: Vehicle detection algorithm based on background modeling. University of Science and
Technology of China, Hefei (2015)
3. Guo, C.: Target tracking algorithm based on improved five-frame difference and mean shift. J.
Langfang Normal Univ. 18(1), 21–24 (2018)
4. Jian, C., Hu, J., Cui, G.: Texture feature extraction method of camouflage effect evaluation
model. Comm. Control Simul. 39(3), 102–105 (2017)
290 C.-F. Guo et al.
5. Gao, C., Hui, X.: GLCM-Based texture feature extraction. Comput. Syst. Appl. 19(6), 195–198
(2010)
6. Liu, X.: ROI digital watermarking based on texture characteristics. Hangzhou Dianzi Univer-
sity, Hangzhou (2011)
7. Wang, L., Ou, Z.: Image texture analysis by grey-primitive co-occurrence matrix. Comput.
Eng. 30(23), 19–21 (2004)
8. Hou, J., Chen, Y., He, S., et al.: New definition of image texture feature. Comput. Appl. Softw.
24(9), 157–158 (2007)
9. Song, L., Wang, X.: An image retrieval algorithm integrating color and texture features. Comp.
Eng. Appl. 47(34), 203–206 (2011)
10. Yu, S., Zeng, J., Xie, L.: Image retrieval algorithm based on multi-feature fusion. Comput. Eng.
38(24), 216–219 (2012)
11. Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE
Trans. Syst. Man Cybern. 3(6), 610–621 (1973)
12. Ghulam, M., Mohammed, A., Hossain, M., et al.: Enhanced living by assessing voice pathology
using a co-occurrence matrix. Sensors 17(2), 267 (2017)
13. Wang, H., Li, H.: Classification recognition of impurities in seed cotton based on local binary
pattern and gray level co-occurrence matrix. Trans. Chin. Soc. Agric. Eng. 31(3), 236–240
(2015)
14. Wang, L., Ou, Z., Su, T., et al.: Content-based image retrieval in database using SVM and gray
primitive co-occurrence matrix. J. Dalian Univ. Technol. (4), 475–478 (2003)
15. Xu, F.: Classification of texture features based on color symbiosis matrix. J. Zhejiang Ind.
Trade Vocat. Coll. 16(4), 54–58 (2016)
16. Gui, W., Liu, J., Yang, C., et al.: Color co-occurrence matrix based froth image texture extraction
for mineral flotation. Miner. Eng. 60–67 (2013)
17. Jiao, P., Guo, Y., Liu, L., et al.: Implementation of gray level co-occurrence matrix texture
feature extraction using Matlab. Comput. Technol. Dev. 22(11), 169–171 (2012)
Chapter 30
Deformation Analysis of Crude Oil
Pipeline Caused by Pipe Corrosion
and Leakage
Yuhong Zhang, Gui Gao, Hang Liu, Qianhe Meng and Yuli Li
Abstract In this paper, the pipeline corrosion and leakage model were built by Ansys
software. The Computational Fluid Dynamics (CFD) simulation and unidirectional
fluid-solid coupling simulation have carried out for the corrosion and leakage condi-
tions of the pipeline. The results have shown that when the pipe is corroded by 2 mm,
the deformation quantity of the pipe will increase to 5.2 × 10−9 m. When the pipe
is leaked, the deformation quantity near the leaking hole of different shapes would
change and the deformation quantity near the leaking hole was the largest part. This
conclusion has provided an effective means for studying pipeline corrosion and leak
detection technology.
30.1 Introduction
At present, the crude oil and natural gas are transported through pipelines. The
consumption of energy is accompanied by the rapid development of the national
economy. The pipelines have been built in the Sinopec system include a number
of refined oil pipelines such as the Southwest Oil Products Pipeline, the Pearl River
Delta Pipeline, and the Lusong Pipeline. But most of existing crude oil pipelines were
built at ~30 years ago. So, the oil pipeline network has been entering the risk period
of accidents because of the pipeline spiral weld defects and corrosion. There are
many risk factors, and the safety production situation is more severe [1]. Therefore,
understand the operational status of oil pipelines and finding problems in the pipeline
transportation process is very important.
The pipeline detection and monitoring techniques have been investigated, the
pipeline inspection can be divided into two aspects: pipeline corrosion detection
and pipeline leakage detection [2]. The detection technology which can be both
used for corrosion and leakage monitor include magnetic flux leakage detection,
acoustic emission detection and optical fiber sensing detection [3]. The fiber-optic
sensing technology has obvious advantages in safety, measurement accuracy and
long-distance transmission, which could meet the potential requirements of pipeline
corrosion and leakage monitoring [4].
The deformation quantity of the pipeline would change with the corrosion and
leakage of pipe. In order to analyze the deformation quantity of the pipeline, we
have simulated the stress distribution of the pipeline under different states by Ansys
software. According to the simulation results, it can provide the theoretical reference
for selecting a suitable stress sensor to detect the deformation of the pipeline.
30.2 Principle
The flow of fluids should follow a series of standard fluid mechanics conservation
equations such as mass conservation, conservation of momentum, conservation of
energy, and conservation of chemical components [6]. Regardless of the local fluid
disturbance problem and heat transfer problem at the pipe joint, the fluid density
can be assumed to be constant and the fluid also is incompressible. The governing
differential equations of the fluid in this simulation are shown in Eqs. (30.1), (30.2),
and (30.3) respectively:
Continuity equation:
∂ρ ∂(ρuj )
+ =0 (30.1)
∂t ∂xj
ρ (kg/m3 ) is the medium density; t (s) is time; ui (m/s) is the speed in the i
direction; xi (m) is the displacement in the i direction; xj (m) is the displacement
in the j direction; μ is the molecular viscosity; μt is the turbulent viscosity; Gk (J)
is the turbulent flow energy generated by the average velocity gradient; Gb (J) is
the turbulent flow energy generated by buoyancy; YM is the turbulent fluctuation of
the compressible fluid and the effect of expansion on the overall dissipation rate;
C12 = 1.44, C22 = 1.92, C32 = 0.99, they are the empirical constant; Sk and S2 are
user-defined source terms.
We took a land open-air crude oil pipeline as an ansys model. The pipeline model
was built by the Design Modeler tool of Ansys software (see Fig. 30.1). The length
of the pipeline is 2 m, the outer diameter is 220 cm, and the wall thickness is 7.5 mm.
Pipeline models were established under pipeline leakage and corrosion conditions,
respectively. The different leaking pipe models were built through the leak location
placing in the middle of the pipe with the different shapes of the leak hole. The
different corrosion pipe models were built with the deferent thickness (0.5, 1.0 and
1.5 mm) of pipeline corrosion.
294 Y. Zhang et al.
Pipeline material is 20# steel, elastic modulus is 210 GPa, Poisson’s ratio is 0.3,
density is 7800 kg/m, yield strength is 245 MPa. The inflow medium is liquid product
oil with a density of 1200 kg/m3 , the dynamic viscosity is 1.3 × 10−3 Pa.
The k-2 model was adopted for simulating the steady state. The pressure–velocity
coupling in the iteration has adopted SIMPLEC to improve the convergence speed,
and the two-dimensional unsteady flow model was adopted to improve the computer
calculation speed. In the unidirectional fluid–solid coupling simulation calculation,
zero displacement constraints were applied to the inlet and outlet of the pipeline.
Boundary conditions:
• Pipeline inlet flow rate is 1 m/s;
• The pressure at the outlet of the pipe is equal to 1000 Pa;
• The pressure at the leak hole of the pipe is equal to 0 Pa [9].
σs (MPa) is the yield stress; σ1 (MPa), σ2 (MPa), σ3 (MPa) are the principal stresses
in three directions.
30 Deformation Analysis of Crude Oil Pipeline Caused … 295
Figure 30.2 shows that the deformation distribution of the leaking circular holes with
different diameters and the non-leakage pipeline. The deformation and compression
conditions are shown in Table 30.1. The comparison shows that the deformation
of pipelines under normal working conditions is uniform. Under the same internal
pressure, when the pipeline leaks, the deformation of the pipeline and the pressure
near the leak hole will increase sharply, and the deformation near the leak hole with
different shapes is different.
Fig. 30.2 Crude oil pipeline deformation distribution cloud map, a normal, b round holes, c square
holes, d elliptical holes
Table 30.1 Effecton leaking Leak hole type Pipeline Total Pipe state
holes of different shapes pressure (Pa) deformation
of pipeline
(m)
No leakage 24,393 8.921 × 10−9 Valid
Round hole 3.16 × 109 0.0064 Invalid
Square hole 9.212 × 1010 0.03487 Invalid
Oval hole 1.46 × 1010 0.02779 Invalid
296 Y. Zhang et al.
Figure 30.3 shows that the deformation distribution of the pipeline with different
depth of corrosion and non-corrosion. The deformation and pressure conditions are
shown in Table 30.2. It shows that the deformation of the pipeline was uniform under
the condition of no leakage. But the total deformation of the pipeline would change
when the pipe wall became thin or corroded, and the deformation of the pipe became
larger with the increasing corrosion depth under the same internal pressure of the
pipeline.
Fig. 30.3 Deformation distribution of crude oil pipelines with different degrees of corrosion, e no
corrosion, f inner wall corrosion 1 mm, g inner wall corrosion 2 mm, h inner wall corrosion 3 mm
Table 30.2 Effects on Leak hole type Pipeline Total Pipe state
different degrees corrosion pressure (Pa) deformation
of pipeline
(m)
No corrosion 24,393 8.921 × 10−9 Valid
Inner wall 25,431 1.284 × 10−8 Valid
corrosion
1 mm
Inner wall 27,810 1.373 × 10−8 Valid
corrosion
2 mm
Inner wall 31,024 1.510 × 10−8 Valid
corrosion
3 mm
30 Deformation Analysis of Crude Oil Pipeline Caused … 297
Based on the above simulation results, it can be seen that the pipeline would generate
the deformation when pipes were leaking and corroding. The deformation of pipeline
was about ~10−8 m in corrosion state (see Table 30.2). The accuracy of the fiber Bragg
grating strain sensor could reach up to 1 pm, with advantages of high safety and high
measurement accuracy. Therefore, the fiber Bragg grating and other high accuracy
sensor could be used to monitor the running status of the pipeline.
30.5 Conclusion
The simulation results show that the deformation of the pipeline is close to zero under
normal operation. But the amount of deformation near the leak hole would increase
sharply if the pipeline leaks. When the pipeline was corroded, the deformation of the
pipeline would increase with the increasing corrosion depth. So, we can judge that
the working states of the oil pipeline through detecting the change of the pipeline’s
deformation. The minimum deformation of pipeline is about ~10−8 m in corrosion
state. The result could provide the reference for selecting the sensor which was used
to monitor the running states of the pipeline.
Acknowledgements This work was supported by National Natural Science Foundation of China
(NSFC) (Grant No: 61705077), Science Foundation of Jilin Province Education Department (No:
92001001).
References
1. Baoqun, W., Yanhong, L., Yibin, D., Xinyu, C.: Current situation and prospect of China’s crude
oil pipeline. Pet. Plan. Des. 8–11 (2012)
2. Yanhui, Z., Tao, Z., Yigui, Z., Qu, H., Penghu, Z.: Numerical simulation of erosion and corrosion
in T-tube of gathering pipeline. Contemp. Chem. Ind. 43(11), 2457–2459 (2014)
3. Guozhong, W., Dong, L., Yanbin, Q.: Numerical simulation of surface temperature field of
underground oil stealing pipeline and buried oil pipeline. J. Pet. Nat. Gas 10, 815–817 (2005)
4. Jingcui, L., Kub, B., Dongmei, D., Qing, H.: Simulation of micro-leakage flow field detection
in natural gas pipeline. Comput. Simul. 10, 361–366 (2017)
5. Fuxing, Z., Pengfei, Z., Yinghao, Q.: Stress analysis of pipeline deformation based on ANSYS.
Chem. Equip. Technol. 37(2), 47–49 (2016)
6. Hongjun, Z.: Ansys+ 14. 5 practical guide for thermo fluid solid coupling, pp. 147–156.
People’s post and Telecommunications Publishing, Beijing (2014)
7. Hongchi, H., He, Q., Jingcui, L., Zhibing, C.: Analysis of the influence of leakage hole shape
on leakage characteristics. Electr. Power Sci. Eng. 34(1), 73–78 (2018)
8. Jianming, F., Hongxiang, Z., Guoming, C., Xiaoyun, Z., Yuan, Z, Ting, R.: Effect of geometric
shape of cracks on leakage of small holes in gas pipelines. Nat. Gas Ind. 34(11), 128–133
(2014)
298 Y. Zhang et al.
9. Sousa, C.A.D., Romero, O.J.: Influence of oil leakage in the pressure and flow rate behaviors
in pipeline (2017)
10. Hongyu, L.: Leakage detection technology for long distance natural gas pipeline. Chem. Manag.
10, 97–103 (2018)
11. Yingliang, W.: Leakage test and numerical simulation study of pipeline orifice. Zhejiang Univ.
45(2), 14–19 (2015)
12. Hongbing, H.: Analysis of the research status of natural gas pipeline leakage. Contemp. Chem.
Ind. 352–354 (2016)
Chapter 31
Open Information Extraction
for Mongolian Language
31.1 Introduction
For the past decade, Open IE has been developed using various methods and for
many languages. These methods show different results on languages because every
language has its own peculiarity [1–3].
Mongolian is a language that is spoken by 5.2 million people all over the world.
Officially, in Mongolia it is written in Cyrillic even though in some other places, for
instance Inner Mongolia Autonomous Region,2 the traditional Mongolian script is
used. Mongolian is classified into Altaic language family. Also it was believed that
Mongolian is related to Turkish and Korean. Similar to these languages, the basic
word order in Mongolian is subject-object-verb (SOV) [4], which means that the
subject, object and verb of a sentence usually appear in that order. For instance, if
English had a SOV structure, the sentence “John plays guitar” would be expressed
as “John guitar plays”.
Comparing with English, the Mongolian language have different grammatical tag-
ging due to its highly agglutinative nature. In Mongolian, postpositions are considered
a very important factor when understanding syntax of sentences. It is common that
almost every object is attached with a postposition. Therefore identifying appropri-
ate tags for Mongolian is significant in both preprocessing and recognition of noun
phrases. Preprocessing tools for Mongolian are scarce. For example, we couldn’t
find any free available tokenizer or sentence splitter. We tried the English tokenizer
and sentence splitter from NLTK [5] library and achieved acceptable results. As
for POS-Tagger (Part of Speech), at the best of our knowledge, currently only the
TreeTagger [6] is freely available for Mongolian. Based on our experience, it works
poorly because it was trained on a small Mongolian corpus.
Correct recognition of association between arguments and relation plays important
role in Open IE [7–9]. For Mongolian, as we surveyed, what can be identified as noun
and verb phrases is still unruled yet. In terms of the noun phrase, [10] have been
published most recently (unfortunately written in Mongolian). The contribution of
this work is three rules to recognize noun phrases as well as a dataset3 which was
annotated noun phrases in about 834 sentences manually. Thus it could be exploited
to build a noun phrase chunker and Open IE methods.
Recently, some researchers (i.e. MiLab4 in the National University of Mongolia),
have contributed in the Natural language processing (NLP) for Mongolian. Their
solution is still preliminary and not yet adequate to use as a preprocessing step for
other tasks [11].
What we observed from Mongolian, which we think will be a problem for other
languages that have limited resources, is that developing Open IE is quite challenging
due to the following reasons:
1. Either of preprocessing tools such as tokenizer, Part-of-Speech tagger, etc have
not been emerged yet or performance of such tools is not sufficient
2. Lack of datasets available
3. Complexity of language structure, grammar, etc.
In this paper we discuss Rule-based and Classification methods for Mongolian
language, implemented in MongoIE—Open Information Extraction system. Under
circumstances for Mongolian we mentioned above, we considered that these two
approaches are most applicable. Additionally, we compare their performances on
parallel dataset. In the rest of the paper, we evaluate their result and give a brief
analysis of errors.
The paper is organised as follows. Section 31.2 presents our methods in Mon-
goIE system. Experiment and a brief analysis of errors is described in Sect. 31.3.
Section 31.4 draws the conclusions and outlines future work.
3 http://172.104.34.197/brat//np-chunk/test2.
4 http://milab.num.edu.mn/.
31 Open Information Extraction for Mongolian Language 301
31.2 Methods
This section describes two approaches for Open IE for the Mongolian language,
namely Rule-based and Classification.
Since there is no dependency parser for Mongolian, we were not able to imple-
ment similar approaches like TextRunner [16], WOE (pos) and WOE (parse) [17].
302 G. Lkhagvasuren and J. Rentsendorj
Therefore we also exploit TreeTagger in this module. This approach consists of two
modules:
1. Candidate Extractor: In order to extract candidates, we use A similar way to
the previous approach (Rule-based). The difference is that no expression is used
to identify verb and noun phrases. Because we found out that rule-based method
sometimes eliminates correct sentences. To avoid ignoring correct triples, we
do not employ special syntactic constraints in this module. After a sentence is
tagged by TreeTagger, first a verb is searched, if it founds two nouns are searched
in the left of the noun. Extracted triples The goal of this module is to feed the
classifier module.
2. Classifier: Extracted candidate tuples from the previous module are labelled
as either trustworthy or not by Naive Bayes classifier. To train the classifier,
we annotated 100 sample triples manually. Also we use 26 features in total.
Examples of features include presence of POS tags, sequences in noun and verb
phrases, grammatical case, the number of tokens, the number of stopwords,
whether or not a subject is found to be a proper noun, etc.
For the former method as we presented in the previous section, we labeled 100 sen-
tences randomly from the web as a testing dataset.5 The result of the two approaches
and error statistics presented in Table 31.1. As shown in Table 31.1, the classification
method had better recall and F1-score. Having thoroughly examined failed sentences,
we found out that most error occurs in incorrect POS tagging. Also expressions to
identify verb and noun phrases affect to make errors.
31.4 Conclusion
In this paper, we have presented two basic methods—Rule based and Classification—
for Open IE in Mongolian language. In the best of our knowledge, this is the first
attempt in building Open IE systems for Mongolian. We believe that the result is
promising and the latter method shows better results. Having thoroughly examined
failed sentences, we found out that most error occurs in incorrect POS tagging. Thus
we believe that the result is able to be improved considerably by using appropri-
ate preprocessing tools especially for POS tagger. In the future, we plan to exploit
Wikipedia in Open IE for the Mongolian language. Also another alternative way to
improve the result is to have a larger dataset. In order to build it, translating a dataset
from other language could be a promising direction.
References
1. Michele Banko, O.E.: The tradeoffs between open and traditional relation extraction. In: Pro-
ceedings of the ACL-08: HLT (2008)
2. Horn, C., Zhila, A., Gelbukh, A., Kern, R., Lex, E.: Using factual density to measure informa-
tiveness of web documents. In: Proceedings of the 19th Nordic Conference on Computational
Linguistics (2013)
3. Mausam, Schmitz, M., Soderland, S., Bart, R., Etzioni, O.: Open language learning for infor-
mation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in
Natural Language Processing and Computational Natural Language Learning (2012)
4. Lin, T., Mausam, Etzioni, O.: Identifying functional relations in web text. In: Proceedings of
the 2010 Conference on Empirical Methods in Natural Language Processing (2010)
5. Bird, S., Loper, E., Klein, E.: In: Natural Language Processing with Python. O’Reilly Media
Inc (2009)
6. Helmut, S.: In: Improvements in Part-of-Speech Tagging with an Application to German, pp.
13–25. Springer, Netherlands, Dordrecht (1999)
7. Sangha, N., Younggyun, N., Sejin, N., Key-Sun, C.: SRDF: Korean open information extraction
using singleton property. In: Proceedings of the 14th International Semantic Web Conference
(2015)
8. Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic
dependency-based n-grams as classification features. In: Gonzalez-Mendoza, M., Batyrshin, I.
(eds.) Advances in Computational Intelligence. Proceedings of MICAI 2012 (2012)
9. Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic
dependency-based n-grams: more evidence of usefulness in classification. In: Gelbukh, A. (ed.)
Computational Linguistics and Intelligent Text Processing. Proceedings of International Con-
ference on Intelligent Text Processing and Computational Linguistics, CICLing 2013 (2013)
10. Bayartsatsral, C., Altangerel, C.: Annotating noun phrases for Mongolian language and using
it in machine learning. In: Proceedings of the Mongolian Information Technology—2018,
Ulaanbaatar, Udam Soyol, pp. 12–15 (2018)
11. Davidov, D., Rappoport, A.: Unsupervised discovery of generic relationships using pattern
clusters and its evaluation by automatically generated sat analogy questions. In: Proceedings
of the ACL-08 (2008)
12. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction.
In: Proceedings of the Conference on Empirical Methods in Natural Language Processing,
EMNLP’11 (2011)
13. Alisa, Z., Alexander, G.: Open information extraction for Spanish language based on syn-
tactic constraints. In: Proceedings of the ACL2014 Student Research Workshop, Baltimore,
Maryland, USA, pp. 78–85 (2014)
304 G. Lkhagvasuren and J. Rentsendorj
14. Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction.
In: Proceedings of the Joint Workshop on Unsupervised and SemiSupervised Learning in NLP,
ROBUS-UNSUP ’12 (2012)
15. Van Durme, B., Schubert, L.: Open knowledge extraction using compositional language pro-
cessing. In: Proceedings of the STEP ’08 Proceedings of the 2008 Conference on Semantics
in Text Processing (2008)
16. Michele, B., Michael, J.C., Stephan, S., Matt, B., Oren, E.: Open information extraction from the
web. In: Proceedings of the Twentieth International Joint Conference on Artificial Intelligence
(2007)
17. Wu, F., Weld, D.S.: Open information extraction using wikipedia. In: Proceedings of the 48th
Annual Meeting of the Association for Computational Linguistics, ACL ’10 (2010)
Chapter 32
Colorful Fruit Image Segmentation
Based on Texture Feature
Chunyan Yang
Abstract The recognition of colorful fruit is one of the important research contents
of agricultural machinery vision system. At present, the popular image segmenta-
tion method of color model is generally suitable for the case of large difference
between fruit and background color. For the image segmentation where the differ-
ence between fruit and background color is not obvious, the image segmentation
method based on color model cannot meet the actual needs. Therefore, this paper
introduces the use of gray-level co-occurrence matrix to analyze the texture features
of fruit and background, find out the texture feature parameters to distinguish the
fruit and background, and segment the image with similar color between fruit and
background. The experimental results show that texture features can not only suc-
cessfully separate the red apple from the background but also have a very good effect
on the segmentation of blue apple image with complex background.
32.1 Introduction
C. Yang (B)
Baicheng Normal University, Baicheng 137000, Jilin, China
e-mail: bcsyycy@163.com
Generally, texture refers to the gray change law of image pixels observed by people.
Texture exists widely in nature. The object identified in this paper is apple, the
background is mainly leaves and branches, and obviously whether it is red apple or
green apple, their texture should be completely different from the texture of leaves
and branches. In the experiment, texture features are introduced to segment the color
apple image with complex background based on texture eigenvalues.
where (x, y) is the pixel coordinate in the image; the range is [0, N − 1]; i, j are
the gray values; and the range is [0, L − 1]. Usually, the direction of gray-level
co-occurrence matrix is 0°, 45°, 90°, and 135°. If we do not synthesize these four
directions, we can get a variety of features in each direction, so that there are too
many texture features, which are not conducive to use. Therefore, the eigenvalues of
these four directions can be averaged, and the average values of the four directions
can be taken as the final eigenvalue symbiosis matrix by comparison in this paper.
For different theta values, the elements of the matrix are defined as follows:
P(i, j, d , 0◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = 0, |l − n| = d ,
I (k, l) = i, I (m, n) = j)} (32.2)
P(i, j, d , 45◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = d , |l − n| = −d
or (k − m = −d , l − n = d )I (k, l) = i, I (m, n) = j)} (32.3)
P(i, j, d , 90◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = d , l − n, I (k, l)
= i, I (m, n) = j)} (32.4)
P(i, j, d , 135◦ ) = #{((k, l) , (m, n) ∈ ly × lx × ly × lx |k − m| = d , |l − n| = d ,
or (k − m = −d , l − n = −d ), I (k, l) = i, I (m, n) = j)} (32.5)
k
k
ASM = (G(i, j))2 (32.6)
i=1 j=1
308 C. Yang
The contrast is
⎧ ⎫
k=1 ⎨ ⎬
CON = n2 G(i, j) (32.7)
⎩ ⎭
n=0 |i−j|=n
The subjects still selected 200 colorful apple images used before and sampled the
texture of fruit and leaves of red apple with complex background and green apple
with complex background, respectively. For the original image in Figs. 32.2a and
b, the experimental method is to divide the image into small blocks of equal size,
N × N , N = 5; two eigenvalues of gray-level co-occurrence matrix and gray-level
co-occurrence matrix in four directions (0°, 45°, 90°, 135°) are calculated, respec-
tively: ASM energy and contrast. The features of gray-level co-occurrence matrix
in average four directions are worthy of the average eigenvalues ASM and CON as
discriminant texture features. After a large number of data tests, the average value
of the characteristics is shown in Table 32.1.
As can be seen from Table 32.1, the ASM energy and contrast of leaves are very
different from those of red apples and can be selected as features to achieve image
32 Colorful Fruit Image Segmentation Based on Texture Feature 309
(a) Red Apple and leaves (b) Green Apple and leaves
segmentation and texture feature mapping (shown in Fig. 32.3a). The difference
between the ASM energy and contrast of leaves and that of green apples is also large.
The texture feature map (shown in Fig. 32.3b) is made. When the segmentation effect
based on color feature value is not satisfactory, texture features are used to segment.
By analyzing Fig. 32.2a and b, it is found that the ASM energy and contrast can
distinguish leaves from red apples, leaves, and green apples, so the ASM energy
and contrast can be used to segment green apple images with complex background.
Experiments on 200 such images show that no matter what color the apple is, the
segmentation success rate is more than 95%. Because the difference of the ASM
310 C. Yang
energy and contrast between green apple and red apple is not obvious, it indicates that
these two characteristic quantities cannot be used as the characteristics to distinguish
red apple from green apple, and new methods will continue to be explored in future
research.
Finally, through the two texture parameters of ASM energy and contrast CON, we
can not only realize the segmentation of red apple in complex background but also
realize the segmentation of green apple in complex background. The final segmented
image is shown in Figs. 32.4 and 32.5.
32.4 Conclusion
Considering that the texture features of leaves and apples are completely different,
two texture features based on gray-level co-occurrence matrix, the ASM energy and
contrast, are introduced in the experiment. The energy and entropy are selected as the
32 Colorful Fruit Image Segmentation Based on Texture Feature 311
eigenvalues to segment the red apple image and the green apple image, respectively.
The experimental results show that the texture features can not only successfully
separate the red apple from the background. It is also very good for cyan apple image
segmentation with complex background. Therefore, this paper believes that the target
recognition of color apple image should be combined with color features and texture
features, so that we can learn from each other and achieve the best recognition effect.
References
1. Bo, H., Ma, J., Jiao, L.C.: Analysis of gray level co-occurrence matrix calculation of image
texture. Acta Electron. Sin. (2006)
2. Yuan, L., Fu, L., Yang, Y., Miao, J.: Analysis of experimental results of texture feature extraction
by gray co-occurrence matrix. Comput. Appl. (2009)
3. Xuesong,W., Mingquan,Z., Yachun, F.: The algorithm of graph cut using HSI weights in color
image segmentation. J. Image Graph. 16(2), 221–226 (2012)
4. Mignotte, M.: A de-texturing and spatially constrained K-means approach for image segmenta-
tion. Pattern Recogn. Lett. 32(2), 359–367 (2013)
5. Zhiguang, Z.: A new color space YCH with strong clustering power for face detection. Pattern
Recogn. Artif. Intell. 24(4), 502–505 (2015)
Chapter 33
Real-Time Emotion Recognition
Framework Based on Convolution
Neural Network
Hanting Yang, Guangzhe Zhao, Lei Zhang, Na Zhu, Yanqing He
and Chunxiao Zhao
33.1 Introduction
Emotion is the cognitive experience that human beings produce under intense psycho-
logical activities. As a vital signaling system, facial expression can express people’s
psychological state which is one of the effective methods to analyze emotions. Estab-
lishing an automatic recognition expression model is a favorite research topic in the
field of computer vision.
The first research on emotion recognition was published in 1978 which tracks
the position of the key points in a continuous set of image frames to analyze the
expression generated by the face [1]. However, due to poor face detection and face
registration algorithms and limited computational power, progress in this field devel-
oped slowly. Until the first facial expression dataset Cohn-Kanade was published
broke the situation [2]. The mainstream method will detect the underlying expres-
sion or the action unit defined by the Facial Action Coding System (FACS) as the
33.2 Background
Inspired by previous work, there are four main processes in our emotion recogni-
tion system: face detection, face registration, and feature extraction and expression
recognition [4]. Depending on the definition of the expression space or the modality
of data, the method for each process is different (Fig. 33.1).
The purpose of face localization is to find faces in the image and make a mark.
There are two main approaches: detection approaches and segmentation approaches.
The detection method is to find the face in the original data and return to the bounding
box of the face. Viola and Jones proposed that the AdaBoost algorithm using the Haar-
like operator is the most commonly used algorithm [7], which is computationally
fast, but not good at dealing with occlusion and head pose changes.
Support vector machines (SVM) applied over HOG features improves the accu-
racy, but sacrifices the calculation speed [8]. The method of convolutional neural
network can deal with various data distribution problems and achieve high accuracy,
but requires a large amount of training data and takes a lot of time [9]. On the other
hand, segmentation approaches assign a binary label to each pixel value of the image.
Face registration can solve the problem of false detection and missing caused by
head posture transformation, and improves the accuracy of the subsequent procedure.
For both 2D and 3D data, the purpose of face registration is to rotate or frontal the
face. In 2D face registration, the active shape model (ASM) [10] and its extended
active appearance model (AAM) [11] find face landmarks by encoding standard
facial geometry and grayscale information.
The fully connected neural network [12] constructs a learning model by specifying
the number of network layers and the number of neurons. Then it determines different
learning strategies according to the sample distribution, which includes the activation
function, the loss function, and the selection of the optimization method. However,
the training process of the neural network is a completely black box. It is necessary
to add a helper function to observe the learning curve to detect whether the network
converges. In addition, the fully connected neural network is not good at processing
image data.
Support vector machine (SVM) [13] is a traditional and widely used machine
learning method. The disadvantage of SVM is that its choice of kernel functions rather
than parameters determines the overfitting of the model, but kernel functions are very
sensitive. Random forest [14] is an ensemble method that essentially superimposes
the output of several decision trees as the final result. Each individually trained
decision tree is weakly discriminating, and integrating their outputs with weights
can achieve high accuracy. The disadvantage of random forests is that the increase
of training samples is not proportional to the improvement of accuracy. At present,
the deep learning method is the mainstream of the visual field, especially the deep
convolutional network. The calculation method of weight sharing makes it possible
to extract features with posture, illumination, and occlusion invariability when facing
new problems. The shortcomings are explained in the previous paragraph.
This section will introduce the proposed expression recognition framework. We uti-
lized the deep convolutional network to integrate feature extraction and emotion
recognition into one pipeline.
In the face detection part, we use SVM method applied HOG features [8], which
construct feature vectors by calculating the histograms of gradient of the local regions
of the image and then puts them into the classifier. If the result is positive, return the
316 H. Yang
position of the detection area which is the coordinates of the upper left corner of the
bounding box ( Xl , Yl ) and the coordinates of the lower right corner (Xr , Yr ).
In the face aliment part, we use the millisecond ensemble method proposed in
[14] to train several regression trees using gradient boosting, and then regression the
68 landmark points included eyes contour, bridge of the nose and mouth contour, by
the ensemble of decision trees (Fig. 33.2).
This article will describe the dense network in the following four aspects: architecture,
convolutional layer, transition layer, and training strategy.
Architecture. The dense convolutional neural network contains a total of 37 convo-
lutional layers, three pooling layers, and a softmax layer. The input is a 48 × 48 × 1
gray image, then through a 3 × 3 convolution layer, followed by three dense blocks
each containing 12 convolution layers. Connected at the end of each dense block, a
transition layer consists of an average pooling, a bottleneck layer, and a compression
layer. The purpose of the softmax layer is to map the output of multiple neurons into
the interval of (0, 1), which is calculated as
33 Real-Time Emotion Recognition Framework … 317
T (i)
e θj x
p y(i) = j|x(i) ; θ = k T (i)
(33.1)
l=1 e θl x
where y(i) represents the label of a certain type of expression, x(i) represents the
input feature, and θ is the total weight of the network. Above function’s output is the
confidence of a specific type of expression (Fig. 33.3).
Convolution Layer. Unlike the vertical expansion algorithm ResNet [15] of DNN,
which uses the identity function to extend the effective training length, and also
unlike the lateral expansion algorithm Inception [16], which uses different sizes of
convolution filters to perform features extraction on different scales. Dense network
highly reuses feature maps and allows any layer in the network to simultaneously
use the single feature map and all feature maps of the front layer, which makes the
network more efficient and reduces a large number of parameters.
In addition, the described convolutional layer includes not only the convolution
calculation of the filtering window but also the activation function ReLU and Batch
Normalization [17]. The generalized calculation in the convolutional layer is shown
in Eq. (33.2).
⎧
⎪
⎪ f1 (xi ) = max(0, xi )
⎪
⎪
⎨ f2 (xi ) = conv3∗3 (f1 (xi ))
f (x )−E f (x ) (33.2)
⎪
⎪ f3 (xi ) = 2√i [ 2 i ]
⎪
⎪ V ar [f2 (xi )]
⎩F = f x ,x ,x ,...,x
output 3 1 2 3 l−1
Transition layer. Transition layer exists in the middle of two dense blocks and has
two purposes: reducing network parameters and facilitating the calculation of the next
dense block. The average pooling layer is a kind of pooling, which calculates the
average value in the subarea and inputs the next layer. The essence of the bottleneck
layer is a 1 × 1 convolutional layer, and its main purpose is not to extract features, but
Fig. 33.3 Proposed DenseNet with three dense block and 7-Softmax layer
318 H. Yang
However, blindly following the gradient acceleration update also brings instability.
The Nesterov momentum gives the approximate gradient trend information after the
optimization function by calculating θ − βvt−1 . If the gradient has an increasing
trend, speed up the update rate, if the gradient has a decreasing trend, slow down the
update speed rate, as shown in formula (33.4).
vt = βvt−1 + α∇θ L(θ − βvt−1 )
(33.4)
θ = θ − vt
This section will present our experimental environment and experimental results.
Hardware Devices. All the model training in this paper is on the GTX1060 graphics
card. It has 1280 CUDA units, 6 GB GDDR5 memory, and core frequency 1506 MHz,
and single-precision floating-point operation is 4.4 TFlops. The test device uses a
screen-integrated 2-megapixel camera that is sufficient for facial expression recog-
nition in images.
Dataset. FER2013 contains 35,887 gray images of 48 × 48 pixels. At the first
publication time, the dataset labels were divided into seven categories, including
4953 cases of “anger”, 547 cases of “disgust”, 5121 cases of “fear”, 8989 cases
33 Real-Time Emotion Recognition Framework … 319
of “happy”, 6077 cases of “sadness”, 4002 cases of “sadness”, and “4002 cases of
surprise” and “Neutral” 6198 cases. This labeling was later verified to be inaccurate,
and we trained in the dataset, as well as the improved FER PLUS dataset [20] and
FERFIN modified from FER PLUS.
Training on the FER2013 database. For this dataset, the setting of hyperparameters
is as follows: add L2 regularization with coefficient λ 0.0001; add the compression
layer with compression factor θ 0.5; learning rate of Nesterov momentum ε is set
to 0.1; and the momentum parameter α is 0.1. The accuracy of the network in the
verification set is reached 67.01%, as shown in Fig. 33.4.
Training on the FER PLUS database. Among the challenges suggested by Good-
fellow et al. [21] regarding neural network resolution classification problems, the
performance degradation caused by the low accuracy of human labeler is included.
Therefore, we trained the second model on FER PLUS dataset. FER PLUS used the
crowdsourcing method to improve the accuracy of the label. In the original paper,
four predesigned way to handle the objective function setting problem. We only use
the majority vote for preprocessing as the main focus is on the update of framework.
The accuracy of the network in the verification set reaches 81.78%, as shown in
Fig. 33.5. The work of [20] uses VGG13 to achieve an average accuracy of 83.97%
under the use of the majority vote loss function strategy. Their network parameters
were 8.7 million, about 147 times of our network.
320 H. Yang
33.5 Conclusion
There are classic pre-design feature’s methods and emerging deep learning methods
in the field of expression recognition. The former involve more prior knowledge
and have less generalization ability. Early deep learning methods can achieve top-
level accuracy but require millions of parameters. In order to train the expression
recognition network parameters with the deep convolution model, this paper proposes
to use the dense convolutional network as the new training network. Its multilevel
connection and feature reuse feature reduce network parameters; while enhancing the
network representation capability, it can reduce the need of the number of trainable
parameters as much as possible to achieve the expected accuracy.
References
1. Suwa, M., Sugie, N., Fujimora, K.: A preliminary note on pattern recognition of human emo-
tional expression. In: Proceedings of the 4th International Joint Conference on Pattern Recog-
nition 1978, IAPR, pp. 408–410, Kyoto, Japan (1978)
2. Tian, Y.I., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. IEEE
Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2002)
3. Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression
database. In: 8th IEEE International Conference on Automatic Face & Gesture Recognition
2008, pp. 1–6. Amsterdam, Netherlands (2008)
4. Corneanu, C.A., Simon, M.O., Cohn, J.F., et al.: Survey on RGB, 3D, thermal, and multimodal
approaches for facial expression recognition: history, trends, and affect-related applications.
IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1548–1568 (2016)
5. Ji, Q.: Looney: A probabilistic framework for modeling and real-time monitoring human
fatigue. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 36(5), 862–875 (2006)
6. Ashraf, A.B., Lucey, S., Cohn, J.F.: The painful face—pain expression recognition using active
appearance models. Image Vis. Comput. 27(12), 1788–1796 (2009)
7. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In:
IEEE Conference Computer Vision Pattern Recognition 2001, vol. 1, pp. I–511 (2001)
8. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer
Society Conference on Computer Vision and Pattern Recognition 2005, CVPR, pp. 886–893,
San Diego, USA (2005)
33 Real-Time Emotion Recognition Framework … 321
9. Osadchy, M., Miller, M., Lecun, Y.: Synergistic face detection and pose estimation. J. Mach.
Learn. Res. 8(1), 1197–1215 (2006)
10. Cootes, T.F., Taylor, C.J., Cooper, D.H., et al.: Active shape models-their training and appli-
cation. Comput. Vis. Image Underst. 61(1), 38–59 (1995)
11. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal.
Mach. Intell. 23(6), 681–686 (2001)
12. Tian, Y.L., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis.
IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2001)
13. Lemaire, P., Ardabilian, M., Chen, L., et al.: Fully automatic 3D facial expression recognition
using differential mean curvature maps and histograms of oriented gradients. In: 10th IEEE
International Conference and Workshops on Automatic Face and Gesture Recognition 2013,
(FG), pp. 1–7, Shanghai, China (2013)
14. Dapogny, A., Bailly, K., Dubuisson, S.: Dynamic facial expression recognition by joint static
and multi-time gap transition classification. In: 11th IEEE International Conference and Work-
shops on Automatic Face and Gesture Recognition 2015, (FG), pp. 1–6, Ljubljana, Slovenia
(2015)
15. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Proceedings
of Computer Vision—ECCV 2016, vol. 9908, pp. 770–778. Springer, Cham (2016)
16. Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)
17. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift. In: International Conference on Machine Learning. JMLR, pp. 448–456
(2015)
18. Su, W., Boyd, S., Candes, E.J.: A differential equation for modeling Nesterov’s accelerated
gradient method: theory and insights. Adv. Neural Inf. Process. Syst. 3(1), 2510–2518 (2015)
19. FER2013 Dataset. https://www.kaggle.com/c/challenges-in-representation-learning-facial-
expression-recognition-challenge. Accessed 25 Jan 2019
20. Barsoum, E., et al.: Training deep networks for facial expression recognition with crowd-
sourced label distribution. In: ACM International Conference on Multimodal Interaction ACM,
pp. 279–283 (2016)
Chapter 34
Facial Expression Recognition Based
on Regularized Semi-supervised Deep
Learning
Taiting Liu, Wenyan Guo, Zhongbo Sun, Yufeng Lian, Shuaishi Liu
and Keping Wu
Abstract In the field of facial expression recognition, deep learning has attracted
more and more researchers’ attention as a powerful tool. The method can effectively
train and test data by using a neural network. This paper mainly uses the semi-
supervised deep learning model for feature extraction and adds a regularized sparse
representation model as a classifier. The combination of deep learning features and
sparse representations fully exploits the advantages of deep learning in feature learn-
ing and the advantages of sparse representation in recognition. Experiments show
that the features obtained by deep learning have certain subspace features, which
accord with the subspace hypothesis of face recognition based on sparse representa-
tion. The method of this paper has a good recognition accuracy in facial expression
recognition and has certain advantages in small sample problems.
34.1 Introduction
In recent years, facial expression recognition has been used as a biometric recognition
technology. It has become an important research topic in the fields of multimedia
information processing, human–computer interaction, image processing, and pattern
recognition. Labels play an important role in facial expression recognition, but are not
readily available. The semi-supervised learning method can simultaneously utilize
both labeled and unlabeled samples in the training set. The purpose of learning is
to construct a learning model with a small number of labeled samples and a large
number of unlabeled samples. Early on semi-supervised deep learning research was
Weston et al. [1]. They attempted to introduce the Laplacian regular term in semi-
supervised learning of graph theory into the objective function of neural network
and semi-supervised training on multilayer neural networks. Lee [2] proposed a
network that was trained in a supervised fashion with labeled and unlabeled data
simultaneously. For unlabeled data, just picking up the class which has the maximum
predicted probability. With denoising autoencoder and dropout, this simple method
outperforms conventional methods for semi-supervised learning.
On the other hand, inspired by sparse coding [3] and subspace methods [4], Wright
et al. [5] proposed a classification method based on sparse representation, using the
original training face image as a dictionary to solve the sparse coefficient of the
test sample by norm. The classification result is obtained by solving the minimum
residual. On the basis of John Wright’s work, a series of researches on classification
methods based on sparse representation have made some progress, including the
research work on dictionary learning in sparse representation [6]. The literature [7]
creatively introduces the compensation dictionary based on sparse representation. In
the method of face recognition, a certain breakthrough has been made in the face
recognition problem of small samples. The literature [8, 9] points out that sparse
representations on facial expression recognition also have a significant effect.
This paper uses the semi-supervised deep learning model for feature extraction
and adds a regularized sparse representation model as a classifier. The combination
of deep learning features and sparse representations fully exploits the advantages
of deep learning in feature learning and the advantages of sparse representation in
recognition.
The sparse representation-based classification (SRC) [5] assumes that the face image
is located in the linear subspace, and the test sample can be cooperatively and linearly
expressed for the training samples (dictionaries) of all classes, and the sample belongs
can be expressed more sparse (with fewer dictionaries for better refactoring). After
increasing the constraint of sparse representation coefficient, the non-0 items in
the sparse representation coefficient obtained by the solution should be mainly the
corresponding items of the category dictionary to which the test sample belongs.
Therefore, it is possible to classify test samples according to a dictionary to obtain
smaller errors, which is how SRC works. The algorithmic process of SRC is as
follows:
(1) The test sample is represented as a linear combination of the dictionary A, and
the sparse coefficient is obtained by the L 1 norm minimization solution:
α̂ = arg min y − Aα22 + λα1 (34.1)
34 Facial Expression Recognition Based on Regularized … 325
The classification method based on sparse representation can effectively utilize the
subspace characteristics of face images, does not require a large number of samples
for classifier learning, and is robust to noise.
The identification method based on sparse representation assumes that each type
of training sample (dictionary) must be complete, and each type of training sam-
ple (dictionary) has sufficient expressive power. This assumption is generally unac-
ceptable in small sample problems with large disturbances such as illumination,
attitude, occlusion, etc. In face recognition problems with small samples and large
interference, test pictures are often misclassified into classes with similar intra-class
variations, rather than classes with the same appearance changes.
To improve the robustness of facial recognition, the classifier adopts the sparse rep-
resentation based on regularized coding. Figure 34.1 gives an overall process.
The main process of classifier implementation is as follows:
(1) The original spatial data is embedded into the feature space, and different
weights are assigned to each pixel point of the facial expression image to be
tested.
Full connection
The weight after Sparse
weight Classification
convergence representation
initialization
The algorithm uses both labeled and unlabeled data, so it is a semi-supervised learning
algorithm. In this paper, the operation structure diagram of facial features extracted
by the regularized semi-supervised deep learning algorithm is shown in Fig. 34.2.
The steps of the facial expression recognition method based on the regularized
semi-supervised deep learning framework are as follows:
(1) Training autoencoder with unlabeled training data to get W and b.
(2) Remove the last layer of autoencoder. Get the function f (x).
(3) Enter the labeled training data (x) into the trained autoencoder. Get the new data
(x = f (x)). Using the new data (x ) replace the raw data (x) for subsequent
training. We call the new data a replacement input.
CNN
Unlabeled Autoencoder
Network Classifier
training data feature learning
initialization
Regularized semi-supervised
feature extraction
Facial
Labeled training
expression
data
database
Fine-tuning
Training data
Fig. 34.2 The expression recognition structure of the semi-supervised deep learning method
34 Facial Expression Recognition Based on Regularized … 327
The feature extraction process used in this paper is based on an autoencoder semi-
supervised convolutional neural networks. The parameters of each layer of the net-
work are shown in Table 34.1.
The dropout probability used in this network training is 50%, and the activation
function uses Relu.
This paper uses the Fer2013 facial expression dataset for training. The database
contains a total of 35,887 face images, including 28,709 training sets, 3,589 vali-
dation sets, and 3,589 test sets. The images in the database are grayscale images,
the size is 48 * 48 pixels, the sample is divided into seven categories: 0 = anger
(angry), 1 = disgust (disgust), 2 = fear (fear), 3 = happy (happy), 4 = SAD (sad),
5 = surprised (surprised), 6 = normal (neutral), and the distribution of each type
is basically uniform. Use the FC2 layer as a face feature and use L1 regularized
expression classifier to identify the facial expression.
Randomly select 300 images of each expression type in the training sets of the
FER2013 database. A total of 2100 facial expression images were used as the unla-
beled training database for this experiment for training semi-supervised learning
models. The following are the facial expression recognition cases when different
classifiers are used.
Softmax classification is the most commonly used classifier in deep learning. The
facial expression recognition results of Softmax classification is shown in Table 34.2.
It can be seen that the recognition rate of happy is significantly higher than other
expressions; meanwhile, the recognition rate of fear is the most difficult to distin-
guish.
The classification method based on sparse representation can effectively utilize the
subspace characteristics of face images, does not require a large number of samples
for classifier learning, and is robust to noise. The facial expression recognition results
of sparse representation classifier are shown in Table 34.3.
After replacing the Softmax classifier with a sparse representation classifier, we
found that the recognition rate of happy is increased by 0.59%, the recognition rate
of fear is increased by 1.22%, and the error rate related to mistaking fear for sad
decreased by 0.66%. These recognition results demonstrated that the proposed algo-
rithm not only improved the recognition accuracy of easily distinguishable categories
Table 34.3 The facial expression recognition results of sparse representation classification via deep
learning features
Angry Disgust Fear Happy Sad Surprised Neutral
Angry 59.42 0.69 9.97 5.27 13.73 2.43 8.49
Disgust 11.98 74.36 3.21 2.12 2.14 3.85 2.34
Fear 14.42 0.61 53.65 4.14 10.27 6.63 10.28
Happy 2.13 0 2.06 90.43 1.79 1.32 2.27
Sad 8.28 0.53 5.78 3.16 55.96 1.75 24.54
Surprised 2.39 0.41 6.68 4.02 1.37 82.67 2.46
Neutral 5.13 0.37 3.87 3.98 10.18 1.69 74.78
Table 34.4 The facial expression recognition results of L 1 regularized sparse representation clas-
sification via deep learning features
Angry Disgust Fear Happy Sad Surprised Neutral
Angry 59.84 0.67 9.95 5.23 13.54 2.41 8.36
Disgust 11.72 74.71 3.19 2.12 2.11 3.85 2.3
Fear 14.17 0.6 54.13 4.14 10.13 6.59 10.24
Happy 2.07 0 2.03 90.67 1.72 1.26 2.25
Sad 8.19 0.54 5.73 3.08 56.52 1.75 24.19
Surprised 2.28 0.39 6.56 3.97 1.39 83.04 2.37
Neutral 5.06 0.33 3.78 3.94 9.86 1.71 75.32
330 T. Liu et al.
Table 34.5 shows the recognition accuracy of different classifiers on the Fer2013
dataset. By changing the classifier, the recognition rate is increased by 0.69% and
0.42% when using the sparse representation classifier and the L1 sparse represen-
tation classifier. When used the L1 algorithm to recognize facial expressions, the
average recognition rate reaches 70.60%.
To verify the validity of the method in this paper. Use other methods of facial
expression recognition comparing with the proposed methods in this paper. Table 34.6
is a comparison of the recognition rate of the facial expression recognition system
with other algorithms on FER2013 database.
As can be seen from Table 34.6, the proposed algorithm has an advantage in
the recognition rate of the Fer2013 dataset. The DNNRL [11] improves local feature
recognition through the Inception layer and updates the model more or less according
to the sample difficulty. The algorithm of FC3072 proposed in literature [12] sets
a fully connected layer of 3072 parameters, which requires a lot of calculations.
The algorithm proposed in this paper uses the sparse representation classifier for
classification, the features obtained by deep learning have linear subspace features,
and the use of classifiers based on sparse representation has outstanding advantages
for small sample problems. The algorithm in this paper has certain advantages.
34.5 Conclusion
This paper improves on the semi-supervised deep learning algorithm and introduces
regularization items in sparse representation classification. By comparing the sparse
representation classifier with regularization. The experimental results show that the
introduction of regularization has improved the recognition rate of facial expres-
sions. Future research work will further study and analyze the characteristics of deep
learning. By improving the network structure and loss function, the characteristics
of the network will be more satisfied with the linear subspace constraints, and the
recognition effect will be further improved.
Acknowledgements This paper is supported by Jilin Provincial Education Department “13th five-
year” Science, Technology Project (No. JJKH20170571KJ), National Natural Science Foundation
of China under Grant 61873304, The Science & Technology Plan Project Changchun City under
Grant No. 17SS012, and the Industrial Innovation Special Funds Project of Jilin Province under
Grant No. 2018C038-2 & 2019C010.
References
1. Weston, J., Ratle, F., Mobahi, H., et al.: Deep learning via semi-supervised embedding. Neural
Networks: Tricks of the Trade, pp. 639–655. Springer, Berlin, Heidelberg (2012)
2. Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep
neural networks. In: Workshop on Challenges in Representation Learning, ICML, vol. 3, p. 2
(2013)
3. Huang, K., Aviyente, S.: Sparse representation for signal classification. Advances in Neural
Information Processing Systems, pp. 609–616 (2007)
4. Lee, K.C., Ho, J., Kriegman, D.J.: Acquiring linear subspaces for face recognition under vari-
able lighting. IEEE Trans. Pattern Anal. Mach. Intell. 5, 684–698 (2005)
5. Wright, J., Yang, A.Y., Ganesh, A., et al.: Robust face recognition via sparse representation.
IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
6. Yang, M., Zhang, L., Feng, X., et al.: Fisher discrimination dictionary learning for sparse repre-
sentation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 543–550.
IEEE (2011)
7. Deng, W., Hu, J., Guo, J.: Extended SRC: undersampled face recognition via intraclass variant
dictionary. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1864–1870 (2012)
8. Fan, Z., Ni, M., Zhu, Q., et al.: Weighted sparse representation for face recognition. Neuro-
computing 151, 304–309 (2015)
9. Guo, Y., Zhao, G., Pietikäinen, M.: Dynamic facial expression recognition with atlas construc-
tion and sparse representation. IEEE Trans. Image Process. 25(5), 1977–1992 (2016)
10. Goodfellow, I.J., Erhan, D., Carrier, P.L., et al.: Challenges in representation learning: a report on
three machine learning contest. In: International Conference on Neural Information Processing,
pp. 117–124. Springer, Berlin, Heidelberg (2013)
11. Guo, Y., Tao, D., Yu, J., et al.: Deep neural networks with relativity learning for facial expres-
sion recognition. In: 2016 IEEE International Conference on Multimedia & Expo Workshops
(ICMEW), pp. 1–6. IEEE (2016)
12. Kim, B.K., Roh, J., Dong, S.Y., et al.: Hierarchical committee of deep convolutional neural
networks for robust facial expression recognition. J Multimodal User Interfaces 10(2), 173–189
(2016)
Chapter 35
Face Recognition Based on Local Binary
Pattern Auto-correlogram
Abstract Face recognition mainly includes face feature extraction and recognition.
Color is an important visual feature. Color correlogram (CC) algorithm is com-
monly used in the color-based image retrieval as a feature descriptor, but most of
the existing methods based on CC have problems of high computational complexity
and low retrieval accuracy. Aiming at this problem, this paper proposes an image
retrieval algorithm based on color auto-correlogram. The new color feature vector
which describes the global and spatial distribution relation among different colors
is obtained in the CC feature matrix, thus reducing the computational complexity.
Inter-feature normalization is applied in color auto-correlogram (CAC) to enhance
the retrieval accuracy. The experimental result shows that this integrated method can
reduce the computational complexity and improves real-time response speed and
retrieval accuracy.
35.1 Introduction
Face recognition has been widely used in different fields. Many face recognition
algorithms have gained encouraging performance. Face recognition mainly includes
two parts: face feature extraction and recognition. Feature extraction is the mapping
process of face data from the original input space to the new feature space, taking
the right way to extract face feature, such as size, location, and profile informa-
tion. Face recognition can be generally classified into the following categories [1]:
image-based methods [2], such as integral projection method, mosaic image method
[3], and symmetry analysis method; template-based methods, such as deformable
template method, active contour model method [4], etc.; statistical learning-based
methods, such as feature face method [5], visual learning method [6], neural network
method [7], etc. At present, the main face feature extraction methods divide into two
categories: global feature and local feature. Global features can represent complete
structural information, such as facial contour, skin color, and the overall nature of
facial features. In order to extract features, a linear subspace of training set is con-
structed based on global features. The image to be recognized can be reproduced
by projecting to the linear subspace. Typical subspace-based methods include prin-
cipal component analysis, linear discriminant analysis, and independent component
analysis. Local features are robust to changes in light conditions, expressions, and
attitudes. In order to adapt to local changes, local feature method trains recognition
parameters based on the geometric relationship between facial organs and feature
parts. Local feature methods mainly include Gabor transform [8], local binary pattern
(LBP) [9], and histogram of oriented gradient (HOG). The method based on Gabor
transform can extract multi-direction and multi-scale information. At the same time,
the method has strong robustness in light condition and expression, but the efficiency
of Gabor transform is low. LBP can capture the fine details of the image and has strong
classification ability, but its adaptability to random noise is poor. More effective face
recognition not only uses a single method but also combines various methods organ-
ically. It maximizes the information obtained from the image itself and from a large
number of samples, fully combines prior knowledge to realize face recognition, and
forms a unique face recognition algorithm system. In order to improve the accuracy
of face recognition, a face recognition algorithm based on LBP auto-correlogram
and SVM is proposed in this paper. After extracting the LBP auto-correlogram tex-
ture feature of the original face image, the LBP auto-correlogram feature is used as
the input of SVM classifier. The experiments on ORL and AR databases verify the
validity of the proposed algorithm.
0 1 1 Binary:
51 98 198
(01111100)=124
34 80 204 0 1
67 189 251 0 1 1
Ojala et al. [10] introduced the LBP texture operator in 1996, which originally works
with the 3 × 3 neighborhood. The pixel values of eight neighbors are decided by the
value of the center pixel, and then, the so-threshold binary values are weighted by
powers of two and summed to obtain the LBP code of the center pixel. Figure 35.1
shows an example of the LBP operator. In fact, let gc and g0, …, g7 denote, respec-
tively, the gray values of the center and its eight-neighbor pixels, and then the LBP
code for the center pixel with coordinate (x, y) is calculated by (35.1)
7
LBP(x, y) = s(gc − gp ) · 2p (35.1)
p=0
In traditional real tasks, the statistic representation of LBP codes, LBP histogram
(LBPH), is usually used. That is, the LBP codes of all pixels for an input image are
collected into a histogram as a texture descriptor, i.e.,
LBPH(i) = δ{i, LBP(x, y)}, i = 0, . . . , 27 (35.3)
x,y
For the computation of LBPH, the uniform patterns are used such that each uniform
pattern has an individual bin and all nonuniform patterns are assigned to a separate
bin. So, with 8 neighbors, the numbers of bins for standard LBPH are 256 and 59
for uniform patterns LBPH, respectively; with 16 neighbors, the numbers of bins
are 65,536 and 243, respectively. Clearly, the uniform patterns are able to reduce the
length of histogram vectors [12].
The support vector machine (SVM) minimizes the empirical risk and confidence
range by seeking the minimum structural risk and makes its classification more
extensive. The basic idea of SVM is map data put into high-dimensional space, and
then build the optimal classification hyperplane in the new space. In this paper, radial
basis kernel function is selected:
K(x, y) = exp −rx − y2 (35.4)
70% of data were selected as training data, and the expert interpretation results
in the database were taken as classification labels. And the training data and labels
were input into the SVM classifier to obtain the classification model. The remaining
30% of data were selected as testing data; the expert interpretation results in the
database were taken as test labels. And the training data were input into the SVM
classification model to get the classification results, compare classification results
and the test labels, and then calculate the classification accuracy. After the whole
above analysis, the face recognition process is designed as in Fig. 35.3.
35 Face Recognition Based on Local Binary Pattern Auto-correlogram 337
The main steps of color face recognition based on color auto-correlogram and LBP
are as follows:
(1) Sample selection. Select the training sample image from the face database.
(2) Color auto-correlogram. Obtain the color auto-correlogram image set using the
calculation method in Sect. 2.1.
(3) LBP feature. First, each image is segmented with same size in the training set,
and the feature vector of the images in the training set is obtained by using the
calculation method in Sect. 2.2. Then, 2DPCA is used to reduce the dimension of
the feature vector. Finally, the results after dimensionality reduction are taken as
a basis of a set of vectors, and then the sample training set images are projected
on this set of vectors, respectively, to obtain the LBP features of the sample
training set images.
(4) Face recognition. First, the color auto-correlogram histogram and LBP features
are integrated to obtain the final features of the training set image. Second, the
remaining images in the face database are taken as the testing set, and the final
features of the testing set images are obtained through the same steps as the
training set. Finally, SVM classifier is used for face recognition.
In order to make the algorithm more robust, the recognition rate of the algorithm
applied to color face recognition is compared by changing the proportion of the
training set in the color face database. In the experiment, the proportion of the train-
ing set is considered to be 30%, 40%, 50%, 60%, and 70%, respectively. In this
experiment, color features are extracted from the color face image, and then the gray
level of the color face image is converted and acquired grayscale texture features.
Finally, color features and grayscale texture features are combined into one color
person by proportional distribution method.
Face image recognition. Through the experiment, we find that the selection of
proportional allocation parameters has a certain impact on the recognition accuracy
of color face recognition. How to reasonably allocate the value of color features and
gray features, and realize the optimal combination of color features and gray texture
features by constantly adjusting the proportion allocation, so as to obtain higher
recognition accuracy in color face image recognition, as shown in the Table 35.1.
In the field of face recognition, texture feature is a very important representation
method of face feature, while color face recognition is inseparable the representation
of a color feature. According to the data in Table 35.1, the larger the proportion
of color features is, the lower the recognition accuracy will be. The reason is the
color feature cannot express the key face information of the color face image. It
only describes the distribution of color and the spatial correlation between colors
35 Face Recognition Based on Local Binary Pattern Auto-correlogram 339
in the color face image. According to the data in Table 35.1, when the proportional
distribution of color auto-correlogram and LBP operator is 3:7, the accuracy of color
face recognition is the highest. We set the value assignment parameter to 3:7 as the
following experimental parameters.
The data shows that the combination algorithm is superior to the single algorithm
in Table 35.2.
35.4 Conclusion
In this paper, the application of color auto-correlogram combined with LBP method
is presented in color face recognition. Color auto-correlogram can well express the
color feature of face image, and LBP method can well describe the texture feature
of face image. Therefore, by combining the advantages of color auto-correlogram
and LBP method, the color feature and texture feature of color face image can be
extracted well and recognized by SVM classifier. Finally, experiments show that this
method is suitable for color face image recognition, and its accuracy is improved.
Acknowledgements This work is supported by the Science and Technology Department Research
Project of Jilin Province (No. 20190302115GX).
References
1. Xu-feng, Ling, Jie, Yang, Chen-zhou, Ye: Face detection and recognition system in color image
series. Acta Electronica Sinica 31(4), 544–547 (2003)
2. Moghaddam, B., Pentland, A.: Probabilistic visual learning for object representation. IEEE
Trans. PAMI 19(7), 696–710 (1997)
3. Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. Int. J. Comput.
Vis. 56(3), 151–177(2004)
4. Huang, C.L., Chen, C.W.: Human facial feature extraction for face interpretation and recogni-
tion. Pattern Recognit. 25(12), 1435–1444 (1992)
5. Turk, M., Pentland, A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991)
6. Sung, Kah-Kay, Poggio, Tomaso: Example-based learning for view-based human face detec-
tion. IEEE Trans. PAMI 20(1), 39–50 (1998)
7. Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks.
Science 313(9), 504–507 (2006)
8. Yoo, C.H., Kim, S.W., Jung, J.Y., et al.: High-dimensional feature extraction using bit-plane
decomposition of local binary patterns for robust face recognition. J. Vis. Commun. Image
Represent. 45(C), 11–19(2017)
9. Zhao, Z., Jiao, L., Zhao, J., et al.: Discriminant deep belief network for high-resolution SAR
image classification. Pattern Recognit. 61, 686–701 (2017)
340 Z. Li et al.
10. Ojala, T., Pietikainen, M., Harwood, D.: A Comparative study of texture measures with clas-
sification based on feature distributions. Pattern Recognit. 29, 51–59 (1996)
11. Shen, X., Wang, X., Du, J.: Image retrieval algorithm based on color autocorrelogram and
mutual information. Comput. Eng. 40(2), 259–262(2014)
12. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture
classification with local binary patterns. IEEE Trans. PAMI 24(7), 971–987 (2002)
13. Huang, J., Kumar, S.R., Mitra, M., et al.: Spatial color index in gand applications. In: 6th
International Conference on Computer Vision. IEEE Press, Bombay, India (1998)
14. http://www.anefian.com/research/face_reco.htm
Chapter 36
Saliency Detection Based
on the Integration of Global Contrast
and Superpixels
Abstract In the field of computer vision, the detection of salient object is an impor-
tant step and one of the preconditions for salient object extraction. The outcome
resulting from some existing detection methods for salient object is considerably
different from the Ground Truth. In view of the shortcomings of existing methods,
this paper proposes a saliency detection method based on the integration of global
contrast and superpixels. The salience value of each pixel is measured according to
the global contrast of the pixels in the image. A histogram optimization technique is
used to highlight the low-contrast pixels of the salient region in the image and omit
the high-contrast pixels of the background. In order to improve the image quality of
the salient image, the superpixel image segmentation based on K-Means clustering
algorithm is proposed, and finally, we generate a more accurate saliency map through
the integration with superpixels. The experiment is performed on the public dataset
MSRA10 K. The results show that the histogram optimization can help improve
the contrast of the salient pixels and generate a better saliency map by integrating
with superpixels. Compared with other classical algorithms, the proposed method
outperforms other methods.
36.1 Introduction
The human eyes can quickly and accurately find the target object in a complex scene
based on the degree of stimulation it exerted on the eyes. The saliency detection is
mainly used for the extraction of salient target in digital images, to simulate human
recognition of salient objects, and to identify the most attractive targets or features in
natural images. Saliency detection is one of the research hotspots of computer vision
in recent years. How to enable computers to quickly and accurately extract valuable
information from a large number of image sets has become one of the challenges in
the field of computer vision.
In recent years, saliency detection has been widely used in many fields such as
image segmentation, image compression, intelligent image retrieval, image match-
ing, and target recognition. There are more and more methods for detecting image
saliency. Some of them are based on biology, information theory, frequency domain,
and contrast. In addition, contrast-based detection methods consist of global contrast
and local-based contrast. However, the results generated by many saliency detection
algorithms are lacking in sufficient similarity with the Ground Truth. In this paper,
the initial saliency map is obtained by the saliency detection method based on global
contrast. Then the histogram optimization technique is adopted to improve the dis-
play effect of the saliency map. Finally, the superpixel image segmentation helps
integrate with the saliency map to generate the final one.
The paper is organized as follows. Section 36.2 describes related work. In
Sect. 36.3, we present a detailed description of our method, including global color
contrast, histogram optimization, and the integration with superpixels. Section 36.4
shows the experimental results. Section 36.5 gives the conclusion.
In the 1990s, experts and scholars began to study saliency detection and applied
saliency detection to biology. In the early stage, the methods of saliency detection
were relatively simple and had some noticeable errors. In recent years, many experts
and scholars have committed to the study of saliency detection and proposed a variety
of methods, and some of them are widely used in face recognition [1], image seg-
mentation [2], video fusion [3] and other fields. Many experts have proposed some
evaluation indicators and verification methods for the results of saliency detection
[4, 5]. The saliency detection model is divided into two categories: “bottom-up” and
“top-down”. The former is driven by data and does not require any prior knowledge;
the latter is task-driven and needs to rely on prior knowledge.
At the current stage, many scholars widely use bottom-up models for research.
Wang et al. suggested that using global information for saliency detection is an
effective method, and adopted a new bottom-up model combined with multiscale
global cues for saliency detection. Yun et al. proposed the LC algorithm by using
36 Saliency Detection Based on the Integration … 343
global pixel differences [6]. The HC algorithm proposed by Chen Mingming used
color differences between global pixels to produce a saliency map [7]. Niu et al.
[8] employed K-means method to cluster the images and proposed an improved
algorithm of clustering and fitting (CF) for saliency detection. This method also
achieved good results. Ishikura et al. [9] measured locally perceived color differences
by multiscale extrema for saliency detection. Singh [10] et al. used global contrast
versus local information to improve the color contrast of the overall image. This
method had certain limitations in the work of extracting saliency maps. Cuevas-
Olvera et al. [11] integrated image information with superpixels to extract saliency
maps. But this method did not use histograms for optimization and some considerable
noises in the final saliency map can be found from the experimental results. To the
best of our knowledge, many saliency detection methods do not well integrate the
information inherent in the image with the histogram information and superpixels.
In this section, we introduce the methods and steps for our saliency detection. In
the first stage, the salience value of each pixel is measured by calculating the global
contrast of the pixels so that the salient object can be separated from the surround-
ing environment. For some images of complex texture, some errors may occur in
the salience value of pixel calculated by the global contrast. For instance, the pixel
contrast of the salient region is low, and the pixel contrast of the background is high.
Second, the histogram optimization is performed on the saliency map to dismiss
unreasonable distribution of contrast in the image. Third, the original image is seg-
mented by superpixels to form multiple pixel blocks with clear boundaries. Finally,
integrate the superpixel segmentation result with the histogram-optimized saliency
map to generate the final saliency map.
Image contrast is one of the key factors affecting human visual perception. Color
images are usually composed of multiple color channels, and multichannel calcula-
tion takes much time for global color contrast. Since the gray-contrast feature can
extract the information of salient feature, this paper adopts the gray channel for global
contrast calculation in the calculation of saliency value. Liu et al. [12] also adopted
gray channels when extracting the salient features of infrared images, and achieved
good experimental results.
In this paper, to calculate the global color contrast of the pixel Ic in image I, it
is necessary to traverse all the pixels and to calculate the sum of the color distances
344 Y. Huang et al.
Fig. 36.1 Original images (top) and saliency maps processed by global color contrast (bottom)
between the Ic and all the other pixels. The global contrast of the Ic can be regarded
as the salience value of the pixel, recorded as S(Ic ); the formula is as follows:
S(Ic ) = Ic − Ii (36.1)
∀Ii ∈I
Image I is a grayscale image, and the value of Ii is between 0 and 255. The
histogram is the statistical distribution of image pixels, which can directly show the
number of each gray level in the image. Since the distance between the pixels of the
same gray value and all the pixels in the image is the same, the histogram is used to
carry out prior statistics on the image, and the calculation results of the histogram is
stored in the array, which can help improve the efficiency of calculating the global
color contrast of the image. Reconstruct formula (36.1) with the following formula:
255
S(am ) = f nam − an (36.2)
n=0
From the result shown in Fig. 36.1, it can be found that after processing the image by
the method of Sect. 36.3.1, a high-resolution saliency map can be obtained. But some
pixels in the salient regions have low contrast and some in the background have high
contrast. To solve this problem, we propose a histogram optimization method, which
can help improve the overall display effect of the saliency image by enhancing the
pixel contrast of the salient regions and lowering the pixel contrast of the background
regions.
The processing results of Fig. 36.1 are displayed by the histogram, as shown
in Fig. 36.2b. We found that a large number of pixels are distributed in the range
of 0–50, and some pixels are distributed between 50 and 250 to varying degrees.
From the perspective of the ideal saliency map, the color values of the pixels in
the saliency map should be concentrated around 0 or 255 after extracting the salient
object. Therefore, we need to optimize the histogram so that the salient regions in the
salient image can be concentrated as close as possible to the color value of 255, and
the background regions in the salient image are distributed as close as possible to the
color value of zero. We set two thresholds, minlevel and maxlevel, which are used to
indicate the minimum and maximum gray values in the saliency map, respectively.
Change the value to 0 when its gray value is less than minlevel, and change the value
to 255 when its gray value greater than maxlevel. The color value of the middle
region is assigned by the region contrast, and the calculation formula is as shown in
formula (36.3).
⎧
⎨ 0, i f an ≤ minlevel
an = 255, i f an ≥ maxlevel (36.3)
⎩ an −minlevel
maxlevel−minlevel
, i f minlevel < an < maxlevel
We experimented with the MSRA1000’s public dataset and achieved good results
when setting minlevel = 85 and maxlevel = 170. The optimized histogram is shown
in Fig. 36.2b.
In 2018, Niu et al. [13] used a simple linear iterative clustering (SLIC) algorithm
based on color similarity and spatial distance to achieve superpixel segmentation in
the process of salient object segmentation and achieved good results. This method
converted the original image into the CIELAB color space and performed a five-
dimensional mensuration on the l, a, b color channels and the two-dimensional space
(x, y) of the pixels.
Set the number of superpixel blocks ready for division as k, and use the k-
mean clustering method to generate superpixels. Set the cluster center Ck =
346 Y. Huang et al.
Fig. 36.2 a Saliency map processed by global color contrast; b histogram before optimization;
c optimized histogram; and d saliency map optimized by histogram
[lk , ak , bk , xk , yk ]T , and move the cluster center Ck to the lowest gradient position in
the 3 × 3 neighborhood, avoiding the cluster center falling on the edge.
For an image of w×h pixels, after superpixel segmentation, the number √ of pixels in
each region is w×h/k, and the side length of each superpixel is S ≈ (w × h)/k (and
h indicate the number of pixels of the width and height of the image, respectively).
Calculate the spatial distance and color distance between the pixel and the cluster
center when Ck is in the adjacent region of 2S × 2S, as shown in formula (36.4),
(36.5), (36.6).
dc = (lk − li )2 + (ak − ai )2 + (bk − bi )2 (36.4)
ds = (xk − xi )2 + (yk − yi )2 (36.5)
2
ds
D= (dc ) +
2
m2 (36.6)
S
In formula (36.6), the threshold m is used to adjust the weight value of ds, and
the value range is [1, 40]. With formula (36.6), the pixel is allowed to update its own
region and the clustering center, and the above steps are iterated continuously until
the algorithm converges.
In this algorithm, k = 400 is set. After processing with the above method, the
original image is superpixel segmented to generate k superpixels, and there are obvi-
ous dividing lines at the edge of the salient object, which can clearly segment the
foreground and background objects, as shown in Fig. 36.3b.
36 Saliency Detection Based on the Integration … 347
Fig. 36.3 a Saliency map optimized by histogram; b the original image after superpixel segmen-
tation; c our Saliency map; and d Ground Truth
In the process of histogram optimization in Sect. 36.3.2, the edges of the salient
map may be impaired, or the pixels that originally belonged to the foreground have
become background pixels, resulting in a notable inaccuracy of the saliency map.
Finally, we integrate the superpixel image with the histogram-optimized image and
map the region range of each superpixel to the histogram. Set the salience value after
integration to S̄, as shown in (36.7).
G × G
S̄ = (36.7)
255
G is the average gray value of the blocks in saliency map optimized by the histogram,
and the value range is [0, 255]. G is the average gray value of the region of the
superpixel map, and the value range is also [0, 255]. If the average gray value in the
region of histogram after optimization is 0, then S̄ = 0, the value range of S̄ can be
found by the formula (36.5) is [0, 255]. We set a threshold δ. If S̄ is smaller than δ, the
superpixel region is the background region. If S̄ is larger than δ, then the superpixel
region is a salient region. If it is a salient region, the gray value of the pixel in this
region in Fig. 36.3a is updated to 255; otherwise, it is updated to 0, and the final
calculation result is shown in Fig. 36.3c.
The experiment was performed in the Windows 10 operating system. The processor
was Intel(R) Core (Tm) i5-7400, and the computer memory was 8G. The algorithm
was edited by the Python programming language.
At present, there are many evaluation metrics for saliency detection, and even
some scholars have proposed their own evaluation metrics. To better contrast with
the typical saliency detection algorithm, we use the precision-recall curves to evaluate
the saliency map.
When calculating the PR curve, the saliency map of adaptive threshold binariza-
tion is employed, and the ordinate and abscissa refer to the accuracy and recall rate,
respectively. The PR curve is calculated by comparing the saliency map with the
Ground Truth diagram, as shown in Fig. 36.4. From the experimental results, the
method we propose has higher accuracy than other algorithms.
36.5 Conclusions
Acknowledgements This work is supported by the 2018 Program for Outstanding Young Scientific
Researcher in Fujian Province University, Education and Scientific Research Project for Middle-
aged and Young Teachers in Fujian Province (No: JZ170367).
References
Karczmarek, P., et al.: A study in facial features saliency in face recognition: an analytic hierarchy
process approach. Soft. Comput. 21(24), 7503–7517 (2017)
Hui, B., et al.: Accurate image segmentation using Gaussian mixture model with saliency map.
Pattern Anal. Appl. 2, 1–10 (2018)
Yikun, Huang: Simulation of parallel fusion method for multi-feature in double channel video
image. Comput. Simul. 35(4), 154–157 (2018)
Niu, Y., Chen, J., Guo, W.: Meta-metric for saliency detection evaluation metrics based on applica-
tion preference. Multimed. Tools Appl. 4, 1–19 (2018)
Xue, X., Wang, Y.: Using memetic algorithm for instance coreference resolution. IEEE Trans.
Knowl. Data Eng. 28(2), 580–591 (2016)
Yun, Z., Shah, M.: Visual attention detection in video sequences using spatiotemporal cues. In:
ACM International Conference on Multimedia (2006)
Cheng, M.M., et al.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach.
Intell. 37(3), 569–582 (2015)
Niu, Y., Lin, W., Ke, X.: CF-based optimisation for saliency detection. IET Comput. Vis. 12(4),
365–376 (2018)
Ishikura, K., et al.: Saliency detection based on multiscale extrema of local perceptual color differ-
ences. IEEE Trans. Image Process. 27(2), 703 (2018)
Singh, A., Yadav, S., Singh, N.: Contrast enhancement and brightness preservation using global-
local image enhancement techniques. In: Fourth International Conference on Parallel (2017)
Cuevas-Olvera, M., et al.: Salient Object Detection in Digital Images Based on Superpixels and
Intrinsic Features. IEEE (2018)
Liu, S., Jiang, N., Liu, Z.: Saliency detection of infrared image based on region covariance and
global feature. J. Syst. Eng. Electron. 29(3), 483–490 (2018)
Niu, Y., Su, C., Guo, W.: Salient object segmentation based on superpixel and background connec-
tivity prior. IEEE Access 6, 56170–56183 (2018)
Chapter 37
Mosaic Removal Algorithm Based
on Improved Generative Adversarial
Networks Model
37.1 Introduction
The removal of the overall mosaic of images is a challenging study. Early mosaic
removal algorithms include nearest neighbor interpolation algorithms, bilinear inter-
polation algorithms, and cubic spline interpolation algorithms. Such an algorithm
is simple and fast, but the image obtained by interpolation recovery is not ideal,
especially at the edge of the image, the distortion is more obvious, and edge blurring
and color diffusion occur. The late demosaicing algorithms mainly include the VCD
(variance of color difference) algorithm published by Chung et al. [1] in 2006, the
SA (successive approximation) algorithm published by Li [2] in 2005, and Zhang
et al. The DLMMSE (directional linear minimum mean-square-error estimation)
algorithm published by Man [3] in 2005. These demosaicing algorithms all utilize
the correlation between the inserted pixel and its neighboring pixels, but for the whole
or deep mosaic image, the above algorithm removal effect is not obvious. With the
development of deep learning, more and more fields have introduced this method,
such as Deep kernel learning [4] and control approaches [5]. This paper attempts
to use the unsupervised learning generation to combat the network to remove deep
mosaic-processed face photos. Based on the Convolutional Neural Network (CNN),
which can acquire deep image features and generate adversarial networks to gener-
ate realistic HD faces, this paper constructs a new Generative Adversarial Networks
model based on deep learning.
Fig. 37.1 Full convolution flowchart of the new GANs Generator network
The facial repair generation was proposed by Li et al. [11] in 2017. We used the
DCGAN network in the discriminant loss calculation part and the WGAN network
optimizer in the model optimization part, which achieved good results. However,
the new GANs network is very unstable, and it is prone to a loss of generation and
a negative value of the discriminant loss. This paper combines the generation loss
calculation of PixelCNN model, and the model is gradually stable.
Due to the significant modification of the calculation of the generation loss, the
generation model of the new GANs removes the Gaussian distribution from the
traditional GANs. The generated model directly inputs 16 overall mosaic-processed
photos of size 64 × 64, that is, 16 × 64 × 64 × 3 vectors, representing the number of
pictures, picture height, picture width, and image channel. The vector is enlarged to
16 × 128 × 128 × 3 before the start of the full convolution operation. The convolution
kernel in Fig. 37.1 has four parameters, namely the height of the convolution kernel,
the width of the convolution kernel, the number of image channels, and the number
of convolution kernels, where the value of convolution kernel 1 is 8 × 8 × 3 × 256.
In the convolution, the step size of each layer of the image is 1 × 1×1 × 1. Since the
step size is 1, the filling layer does not participate in the calculation. The generator
model structure is shown in Fig. 37.1.
The convolution formula is defined as follows (out_height, out_width indicates
the convolution output height and width):
According to the parameter values in Table 37.1, combined with the convolution
formula, the output height and output width can be calculated: (128 + 2 * 0 − 8)/1
+ 1=121. The number of input images in the first layer is 16, and the number of
convolution kernels is 256, so the first convolution output is 16 × 121 × 121 ×
256. The output of the other layers can be separately determined according to the
convolution kernel and the convolution formula of Fig. 37.1.
The Discriminator network has the same structure as the Generator network, and
the input is the output of the Generator network, and the output is the discriminating
result for the generated image.
According to Ref. [8], it can be seen that the main improvement of WGAN compared
to the original GANs is that the generation loss and the discriminant loss calculation
do not take the logarithm. The generation loss of WGAN is calculated as shown
in formula (37.3), where X represents the output of the discriminant model. In this
paper, in order to perform mosaic restoration, the calculation using Eq. (37.3) is poor.
To this end, this paper studies the loss calculation of PixelCNN network, adjusts the
calculation method of generating loss, and puts the calculation focus on the distance
between the generated model output and the learning target feature, as shown in the
following formula (37.4). In this paper, there are three parameter inputs for the loss,
namely the discriminant model output (defined as d_out), the generated model output
(defined as g_out), and the learning target feature (defined as t_feature).
1 16 16 16 3
Lloss(X ) = −xs (37.3)
16 × 16 × 16 × 3 s=1 i=1 j=1 k=1 i, j,k
1 16 16 16 3
L1loss(X, Y) = xs − y s (37.4)
16 × 16 × 16 × 3 s=1 i=1 j=1 k=1 i, j,k i, j,k
1
C(y, a) = − [y ln a + (1 − y) ln(1 − a)] (37.5)
n x
Although Ref. [8] mentions avoiding logarithms, experiments have shown that
the use of cross entropy for mosaic restoration is very good, as shown in Eq. (37.5).
37 Mosaic Removal Algorithm Based on Improved Generative … 355
In this paper, the inputs x, y that generate the loss of the first part L1loss are, respec-
tively, g_out and t_feature defined above. The inputs a, y of the second part of the
cross entropy are d_out defined above and d_out (defined as d_out_one) of 1 value,
respectively. Finally, our generation loss gene_loss is defined as follows:
Although Ref. [8] mentions avoiding logarithms, experiments have shown that it
is good to use cross entropy for discriminating losses for mosaic restoration. There
is a distinct feature of the DCGAN network here. The discriminant loss in this paper
is equal to the generated image loss (defined as f_loss) minus the real image loss
(defined as t_loss). This corresponds to an improvement in the generation of losses.
The real image loss is the average cross entropy of the real feature image (defined
as t_feature) at the output of the discriminant model (defined as t_out) (the inputs
a, y are, respectively, defined as t_out and the valued t_out which is represented by
t_out_one). The generated image loss is the average cross entropy of the generated
result (defined as g_out) through the discriminant model output (defined as d_out)
(the inputs a, y are, respectively, defined as d_out and the 1-valued d_out which is
represented by t_out_one). The final discriminator network loss d_loss definition
formula is as follows:
In order to minimize the generation loss and discriminate the loss, the optimizer is
needed to optimize the weight parameters, and one of the two optimizers is shared.
The Adam optimization algorithm proposed by Kingma et al. [12] in 2014 is used
to optimize the gradient direction of the generated model, and then use this gradient
to minimize the next loss by updating the weight parameter values. According to
the characteristics of the DCGAN network, the fixed learning rate is 0.0002. The
RmsProp optimization algorithm proposed by Hinton et al. [13] in 2012 is used to
optimize the gradient direction of the discriminant model, and then the gradient is
used to minimize the next loss by updating the weight parameter values. According
to the characteristics of WGAN network, the fixed learning rate is 0.0002 in the
experiment, and other parameters are default.
Although the WGAN original text indicates that the generation model and the
discriminant model are optimized using the RmsProp optimization algorithm, for the
mosaic restoration, the experimental results show that the generation model is better
with the Adam optimization algorithm. Next, generate a model and a discriminant
model, and minimize the loss value by updating the weight parameters. At the same
356 H. Wang et al.
time, the updated weight parameters are used for the next convolution operation.
Thus, through the backpropagation algorithm, each time the gradient is updated by
the learning rate and then combined with the model loss; the model loss of the next
cycle is minimized.
WGAN pointed out that in order to solve the GANs network crash problem,
each time the parameter of the discriminator is updated, its absolute value needs
to be truncated to no more than one constant. The constant in this paper is defined
as 0.008. Therefore, after the optimization of the model weight parameters in the
previous step, a truncation process is added in the paper. The truncation algorithm
is specifically as follows: the parameter value is greater than 0.008, which is 0.008,
and the value less than −0.008 is −0.008. This ensures the stability of the update to
a certain extent.
Refer to the semi-supervised learning of ladder networks proposed by Rasmus
et al. [14] in 2015 and the sample-based volume proposed by Dosovitskiy et al. [15] in
2015 to reduce internal covariate acceleration deep network training and Dosovitskiy
et al. [16] in 2015. Discriminant unsupervised feature learning of the neural network.
In this paper, the fixed learning rate is specified as 0.0002, 200,000 CELEBA pictures
are used for training features, and 16 pictures are randomly loaded in a single cycle.
Each training first generates loss and discriminant loss through forward feedback
calculation, and then minimizes the loss and updates the weight parameter through
gradient descent. The weight parameter of each training 200 times is used as a
restoration model to train the test feature image, and the generation result of the
test picture is saved. Finally, 43 sets of training feature pictures are obtained, and
the definition of these feature pictures gradually becomes better as the training time
increases.
The complete structure of the new GANs is shown in Fig. 37.2.
The experimental operating system in this paper is Windows 10, 64-bit, based on the
TensorFlow framework; the version is 0.10. The programming language Python3.5,
the core extension MoviePy is version 0.2.2.11, Numpy is version 1.11.1, Scipy is
37 Mosaic Removal Algorithm Based on Improved Generative … 357
version 0.18.0, and Six is version 1.10.0. The data set for face generation uses the
public face data set CELEBA, which has a total of 200,000 face photos of 178 ×
218 size.
The experimental learning rate of this group is 0.0002, 200,000 CELEBA images
are used for training features, and 16 images are randomly loaded in a single cycle.
The test picture is shown in Fig. 37.3 and has a size of 178 × 218. The result of the
mosaic restoration is shown in Fig. 37.4.
Before starting the experiment, compress the image to 64 × 64 and then add mosaic
to start training. The goal of the experiment was to reduce the difference between
the mosaic photos and the real features, and finally restore the mosaic photos. Each
cycle first generates an output through the generated model, and then discriminates
the model for discriminating, and then uses the output of the generated model and
the discriminant model to calculate the loss and discriminate the loss. Finally, the
weighting parameters are optimized by the backpropagation algorithm to start the
next cycle. The result is output directly every 200 cycles. The output of the 200th
358 H. Wang et al.
Fig. 37.5 Comparison of the results of 200, 800, and 15,000 cycles
cycle is shown in Fig. 37.5a, and the effect is very poor. The output image of the
800th cycle is gradually improved as shown in Fig. 37.5b. The result of 15,000 cycles
is shown in Fig. 37.5c, which is basically close to the real face.
The stability of the entire experimental process can be generated by the loss curve
shown in Fig. 37.6.
For the overall mosaic restoration, the best results come from the pixel recursive-
based super-resolution algorithm proposed in 2017 by Google Brain [17]. In Fig. 13,
the right side is the real character avatar of the 32 × 32 grid, the left side is the same
avatar that has been compressed to the 8 × 8 grid, and the middle photo is the result
of Google Brain’s guess based on low-resolution swatches.
This paper combines Arjovsky and Bottou [18] in 2016 to propose a principle
method for training generative confrontation networks. The final experimental com-
parison data calculation results are shown in Table 37.2.
37 Mosaic Removal Algorithm Based on Improved Generative … 359
37.5 Conclusion
In this paper, the overall mosaic restoration using the generative adversarial networks
is studied, and the calculation method of GANs network generation loss is improved,
so that the target of the generation model can be controlled. At the same time, the
characteristics of deep convolution of DCGAN network and the calculation method
of discriminant loss are introduced. The results of the realistic restoration of the
mosaic image are first generated by experiments. Second, the improvement of the
proposed method can be used to predict the instability of the WGAN network. Finally,
the comparison results show that the proposed algorithm is better than the existing
algorithm.
Acknowledgements This work was supported by National Natural Science Foundation of China
(No. U1536121, 61370195).
References
1. Chung, K.H., Chan, Y.H.: Color demosaicing using variance of color differences. IEEE Trans.
Image Process. 15(10), 2944–2955 (2006)
2. Li, X.: Demosaicing by successive approximation. IEEE Trans. Image Process. A Publ. IEEE
Signal Process. Soc. 14(3), 370–379 (2005)
3. Zhang, L., Wu, X.: Color demosaicking via directional linear minimum mean square-error
estimation. IEEE Press (2005)
4. Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques
and applications. J. Netw. Intell. 1(3), 83–98 (2016)
5. Xia, Y., Rong, H.: Fuzzy neural network based energy efficiencies control in the heating energy
supply system responding to the changes of user demands. J. Netw. Intell. 2(2), 186–194 (2017)
6. Goodfellow, I.J., Pougetabadie, J., Mirza, M., et al.: Generative adversarial nets. Adv. Neural.
Inf. Process. Syst. 3, 2672–2680 (2014)
7. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolu-
tional generative adversarial networks. Comput. Sci. (2015)
8. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN (2017). arXiv:1701.07875
9. Gulrajani, I., Ahmed, F., Arjovsky, M., et al.: Improved training of Wasserstein GANs (2017).
arXiv:1704.00028
10. Oord, A., Kalchbrenner, N., Vinyals, O., et al.: Conditional image generation with PixelCNN
decoders (2016). arXiv:1606.05328
11. Li, Y., Liu, S., Yang, J., et al.: Generative face completion (2017). arXiv:1704.05838
12. Kingma, D.P., Ba, J., Lei: Adam: a method for stochastic optimization (2014). arXiv:1412.
6980
360 H. Wang et al.
13. Tieleman, T., Hinton, G.: Lecture 6.5—RmsProp: divide the gradient by a running average of
its recent magnitude. In: COURSERA: Neural Networks for Machine Learning (2012)
14. Rasmus, A., Valpola, H., Honkala, M., Berglund, M., Raiko, T.: Semisupervised learning with
ladder network (2015). arXiv:1507.02672
15. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing
internal covariate shift (2015). arXiv:1502.03167
16. Dosovitskiy, A., Fischer, P., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsu-
pervised feature learning with exemplar convolutional neural networks. IEEE Trans. Pattern
Anal. Mach. Intell. 99 (2015)
17. Dahl, R., Norouzi, M., Shlens, J.: Pixel recursive super resolution super resolution (2017).
arXiv:1702.00783
18. Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial net-
works. NIPS 2016 Workshop on Adversarial Training
Chapter 38
Xception-Based General Forensic
Method on Small-Size Images
38.1 Introduction
Since new editing operations are frequently developed and incorporated into editing
software such as Photoshop and some popular mapping software, images manipula-
tions, such as median filtering and contrast enhancement are often adopted without
authorization. These operations alter the inherent statistics without changing the
content of original natural images. The use of image operations has changed the
style of the image itself and the information, seriously affecting people’s judgment
of the truth. In this context, image manipulation detection is proposed to verify the
authenticity of the image and detect the processing history of the image by means of
analysis, making it an indispensable part of multimedia forensics.
In the early stage, most forensic algorithms were designed to detect single targeted
manipulation, thus, only a binary classification is considered [1]. The inherent statis-
tics of the original image will change with the type of various image operations. So
most forensic methods are mainly realized by detecting the changes of some inher-
ent statistical attributes in the original image. Forensic methods based on the above
considerations may lead to the following significant drawbacks: this usually leads to
misleading results if irrelevant classifiers are used. Hence, forensic algorithms need
detecting various image manipulations and maintaining high accuracy.
To address these issues, Li et al. [2] found powerfully steganalysis features called
Spatial Rich Model (SRM) [3] would be used for simultaneously identify multiple
image operations, which can distinguish 11 typical image processing operations.
However, these traditional methods relied on difficult and time-consuming human
analysis to design forensic detection features.
This issue was quickly fixed by using CNN, which can learn features from images
and do classification automatically. However, the forensic tasks are different from
traditional computer vision tasks. Classification tasks tend to extract features from
image content, while forensic tasks tend to extract traces left by image operation,
which has nothing to do with image content. Therefore, the traditional convolutional
neural network would not applicable to image forensics problems directly. In order
to solve this problem, the preprocessing layer is usually added before the neural
network. Bayar et al. restrained the content of the images by the constrained convo-
lution layer and then classified the images by Constrained CNN [4]. While CNNs
provides a way toward automatically learning the trace of image processing oper-
ations, the most of the existing methods are no longer effective for small-size or
highly compressed images. Recently, Tang et al. proposed Magnified CNN to detect
six image operations especially for small-size images [5]. But for some operations,
Tang’s method is not very satisfactory.
In this paper, we continue to aim at detecting operations for small-size images and
are motivated by using magnified layer as preprocessing layer. Compared with the
current state of the art of image forensic methods, this paper contains the following
differences and new insights.
38 Xception-Based General Forensic Method … 363
On the one hand, the nearest neighbor interpolation algorithm as magnified method
was extended to magnify the difference between pictures after various operations,
which can enlarge the difference between different types of images and preserve the
property of image operations better than other magnified tools.
On the other hand, with the rapid development of deep learning, many classical
network structures have emerged [6, 7]. In order to improve the classification perfor-
mance of the network, we compared some typical frameworks such as Xception [8],
Densenet-121 [9], Resnet-50 [10], and Resnext-50 [11]. Based on extensive exper-
iments and analysis, Xception performed best in our comprehensive experimental
settings.
Xception is based on deep separable convolution module. At the same time, the
network also used residual connection [10] to reduce information loss. On the last
pooling layer, we got adaptive average pooling function from global average pooling
to apply this method to any size of input pictures. These results show that our proposed
network can achieve 97.71% accuracy with six different tampering operations when
the images of size 64 × 64.
This paper is organized as follows: In Sect. 38.2, we present an overview of the
proposed architecture while Sect. 38.3 shows the results and performance compari-
son. Finally, Sect. 38.4 concludes our work.
Most image operations are carried out in local areas. Thus, it is difficult to locate
the operation position directly in a large image. To solve this issue, researchers can
detect the large image block by block to locate the operation position in the actual
processing process. So the smaller the size of the detected block, the higher the final
positioning accuracy will be. Here, we proposed a general method to improve the
accuracy of the detection especially on small-size images.
CNN model can automatically extract features, iterate, and update parameters at
the same time. Therefore, it has been more and more popular in forensic methods.
Convolution neural networks usually contain convolution layer, pooling layer, and
classification layer. Convolution layer mainly completes feature extraction, which
includes capturing local dependency between adjacent pixels and output the feature
map. The pooling layer can reduce the dimensionality of the features by fusing
the features extracted from the convolution layer to obtained global information.
Xception uses global average pooling to replace the traditional fully connected layers,
and the resulting vector is fed directly into the softmax in classification layer. The
364 L. Yang et al.
Xception is a deep convolution neural network structure inspired by Inception and the
Inception module has been replaced by the depthwise separable convolution module
(Fig. 38.2). The deep convolution module can be divided into two parts: depthwise
38 Xception-Based General Forensic Method … 365
Split channels
Input
38.3 Experiment
Our database consisted of 13,800 images. These images mainly take from three
widely used image databases: the BOSSbase 1.01 [13], the UCID database [14],
and the NRCS Photo Gallery database [15, 16]. The BOSSbase database contributes
10,000 images and the UCID database and the NRCS Photo Gallery database con-
tribute 1338 images, respectively. Finally, [16] contains 1124 natural images. Before
366 L. Yang et al.
any further processing, we converted the images to grayscale images. We test our pro-
posed method to be used as a multiclass detector with six types of image processing
operations as shown in Table 38.1.
Then, the image blocks were cropped from the center of a full-resolution image
with size 32 × 32 and 64 × 64, respectively. We randomly selected three out of five
images as the training set, one-fifth of images as the validation, and the rest as the
testing set. The image data were processed into grayscale images, and then amplified
by the magnification layer. Finally, the images were input to network. The proposed
CNN model was implemented by using Pytorch. All the experiments were done with
two GPU card of type GeForce GTX Titan X manufactured by Nvidia. The training
parameters of the stochastic gradient descent were set as follows: momentum = 0.9,
decay = 0.0005, the learning rate was initialized to 0.1 and multiplied by 0.1 for every
30 epochs. As the training time increased, the learning rate decreased gradually. The
step length reduces, which makes it possible to oscillate slightly in a small range at
the minimum and to approach the minimum continuously.
In each experiment, we trained each CNN for 76 epochs, where an epoch is the
total number of iterations needed to pass through all the data samples in the training
set. Additionally, while training our CNNs, the testing accuracy on a separate testing
dataset was recorded every 1 epoch to produce tables and figures in this section. The
accuracy of the form was derived from the maximum accuracy of the test dataset.
Table 38.2 Confusion matrix about the detection accuracy of our method with magnified layer;
the size of testing image size is 64 × 64 (%)
A/P CE GF5 JEG70 MeaF5 MF5 ORG RES2
CE 89.57 0 0.04 0 0.07 10.07 0.25
GF5 0.04 99.64 0 0.22 0 0.07 0.04
JPEG70 0.07 0 99.86 0 0 0 0.07
MeaF5 0.04 0.11 0 99.67 0.14 0 0.04
MF5 0.07 0.18 0 0.54 99.17 0 0.04
ORG 3.70 0.07 0 0.04 0.11 96.05 0.04
RES2 0 0 0 0 0 0 100
Table 38.3 Confusion matrix about the detection accuracy of our method with magnified layer;
the size of testing image is 32 × 32 (%)
A/P CE GF5 JEG70 MeaF5 MF5 ORG RES2
CE 86.56 0 0 0 0.43 12.17 0.83
GF5 0 98.91 0.04 0.69 0.22 0.11 0.04
JPEG70 0.07 0 99.89 0 0 0 0.04
MeaF5 0 0.36 0.04 98.84 0.69 0 0.07
MF5 0.25 0.65 0.04 1.49 97.17 0.36 0.04
ORG 10.29 0.11 0 0.07 1.05 88.44 0.04
RES2 0 0 0 0 0 0 100
Table 38.4 The detection average accuracy of our method, Magnified CNN, and Constrained CNN
Image size Proposed network Magnified Constrained CNN
Magnified Without magnified CNN
Here “MeaF5”, “CE”, “GF5”, “JPEG70”, “MF5”, and “RES2” denote mean filter-
ing, contrast enhancement, Gaussian filtering, JPEG compression, median filtering,
and up-sampling, respectively.
form. To be fair, all networks add magnified layer. The average detection accuracy
is presented in Table 38.5, and our proposed strategy is significantly outperforming
these traditional networks in terms of effectiveness.
38.4 Conclusion
Acknowledgements This work was supported in part by the National Key Research and Devel-
opment of China (2018YFC0807306), National NSF of China (61672090, 61532005), and Funda-
mental Research Funds for the Central Universities (2018JBZ001).
References
1. Stamm, M.C., Wu, M., Liu, K.J.R.: Information forensics: an overview of the first decade.
IEEE Access 1, 167–200 (2013)
2. Li, H., Luo, W., Qiu, X., Huang, J.: Identification of various image operations using residual-
based features. IEEE Trans. Circuits Syst. Video Technol. 1–1 (2016)
3. Fridrich, J., Kodovsky, J.: Rich models for steganalysis of digital images. IEEE Trans. Inf.
Forensics Secur. 7(3), 868–882 (2011)
38 Xception-Based General Forensic Method … 369
4. Bayar, B., Stamm, M.C.: Constrained convolutional neural networks: a new approach towards
general purpose image manipulation detection. IEEE Trans. Inf. Forensics Secur. 1–1 (2018)
5. Tang, H., Ni, R., Zhao, Y., Li, X.: Detection of various image operations based on CNN. In: Asia-
Pacific Signal and Information Processing Association Summit and Conference, pp. 1479–1485
(2017)
6. Chen, X., Peng, X., Li, J., Peng, Y.: Overview of deep kernel learning based techniques and
applications. J. Netw. Intell. 1(3), 83–98 (2016)
7. Xia, Y., Hu, R.: Fuzzy neural network based energy efficiencies control in the heating energy
supply system responding to the changes of user demands. J. Netw. Intell. 2(2), 186–194 (2017)
8. Chollet, F.: Xception: Deep Learning with Depthwise Separable Convolutions, pp. 1800–1807
(2016)
9. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional
networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-
tion, pp. 4700–4708 (2017)
10. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition (2015)
11. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep
neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 1492–1500 (2017)
12. Tang, H., Ni, R., Zhao, Y., Li, X.: Median filtering detection of small-size image based on
CNN. J. Vis. Commun. Image Represent. 51, 162–168 (2018)
13. Bas, P., Filler, T., Pevný, T.: “Break our steganographic system”: the ins and outs of organiz-
ing BOSS. In: International Workshop on Information Hiding, pp. 59–70. Springer, Berlin,
Heidelberg (2011)
14. Schaefer, G., Stich, M.: UCID: an uncompressed color image database. In: Storage and Retrieval
Methods and Applications for Multimedia, vol. 5307, pp. 472–481. International Society for
Optics and Photonics (2004)
15. http://photogallery.nrcs.usda.gov
16. Luo, W., Huang, J., Qiu, G.: JPEG error analysis and its applications to digital image forensics.
IEEE Trans. Inf. Forensics Secur. 5(3), 480–491 (2010)
Chapter 39
Depth Information Estimation-Based
DIBR 3D Image Hashing Using SIFT
Feature Points
Abstract Image hashing has been widely used for traditional 2D image authentica-
tion, content-based identification, and retrieval. Being different from the traditional
2D image system, virtual image pair is generated from the center image according
to the corresponding depth image in the DIBR process. In one of the communica-
tion models for DIBR 3D image system, the content consumer side only receives
the virtual images without performing DIBR operation. By this way, only a variety
of copies for virtual image pairs could be distributed. This paper designs a novel
DIBR 3D image hashing scheme based on depth information estimation using local
feature points, by detecting the matched feature points in virtual image pair and divid-
ing these feature points into different groups according to the corresponding depth
information estimated to generate the hash vector. As the experiments shown, the
proposed DIBR 3D image hashing is robust against most of the content-preserving
operations.
C. Cui
School of Information Science and Technology, Heilongjiang University,
Harbin, Heilongjiang, China
e-mail: 2018012@hlju.edu.cn
S. Wang (B)
School of Computer Science and Technology, Harbin Institute of Technology, Harbin,
Heilongjiang, China
e-mail: shen.wang@hit.edu.cn
39.1 Introduction
39.2 Background
DIBR is a process generating the virtual images from the center image according to
its corresponding depth image [12]. As shown in Fig. 39.1, P represents a pixel in
the center image, Z is the depth value of P, f represents the focal length of the center
viewpoint, Cl and Cr are the left viewpoint and the right viewpoint, respectively. The
value of the baseline distance tx is consistent with the distance between the left and
right viewpoints. Formula 39.1 shows the geometric relationships of generating the
virtual image pair in the DIBR process.
xl = x c + tx f
2 Z
,
xr = x c − tx f
2 Z
, (39.1)
d = xl − xr = tx Zf
where xl , xc , and xr represents the x-coordinate of pixels in the left virtual image,
center image, and virtual right image, respectively. d represents the value of disparity
between the left and right virtual images, the value of f is set to 1 without loss of
generality.
The proposed image hashing scheme consists of three steps. In the first step, virtual
image pair is generated from the original center image with a fixed baseline distance.
In the second step, the matched feature points extracted from virtual image pair are
374 C. Cui and S. Wang
Fig. 39.1 The relationship of pixel in left image, center image, and right image
divided into different groups according to estimated depth information, and the depth
information can be computed according to the following formula 39.1:
f
Z = tx (39.2)
d
In the third step, the descriptors of matched feature points in different groups are
utilized to generate the final hash vector. The proposed image hashing scheme will
be illustrated in the following subsections.
As shown in Fig. 39.2, the matching feature point pairs are divided into L groups
according to their depth information estimated, let P represents the set of feature
point pairs in different groups as
P = { p1 , p2 , . . . , p L } (39.3)
After computing the disparity, these matched feature point pairs can be divided
into L groups as
where l = dmaxL−d
1
min
and L = L 1 − L 2 . dmax and dmin represent the min disparity
and max disparity, respectively.
– Step 2: Pseudorandom weights {ak }k=1L
from the normal distribution N u, σ 2 are
generated with a secret key to ensure the security of proposed image hashing. The
vector length of each ak is 128 consistent with the dimensions of feature descriptor.
– Step 3: The image hash vector H = {h k }k=1 L
is generated by computing each com-
ponent h k as
hk = (< ak , d pi > + < ak , d p j >) (39.6)
pi , p j ∈b(k)
1
128
< ak , d pi > = ak (m)d pi (m) (39.7)
128 m=1
376 C. Cui and S. Wang
1
128
< ak , d p j > = ak (m)d p j (m) (39.8)
128 m=1
virtual images belong to, then the identification accuracy is finally calculated as
shown in Table 39.2.
Ideally, we hope that the virtual image pairs attacked by different kinds of content-
preserving operations should still be correctly classified into the corresponding orig-
inal center image, and two distinct pairs of virtual image should have different hash
values. As the experiments shown, the proposed DIBR 3D image hashing is robust
against common signal distortion attacks such as JPEG compression, noise addition,
and gamma correction.
378 C. Cui and S. Wang
39.5 Conclusion
In this paper, a novel DIBR 3D image hashing scheme has been proposed. The image
hash is generated with virtual image pair of the corresponding center image instead
of generating a hash from the center image directly. First, we use the SIFT algorithm
to extract and select matched feature points of virtual image pair. Dividing these
feature points into different groups according to their depth information estimated,
the image hash is generated with feature descriptors. As the experiments shown, our
DIBR 3D image hashing is robust against most of signal distortion attacks, such as
noise addition, JPEG compression, and so on. However, the proposed hashing still
has limitations when considering about geometric distortions, such as rotation. The
future works mainly focus on improving the robustness against geometric distortion
attacks and localizing the tampered contents in images.
Acknowledgements This work is supported by the National Natural Science Foundation of China
(Grant Number: 61702224).
References
1. Fehn, C.: Depth-image-based rendering (DIBR) compression and transmission for a new
approach on 3D-TV. In: Proceedings of the SPIE Stereoscopic Displays and Virtual Reality
Systems XI, pp. 93–104 (2004)
2. Chen, C.M., Xu, L.L, Wu, T.S., Li, C.R: On the security of a chaotic maps-based three-party
authenticated key agreement protocol. J. Netw. Intell. 1(2), 61–66 (2016)
3. Chen, C.M., Huang, Y.Y., Wang, Y.K., Wu, T.S.: Improvement of a mutual authentication
protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern
Recognit. 2(1), 15–24 (2018)
4. Ahmed, F., Siyal, M.Y., Abbas, V.U.: A secure and robust hash-based scheme for image authen-
tication. Signal Process. 90(5), 1456–1470 (2010)
5. Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation
and tradeoffs. IEEE Trans. Image Process. 15(11), 3452–3465 (2006)
6. Kozat, S., Venkatesan, R., Mihcak, M.: Robust perceptual image hashing via matrix invariants.
In: 2004 International Conference on Image Processing, pp. 3443–3446. IEEE, Singapore,
Singapore (2004)
7. Monga, V., Mhcak, M.K.: Robust and secure image hashing via non-negative matrix factoriza-
tions. IEEE Trans. Inf. Forensics Secur. 2(3), 376–390 (2007)
8. Roy, S., Sun, Q.: Robust hash for detecting and localizing image tampering. In: 2007 IEEE
International Conference on Image Processing, pp. 117–120. IEEE, San Antonio, TX, USA
(2007)
9. Lv, X., Wang, Z.J.: Perceptual image hashing based on shape contexts and local feature points.
IEEE Trans. Inf. Forensics Secur. 7(3), 1081–1093 (2012)
10. Tang, Z.J., Zhang, X.Q., Li, X.X., Chao, S.C.: Robust image hashing with ring partition and
invariant vector distance. IEEE Trans. Inf. Forensics Secur. 11(1), 200–214 (2016)
11. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis.
60(2), 91–110 (2004)
12. Zhang, L., Tam, W.: Stereoscopic image generation based on depth images for 3d TV. IEEE
Trans. Broadcast. 51(2), 191–199 (2015)
39 Depth Information Estimation-Based DIBR 3D Image Hashing … 379
13. Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: 2007 IEEE Conference
on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, Minneapolis, MN, USA (2007)
14. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error
visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2007)
Chapter 40
Improved Parity-Based Error Estimation
Scheme in Quantum Key Distribution
40.1 Introduction
H. Mao · Q. Li (B)
Information Countermeasure Technique Institute, School of Computer Science
and Technology, Harbin Institute of Technology, Harbin 150080, China
e-mail: qiongli@hit.edu.cn
for a QKD system. Once the estimated QBER is beyond the given threshold, there
may exist an attacker, so-called Eve. Second, it can be predicted that few secure
keys will be obtained when the estimated QBER is too high. In such a case, unnec-
essary subsequent processing steps can be avoided. Third, error estimation affects
the performance of error correction which is often called reconciliation in QKD. For
instance, in Cascade reconciliation [7], the knowledge of QBER is helpful to set an
optimal block length that decreases the amount of information leakage. In LDPC rec-
onciliation [8, 9], an appropriate matrix and other optimum parameters can be chosen
with the help of estimated QBER, improving the efficiency and convergence speed of
reconciliation. Although blind reconciliation [10–12] and reconciliation-based error
estimation [13] have been proposed, the traditional error estimation before reconcil-
iation is still an essential stage. That is because the protocols above are more suitable
for stable QKD systems. However, the QBER of a practical QKD system might vary
significantly between two consecutive frames. In that case, the protocols without
prior error estimation are not effective.
The optimization target of error estimation is to improve the accuracy of QBER
estimation with minimum information leakage. In order to realize the target, some
improved methods have been proposed. An improved random sampling method was
proposed to improve the performance of QKD systems [14]. The connection between
sampling rate and erroneous judgment probability was analyzed first. Then the cal-
culating method of the optimal sampling rate was presented to maximize the final
secure key rate. The issue of how the sampling rate affected the final secure key
rate in a decoy state QKD was fully discussed. However, limited by the inherent
capacity of random sampling, the performance is not good enough. In order to fur-
ther improve error estimation performance, a Parity Comparison Method (PCM) was
proposed [15]. The parities of blocks were analyzed to estimate QBER instead of ran-
dom sampling. Simulation results showed that PCM outperformed random sampling
in most realistic scenarios. However, the calculating method of the optimal block
length, which is the key parameter of PCM was insufficiently studied. In addition,
all blocks are sampled for error estimation, leaking too much information.
An improved parity-based error estimation scheme is proposed in this research.
The main contributions of our work are as follows. The optimal block length is
obtained through theoretical analysis. In addition, an effective error estimation
scheme is proposed. Simulation results show that the proposed scheme is able to
leak less information than random sampling with the same accuracy level.
The rest of the paper is organized as follows. The mathematical relationship among
parity error rate, QBER, and block length is described in Sect. 40.2. The theoretical
analysis of the optimal block length is presented in Sect. 40.3. The complete error
estimation scheme is detailed in Sect. 40.4. Finally, brief conclusions are provided
in Sect. 40.5.
40 Improved Parity-Based Error Estimation Scheme … 383
In this section, the calculation formula of the QBER in Discrete-Variable QKD (DV-
QKD) is derived. For a DV-QKD system, the quantum channel can be viewed as
a Binary Systematic Channel (BSC) whose error probability is QBER. Hence, the
probability of a specific number of errors can be calculated by using the binomial
distribution [15].
Let eparity be the parity error rate, L be the block length, n be the number of errors
in a block, Eodd be the set of odd n, e be the QBER. It is obvious that an odd number
of errors in a block will lead to a parity error. Then eparity can be calculated by using
Eq. 40.1
eparity = CLn en (1 − e)L−n (40.1)
n∈Eodd
Let x = 1 − e and y = e, then Eq. 40.4 can be obtained by combining Eqs. 40.2
and 40.3.
L
(x + y)L = CLn xn yL−n (40.2)
n=0
L
(x − y)L = CLn xn (−y)L−n (40.3)
n=0
1 − (1 − 2e)L
eparity = (40.4)
2
The inverse function of Eq. 40.4 is presented in Eq. 40.5
1
1 − (1 − 2eparity ) L
e= (40.5)
2
It is obvious that QBER can be calculated by using Eq. 40.5 with the statistical
eparity and preset L. In particular, QBER equals to eparity when L is 1, which indicates
that random sampling is only a special case of PCM.
A rough performance analysis of PCM is given here. Let N be the data size and α
be the sampling rate. The amount of information leakage and involved data are both
N α when random sampling is applied. However, in PCM, the amount of involved
data is LN α when the information leakage is the same as that of random sampling.
The increased amount of involved data is a benefit to error estimation. While the
error estimation accuracy in a block decreases with increasing L. Thus, there may
exist an optimal L achieving the best overall performance of error estimation.
384 H. Mao and Q. Li
In this section, the calculating method of the optimal block length is proposed. The
calculation formula of the optimal block length is given first through theoretical
analysis and then verified with simulations.
In addition to the theoretical analysis, the relevant simulations have been carried,
and the results coincide with the theory deduction. The estimation efficiency fest is
defined in Eq. 40.9. In the equation, Dparity_based and Dparity_based are the variance of
estimated QBER by using parity-based and random sampling methods respectively,
and the mathematical expectation is the actual QBER. Lsl is the optimal block length
obtained through simulation.
Dparity_based
fest = (40.9)
Drandom_sampling
The simulations are conducted with the block length being 1000/100/10 kb,
respectively, and each simulation is repeated 1000 times. Simulation results are
depicted in Table 40.1. As can be seen from the table, Ltheory decreases with increas-
ing QBER and drops to 1 when QBER is 25%. If QBER further increases, random
sampling instead of parity-based will be a better error estimation method. In addi-
tion, the simulation results show that fest decreases with the increasing QBER. Hence,
though the proposed estimation method is always effective in the QBER region of a
DV-QKD system, it is more suitable for low-QBER situation. Nowadays, the QBER
of DV-QKD systems is typically less than 3% [15]. Hence, the advantage of the pro-
posed method is obvious in this situation. In addition, Eq. 40.8 is deduced without
considering the effect of finite block length. As depicted in Table 40.1, Lsl is always
a little smaller than the corresponding theoretical result. Moreover, the length gap
becomes wider with the decreasing QBER. Thus, an adjustment factor α, which can
be fixed or adjustable with the varying QBER, is supplemented to narrow the gap.
The modified formula of actual optimal length Lactual is depicted in Eq. 40.10.
1
Lactual = − −α (40.10)
ln(1 − 2e)
Table 40.1 The optimal block lengths and the estimation efficiency obtained through theory and
simulation for typical QBERs
QBER Ltheory Simulation l Simulation 2 Simulation 3
(%)
Lsl fest Lsl fest Lsl fest
2 24 20 8.8 18 9.4 16 8.9
3 16 15 5.9 14 6.0 14 5.3
5 9 8 3.3 7 3.7 7 3.3
8 5 4 2.2 4 2.2 4 2.1
25L 1 1 ≈ 1.0 1 ≈ 1.0 1 ≈ 1.0
386 H. Mao and Q. Li
An efficient and convenient error estimation scheme based on the obtained optimal
block length is proposed in this section. The application scenarios of error estima-
tion are divided into three categories: blind, semi-blind, and non-blind. The blind
scenario indicates the QKD systems whose QBER are completely unknown to error
estimation. This situation usually occurs in the system debugging process. The QKD
systems with high fluctuation of QBERs are the typical examples of the semi-blind
scenario. In this scenario, the gain of estimation accuracy obtained from previous
(already corrected) frame is low. Most commonly used QKD systems belong to the
third category. The probability distribution of QBER is stable and known to error
estimation. Hence, a rough error estimation is sufficient, leaking only a small amount
of information. Since most blind QKD systems can be converted to non-blind ones
after several rounds of reconciliation, only semi-blind and non-blind systems are
concentrated on.
1.25
Simulation 1
1.20 Simulation 2
Random Sampling
1.15
1.10
1.05
1.00
5.0
4.0
3.0
2.0
1.0
0.0
2 4 6 8
Fig. 40.1 Estimation efficiency (upper panel) and leakage ratio (lower panel)
40.5 Conclusions
Acknowledgements This work is supported by the Space Science and Technology Advance
Research Joint Funds (6141B06110105) and the National Natural Science Foundation of China
(Grant Number: 61771168).
References
1. Bennett, C.H., Brassard, G.: Quantum cryptography: public key distribution and coin tossing.
Theor. Comput. Sci. 560, 7–11 (2014)
2. Chen, C.M., Wang, K.H., Wu, T.Y., Wang, E.K.: On the security of a three-party authenticated
key agreement protocol based on chaotic maps. Data Sci. Pattern Recognit. 1(2), 1–10 (2017)
3. Pan, J.S., Lee, C.Y., Sghaier, A., Zeghid, M., Xie, J.: Novel systolization of subquadratic space
complexity multipliers based on Toeplitz matrix-vector product approach. IEEE Transactions
on Very Large Scale Integration (VLSI) Systems (2019)
4. Renner, R.: Security of quantum key distribution. Int. J. Quantum Inf. 6(1), 1–127 (2008)
5. Li, Q., Yan, B.Z., Mao, H.K., Xue, X.F., Han, Q., Guo, H.: High-speed and adaptive FPGA-
based privacy amplification in quantum key distribution. IEEE Access 7, 21482–21490 (2019)
6. Li, Q., Le, D., Wu, X., Niu, X., Guo, H.: Efficient bit sifting scheme of post-processing in
quantum key distribution. Quantum Inf. Process. 14(10), 3785–3811 (2015)
7. Yan, H., Ren, T., Peng, X., Lin, X., Jiang, W., Liu, T., Guo, H.: Information reconciliation
protocol in quantum key distribution system. In: Fourth International Conference on Natural
Computation, ICNC’08, vol. 3, pp. 637–641. IEEE (2008)
40 Improved Parity-Based Error Estimation Scheme … 389
8. Li, Q., Le, D., Mao, H., Niu, X., Liu, T., Guo, H.: Study on error reconciliation in quantum
key distribution. Quantum Inf. Comput. 14(13–14), 1117–1135 (2014)
9. Mao, H., Li, Q., Han, Q., Guo, H.: High throughput and low cost LDPC reconciliation for
quantum key distribution. arXiv:1903.10107 (2019)
10. Martinez-Mateo, J., Elkouss, D., Martin, V.: Blind reconciliation. Quantum Inf. Comput. 12(9–
10), 791–812 (2012)
11. Kiktenko, E., Truschechkin, A., Lim, C., Kurochkin, Y., Federov, A.: Symmetric blind infor-
mation reconciliation for quantum key distribution. Phys. Rev. Appl. 8(4), 044017 (2017)
12. Li, Q., Wen, X., Mao, H., Wen, X.: An improved multidimensional reconciliation algorithm
for continuous-variable quantum key distribution. Quantum Inf. Process. 18(1), 25 (2019)
13. Kiktenko, E., Malyshev, A., Bozhedarov, A., Pozhar, N., Anufriev, M., Fedorov, A.: Error
estimation at the information reconciliation stage of quantum key distribution. J. Russ. Laser
Res. 39(6), 558–567 (2018)
14. Lu, Z., Shi, J.H., Li, F.G.: Error rate estimation in quantum key distribution with finite resources.
Commun. Theor. Phys. 67(4), 360 (2017)
15. Mo, L., Patcharapong, T., Chun-Mei, Z., Zhen-Qiang, Y., Wei, C., Zheng-Fu, H.: Efficient error
estimation in quantum key distribution. Chin. Phys. B 24(1), 010302 (2015)
Chapter 41
An Internal Threat Detection Model
Based on Denoising Autoencoders
Abstract Internal user threat detection is an important research problem in the field
of system security. Recently, the analysis of abnormal behaviors of users is divided
into supervised learning method (SID) and unsupervised learning method (AD).
However, supervised learning method relies on domain knowledge and user back-
ground, which means it cannot detect previously unknown attacks and is not suitable
for multi-detection domain scenarios. Most existing AD methods use the clustering
algorithm directly. But for threat detection on internal users’ behavior, mostly for
high-dimensional cross-domain log files, as far as we know, there are few methods
of multi-domain audit log data with effective feature extraction. An effective feature
extraction method which can not only reduce testing cost greatly, but also detect the
abnormal behavior of users more accurately. We propose a new unsupervised log
abnormal behavior detection method which is based on the denoising autoencoders
to encode the user log file, and adopts the integrated method to detect the abnormal
data after encoding. Compared with the traditional detection method, it can analyze
the abnormal information in the user behavior more effectively, thus playing a pre-
ventive role against internal threats. In addition, the method is completely data driven
and does not rely on relevant domain knowledge and user’s background attributes.
Experimental results verify the effectiveness of the integrated anomaly detection
method under the multi-domain detection scenario of user log files.
41.1 Introduction
Internal user threat detection is an important research problem in the field of system
security. In many recent security incidents, internal user attack has become one of
the main reasons [1]. Internal users usually refer to the internal personnel of an
organization. They are usually the users of information systems in the organization
such as government employees, enterprise employees, or the users of public services
such as the users of digital libraries, etc. [2, 3]. The user or user process in the
computer system in a variety of activities recorded (also known as user audit log)
is an important basis for the analysis of user behavior such as the user’s command
execution records, file search records, etc. Therefore, we’ll explore the anomaly
detection of cross-domain log files.
A lot of work has been done to propose user behavior analysis methods for internal
threat detection. The existing internal threat detection and prediction algorithms are
divided into two types: (i) anomaly detection based on unsupervised learning (AD)
and (ii) signature-based intrusion detection (SID) [4]. However, supervised learning-
based SID methods can only detect known attacks [5]. Most existing AD methods
use the clustering algorithm directly, but for threat detection on internal users, as far
as we know, there are few methods of multi-domain audit log data with effective
feature extraction. An effective feature extraction method which can not only reduce
testing cost greatly, but also detect the abnormal behavior of users more accurately.
Therefore, we adopt the method based on deep learning to extract the progressive
features of high-dimensional cross-domain log files and then detect the abnormal
behaviors of users.
In this paper, the one-hot encoder which describes the user’s multi-domain behav-
ior is put into the de-noising automatic encoder to train low-dimensional vector.
Finally, we analyze the abnormal behavior of users based on unsupervised learning
technology. Traditionally, AD method is bound to generate many false alarms [6].
Some studies suggest to use intent models and other models [7], but these methods
involve human intervention and expert experience. In our model, Robust covariance
[8], OCSVM [9], isolation forest [10], and Local Outlier Factor [11] are used to
integrate with GMM to obtain the final results, which can effectively reduce the false
alarm rate while ensuring a high recall rate. Our final experimental results show that
with our method, the experimental recall rate reaches 89%, and the false alarm rate
is only 20%.
Our goal is to propose an internal threat detection and prediction algorithm. The
method in our model includes three main steps (as shown in Fig. 41.1).
41 An Internal Threat Detection Model Based on Denoising Autoencoders 393
In the data preprocessing based on the statistical method, the user’s multi-domain
behavior description is constructed. First, the normalized data characteristics of audit
logs of users in each domain are extracted respectively. After obtaining the single-
domain behavior characteristics of users, we statistically combined all single-domain
behavior descriptions of users in the same time window based on a time window.
Usually, we can simplify the negative logarithm likelihood − log pdecoder (x|h)
based on gradient method, such as small batch gradient descent method similar to
minimize. As long as the encoder is deterministic, the denoising autoencoders is a
feedforward network and can be trained in exactly the same way as other feedforward
networks. Therefore, we can think that DAE performs stochastic gradient descent
under the following expectations:
where k is the number of model parameters, L is the likelihood function, and n is the
number of samples. When training the model, increasing the number of parameters.
Finally, we use the user behavior pattern analysis method introduced above to
design a new detection method of camouflage attack for multiple detection domains.
The detection of camouflage attack mainly includes two aspects: “abnormal behavior
pattern detection” and “normal behavior pattern interference detection”. Generally,
the frequency of attack behavior is much less than that of normal behavior. In GMM
model, it is sparse and small clusters. In the detection process, we set a threshold
of abnormal behavior mode to distinguish th e normal behavior mode and abnormal
behavior mode of users. In GMM model, this threshold is the lower limit of cluster
size, and clusters below this threshold are abnormal behavior patterns. The user
behavior contained in the exception pattern is considered to be aggressive behavior.
In the aspect of behavioral pattern interference detection, we examine whether the
influence of each behavioral feature vector on the Gaussian distribution of its mode
is beneficial. Similarly, because the frequency of attack behavior is far less than
that of normal behavior, the normal behavior is more consistent with the Gaussian
distribution of the pattern, while the attack behavior will weaken the conformity of
the Gaussian distribution of the pattern.
Integrated method abnormal data detection. In our model, robust covariance,
OCSVM, isolation forest and local outlier factor are adopted to integrate with GMM
to get the final results. Figure 41.2 is the calculation diagram of the detection process.
Because GMM-based detection method examines whether the influence of each
behavior feature vector on the Gaussian distribution of its mode is favorable, it is con-
ducive to the detection of camouflage attacks in internal threats (camouflage attacks
are often less different from normal behaviors, and thus hidden in a large number
of normal behaviors). However, the disadvantage of GMM-based detection method
is that it requires manual control of detection threshold (a too high threshold will
miss hidden camouflage attack, and a too low threshold will lead to high false alarm
rate), which means manual intervention is required and it is difficult to guarantee
the accuracy of detection. The robust covariance, OCSVM, isolation forest and local
outlier factor are all based on different detection methods, thus leaving out different
camouflage attacks.
Therefore, we attempt to combine the robust covariance, OCSVM, isolation forest,
and local outlier factor detected by four methods into the abnormal behaviors of users
396 Z. Zhang et al.
OCSVM
Test results
Isolation-
forest
GMM
Local
Outlier
Factor
(expecting to get the maximum recall rate), and to take the intersection with the
abnormal behaviors obtained by GMM model (expecting to reduce the false alarm
rate of detection results) to get the final detection results. GMM model in this method
can be fixed as a lower threshold to ensure a higher recall rate (Fig. 41.3).
41.3 Experiments
This section describes the experiment section in detail. First, data preprocessing
is used to obtain the characteristic expression of user behavior. The dataset for this
article is from the Insider Threat Test Dataset of Carnegie Mellon University. It should
be noted that this data set is synthesized [14]. Because malicious insiders are, first and
foremost, insiders, to collect real data, some organization must directly monitor and
record the behavior and actions of its own employees. Confidentiality and privacy
concerns create barriers to the collection and use of such data for research purposes.
Thus, it is sometimes preferable to proceed with synthetic data. This dataset had
been proved to be effective for abnormal behavior detection and was shared within a
large community of researchers engaged in the DARPA ADAMS program to develop
techniques for insider threat detection [14]. This section extracts data by extracting
all behavior data of 21 users in 90 days and quantizing it with one-hot encoding.
Then feature extraction of user behavior is carried out. In our model, the user
behavior characteristics of one-hot encoding are re-encoded and reduced to a low
41 An Internal Threat Detection Model Based on Denoising Autoencoders 397
Table 41.1 Detection results Recall rate (%) Accuracy (%) F1-score
based on GMM
Outlier 66.7 85.7 0.75
detection
CA detection 100 22.5 0.37
dimension through the four-layer denoising autoencoders network. Figure 41.4 is the
feature extraction process based on denoising autoencoders.
The following detection method based on Gaussian mixture model is adopted.
Figure 41.5 is the classification result of the Gaussian mixture model when the
feature dimension is 20. Use the Gaussian mixture model and contrast the ratio of
each category. According to the above anomaly detection algorithm, the data with the
smallest proportion is regarded as the first detected anomaly. Then the second outlier
detection is carried out based on the first detection result. Because the frequency of
attack behavior is far less than that of normal behavior, the normal behavior is more
consistent with the Gaussian distribution of the pattern, while the attack behavior
will weaken the conformity of the Gaussian distribution of the pattern. Therefore,
we examine whether the influence of each behavior eigenvector on the Gaussian
distribution of its mode is favorable. In our model, a threshold of abnormal behavior
pattern is set to distinguish normal behavior pattern from abnormal behavior pattern.
In GMM model, this threshold is the lower limit of the support degree of each behavior
eigenvector to the Gaussian distribution of its mode, and the behavior eigenvector
below this threshold is the abnormal behavior pattern. The user behavior contained
in the exception pattern is considered to be aggressive behavior.
The two detected abnormal behaviors were combined to obtain the preliminary
test results. Table 41.1 shows the test results based on the GMM method. It can be
seen that all abnormal data were detected with a recall rate of 100%, but the accuracy
rate was only 22.5%, indicating a high false alarm rate. In the next section, we will
effectively reduce the false alarm rate.
Since the GMM model cannot guarantee a low false alarm rate, we use four
other detection methods to further detect user behavior characteristics. Figure 41.6
shows the test results of buck covariance, OCSVM, isolation forest and local outlier
factor. Among them, the yellow dots are detected abnormal points, and the ones with
numbers (1932–1940) are real abnormal points.
The test results of these four methods are integrated. Table 41.2 shows the test
results of robust covariance, OCSVM, isolation forest and local outlier factor. The
398 Z. Zhang et al.
Table 41.2 Test results based Recall rate (%) Accuracy (%) F1-score
on Robust covariance,
OCSVM, isolation forest, and Robust 44.0 26.6 0.33
LOF covariance
OCSVM 77.8 44.6 0.57
Isolation forest 55.6 33.3 0.42
LOF 66.7 40.0 0.5
obtained integrated test results are intersected with the preliminary test results of
GMM.
Finally, we compared our method with the detection results obtained without
using the denoising autoencoder feature extraction, and found that compared with
the original method, the F1-score of our method was improved by 42% (Table 41.3).
41 An Internal Threat Detection Model Based on Denoising Autoencoders 399
Fig. 41.6 Robust covariance, OCSVM, isolation forest, and local outlier factor detection results
41.4 Conclusion
Internal user behavior analysis is an important research problem in the field of system
security. We propose a new unsupervised log abnormal behavior detection method.
This method is based on the denoising autoencoders to encode the user log file, and
adopts the integrated method to detect the abnormal data after encoding. Compared
with the traditional detection method, it can analyze the abnormal information in
the user behavior more effectively, thus playing a preventive role against internal
threats. In the experiment, we used the method in our model to analyze all the
behavior data of 21 users in the real scene in 90 days. The experimental results
400 Z. Zhang et al.
verified the effectiveness of the analysis method in the multi-detection domain scene
to analyze the multiple patterns of user behavior. The detection method in our model
is superior to the traditional method for detecting abnormal user behavior based on
matrix decomposition.
References
1. Mayhew, M., Atighetchi, M., Adler, A., et al.: Use of machine learning in big data analytics
for insider threat detection. In: Military Communications Conference. IEEE (2015)
2. Gheyas, I.A., Abdallah, A.E.: Detection and prediction of insider threats to cyber security: a
systematic literature review and meta-analysis. Big Data Anal. 1(1), 6 (2016)
3. Evolving insider threat detection stream mining perspective. Int. J. Artif. Intell. Tools 22(05),
1360013 (2013)
4. Chen, C.-M., Huang, Y., Wang, E.K., Wu, T.-Y.: Improvement of a mutual authentication
protocol with anonymity for roaming service in wireless communications. Data Sci. Pattern
Recogn. 2(1), 15–24 (2018)
5. Chen, C.-M., Xu, L., Wu, T.-Y., Li, C.-R.: On the security of a chaotic maps-based three-party
authenticated key agreement protocol. J. Netw. Intell. 1(2), 61–66 (2016)
6. Chen, Y., Nyemba, S., Malin, B.: Detecting anomalous insiders in collaborative information
systems. IEEE Trans. Dependable Secure Comput. 9(3), 332–344 (2012)
7. Young, W.T., Goldberg, H.G., Memory, A., et al.: Use of domain knowledge to detect insider
threats in computer activities (2013)
8. Rousseeuw, P.J., Driessen, K.V.: A fast algorithm for the minimum covariance determinant
estimator. Technometrics 41(3), 212–223 (1999)
9. Liu, F.T., Kai, M.T., Zhou, Z.H.: Isolation forest. In: Eighth IEEE International Conference on
Data Mining (2009)
10. Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. J. Mach. Learn.
Res. 2(1), 139–154 (2002)
11. Lee, J., Kang, B., Kang, S.H.: Integrating independent component analysis and local outlier
factor for plant-wide process monitoring. J. Process Control 21(7), 1011–1021 (2011)
12. Li, D., Chen, D., Goh, J., et al.: Anomaly detection with generative adversarial networks for
multivariate time series (2018)
13. Dilokthanakul, N., Mediano, P.A.M., Garnelo, M., et al.: Deep unsupervised clustering with
Gaussian mixture variational autoencoders (2016)
14. Glasser, J., Lindauer, B.: Bridging the gap: a pragmatic approach to generating insider threat
data. In: 2013 IEEE Security and Privacy Workshops (SPW). IEEE (2013)
Chapter 42
The Para-Perspective Projection
as an Approximation of the Perspective
Projection for Recovering 3D Motion
in Real Time
42.1 Introduction
spective projection are used. In general, nonlinear minimization methods have been
proposed for solving the 3D motion estimation problem in perspective model. The
approach of solving nonlinear equations requires some form of initial approximate
solution if that is far away from the true solution. Since the perspective projection
can be approximated by para-perspective projection by modeling both the scaling
and the position effects [1], we initiated the proposed nonlinear equations using para-
perspective factorization method by recovering geometry of the scene and motion of
either the camera or the object from image sequences.
Tomasi and Kanade [1] first introduced a factorization method to recover 3D
shape of the object and the motion of the camera simultaneously under orthographic
projection and obtained accurate results. Aloimonos described the approximations of
perspective projection based on para-perspective and ortho-perspective projections
[2].
The methods of 3D motion estimation problem are developed by researchers based
on various types of corresponding features on the rigid objects from two (or more)
images of sequences at different times. The main mathematical problem for determin-
ing the location and orientation of one camera was proposed in nonlinear equations
with omitting depth information by Huang and Tsai in [3–5]. The approaches of
solving nonlinear equations were viable, if the good initial guess solution is avail-
able. Among others, Zhuang et al. [6], Longuet-Higgins [7], and Faugeras [8] have
shown that the motion parameters of rigid body can be estimated from point corre-
spondences by solving linear equations. However, it has been found empirically that
linear algorithms are usually more sensitive to measurement noise than nonlinear
algorithms [3].
The contribution of this work is proposed in several aspects. First, this 3D motion
estimation from single camera is fast processing without requiring any model of
additional hardware. Second, the applicability of factorization method is limited to
offline computation for recovering shape and motion after all the input images are
given. Although it is difficult to apply to real-time case, we have used para-perspective
factorization method in real time to initialize nonlinear system equations. Third,
since linear techniques are very sensitive to the noise. However, para-perspective
factorization method is formulated in linear properties. The best approach is to first
use a linear algorithm by assuming a sufficient number of feature correspondences to
find initial guess value, and then to use a nonlinear formulation to refine the solution
for getting accurate motion parameters iteratively.
The paper is organized as follows. The problem statement and general motion
model of this proposed method are described in Sect. 42.2. The perspective approxi-
mation as para-perspective projection for obtaining initial guess value is summarized
in Sect. 42.3. The implementation of the proposed method is presented in Sect. 42.4.
The experiment of the proposed method is discussed in Sect. 42.5.
42 The Para-Perspective Projection as an Approximation … 403
Consider a rigid body viewed by pinhole camera imaging system. We denote that a
3D point P on the surface of the object in object space coordinate is projected at a
point p on the image space under perspective projection. We consider that there are
image sequences of the object that is moving relative to a static camera. The moving
system O is attached to the object, and a static system C is attached to the camera
as shown in Fig. 42.1.
Each image is taken with keeping some object orientation that defined by the
orthonormal unit vectors, i f , j f , and k f corresponding to the x-, y-, and z-axes of
the camera. We represent the position of the object frame in each image by the
vector t. We assume that N feature points are extracted in the first image and are
tracked to the next for each F image. N feature points Pn = (X n , Yn , Z n )T on the
object that are projected into each F image with coordinates p f n = x f n , y f n | f =
1, . . . , F, n = 1, . . . , N . Our goal is to estimate 3D motion of the moving object
based on the tracked feature correspondence points from image sequences. Initially,
we formulate the equations based on rigidity constraint of the object.
The 3D point in the object space coordinate system is represented in camera
coordinate system by a rotation matrix, R f whose rows i f = i x f , i y f , i z f , j f =
T
jx f , j y f , jz f , k f = k x f , k y f , k z f and translation, t f = tx f , t y f , tz f .
Pnc = R f Pn + t f , (42.1)
where
⎡ ⎤
if
⎣
Rf = jf ⎦
kf
⎡ ⎤
cos α cos β sin α cos β − sin β
= ⎣ cos α sin β sin γ − sin α cos γ sin α sin β sin γ + cos α cos γ cos β sin γ ⎦. (42.2)
cos α sin β cos γ + sin α sin γ sin α sin β cos γ − cos α sin γ cos β cos γ
The rotation matrix R f is specified as three independent rotations around x-, y-,
and z-axes by angles α, β, and γ in Eq. (42.2).
Assuming the camera intrinsic parameters are known and focal length is unit,
the relationship between the image space and the object space coordinates using the
property of similar triangles can be written as
i f · Pn − t f j f · Pn − t f
xfn = yfn = (42.3)
k f · Pn − t f k f · Pn − t f
i f · Pn + tx f j f · Pn + t y f
xfn = yfn = , (42.4)
k f · Pn + tz f k f · Pn + tz f
where
tx f = −i f · t f ; t y f = − j f · t f ; tz f = −k f · t f . (42.5)
Since we have the given N point correspondences p f n = x f n , y f n for
each image F, a total of 2FN equations can be solved to determine six motion
parameters (tx f , t y f , tz f , α, β, γ) and three shape parameters for each point (Pn =
[Pn1 , Pn2 , Pn3 ]) for a total of 6F + 3N . As before mentioned, the good initial guess
is essential for converging to the right solution of Eq. (42.6). In the next section,
we will introduce the para-perspective projection model and discuss how it will be
used for the current approach. We will refine results of para-perspective projection
through the perspective projection model Eq. (42.6) iteratively.
Para-perspective projection has been used for the solution of various problems.
Para-perspective projection closely approximates perspective projection by mod-
eling image distortions as illustrated in Fig. 42.2.
First, the points Pn are projected onto the auxiliary plane G that is parallel to the
image plane, including mass center of the object c. The projection ray is parallel to
the line connecting with camera focal point. This step is capturing the foreshortening
distortion and the position effect.
42 The Para-Perspective Projection as an Approximation … 405
Then, the points of the plane G are projected onto image plane using perspective
projection. Since the plane G is parallel to the image plane, it is scaling the point
coordinates by the distance between camera focal point and auxiliary plane G. This
step is capturing both the distance and position effects.
The para-perspective projection is the first-order approximation of the perspective
projection derived from the perspective projection Eq. (42.3). We suppose that the
perspective projection of the point P onto the image plane be p f n = x f n , y f n is
given by
x f n = m f · Pn + tx f ;
y f n = n f · Pn + t y f , (42.7)
where
tz f = −t f · k f (42.8)
tf · if tf · jf
tx f = − ; ty f = − ; (42.9)
tz f tz f
i f − tx f k f j f − ty f k f
mf = ;nf = . (42.10)
tz f tz f
Since we have the tracked N feature points over F frames in the image streams,
we can write all these measurements into a single matrix by combining equations as
follows:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x11 . . . x1N m1 tx1
⎢ ... ... ... ⎥ ⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ x F1 . . . x F N ⎥ ⎢ m F ⎥ ⎢ tx F ⎥
⎢ ⎥=⎢ ⎥[P1 . . . Pn ] + ⎢ ⎥[1 . . . 1] (42.11)
⎢ y11 . . . y1N ⎥ ⎢ n 1 ⎥ ⎢ t y1 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ ... ... ... ⎦ ⎣ ... ⎦ ⎣ ... ⎦
y F1 . . . y F N nF ty F
406 T. Tumurbaatar and N. Sengee
or in brief form
W = M S + T [1 . . . 1] (42.12)
1
N
tx f = x f n;
N n=1
1
N
ty f = y f n; (42.13)
N n=1
In this section, we explain our proposed method in detail for implementation steps.
We recovered the shape and motion parameters for every five frames under para-
perspective projection because to compute L matrix in Sect. 42.3, we obtained 2F +1
equations for six unknown parameters. Thus, we need three frames or more than
three. The first four frames are initially captured, and the fifth frame is captured
consequently at different times.
First, we extract a sub-image, which is including only part of the foreground
moving object, extracted from the current video sequence. Then, the feature points
are computed to be matched to the next frames using SIFT feature extractor for the
extracted sub-image. After these steps, the first frame is captured from the current
frame, and the corresponding features are extracted between the first frame and the
extracted sub-image. The best matches are found out by Random Sample Consensus
(RANSAC)-based robust method, eliminating outliers among the matched point with
Brute force matcher. Similarly, the second, third, fourth, and the fifth frames are
captured in order of the selection from the image sequences. All processing steps in
capturing the first frame are implemented when capturing other four frames. Since
the input of the initialization step by para-perspective projection required the exact
number of the tracked correspondences one to another frame, we computed the
42 The Para-Perspective Projection as an Approximation … 407
42.6 Conclusion
In this paper, we obtained the 3D motion parameters from rigid transformation equa-
tions when features in the 3D space and their perspective projections on the camera
plane are known. The solution equations were formulated in nonlinear least squares
problem for the tracked feature correspondences over image sequences. These equa-
tions require the good initial approximation and 3D features, so to avoid difficulties,
the para-perspective projection is used to approximate the perspective projection and
to find out the 3D features in Euclidean space. Then, we solved the proposed equa-
tions using results of para-perspective approximation as initial values. The results
408 T. Tumurbaatar and N. Sengee
Fig. 42.3 The comparison results. a Comparison of the rotations around the x-axis. b Comparison
of the rotations around the y-axis. c Comparison of the rotations around the z-axis
42 The Para-Perspective Projection as an Approximation … 409
of this method are accurate, and the produced errors between the estimated and the
measured motion parameters are negligibly small.
Acknowledgements The work in this paper was supported by the grant of National University of
Mongolia (No. P2017-2469) and MJEED, JICA (JR14B16).
References
1. Poelman, C.J., Kanade, T.: A paraperspective factorization method for shape and motion recov-
ery. IEEE Trans. Pattern Anal. Mach. Intell. 19(3), (1997)
2. Aloimonos, J.Y.: Perspective approximations. Image Vis. Comput. 8(3) (1990)
3. Huang, T.S., Netravali, A.N.: Motion and structure from feature correspondences: a review.
Proc. IEEE 82(2) (1994)
4. Huang, T.S., Tsai, R.Y.: Image sequence analysis: motion estimation. In: Image Sequence Anal-
ysis. Springer Verlag, New York (1981)
5. Huang, T.S.: Determining three dimensional motion and structure from two perspective views. In:
Young, T.Y., Fu, K.S. (eds.) Handbook of Pattern Recognition and Image Processing. Academic
Press, New York (1986)
6. Zhuang, X., Huang, T.S., Ahuja, N., Haralick, R.M.: A simplified linear optic flow motion
algorithm. Comput. Vis. Graph. Image Process. 42, 334–344 (1988)
7. Longuet-Higgins, H.C.: A computer program for reconstructing a scene from two projections.
Nature 293, 133–135 (1981)
8. Faugeras, O.: Three-Dimensional Computer Vision: A Geometric View-point, Cambridge. MIT
Press, MA (1993)
Chapter 43
Classifying Songs to Relieve Stress Using
Machine Learning Algorithms
Abstract Music has a great impact on stress relieving for human. We have become
very stressed by society and the times. Accumulated stress cannot be met daily,
and this will have an adverse effect on our health and our mental health, such as
obesity, heart attacks, insomnia, and so on. Therefore, this study has been offering
an ensemble approach combining algorithms of machine learning such as K-NN,
naïve Bayes, multilayer perceptron, and random forest for stress relief based on
musical genres.
43.1 Introduction
Everybody is listening to music and sounds every day and somehow. All of them
affect vital organs such as the human brain and heart, and stress, the psychological
state, the behavior, and even the child education. Choosing the right music, for whom,
and choosing what to use, can positively affect health and relationships. On the other
hand, selecting inappropriate music (Not applicable to the item) may have a negative
effect on increasing the level of depression and stress and disrupting the humanity.
K. Munkhbat
Database/Bioinformatics Laboratoty, School of Electrical and Computer Engineering, Chungbuk
National University, Cheongju, South Korea
e-mail: khongorzul@dblab.chungbuk.ac.kr
K. H. Ryu (B)
Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City 700000,
Vietnam
e-mail: khryu@tdtu.edu.vn; khryu@chungbuk.ac.kr
Department of Computer Science, College of Electrical and Computer Engineering, Chungbuk
National University, Cheongju 28644, South Korea
Therefore, the feeling of music can be different for everyone, as well as for different
stages of human life [1].
In recent years, the music industry has been widely used in health care collaborat-
ing with the health sector. In particular, research has been conducted mainly for men-
tal health and heart disease. As well as, machine learning techniques have been given
high accuracy in the music industry [2]. Thus, we created the ensemble model and got
the results by comparing the machine learning algorithms such as K-Nearest Neigh-
bors (K-NN), Naive Bayes (NB), Multilayer Perceptron (MLP), and Random Forest
(RF) [3–6]. K-NN is figured out in the context of genre classification in [7]. Zhouyu
et al. [8] applied NB classifier for both music classification and retrieval as compared
to the alternative methods. While Cemgil and Gürgen [9] obtained the results from
MLP, the new RF method is introduced [10] for music genre classification.
43.2.1 Stress
43.2.2 Music
Music cannot completely relieve stress, but it helps to undermine and control the
stress levels, which are shown in Myriam V. Thoma’s research works [12]. Listen to
the right music that suits your mood which can directly affect the mood, productivity,
43 Classifying Songs to Relieve Stress … 413
and attitudes of your current mood. For example, the fast rhythmic songs increase
the skills on focus and concentration while the music with upbeat makes it more
optimistic and positive. Slow rhythmic music can relax the mind and body muscles
and relieve stress. The music has 60 beats in the minute, and it is mentioned in recent
studies that the brain produces alpha waves with 8–14 derivatives (in seconds), which
indicates that our brain is conscious [13].
In the presence of music, increase of serotonin and endorphin in the human brain
creates the positive effects such as antistress and irritation, relaxation, improving
concentration, promoting the immune system, reducing blood pressure, lifting soul,
and salving. We do not need much time for listening to music or playing to relieve
stress. The main thing is to make it a habit, in other words, in the daily rhythm. For
example, in the morning stand up and listen to your favorite songs and music, listen
to after great work, on the car, or walking. Even when listening to music, playing
music, and try if not being able to do it, the chanting process will only be affected
by stress. So need to just listen, get deep breaths, and calm down. However, the
everyone’s taste of the music is different, so it is best to listen to your favorite and
personal matches.
We have tested using the Free Music Archive (FMA) [14], an easy and open accessible
dataset relevant for evaluating several tasks in MIR. It has various sizes of MP3-
encoded audio data and metadata. We used metadata, named tracks.csv, which has
total of 39 features such as ID, title, artist, genres, tags, and play counts, for all
106,574 tracks. Some attributes and tuples are presented in Table 43.1.
Each track in this dataset is legally free to download as artists decided to release
their works under permissive licenses. The purpose of our research is to predict and
classify the songs which reduce and release stress, so dataset genres [14] were used
to divide into two labels that stressed and stressed out as described in the study (see
Fig. 43.1).
Data preprocessing is made by two steps using encodings: ordinal and one-hot.
String values in the dataset need to be converted to numerical value in order to
apply machine learning algorithms. Ordinal encoding, one of the model encoding
methods, is used in the first step of data preprocessing. It means changing original
value to sequential numbers. After that, one-hot encoding, the most common to code
categorical variables, is used. This method creates more than 200 new columns,
which makes the training process slower.
We proposed NB, K-NN, MLP, and RF algorithms in this work. K-NN is one of many
(supervised learning) algorithms used in data mining and machine learning. It can
be used for both classification and regression predictive problems. Nevertheless, it
is more broadly used in classification problems in the industry. NB is a classification
technique based on Bayes’ theorem with an assumption of independence among
predictors. That model is easy to build, with no complicated iterative parameter
estimation which makes it particularly useful for very large datasets. MLP, often
applied to supervised learning problems, is a feedforward artificial neural network
model. It is composed of more than one perceptron. RF is an ensemble algorithm
which combines more than one algorithms of a same or different kind for classifying
objects. This classifier creates a set of decision trees from randomly selected subset
of training set. It then aggregates the votes from different decision trees to decide the
final class of the test object. Using ensemble approach is a technique that combines
several machine learning techniques into one predictive model in order to decrease
variance (bagging), bias (boosting), or improve predictions (stacking) [15]. As above
mentioned algorithms are used to propose stacking ensemble approach for classifying
songs in this study, Figure 43.2 indicates the ensemble approach architecture.
43 Classifying Songs to Relieve Stress … 415
43.4 Results
Table 43.2 shows the comparisons of classification algorithms with ensemble model.
We applied the machine learning algorithms including naïve Bayes, K-NN, ran-
dom forest, and multilayer perceptron in this study. Among the classifiers, RF gives
the highest accuracy to 0.801, while MLP provides the lowest accuracy, with 0.532.
At the end of this result, we approved that the ensemble approach is not suitable for
classifying songs in metadata. The best results were reached using the RF classifier.
The AUC-ROC curve is shown in Fig. 43.3.
43.5 Conclusion
We built the model which can predict the song relieving stress in metadata set of
FMA. Experimental results have made using MLP, NB, K-NN, and RF of machine
learning. We recommend the RF algorithm to build the model which can predict the
songs reducing stress based on our experimental result. Using only metadata set of
the song has drawback to predict reducing stress song. Thus, we will use audio file
dataset, provided detailed musical objects such as chords, trills, and mordents, to
create the model which can predict relieving stress songs in the future work.
Acknowledgements This research was supported by Basic Science Research Program through the
National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future
Planning (No. 2017R1A2B4010826).
43 Classifying Songs to Relieve Stress … 417
References
1. Trappe, H.J.: Music and medicine: the effects of music on the human being. Appl. Cardiopulm.
Pathophysiol. 16, 133–142 (2012)
2. Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey.
IEEE Signal Process. Mag. 23(2), 133–141 (2006)
3. Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
4. McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classification.
In: AAAI-98 Workshop on Learning for text Categorization, vol. 752, no. 1, pp. 41–48 (1998)
5. Gardner, M.W., Dorling, S.R.: Artificial neural networks (the multilayer perceptron)—a review
of applications in the atmospheric sciences. Atmos. Environ. 32(14–15) (1998)
6. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22
(2002)
7. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech
Audio Process. 10(5), 293–302 (2002)
8. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: Learning naive Bayes classifiers for music classification
and retrieval. In: 20th International Conference on Pattern Recognition, pp. 4589–4592. IEEE
(2010)
9. Cemgil, A.T., Gürgen, F.: Classification of musical instrument sounds using neural networks.
In: Proceedings of SIU’97. (1997)
10. Jin, X., Bie, R.: Random forest and PCA for self-organizing maps based automatic music genre
discrimination. In: DMIN, pp. 414–417 (2006)
11. Syed, S.A., Nemeroff, C.B.: Early life stress, mood, and anxiety disorders. Chronic Stress
(Thousand Oaks, Calif.) 1 (2017). https://doi.org/10.1177/2470547017694461
12. Thoma, M.V., La Marca, R., Brönnimann, R., Finkel, L., Ehlert, U., Nater, U.M.: The effect of
music on the human stress response. PLoS One. 8(8), e70156 (2013). https://doi.org/10.1371/
journal.pone.0070156
13. University of Nevada, Reno Homepage. https://www.unr.edu/counseling/virtual-relaxation-
room/releasing-stress-through-the-power-of-music
14. Defferrard, M., Benzi, K., Vandergheynst, P., Bresson, X.: FMA: a dataset for music analysis.
arXiv:1612.01840 (2016)
15. Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Mul-
tiple Classifier Systems, pp. 1–15. Springer, Berlin (2000)
16. Song, Y., Simon, D., Marcus, P.: Evaluation of musical features for emotion classification. In:
ISMIR, pp. 523–528 (2012)
17. McCraty, R., Barrios-Choplin, B., Atkinson, M., Tomasino, D.: The effects of different types
of music on mood, tension, and mental clarity. Altern. Ther. Health Med. 4, 75–84 (1998)
Chapter 44
A Hybrid Model for Anomaly-Based
Intrusion Detection System
44.1 Introduction
sion detection system is assigned to shield the system from malicious attacks and
network vulnerability [1].
Since the last few years, network and Internet technologies have been widely
applied in industries and other sectors. Hereupon, network intrusions have been
increased with increasingly changing their types and forms. Therefore, network intru-
sion and information security are challenged while using Internet. Although many
information security technologies such as encryption, authentication, authorization,
intrusion detection, deception, and so on can protect network systems, yet they are
unable to detect novel attacks. Also, there are many undetected anomaly and intru-
sions are named zero-day attacks. Intrusion detection system has been applied to
detect network intrusions and anomalies. Signature-based network intrusion detec-
tion systems (NIDS) can capture and analyze network traffic to detect known attacks
by comparing signatures of the attacks. NIDS are also captured the packets passing
through the network devices [2]. Intrusion detection mechanism is divided into two
types; anomaly detection and misuse detection (signature-based system) [3, 4] also
host-based and network-based IDS [3]. Misuse detection is an approach where each
suspected attack is compared to a set of known attack signatures [5]. It is used for
detecting known attacks [3]. It detects the attacks in an exclusive manner in that
database, but this method cannot be used for the detection of unknown attacks [3].
Unknown attack could be mostly zero-day attacks. Anomaly detection systems are
divided into two types: supervised and unsupervised [6]. In the supervised anomaly
detection method, the normal behavior model of system or networks is established
by training with a labeled dataset. These behavior models are used to classify new
network connection and distinguish malign or anomaly behaviors from normal ones.
Unsupervised anomaly detection approaches work without any labeled training data
and most of them detect malign activities by clustering or outliers detections tech-
niques [6]. The role of anomaly detection is the identification of data points, sub-
stance, event, and observations or attacks that are not transitive to the expected pattern
of a given collection and this technique is based on defining network behavior [3, 7].
Data preprocessing and classifications are important tasks in the machine learning.
Most of the proposed techniques are tried to gain overall classification accuracy.
Even though many models introduced for dealing with network intrusions behavior
had introduced by researchers, most of them suffer from addressing dangerous and
rare attacks as well as it has several problems. Eventually, authors decided to utilize
data mining methods for solving the problem of network anomaly and intrusion for
the following reasons:
• High-speed processing in networks’ big data using several features (near to the
real-time classification);
• Detection accuracy will increase with dataset preprocessing;
• It is appropriate to discover the hidden and unseen information from novel network
attack;
• Prevent from a single point of failure.
In this paper, authors have proposed a novel hybrid model to detect network attacks
as well as solving reasons above. The main contribution of this work against other
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 421
existing models cannot be defined as above. In this model, authors are focused on
data mining as a data preprocessing technique, and continuously machine learning
to increase detection rate and accuracy and also signature-based and anomaly-based
system has been used. In the previous work by the authors [4], data on some novel
attacks after 2010 with Backtrack system and the testing network environment were
collected. In this research, KDD 99 dataset has been used and the collected [4]
dataset is named NUM15. At the outset, KDD 99 dataset was preprocessed before
the experiment. KDD 99 is widely used in computer network anomaly detection and it
consists of nearly 5 million training connection records labeled as an intrusion or not
an intrusion, and separated testing dataset consists of known and unseen attacks [8].
In methodology section, training and testing model and information gain in NUM15
and KDD 99 datasets have been presented. Finally, accuracy and attack detection
with proposed hybrid model have been calculated.
Signature-based [9] and anomaly-based [10] network IDS have been studied since
1980. There are many research papers on IDS with several algorithms and data min-
ing techniques that improve accuracy and decrease false alarm. For the very first time,
IDS was suggested by Anderson [9], which was based on applied statistic methods
to analyze users’ behavior. In 1987, [11], a prototype of IDS was proposed. The idea
of IDS spread progressively. A couple of research papers [10, 12] are focused on data
mining used in network intrusion and anomaly. Their idea is to apply data mining
programs classification and frequent episodes to training data for computing misuse
and creating anomaly detection models that exactly capture the normal and anomaly
behavior activities. Packet-based and flow-based methods are analyzed in network
IDSs based on the source of data. Packet-based IDS mostly provides signature-based
systems valuable information to detect attacks, while flow-based support anomaly-
based IDS to have the ability to detect anomalies [13]. Network IDS uses packets or
flow in network to detect anomaly or network attacks. The [14] presented three lay-
ers of multiple classifiers for intrusion detection that was able to improve the overall
accuracy. They applied naive Bayes, fuzzy K-NN, and backpropagation NN for gen-
erating the decision boundary. The experiment used KDD 99 dataset with 30 features
to test the model. Neelam et al. [15] proposed a layer approach for improving the
efficiency of attack detection rate using domain knowledge and the sequential search
to decrease feature sets and applied naive Bayes classifier for classifying four classes
of attack types. Various works used IDS evaluation dataset of DARPA and KDD 99 in
their experiment. There is various hybrid network IDSs proposed for detecting novel
attacks. Gómez et al. [16] presented hybrid IDSs anomaly preprocessor extended by
Snort named H-Snort. Cepheli et al. [17] introduced novel hybrid intrusion detection
preprocessor for DDoS attack named H-IDS. They are benchmarked by DARPA and
commercial bank dataset for H-IDS and the true positive rate increased by 27.4%.
422 N. Ugtakhbayar et al.
Patel et al. [3] designed hybrid IDS consist of six components and use Snort IDS
with Aho–Corasick algorithm.
Hussein et al. [18] combined a signature-based Snort and anomaly-based naïve
Bayes methods by hierarchical. They used KDD 99 dataset and Weka program for
testing the proposed system. In this research work, they adopted Bayes Net, J48 graft,
and naïve Bayes in anomaly-based system. Besides, they confronted the results of
anomaly-based systems. Their system achieved about 92% detection rate by naïve
Bayes and required about 10 min to build the model.
Dhakar et al. [19] combined two classifiers which are tree-augmented naïve Bayes
(TAN) and reduced error pruning (REP). The TAN classifier is used as base classifier,
while the REP classifier is for meta-classifier which learns from TAN classifier. Their
proposed hybrid model shows 99.96% accuracy for KDD 99 dataset.
MIT’s AI2 model [20] introduced both supervised and unsupervised methods and
combined them with security analyst in their detection system. The features that are
used in this paper include a big data behavioral analytics platform, an ensemble of
outlier detection methods, a mechanism to obtain feedback from security analysts,
and a supervised learning module. They tested the proposed system by real-world
dataset consisting of 3.6 billion log lines.
44.3 Methodology
The KDD 99 dataset has four categories of attacks, viz., DoS, Probe, U2R, and R2L.
Each data instance contains 41 features, and these are separated from the training
and testing datasets. The benchmarking dataset consists of different components.
The researchers have used KDD 99’s 10% labeled dataset and NUM15 dataset as
training. NUM15 dataset has four categories of attacks and 300 thousand instances,
each instance containing 41 features. Information gain is a method that measures the
expected reduction in entropy.
Naive Bayes classifier is based on directed acyclic graph which is broadly utilized
method in classifications purposes [21]. The computer network consists of nodes, arcs
representing variables, and interrelationships among the variables. Bayesian network
is to evaluate the relationship between these features to construct a Bayesian network;
it is called profile of system and determines the support using this profile. The profile
gives a description of the current state of the system by variables. If the probability
of occurrence is less than the threshold, an alarm should be raised.
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 423
The Naive Bayes classifier combines the probability model with a decision rule.
The corresponding classifier, a Bayes classifier, is the function that assigns a class
label y = C k for some k as follows:
n
ŷ = arg max p(Ck ) p(xi Ck ) (44.1)
k∈(1,...K) i=1
In this work, the dataset has been classified into only two classes: normal = C 1
and attack = C 2 .
The S is a set of training set with their corresponding labels. Assume that there are m
classes, and the training set contains S i samples of class i and s is the total number of
samples in the S, expected information gain ratio needed to classify a given sample.
It is calculated using the following formula:
m
Si Si
I (S1 , S2 , . . . , Sm ) = − log2 (44.2)
i−1
S S
A feature F with values {f1 : fv } can divide the training set into v subsets {S1 : Sv }
where S j is the subset which k has the value f j for feature F.
Information gain for F can be calculated as
The model can be divided into two parts. One of them is offline training module
named “Research Phase” as shown in Fig. 44.1; second one is online classification
module named “Testing Phase” as shown in Fig. 44.2.
In the research phase, data preprocessing and feature selection have been calcu-
lated. Subsequently, it was found that one of the feature sets has resulted in better
accuracy. Later, machine was trained using naive Bayes with training dataset. Using
this design (Fig. 44.1), feature selection, data cleaning (remove duplicated records),
and converting collected traffic to arff format as required by Weka program have
been performed. Before converting data, all discrete data were converted to contin-
424 N. Ugtakhbayar et al.
uous (data converting) that is used for normalizing. Subsequently, the dataset was
split into training and testing datasets. In the training dataset, all normal traffics were
chosen and testing dataset consists of KDD 99 attack and the collected dataset.
The proposed design uses both anomaly- and signature-based methods in parallel
as shown in Fig. 44.2. Snort IDS [22] was implemented for signature-based system
and compared its results to the machine classification results. The results of the
detectors are collected to database with label for signature (S) and anomaly (A) by
anomaly-based system. The goal of this system is signature-based IDS that has high
accuracy for known attacks as compared to anomaly-based system and benchmarked
the training machine system by signature-based IDS. Apart from reducing detection
delay, it has increased the detection accuracy after some time. The system compares
with the anomaly-based and signature-based systems. If the results are same, the
packet will not be saved into analysis table, whereas if the results are different then
the packet will be saved into the analysis table. Outputs of the signature- and anomaly-
based systems can be examined by network administrator. All different results were
collected into analyzing table followed by training and repetition of the first phase.
The model compares the database of results of both signature and anomaly systems
using time stamp, session ID, source, and destination IP addresses so that we can
reassess the result and train them into machine if the results are different.
In training and classification, a machine with Intel second-generation i5 2.4 GHz
processor, 8 GB DDR3 RAM disk, 1 TB SATA hard disk was used. The analysis
table is created on MySql version 5.7 in Ubuntu 16.
The KDD 99 dataset consists of several types of features such as discrete and con-
tinuous with varying resolution and ranges. So, the authors calculated the symbolic
features, from 1 to N, where N is the number of symbols. Afterward, each symbol
and each value were linearly scaled to the range of [0, N]. The dataset includes five
classes that are four attack types and normal. The dataset is used in many methods
such as semi-supervised algorithms [23, 24], IDS benchmarking [4, 24], and so on.
KDD 99 dataset has nine discrete values which are protocol type, service, flag,
land, logged in, root shell, su attempted, host login, and guest login [4, 25]. Euclidian
distance to calculate dataset for normalization was used. Normalization is required
because the scales of the majority of numerical features in the dataset are not same.
Consequently, the authors conducted to select an optimal feature set using infor-
mation gain ranking, while the next step is to train the machine with training dataset.
426 N. Ugtakhbayar et al.
The first step of the proposed model is to extract 41 features from the real network
traffic. For this, some codes were written for feature extraction, and the results are
shown in Fig. 44.3.
– True positive (TP)—True positive rate measures the proportion of actual positives
which are correctly identified.
– True negative (TN)—An event when no attack has taken place and no detection is
made.
– False positive (FP)—An event signaling an IDS to produce an alarm when no
attack has taken place.
– False negative (FN)—IDS allows an actual intrusive action to pass a nonintrusive
behavior.
– Accuracy—Used for trueness of IDS detection rate and for the existing system it
was calculated using the following formula, as shown below:
TN + TP
Accuracy = (44.4)
TN + TP + FN + FP
This section evaluates the performance of the proposed model. The first step is infor-
mation gaining, where 19 features using information gain ranking were calculated
as shown in Table 44.1.
The objective of the research phase is to create a trained machine for network
attack classification to be used in the next phase. This phase consists of three steps
as shown in Fig. 44.4. The first step is the network packet capture with Wireshark
interface mode in promiscuous. The packets are stored in pcap-formatted file. Next
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 427
step is to preprocess and extract the features from the pcap file. These are selected
features.
The accuracy of the feature selection and classifiers are given in Table 44.1. The
result shows that, on classifying the dataset with all 41 features, the average accuracy
rate of 98 and 95.8% is obtained for naive Bayes when using the selected features.
Its results were shown after 3 times training. Table 44.2 compares the accuracy of
41 and 19 features. As shown in Table 44.2 the all features got the best results than
selected features. The network IDS research using Weka (Waikato Environment for
428 N. Ugtakhbayar et al.
HVF OTH
Dependency ratio = − (44.5)
TIN TON
In the following experiment, finally, we calculated the 25 features for use and
summarized the experiment results as shown in Table 44.4 in two classes. As an
overall, the 25 features give the best results after 3 times machine training (like as
all 41 features). Moreover, applying a proposed model further increases execution
speed and decreases training time about 25%.
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 429
In the following experiment, we calculated the detection time for both signature-
and anomaly-based systems. The anomaly-based system is defended by feature
counts. In this experiment result, we adopted selected 25 features. The result is
shown in Fig. 44.5.
In the graphic, signature-based system is slow detection time in probe-type attack
and anomaly-based system is slower than signature-based system in DoS, U2R, and
R2L types. Thereupon, the proposed hybrid model can quickly classify the traffic
than other hierarchal models.
To better evaluate the performance of the proposed method, we compared the
proposed model with other researcher’s results. Table 44.5 shows the detection ratio
comparison for the proposed model with state-of-the-art methods. The detection ratio
of our model is higher than two researchers and lower than one.
In summary, we adopted naïve Bayes technique for anomaly-based system so that
it was obtainable to train new types of attacks with NUM15 dataset. We conclude
that regulating both signature- and anomaly-based systems are more effective based
on our result and empirical data. Because the Snort can detect known intrusion, it can
measure the machine by Snort. Besides, some features have no relevance in anomaly-
based intrusion detection system while some features can increase the accuracy. In
this regard, we postulate that the combination of these solutions can save time to
detect intrusions if we use them in parallel way. The strengths of the proposed model
are the improved accuracy compared with some methods and models as well as the
quick training time and retraining is easy.
44.6 Conclusion
This study has proposed a hybrid approach to train the machine with an effective way
using processed dataset and signature-based IDS. The proposed system is new of its
kind of hybrid system that combines signature-based detection and anomaly-based
detection approaches. The signature-based detection system is real-time network
IDS Snort that is used widely. The Snort is applied first to detect known intrusion
in real-time and has a low false positive rate. The proposed hybrid system combines
the advantages of these two detection methods. In other word, the Snort can detect
known intrusion, so it can measure the machine by Snort. Also, some features have
no relevance in anomaly-based intrusion detection system while some features can
increase the accuracy. Our feature selection process is utilized to reduce the number
of features to 25, then naive Bayes testing model works on those features. Following
are the advantages of our hybrid model:
• The model will be easy for installation and maintenance.
• Re-modeling of naïve Bayes will be easy.
• The model will increase the accuracy.
• The model was designed using fault-tolerant architecture.
The experimental results show that the proposed hybrid system can increase the
accuracy and can detect novel intrusions after multiple trainings using corrected
analyze table. The proposed model evaluation results show that the accuracy rates
are 97.5%.
Moreover, it can increase execution speed and decrease processing time. The next
task is to study the performance and classification of the computation speed and
comparing other methods.
References
1. Reazul Kabir, Md., Onik, A.R., Samad, T.: A network intrusion detection framework based on
Bayesian network using wrapper approach. Int. J. Comput. Appl. 166(4), 13–17 (2017)
2. Ashoor, A.S., Gore, S.: Importance of intrusion detection system (IDS). Int. J. Sci. Eng. Res.
1–7 (2005)
3. Patel, K.K., Buddhadev, B.V.: An architecture of hybrid intrusion detection system. Int. J. Inf.
Netw. Secur. 2(2), 197–202 (2013)
4. Ugtakhbayar, N., Usukhbayar, B., Nyamjav, J.: Improving accuracy for anomaly based IDS
using signature based system. Int. J. Comput. Sci. Inf. Secur. 14(5), 358–361 (2016)
5. Pathan, A.K.: The state of the Art in Intrusion Prevention and Detection. CRC Press (2014)
44 A Hybrid Model for Anomaly-Based Intrusion Detection System 431
6. Pajouh, H.H., Dastghaibyfard, G.H., Hashemi, S.: Two-tier network anomaly detection model:
a machine learning approach. J. Intell. Inf. Syst. 61–74 (2017)
7. Naga Surya Lakshmi, M., Radhika, Y.: A complete study on intrusion detection using data
mining techniques. IJCEA IX(VI) (2015)
8. Stampar, M., et al.: Artificial Intelligence in Network Intrusion Detection
9. Anderson, J.P.: Computer security threat monitoring and surveillance. In: Technical report,
James P. Anderson Co., Fort Washington, Pennsylvania (1980)
10. Yorozu, Y., Hirano, M., Oka, K., Tagawa, Y.: Electron spectroscopy studies on magneto-optical
media and plastic substrate interface. IEEE Trans. J. Mag. Jpn. 2, 740–741 (1987) [Digests 9th
Annual Conference on Magnetics Japan, p. 301, 1982]
11. Zenghui, L., Yingxu, L.: A data mining framework for building Intrusion detection models
based on IPv6. In: Proceedings of the 3rd International Conference and Workshops on Advances
in Information Security and Assurance. Seoul, Korea, Springer-Verlag (2009)
12. Young, M.: The Technical Writer’s Handbook. University Science, Mill Valley, CA (1989)
13. Androulidakis, G., Papavassiliou, S.: Improving network anomaly detection via selective flow-
based sampling. Commun. IET 399–409 (2008)
14. Te-Shun, C., Fan, J., Kia, M.: Ensemble of machine learning algorithms for intrusion detection,
pp. 3976–3980
15. Neelam, S., Saurabh, M.: Layered approach for intrusion detection using Naive Bayes classifier.
In: Proceedings of the International Conference on Advances in Computing, Communications
and Informatics, India (2012)
16. Gómez, J., Gil, C., Padilla, N., Baños, R., Jiménez, C.: Design of Snort-based hybrid intrusion
detection system. In: IWANN 2009, pp. 515–522 (2009)
17. Cepheli, Ö., Büyükçorak, S., Kurt, G.K.: Hybrid intrusion detection system for DDoS attacks.
J. Electr. Comput. Eng. 2016 (2016). Article ID 1075648
18. Hussein, S.M., Mohd Ali, F.H., Kasiran, Z.: Evaluation effectiveness of hybrid IDS using Snort
with Naïve Bayes to detect attacks. In: IEEE DICTAP 2nd International Conference, May 2012
19. Dhakar, M., Tiwari, A.: A novel data mining based hybrid intrusion detection framework. J.
Inf. Comput. Sci. 9(1), 37–48 (2014)
20. Veeramachaneni, K., Arnaldo, I., Cuesta-Infante, A., Korrapati, V., Bassias, C., Li, K.: AI2:
training a big data machine to defend. In: 2nd IEEE International Conference on Big Data
Security (2016)
21. Aburomman, A.A., Reaz, M.B.I.: Review of IDS development methods in machine learning.
Int. J. Electr. Comput. Eng. (IJECE) 6(5), 2432–2436 (2016)
22. Snort. http://www.snort.org
23. Pachghare, V.K., Khatavkar, V.K., Kulkarni, P.: Pattern based network security using semi-
supervised learning. Int. J. Inf. Netw. Secur. 1(3), 228–234 (2012)
24. Hlaing, T.: Feature selection and fuzzy decision tree for network intrusion detection. Int. J.
Inform. Commun. Technol. 1(2), 109–118 (2012)
25. Wang, Y., Yang, K., Jing, X., Jin, H.L.: Problems of KDD Cup 99 dataset existed and data
preprocessing. Appl. Mech. Mater. 667, 218–225 (2014)
26. Weka. http://weka.sourceforge.net
27. Olusola, A.A., Oladele, A.S., Abosede, D.O.: Analysis of KDD’99 intrusion detection dataset
for selection of relevance features. In: Proceedings of the WCECS 2010, USA (2010)
28. Aslahi-Shahri, B.M., Rahmani, R., Chizari, M., Maralani, A., Eslami, M., Golkar, M.J.,
Ebrahimi, A.: A hybrid method consisting of GA and SVM for intrusion detection system.
Neural Comput. Appl. 27(6), 1669–1676 (2016)
29. Maxion, R.A., Roberts, R.R.: Proper use of ROC curves in intrusion/anomaly detection. Tech-
nical report CS-TR-871 (2004)
Chapter 45
A Method for Precise Positioning
and Rapid Correction of Blue License
Plate
Jiawei Wu, Zhaochai Yu, Zuchang Zhang, Zuoyong Li, Weina Liu
and Jiale Yu
Abstract To alleviate the problems of slow speed and weak correction ability
of existing license plate correction methods under complex conditions, this paper
presents a faster license plate positioning method based on the color component com-
bination and color region fusion and develops a more accurate correction algorithm
of blue license plate using probabilistic Hough transform and perspective transform.
The proposed methods utilize the characteristics of white characters on the blue back-
ground of the Chinese license plate. Color component combination in HSV and RGB
color spaces and image thresholding are first performed to obtain the background
region of the blue license plate and its character region. Then, both regions are fused
to obtain complete and accurate license plate region. And finally, edge detection,
probabilistic Hough transform, and perspective transform are performed to achieve
rapid license plate correction. Experimental results show that average correction time
of blue license plate obtained by the proposed method is 0.023 s, and the average
correction rate is 95.0%.
45.1 Introduction
License plate recognition system is a key part of intelligent traffic system, which has a
wide range of application scenarios, such as car theft prevention, traffic flow control,
parking fee management, red light electronic police, and highway toll station. The
general steps of the license plate recognition system can be divided into rough license
plate positioning, license plate correction, accurate license plate positioning, and
license plate character recognition. Each step in the license plate recognition system
is closely related. The quality of the license plate correction has an important impact
on the subsequent steps. Good correction results can greatly reduce the difficulty
of processing the subsequent steps and improve the accuracy of the license plate
character recognition. Therefore, the license plate correction is an important step in
the license plate recognition system.
In practice, there are three kinds of situations that need to be corrected, namely,
horizontal slant, vertical slant, and mixed slant. Chinese researchers put forward
many correction methods for three situations, which can be mainly divided into two
methods: (1) the method based on traditional Hough transformation [1] and (2) Radon
transform-based method [2]. The traditional Hough transform method relies on the
license plate frame to determine the slant angle, which cannot correct the situation
of characters sticking to the frame or no license plate frame. Radon transform-based
method also has the defects of large computation and slow speed, and cannot adapt
to some complex conditions. Of course, some researchers have improved the above
two methods [3–6], but most of them are difficult to complete real-time correction
under complex conditions.
To solve the above problems, we propose an accurate positioning algorithm for
rough location license plate based on component of color model combination and
color region fusion, and a rapid correction algorithm based on probabilistic Hough
transform [7] and perspective transform [8], which can rapidly and accurately com-
plete the task of blue license plate correction under complex conditions. We use SSD
[9] to get a rough license plate images as the research input.
45.2 Methods
We first scaled the license plate image to the same scale and studied it at a width
of 250 px. Considering that it is difficult to accurately segment the complete license
plate through the RGB color model and splitting the complete license plate has a
great influence on the subsequent correction, we propose a more accurate method
of license plate positioning, which integrates more underlying information and is
more robust. We have also improved the license plate correction process, greatly
improving the correction speed and accuracy. The flowchart is shown in Fig. 45.1.
45 A Method for Precise Positioning and Rapid Correction … 435
The blue
Threshold Fused Get the
license
segmentation color region white region
plate
Fig. 45.1 Flowchart of the entire algorithm. First, construct the channel combination diagram and
use threshold segmentation to get blue region. Next, the threshold segmentation is used to obtain
the white character region. Then, fuse the two regions to obtain the complete accurate license plate
region, and finally correct the region
Appropriate component combination of different color models can enlarge the dif-
ference between foreground and background and simplify the complexity of back-
ground, so as to facilitate image segmentation. The traditional algorithm [10] takes
advantage of the color features of the blue license plate, such as the blue component
gray value is larger than red component gray value in the RGB model. Based on the
above method, it can be described as
Ib = Max{0, B − R} (45.1)
where I b is the result, B is the blue component in RGB color model, and R is the red
component in RGB color model. Then, Otsu [11] algorithm was used to binarize the
combination diagram to obtain the blue region of the license plate.
However, the algorithm will not work in some cases. As shown in Fig. 45.2, this
algorithm is not robust to complex conditions such as the adhesion of characters
to the border and dim illumination, and it cannot obtain the complete license plate
region, and the result of blue region selection is also incomplete.
In order to solve the problems existing in the traditional algorithm, we converted
the preprocessed license plate image into HSV color model and obtained a single
channel image of hue, saturation, and value components through channel separation.
After careful observation of the three components of RGB color model and HSV
color model, we found that the gray value of subtracting the red component from the
value component is bigger in the blue region, as shown in Fig. 45.3. Therefore, it is
436 J. Wu et al.
Fig. 45.2 Example of subtracting of a component of RGB color model. a Original license plate
images, b the results of the subtracting between blue component and red component of RGB color
model
Fig. 45.3 The results of three representative components combined images from left to right:
original image, value component in HSV color model, red component in RGB color model, gray
image of subtracting of value component and red component, thresholding result of gray image
Fig. 45.4 Example of the LCRS method. The left side is the original image, the middle is the blue
region binary image, and the right side is the result of LCRS method
easier to obtain the blue area of the license plate by constructing a composite image
with the value component of HSV color model and the red component of RGB color
model. It can be described as
where V is the value component of HSV color model and R is the red component of
RGB color model.
However, the new question is shown in Fig. 45.4; for the license plate image
which is subject to the car whose body is blue, it is necessary to remove the blue
background of the license plate. We propose a method of large connected region
screening (LCRS) method to obtain the binary map of the blue region. The LCRS
method is defined as follows:
Step1: Find the outer contours of all connected regions in the binary image.
Step2: Find the minimum outer rectangle corresponding to each outer contour.
45 A Method for Precise Positioning and Rapid Correction … 437
IR , if wrect < K ∗ w & hrect < K ∗ h
IR = (45.3)
IBG , otherwise
where K is the ratio of width of binary image. In this paper, the value of K is 0.9.
wrect is the width of inset rectangle and hrect is the height of inset rectangle. The
reason for the K value is that the background area is caused by the color of the car,
and the color of the car is usually pure color, so the width and height of the minimum
external rectangle of the background area will take a large proportion.
However, some license plate characters have adhesion to the license plate frame.
Only the blue region cannot solve the problem that the characters stick to the edge of
the license plate, and the corner phenomenon will occur. If the character region can
be increased, a more complete license plate area will be obtained. The rule of white
region obtained can be formulated as
255, if Hmin < h < Hmax & Smin < s < Smax & Vmin < v < Vmax
Iw =
0, otherwise
(45.4)
Fig. 45.5 Example of two-region fusion. a Original image, b the result of subtracting the value
component from the red component, c the white region of thresholding result, d the results of
two-region fusion
255, if Ib + Iw > 255
If = (45.5)
Ib + Iw , otherwise
where I b is the blue region binary image and I w is the white region binary image.
The results of two-region fusion are shown in Fig. 45.5.
The binary image after the two-region fusion uses the morphological closing opera-
tion to remove some black holes in the license plate area. The binary image after the
closed operation uses the Canny operator for edge detection [12]. The edge detec-
tion result retains the largest contour binary image, which is the external contour
binary image of the license plate area. Then, the probabilistic Hough transform [7]
is applied to the circumscribed contour binary image of the license plate area to fit
the contour line segment. The probabilistic Hough transform [7] is an improvement
on the traditional Hough transform, which has a great improvement in speed and can
detect the end points of the line segment. After detecting the line segment, because
the number of endpoints is not large, it is easy to find the four corner points of the
license plate by iterating to find the corner point.
Perspective transformation [8] is defined as a projection of an image onto a new
viewing plane, also named projective mapping. After obtaining four corner points
of the license plate, it is used as four source points. Then the positions of the four
corrected target points are calculated. The distance between the two adjacent source
points in the upper left corner is taken as the length and width of the target rectangle.
The upper left corner target point is taken as the source point in the upper left corner.
After obtaining four groups of corresponding points, the perspective transformation
matrix was calculated, and then the original license plate image was perspective
transformed with the perspective transformation matrix, that is, the corrected license
plate image was obtained. In this paper, OpenCV was used for perspective trans-
formation, and the average time spent on 40 images with a width of 250 px was
0.002 s.
45 A Method for Precise Positioning and Rapid Correction … 439
In order to verify the effectiveness of the method proposed in this paper, this article
made 40 true blue license plate image datasets and 40 pictures of true license plate
coarse positioning system vehicle images and real scene segmentation, containing
the night, uneven illumination, large lateral tilt angle, fuzzy characters of license
plate tilt at border adhesion under complex conditions, such as a variety of vertical
slant horizontal slant images, and the horizontal and vertical slant images.
The experiment was run in the hardware and software environment of Intel Core
i5 4210M 2.60 GHz processor, 8 GB RAM, and Windows 7 operating system. The
eclipse-integrated development environment was used, Python language was used
for coding, and the open-source library OpenCv assisted programming. Only one
thread was used in the test.
In order to verify that the proposed algorithm has advantages over the traditional
method, using these 40 images as the test dataset, three algorithm comparison exper-
iments are designed to verify. The first algorithm is based on the traditional Hough
transform, and the second algorithm is based on the Radon transform.
The comparison results of the three algorithms to correct some images are shown
in Fig. 45.6. On the rough license plate samples with a large angle, the algorithm
has higher correction accuracy and better robustness. The comparison results of 40
Fig. 45.6 Comparison of three different methods of license plate correction results. a Original
license plate image, b the correction result of the tradition Hough transform-based, c the correction
result of the tradition Radon transform-based, d the correction result of the proposed method
440 J. Wu et al.
Table 45.2 Result obtained Number of Methods Time (s) Accuracy (%)
by different methods tests
40 The tradition 5.122 77.5
Hough-based
40 The 0.320 85.0
Radon-based
40 Proposed 0.023 95.0
method
complete test set algorithms are shown in Table 45.2. The test set image width is 250
px scaled license plate coarse positioning picture, and the three algorithm test sets are
the same. The experimental results show that the proposed algorithm outperforms the
first two algorithms in both time and accuracy. Compared with most other existing
license plate correction algorithms, the algorithm still has advantages.
In order to verify that the algorithm has good generalization, 50 real license plate
pictures were collected and corrected by this method. 46 were correct corrections
and 4 were correction failures; the correct rate is 92.0%, and the average correction
time is 0.016 s. The reason for the failure of the correction is that the white metal
rod blocks and the green paints interfere the license plate. The experimental results
show that the proposed algorithm has good generalization ability and can complete
blue license plate correction quickly in most cases.
45.4 Conclusions
A precise positioning and correction method for the blue license plate is proposed in
this paper. The proposed method first uses color component combination and color
region fusion to accurately localize the license plate, and then uses the probability
Hough transform and perspective transformation to quickly correct the license plate.
Experimental results show that the proposed method has good real-time perfor-
mance and high correction rate, which satisfies the requirements of real-time moni-
toring in real-world scenes. Meanwhile, the license plate correction method has the
advantages of good robustness and a large range of corrective adaptability. In the
future, we will expand the proposed method to suitable for other types of Chinese
license plates.
45 A Method for Precise Positioning and Rapid Correction … 441
References
1. Rui, T., Shen, C., Zhang, J.: A fast algorithm for license plate orientation correction. Comput.
Eng. 30(13), 122–124 (2004)
2. Ge, H., Fang, J., Zhang, X.: Research on license plate location and tilt correction algorithm in
License plate recognition system. J. Hangzhou Dianzi Univ. 27(2) (2007)
3. Wang, S., Yin, J., Xu, J., Li, Z., Liang, J.: A fast algorithm for license plate recognition
correction. J. Chang. Univ. (Nat. Sci. Ed.) 30(04), 76–86 (2018)
4. Wang, N.: License plate location and slant correction algorithm. Ind. Control. Comput. 27(11),
25–26 (2014)
5. Ji, J., Cheng, Y., Wang, J., Luo, J., Chang, H.: Rapid correction of slant plate in license plate
recognition. Technol. Econ. Guid. 26(35), 68 (2018)
6. Lu, H., Wen, H.: License plate positioning and license plate correction method under different
degrees of inclination. Mod. Ind. Econ. Inf. 6(05), 69–71 (2016)
7. Stephens, R.S.: Probabilistic approach to the Hough transform. Image Vis. Comput. 9(1), 66–71
(1991)
8. Niu, Y.: Discussion about perspective transform. J. Comput.-Aided Des. Comput. Graph. 13(6),
549–551 (2001)
9. Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: European
Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
10. Zheng, K., Zheng, C., Guo, S., Cheng, K.: Research on fast location algorithm of license plate
based on color difference. Comput. Appl. Softw. 34(05), 195–199 (2017)
11. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man
Cybern. 9(1), 62–66 (2007)
12. Canny, J.: A computational approach to edge detection. In: In Readings in Computer Vision,
pp. 184–203. Morgan Kaufmann (1987)
Chapter 46
Preliminary Design and Application
Prospect of Single Chinese Character
Calligraphy Image Scoring Algorithm
Abstract This paper improves the image classification task based on deep learning
and proposes a new font grading system to help calligraphy lovers to practice callig-
raphy. The basic model of the framework proposed in this paper is ResNet, and then
dilated convolution, deformable convolutional, and deformable pooling are used on
the traditional ResNet to improve performance. Experimental results show that the
proposed algorithm can make a reasonable judgment on handwriting.
46.1 Introduction
With the extensive development of MOOC, some Chinese calligraphy courses have
emerged in the major MOOC platforms in China [1]. The 2018 National Online
Open Course Evaluation of China emphasizes more on students’ intensive and quan-
titative homework exercises [2]. The authors’ research team found that the number
of students taking calligraphy courses online is more than the average number of
students taking other courses and the evaluation of students’ calligraphy homework
is a kind of brainwork which is difficult for scorers to achieve long-term objective
evaluation. Because of this, the research team went looking for a program that could
automatically score calligraphy works but found that there were only a few simple
software that collect expert scores and perform simple mathematical processing and
no software that can be directly used to score calligraphy art. In the field of education
research, timely feedback and evaluation have great significance in the training of stu-
dents’ calligraphy skills [3]. At present, the academic field of intelligent recognition
of Chinese characters has made considerable progress than before, and various kinds
of Chinese character recognition software can be seen everywhere. And the devel-
opment of deep learning technology makes the more scientific intelligent evaluation
of Chinese calligraphy becomes a possibility.
46.2 Method
This paper proposes a new algorithm for calligraphy grading based on deep learning.
It can be used to help students with calligraphy exercises. Calligraphy font recognition
based on convolution neural network can automatically extract features and avoid
the defects of manual design features. The network structure proposed in this paper
is based on the ResNet [4] image classification network combined with the newly
proposed dilated convolution [5] kernel and deformable convolution [6] design.
The scoring algorithm of Chinese calligraphy proposed in this paper is based on
the improvement of the classification algorithm. Generally speaking, the goal of the
classification task is to find out the probability of different types of targets and then
select the category with the highest probability as the classification result [7–11]. The
scoring algorithm proposed by us will make some improvements to the traditional
classification algorithm. We first assume that the category with the largest probability
is still the classification result, and then take the probability value of this category
as the reference value for scoring. The scoring algorithm proposed by us will make
some improvements to the traditional classification algorithm. We first assume that
the category with the highest probability is still the classification result, and then use
the obtained probability value as the reference for scoring. As mentioned above, the
algorithm we proposed is based on the image classification task. Therefore, we refer
to ResNet, a very effective image classification network.
The advantage of the deep residual network is that it can make the network very
deep. This is because of the building block. The structural diagram of the building
block is shown in Fig. 46.1.
ResNet addresses the degradation problem by introducing a deep residual learning
framework. Instead of hoping every few stacked layers directly fit a desired underly-
ing mapping, we explicitly let these layers fit a residual mapping. In this paper, the
46 Preliminary Design and Application Prospect … 445
Text images differ from regular images. This kind of image is blank in a large part
of the area, so the features extracted from most areas of this image are invalid.
Therefore, we choose empty convolution to extract features. Figure 46.2 shows a
schematic diagram of dilated convolution, where the left side is the normal convo-
lution and the right side is the dilated convolution as can be seen from Fig. 46.2.
Dilated convolution does not extract the features of all pixel points but takes the
features across them. This way of feature extraction can avoid the network which
extracts too many invalid features. The purpose of this structure is to provide greater
sensing field without pooling (the pooling layer will result in information loss) and
with the same amount of calculation. Therefore, empty convolution is very suitable
for the project of calligraphy font grading. We convert the original feature extraction
method of RSRNet into this method. This makes a lot of sense for the end result.
In this work, we adopt two new modules to enhance CNN’s transformation modeling
capability, namely, deformable convolution and deformable Roi pool. Both are based
on the idea of adding a spatial sampling location to the module with an additional
offset and learning the offset of the target task without additional supervision. These
modules can easily replace the ordinary peers in the existing CNN and carry out
end-to-end training through standard backpropagation, thus generating deformable
convolution network. The schematic diagram of deformable convolutional is shown
in Fig. 46.3. Among them, the standard convolution (a) the rules of the sampling grid
(green), (b) (deep blue dot) deformation of the sampling position, have enhanced
migration in deformable convolution (blue arrow), (c) and (d) is the special case
of (b), shows that the deformation of the convolution sums up all kinds of scale
transformation, aspect ratio and rotation (anisotropic).
In fact, the offset added in the deformable convolution unit is part of the network
structure, which is calculated by another parallel standard convolution unit, and fur-
ther end-to-end learning can be carried out through gradient backpropagation. And
the offset after learning, the size of the deformable convolution kernels and position
can be adjusted according to the current need to identify the dynamic image con-
tent. Its visual effect is different location of convolution kernels will sample point
location based on image content adaptive changes, so as to adapt to different geomet-
rical deformation object’s shape, size, etc. Deformable convolution and deformable
pooling are shown in Fig. 46.4.
Due to our task, we are very sensitive to the direction of strokes. An excellent
calligraphy font should have a good grasp of these details. Therefore, it is necessary
to extract features by means of deformable convolution. In this way, the network is
more sensitive to the orientation of fonts.
The rest of the network is aligned with the traditional ResNet. So ResNet itself is
a very good network.
The scoring algorithm of Chinese calligraphy proposed in this paper is based on the
improvement of the classification algorithm. The goal of the classification task is to
find out the probability of different types of targets and then select the category with
the highest probability as the classification result. As mentioned above, the algorithm
we proposed is based on the image classification task. We refer to ResNet [1], a
very effective image classification network. Different from traditional ResNet, our
proposed network adds dilated convolution, deformable convolution, and deformable
pooling to the basic ResNet.
46.3 Experiment
46.3.1 Dataset
Since Chinese calligraphy has few international character recognition tasks, all the
databases are difficult to obtain. So we use the print font provided in Windows as
the database source. There are many fonts in this database, but we only carried out
experiments on two fonts, namely, regular script and song script. The reason why we
only study these two fonts is that the main purpose of this paper is to provide a method
for scholars in the need to identify characters in calligraphy, rather than to completely
solve all problems in this field or complete a certain project. Our follow-up work
will expand the type of fonts.
Our data does not start with a single text image, but a text image with multiple
texts. The image contains only Chinese characters, not anything else. We first cut
the original image into an image containing only one Chinese character, because
the computer software generated by the image contains many Chinese characters in
shape is very regular. Therefore, as long as the size of these pictures has a certain
understanding, it is easy to perfect the cutting. After cutting, we get the dataset we
448 S. Liu et al.
need. Figure 46.5 shows the schematic diagram of the dataset acquisition method.
As can be seen from Fig. 46.5, this is a very neat image, so it is very easy to get the
image data we need from this image.
46.3.2 Training
In order to verify the feasibility of the improved model in this section for handwriting
style recognition, we test the training model on the dataset of the two fonts, as well as
the training and testing on the standard dataset. The accuracy rate of final recognition
reached above 0.99. It shows that, first of all, our network can perfectly complete
the task of font recognition. Since the core of our task is actually identification, the
proposed network is effective.
We used the Tensorflow deep learning framework to compare and contrast the
model. We conducted many experiments with various learning rates. The training
set was 2400 and the test set 800. The batch gradient descent method was used to
update and iterate the model parameters, batch = 50, and the training set was iterated
48 times without falling. Too small a learning rate will make the convergence rate
too slow, and too large a learning rate will lead to the result that the optimal value
is skipped and the convergence cannot be achieved. Finally, the learning rate of this
experiment is set as rate = 0.001.
46 Preliminary Design and Application Prospect … 449
The application of this software is not only in the field of calligraphy courses for
MOOC. It can be released separately and applied to the teaching of calligraphy
for children and adolescents. If it is to be applied to the evaluation of calligraphy art
contests, further research and development is needed. Although the song character is a
kind of printed font, it has an important role in the evaluation of writing norms because
of its simple strokes. As a font on the pre-evolution step of regular calligraphy,
Li Calligraphy has not yet appeared too much genre differentiation and is easy to
identify intelligently. Because of the differentiation of regular scripts and running
hand scripts, their scoring indicator system requires a larger sample of deep learning
training, which is the main work of the future. The entire calligraphy work requires a
macro layout. Sample collection of the entire calligraphy work is also another major
work in the future.
References
1. Mi, W.: The e-curriculum development: a new way for current primary and secondary school
calligraphy teaching. Curric., Teach. Mater. Method 38(07), 87–91 (2018)
2. Ministry of Education of the People’s Republic of China official website, http://www.moe.gov.
cn/srcsite/A08/s5664/s7209/s6872/201807/t20180725_343681.html. Last accessed 24 July
2018
3. Zhou, Y.: Thoughts on the construction of online open courses for art. Art Educ. 336(20),
136–137 (2018)
4. He, K.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision
and Pattern Recognition 2016, pp. 770–778 (2016)
5. Yu, F.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.
07122(2015)
6. Dai, J.: Deformable convolutional networks. In: Proceedings of the IEEE International Con-
ference on Computer Vision (2017)
7. Fanello, S.R.: Keep it simple and sparse: real-time action recognition. J. Mach. Learn. Res.
14(1), 2617–2640 (2017)
8. Lu, C.: Two-class weather classification. IEEE Trans. Pattern Anal. Mach. Intell. (99), 1 (2017)
450 S. Liu et al.
9. Woitek, R.: A simple classification system (the Tree flow chart) for breast MRI can reduce the
number of unnecessary biopsies in MRI-only lesions. Eur. Radiol. 27(9), 3799–3809 (2017)
10. Cicero, M.: Training and validating a deep convolutional neural network for computer-aided
detection and classification of abnormalities on frontal chest radiographs. Investig. Radiol.
52(5), 281 (2017)
11. Yuan, Y.: Hyper spectral image classification via multitask joint sparse representation and
stepwise MRF optimization. IEEE Trans. Cybern. 46(12), 2966–2977 (2017)
Chapter 47
Adaptive Histogram Thresholding-Based
Leukocyte Image Segmentation
47.1 Introduction
In the medical fields, the analysis and cytometry of white blood cells (WBCs) in
blood smear images is a powerful diagnostic tool for many types of diseases, such
as infections, anemia, malaria, syphilis, heavy metal poisoning, and leukemia. A
X. Zhou
College of Mathematics and Computer Science, Fuzhou University, Fuzhou, People’s Republic of
China
e-mail: xiaogenzhou@126.com
X. Zhou · C. Wang · Z. Li (B) · F. Zhang (B)
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou, People’s Republic of China
e-mail: fzulzytdq@126.com
F. Zhang
e-mail: 8528750@qq.com
C. Wang
School of Computer Science and Technology, Harbin University of Science and Technology,
Harbin, People’s Republic of China
Fig. 47.1 The WBC image segmentation process of the proposed method. a Original WBC image,
b the color component combination image, c the grayscale histogram of (b), where P1 , P2 , and
P3 are three peaks of the histogram, and T is a threshold for image binarization, d segmentation
result of the leukocyte’s nucleus, e the result of (a) after removing the RBCs and background, f the
maximum object contour in the leukocyte’s edge detection result, g the leukocyte’s segmentation
result, h segmentation result of the leukocyte’s cytoplasm
computer-aided automatic cell analysis system not only saves manpower and time
cost but also reduces the effects of human error. WBC segmentation is the basis
of automatic cell image analysis, and the precision of WBC segmentation directly
influences the reliability of the blood smear image analysis.
A typical human blood smear image which consists of WBCs, red blood cells
(RBCs or erythrocytes), platelets, and the background is conventionally prepared with
Wright-Giemsa stain to visualize and identify WBCs microscopically. The goal of
cell segmentation is to extract WBCs from a complex scene for subsequent analysis,
however, due to uneven staining, illumination conditions, the limitations of various
properties of cells such as size, color, and shape of cells, and the adhesion between
WBCs and RBCs. Therefore, an accurate and robust WBC segmentation is still a
challenging task owing to the above reasons.
The primary objective of this paper is to present a method to segment entire leuko-
cyte, nucleus, and cytoplasm from the blood smear image with the standard staining
condition, as shown in Fig. 47.1; it is an example of leukocyte image segmentation.
There are various types of segmentation methods that have been proposed for
cell images over the past several decades. Specific thresholding is a widely used
technique based on the analysis of histogram in cells segmentation. Threshold-based
[1, 2] methods mainly include the region growing method, the watershed method [3,
4], and Otsu’s method [5]. Lim et al. [6] proposed a WBC segmentation method by
image thresholding and watershed techniques. In addition, learning-based methods
include supervised methods, including support vector machine (SVM) [7], deep
47 Adaptive Histogram Thresholding … 453
neural networks, and unsupervised methods such as k-means clustering [8] and fuzzy
c-means. Zhang et al. proposed a novel method for the nucleus and cytoplasm of
leukocyte segmentation based on color space decomposition and k-means clustering.
In this paper, we proposed a method to segment nucleus and cytoplasm of leuko-
cytes in blood smear images. We employ AHT and components combination in color
space (CCCS) to segment the nucleus of leukocyte and obtain the entire leukocyte
using the Canny edge detection. We also obtain the cytoplasm region by subtracting
the nucleus region from the entire leukocyte region.
The rest of the paper is structured as follows. Section 47.2 briefly introduces the
proposed method. The experimental results are shown and discussed in Sect. 47.3.
The conclusion is drawn in the final section.
We introduce a novel method to accurately segment the nucleus from the leukocyte
image, which contains two main steps. First, a novel color component combination
image (see Fig. 47.1b) is constructed by the saturation component in HSI color space,
the green component and the blue component in RGB color space, respectively.
Second, the nucleus segmentation result is obtained based on the AHT method. The
detailed process of nucleus segmentation is as follows:
(1) Components combination in color space: Construct a color component combi-
nation by the saturation component in hue, saturation, and intensity (HSI) color
space, the green and blue components as a new image I , using the following
formulae:
I (i, j) = S + k1 B − k2 G, (47.1)
454 X. Zhou et al.
k1 = 1, if B0 ≥ S0 (47.2)
S0
B0
, otherwise
In Eq. (47.1), S denotes the normalized saturation component in HSI color space,
G and B indicate the green and blue components in RGB color space, respectively.
Symbols k 1 and k 2 are weights of B and G, respectively, and k 1 is adaptively set
according to Eq. (47.2). In Eq. (47.2), . indicates rounding upward, S 0 and B0 are
the thresholds of the saturation and the blue components determined by our proposed
adaptive histogram thresholding, respectively.
(2) Extraction of nucleus region: We first suppress image noise using the median
filter, then extract candidate nucleus regions by our proposed method AHT,
and finally remove small regions for obtaining final nucleus regions. The AHT
method includes the following steps.
Step 1: Construct a grayscale histogram, H, of the above color component com-
bination image.
Step 2: Find the peaks in H using Matlab function “findpeaks”, and denoted
their corresponding gray levels as g1 , g2 , . . . , gN , where N is the number of peaks.
Figure 47.1c shows all the three peaks of the image histogram.
Step 3: Calculate two gray levels, gM and gSM corresponding to the highest
peak and the second highest peak among the peaks, respectively, via the following
formulae:
where T is the gray level corresponding to the minimum value of H among gray
levels between the highest peak and the second highest peak.
Step 5: Obtain nucleus segmentation result using the following equation:
1, if I (i, j) > T
BT (i, j) = (47.6)
0, otherwise
This section presents a novel method to segment cytoplasm. Specifically, the pro-
posed method first removes image background and RBCs by a preprocessing oper-
ation based on image color features, and then performs Canny [9] edge detection
to detect the contour of entire leukocyte, which is then utilized to obtain the binary
image of leukocyte. Finally, cytoplasm segmentation is achieved by subtracting the
nucleus region from the leukocyte region. The detailed steps of cytoplasm segmen-
tation are described as follows.
(1) Remove the background based on prior knowledge of image color via the fol-
lowing formula:
[255, 255, 255], if I (i, j, 2) ≥ t1 ,
Ib (i, j, :) = (47.7)
I (i, j, :), otherwise.
I (i, j, 1) + I (i, j, 3)
t1 = (47.8)
2
where I (i, j, :) and Ib (i, j, :) denote three color component values of the pixel (i, j)
in the original image and the background removal result, respectively.
(2) Remove red blood cells (RBCs) from the image Ib by the following image
thresholding:
[255, 255, 255], if Ib (i, j, 1) ≥ t2
Ibr (i, j, :) = (47.9)
Ib (i, j, :), otherwise
Ib (i, j, 2) + Ib (i, j, 3)
t2 = , (47.10)
2
where Ibr (i, j, :) denotes the image after removing the red blood cells.
(3) Perform median filter to smooth Ibr and remove impurities.
(4) Perform Canny edge detection to obtain the leukocyte contour.
(5) Obtain the maximum connected region from the edge detection result. The
corresponding result is shown in Fig. 47.1f.
(6) Fill the leukocyte contour to obtain leukocyte region by Matlab function “imfill”,
and then further perform the morphological operation by the Matlab function
“imopen” to obtain the final leukocyte segmentation result, which is shown in
Fig. 47.1g.
(7) Cytoplasm segmentation is achieved by subtracting the WBC nucleus region
from the leukocyte region, and the corresponding result is shown in Fig. 47.1h.
456 X. Zhou et al.
In this paper, to validate the effectiveness of the proposed method, we used one image
database which includes 60 260×260 WBC images with single WBC under standard
staining condition, which was provided by The People’s Hospital Affiliated to Fujian
University of Traditional Chinese Medicine. There also is a color difference between
different images due to unstable illumination, different types of leukocytes, and so
on.
To demonstrate the superiority of the proposed method, we compared our pro-
posed method with other available existing WBC image segmentation methods, i.e.,
Zheng et al. [10] and Gu and Cui [11]. Segmentation results on several typical images
are first evaluated qualitatively. Then, segmentation results on the two image datasets
were quantitatively evaluated using four common image classification measures, i.e.,
misclassification error (ME) [12], false positive rate (FPR), false negative rate (FNR)
[13], and kappa index (KI) [14]. Their definitions are as follows:
|Bm ∩ Ba | + |Fm ∩ Fa |
ME = 1 − , (47.11)
|Bm | + |Fm |
|Bm ∩ Fa |
FPR = , (47.12)
|Bm |
|Fm ∩ Ba |
FNR = , (47.13)
|Fm |
|Fm ∩ Fa |
KI = 2 , (47.14)
|Fm | + |Fa |
where Bm and Fm are the background and the foreground of the manual ideal seg-
mentation result (ground truth), respectively. Ba and Fa are the background and
foreground of the automatic segmentation result obtained by a certain algorithm,
respectively, and |.| is the cardinality 0 and 1. Lower values of ME, FPR, and FNR
indicate better segmentation, while higher values of KI indicate better segmentation.
To quantitatively compare the segmentation accuracy of the three methods (i.e.,
Zheng’s method [10], Gu’s method [11], and the proposed method), we have a dataset
composed of 60 blood smear images with standard staining condition. The segmen-
tation results were quantitatively evaluated by four measures of ME, FPR, FNR,
and KI. Tables 47.1 and 47.2 show the quantitative evaluation results of leukocyte
and nuclear segmentation results on the standard staining dataset, respectively (the
47 Adaptive Histogram Thresholding … 457
best results are highlighted in bold). Figure 47.2 shows segmentation results on eight
WBC images under standard staining condition. As for the average segmentation per-
formance on the standard-stained images, Tables 47.1 and 47.2 demonstrate that the
proposed method has the lowest value of ME, FPR, and FNR, and has the highest KI
value, which indicates that our method performs best among all the two approaches.
47.4 Conclusions
Acknowledgements This work is partially supported by the National Natural Science Founda-
tion of China (61772254 and 61202318), Fuzhou Science and Technology Project (2016-S-116),
Program for New Century Excellent Talents in Fujian Province University (NCETFJ), Key Project
of College Youth Natural Science Foundation of Fujian Province (JZ160467), Young Scholars in
Minjiang University (Mjqn201601), and Fujian Provincial Leading Project (2017H0030).
458 X. Zhou et al.
Fig. 47.2 Visual segmentation results under standard staining condition with columns from left to
right: original images, ground truths, segmentation results obtained by Gu’s method [11], Zheng’s
method [10], and the proposed method, respectively
47 Adaptive Histogram Thresholding … 459
References
1. Huang, D.C., Hung, K.D., Chan, Y.K.: A computer assisted method for leukocyte nucleus
segmentation and recognition in blood smear images. J. Syst. Softw. 85(9) (2012)
2. Putzu, L., Di Ruberto, C.: White blood cells identification and counting from microscopic
blood images. In: Proceedings of the WASET International Conference on Bioinformatics,
Computational Biology and Biomedical Engineering 2013, vol. 7(1). Guangzhou, China
3. Arslan, S., Ozyurek, E., Gunduz-Demir, C.: A color and shape based algorithm for segmentation
of white blood cells in peripheral blood and bone marrow images. Cytom. Part A 85(6), 480–490
(2014)
4. Zhi, L., Jing, L., Xiaoyan, X., et al.: Segmentation of white blood cells through nucleus mark
watershed operations and mean shift clustering. Sensors 15(9), 22561–22586 (2015)
5. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man
Cybern. 9(1), 62–66 (1979)
6. Lim, H.N., Mashor, M.Y., Hassan, R.: White blood cell segmentation for acute leukemia bone
marrow images. In: Proceedings of the 2012 IEEE International Conference on Biomedical
Engineering (ICoBE) 2012. Penang, Malaysia, IEEE (2012)
7. Zheng, X., Wang, Y., Wang, G., Liu, J.: Fast and robust segmentation of white blood cell images
by self-supervised learning. Micron 107, 55–71 (2018)
8. Zhang, C., Xiao, X., Li, X., et al.: White blood cell segmentation by color-space-based k-means
clustering. Sensors 14(9), 16128–16147 (2014)
9. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell.
8, 679–698 (1986)
10. Zheng, X., Wang, Y., Wang, G.: White blood cell segmentation using expectation-maximization
and automatic support vector machine learning. J. Data Acquis. Process. 28(5), 217–231 (2013)
11. Gu, G., Cui, D.: Flexible combination segmentation algorithm for leukocyte images. Chin. J.
Sci. Instrum. 29(9), 1977–1981 (2008)
12. Yasnoff, W.A., Mui, J.K., Bacus, J.W.: Error measures for scene segmentation. Pattern Recogn.
9(4), 217–223 (1977)
13. Fawcelt, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
14. Fleiss, J.L., Cohen, J., Everitt, B.S.: Large sample standard errors of kappa and weighted kappa.
Psychol. Bull. 72(5), 323–327 (1969)
Chapter 48
Simulation Study on Influencing Factors
of Flyer Driven by Micro-sized PbN6
Abstract In order to guide the structural design of the micro-explosive train, the
JWL equation of state parameters of the primer explosive PbN6 is fitted first, and then
the simulation model of the flyer driven by micro-charge and the flyer impacting the
explosion-proof component is established using AUTODYN software. The effects of
charge height, flyer thickness, and shear plate aperture on flyer velocity and kinetic
energy are obtained by simulation calculation. When the charge diameter is fixed, the
flyer velocity increases first with the increase of charge height, and then gradually
tends to a fixed value. When the charge size is fixed, the maximum flyer kinetic
energy corresponds to an optimal flyer thickness. When the shear plate aperture is
smaller than the charge diameter, the flyer velocity will be improved. The relationship
between the thickness of nickel, copper, and silicon explosion-proof component
and shock wave attenuation is studied quantitatively, and the safe explosion-proof
thickness of initiating JO-9C acceptor charge is given.
48.1 Introduction
Miniaturization of explosive train can reduce the volume of ammunition fuze, which
saves more space for the circuit design of weapon system and the main charge,
thus improving the power of weapon, and it is a research hotspot of explosive train
technology. It is possible to further miniaturize the explosive train by integrating
X. He (B) · N. Yan
Beijing Institute of Technology, Beijing 100081, China
e-mail: 716280128@qq.com
W. Wu
The 53rd Research Institute of CETC, Tianjin 300161, China
L. Zhang
School of Information and Science and Technology, Peking University, Beijing 100871, China
© Springer Nature Singapore Pte Ltd. 2020 461
J.-S. Pan et al. (eds.), Advances in Intelligent Information Hiding and Multimedia
Signal Processing, Smart Innovation, Systems and Technologies 157,
https://doi.org/10.1007/978-981-13-9710-3_48
462 X. He et al.
1
2
3
4
5
6
2 3 4 5
6 1 Guass point
Fig. 48.3 Parameter fitting process of JWL equation of state for detonator
ω ω ωE
P(V, E) = A(1 − )e−R1 V + B(1 − )e−R2 V + (48.1)
R1 V R2 V V
u s = c0 + su p (48.2)
In the formula, μs and μp are stress wave velocity in solid medium and particle
velocity on wavefront, respectively. c0 is the elastic wave velocity in medium and s
is the test constant. The material parameters of constraints, shear plates, and flyers
are shown in Table 48.2.
The air region is described by the equation of state of ideal gas:
P = (γ − 1)ρ E g (48.3)
In the formula, γ is an adiabatic index, and for ideal gases, γ = 1.4. The initial
density ρ 0 of air is 1.225 × 10−3 g cm−3 , and the specific internal energy of gas E g
= 2.068 × 105 .
The Relation between Flyer Speed and Displacement. After the shock wave passes
through the air gap, its shock wave pressure drops rapidly, which often fails to
detonate the lead charge. The shearing process of flyer sheet is shown in Fig. 48.5.
The flyer is accelerated by shock wave at first and can maintain a certain distance
after reaching a certain speed, and then the speed decreases slowly, so the flyer
can transfer energy more effectively. The velocity–time data of Ti flyer driven by
charge of ϕ0.9 mm × 1.8 mm PbN6 are obtained by simulation calculation. The
466 X. He et al.
A B C D
Fig. 48.7 The relationship between charge height and flyer speed and kinetic energy
the shock wave output pressure of the detonator tends to be fixed, and the speed of
the flyer is positively correlated with the output pressure of the primer [4]. Simulate
the charge of ϕ0.9 mm PbN6 , the maximum speed, and kinetic energy of the flyer
when the charge height increases from 0.6 to 3 mm, as shown in Fig. 48.7.
As can be seen from Fig. 48.7, after charge height > 1.8 mm, the increasing trend
of velocity and energy of flyer is gentle, so the charge height should be less than
1.8 mm. When the kinetic energy of flyer is greater than the critical initiation energy
E C of explosive, the lead charge can be detonated. According to Ref. [5], the critical
initiation energy E C of JO-9C is 164.6 mJ. Reference to GJB1307A [6], the minimum
output energy of the detonator should be at least 25% higher than the minimum input
energy required by the detonation transfer train or terminal device. The minimum
charge height of 1.25 E C is 0.85 mm, so the detonator charge height that meets the
requirements of reliable detonation transfer and margin design should be more than
0.85 mm.
Effect of Flyer Thickness on Flyer Velocity and Kinetic Energy. The process of
flyer impact initiation is high-pressure short-pulse initiation. The initiation ability is
affected by shock wave pressure and action time. The duration of shock wave pulses
in explosives τ is related to the thickness of flyer plates. The formula for calculating
τ is as follows:
2δ
τ= (48.4)
Df
In the formula, Df is the velocity of shock wave in the flyer and δ is the thickness
of the flyer. When the size of PbN6 is conformed, the velocity and kinetic energy of
468 X. He et al.
Fig. 48.8 The relationship between flyer thickness and flyer velocity and kinetic energy
titanium flyer with thickness from 0.02 to 0.1 mm are calculated by simulation. The
results are shown in Fig. 48.8.
The simulation results show that the velocity of the flyer decreases linearly with
the increase of the thickness of the flyer. Except that the kinetic energy of the flyer
with a thickness of 0.02 mm does not meet the requirement of initiation energy, the
kinetic energy of the flyer with other thickness can meet the requirement of energy
margin. The kinetic energy of the flyer increases first and then decreases. There exists
an optimal thickness of the flyer with the largest kinetic energy of the flyer, which is
also the preferred thickness of the flyer in design.
Effect of Shear Plate Aperture on Flyer Speed. The shear plate and the initiating
explosive together make the flyer shear forming. The aperture of the shear plate is the
diameter of the flyer. Three series of shear plate aperture are simulated and designed,
that is, the aperture of the shear plate is larger than the diameter of the charge,
close to the diameter of the charge, and smaller than the diameter of the charge. The
relationship between the aperture of the shear plate and the velocity of the flyer plate
is studied. In the simulation, the thickness of PbN6 and flyer is unchanged, and the
settlement results are shown in Fig. 48.9.
When the aperture of shear plate (0.2, 0.3, 0.6 mm) is smaller than the charge
diameter, the smaller the aperture of shear plate, the shorter the time for flyer velocity
to reach its maximum, and the final flyer velocity tends to be the same. When the
diameter of the shear plate (1, 0.9 mm) is close to the charge, the flyer can also
accelerate to the speed close to the small diameter, but then the speed decreases
rapidly. When the aperture of the shear plate (1.2, 1.5 mm) is larger than that of the
charge, the influence of lateral sparse wave intrusion on the shear forming process of
the flyer sheet is significant [7]. The maximum velocity of the flyer sheet is obviously
48 Simulation Study on Influencing Factors of Flyer Driven … 469
Fig. 48.9 The velocity–displacement curve of flyer under different shear plate apertures
smaller than that of the small diameter flyer sheet, and the velocity attenuation is
advanced, and the attenuation range is more obvious.
Therefore, in the design of shear plate aperture, the flyer diameter should be larger
than the charge diameter, so as to improve the flyer’s explosion transfer ability.
48.3 Conclusion
When the diameter of PbN6 charge and the size of titanium flyer are fixed, the height
of charge increases from 0.6 to 3 mm. When the charge height h = 0.6 mm, the
energy requirement of initiating JO-9C is met. When the charge height h = 0.85,
the energy margin requirement of initiating JO-9C is met. When the charge height
h > 1.8 mm, the increasing trend of flyer velocity and kinetic energy is gentle. The
simulation provides quantitative guidance for designing the minimum charge height
of primer explosive.
When the diameter of PbN6 charge and titanium flyer is constant and the thickness
of flyer is increased from 0.02 to 0.1 mm, the velocity of flyer decreases linearly,
and the kinetic energy of flyer increases first and then decreases. When the flyer is
greater than 0.044 mm, the energy margin of JO-9C initiation is satisfied, and when
the flyer is equal to 0.08 mm, the kinetic energy of flyer is the largest.
When the size of PbN6 charge and titanium flyer is fixed and the aperture of
shear plate varies from 0.2 to 1.5 mm, the velocity of flyer will eventually converge
when the aperture of shear plate is less than the diameter of charge, and the faster
470 X. He et al.
the aperture is, the faster the velocity of flyer will reach the maximum. The larger
the aperture is, the smaller the maximum velocity of the flyer is, and the earlier
the attenuation time of the corresponding velocity of the flyer is, the greater the
attenuation range is. Therefore, the diameter of the shear plate is smaller than that
of the charge.
References
1. Wu, X., Tan, D.: Poly tropic index calculation of condensed explosives. Explosives 2, 1–9 (1981)
2. Shen, F., Wang, H., Yuan, J.: A simple algorithm for determining the parameters of JWL equation
of state. Vib. Shock. 9, 107–110 (2014)
3. Lao, Y.: Pyrotechnics Pharmaceutics. North University of science and Technology Press, Beijing
(2011)
4. He, A.: Design Principle of Miniature Detonating Sequence Based on MEMS Fuze. Beijing
Institute of Technology, Beijing (2012)
5. Zhang, B., Zhang, Q., Huang, F.: Detonation Physics. Weapons Industry Press, Beijing (2001)
6. GJB1307A. 2004. General Design Code for Aerospace Pyrotechnics. National Defense Science
and Technology Industry Committee (2004)
7. Lim, S., Baldovi, P.: Observation of the velocity variation of an explosively-driven flat flyer
depending on the flyer width. Appl. Sci. 9, 97–109 (2019)
Chapter 49
Identifying Key Learner on Online
E-Learning Platform: An Effective
Resistance Distance Approach
Abstract Teachers are never the only teacher in the class, especially in online e-
learning environment. The key learner who is supposed to be more active and eager
to spread knowledge and motivation to other classmates has a huge potentiality to
improve the quality of teaching. However, the identification of such key learner
is challenging which needs lots of human experience, especially when the contact
channels between teachers and students are much more monotonous in online e-
learning environment. Inspired by resistance distance theory, in this paper, we apply
resistance distance and centrality into an interactive network of learners to identify
key learner who can effectively motivate the whole class with discussion in e-learning
platform. First, we define the terms of interactive network of learners with the node,
edge, and graph. Then the distance between nodes is replaced with effective resistance
distance to gain better understanding of propagation among the learners. Afterward,
Closeness Centrality is utilized to measure the centrality of each learner in interactive
network of learners. Experimental results show that the centrality we use can cover
and depict the learners’ discussion activities well, and the key learner identified by our
approach under apposite stimuli can effectively motivate the whole class’ learning
performance.
C. Lu (B)
School of Electronic and Information Engineering, Anshun University, Guizhou 561000, China
e-mail: firethree123@163.com
F. Zhang
Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang
University, Fuzhou 350121, China
e-mail: 8528750@qq.com
Y. Li
Computer School, Beijing Information Science and Technology University, Beijing 100101, China
e-mail: Leeyunpengs@163.com
49.1 Introduction
In many application scenarios, resistance distance is used to mine the effective infor-
mation among the specific graph. Balaban and Klein [4] proposed an approach to
construct the co-authorship graph as well as to calculate the Erdős number (EN) for
collaborative papers among mathematicians, which is conducted through resistance
distances leading to rational Erdős numbers (REN). Wang and Hauskrecht [5] con-
ducted document retrieval experiments through effective query expansion with the
resistance distance and improved its performance. Meanwhile, resistance network
models were also used in recommender systems with collaborative filtering methods
[6–8]. Guo et al. [9] proposed a new data clustering method that used a similarity met-
ric derived from electrical resistance networks. Aporntewan et al. [10] constructed
a novel indexing algorithm based on electrical resistance between two vertices in
the graph. The experimental results show that it produced a unique index for every
simple connected graph with ≤10 vertices. It had a simple calculation method and
good performance.
49 Identifying Key Learner on Online E-Learning Platform … 473
Sociologist Ritzer [11] believed that influence is the ability to transform other individ-
uals’ thoughts, feelings, and attitudes through communication with others or groups.
The current research on the entities’ influence on social networks is mainly focusing
on two aspects: user static attributes and social network topology. In terms of user
influence based on user static attributes, the most intuitive indicator is the number of
user fans. But Cha et al. [12] found that the users who have high-impact of “fans”
don’t necessarily have high influence on “retweeting” or “ mentioning”. Pal et al.
[13] integrated the users’ quantity of tweeting, responses, retweeting, and fans in
Twitter. Then they calculated the users’ communication influence, the mentioned
influence, and the retweeting influence. Boyd et al. [14] selected users’ retweet-
ing, replying, and likes as their characteristics, and obtained their influence through
weighted calculation.
In describing the influence of users based on the social network topology, Freeman
[15] proposed the importance of nodes based on the shortest path of the network
topology, the centrality of the media, and the proximity to the centrality. Weng et al.
[16] extended on the basis of the PageRank algorithm and proposed the TwitterRank
algorithm to calculate the influence of users on different topics according to the
network structure of the users’ attention relationship and the similarity of users’
interest. Ding et al. [17] comprehensively considered the microblog publishing time,
comment content, and network topology to study users’ influences.
Properties of learner as well as their behavior construct the basement of data mining
in online education. In this paper, we designed an interactive network of learners to
extract graph-based interactive activities via their online learning behaviors. Specifi-
cally, we constructed a web-based discussion application in previously implemented
e-learning platform. In this application, all learners are supposed to ask and answer
questions during and after lessons. A question is usually proposed by one learner
and this question could be responded by different learners. We also provide a button
named “agree” for each answer to let other learners give their feedback to measure
the quality of an answer (i.e., an answer with more clicks of “agree” is supposed to
be a better answer to this question.). In other words, this application works similar to
a question-and-answer site like Quora (https://www.quora.com/) and Zhihu (https://
www.zhihu.com/).
474 C. Lu et al.
where λ is an exponential decay constant and t represents the time span the post has
been released. Afterward, the resistance matrix R can be defined via reciprocal of
elements in W, i.e.,
⎡ ⎤
0 r1,2 ··· r1,n
⎢ .. ⎥
⎢ r2,1 . r2,n−1 r2,n ⎥
R=⎢
⎢ .. .. ⎥
⎥ (49.2)
⎣ ..
. rn−1,2 . . ⎦
rn,1 r n,2 ··· 0
where ri, j = wci, j . For instance, Fig. 49.1 gives two examples. In Fig. 49.1a, learner A
responds two questions from B and D, respectively, then B and D respond the same
question which is proposed by C. Let r1 , r2 , r3 , r4 indicate the resistance values of
the propagation A → D, D → C, A → B, and B → C, respectively. In Fig. 49.1b, we
remove the path including learner B, which only keep the path from learner A to D
and C. Most traditional graph-based social network analysis researches only consider
the shortest path while ignoring other possible pathway in a connected subgraph. For
49 Identifying Key Learner on Online E-Learning Platform … 475
instance, using Freeman’s scheme, we can derive the spreading resistance between
learner A and C as min{r1 + r2 , r3 + r4 }, while the real situation is that learner C
may benefit from knowledge propagation through both A → D → C and A → B →
C, which makes the propagation easier than Fig. 49.1b. Therefore, here we introduce
resistance distance to depict this process of knowledge propagation.
Assuming G as a fully connected graph, we replace the all edges W with resistances R.
Thus, we can utilize Ohm’s law to calculate the actual effective resistance between
any two nodes in the network. For example, the resistance between A and C in
Fig. 49.1a can be calculated by
r A,D + r D,C × r A,B + r B,C (r1 + r2 ) × (r3 + r4 )
r A,C = = (49.3)
r A,D + r D,C + r A,B + r B,C r1 + r2 + r3 + r4
By using the above methods, the resistance matrix R can be transformed into
effective resistance matrix R + . Afterward, we suppose the key learner locates at the
center of the graph. In this paper, Closeness Centrality is utilized to measure the
centrality of our graph, in which the centrality of node u i ∈ U can be derived by
476 C. Lu et al.
1
C(u i ) =
(49.5)
u j ∈U \u i d(i, j)
where d(i, j) denotes the effective resistance distance between u i and u j from R + .
In practical use, it is uncertain whether G is strongly connected. Therefore, we use
the sum of reciprocal of distances, instead of the reciprocal of the sum of distances,
with the convention ∞ 1
= 0. It is obvious that the time complexity is O n 3 and the
2
space complexity is O n .
49.4.1 Participants
109 rural labor workers (84 males, age range 29–51) in Anshun City, China spon-
sored by Guizhou Provincial Department of Science and Technology, China entered
the experiment. All of them have selected an online course named “Designing and
Implementation of Web Pages” in our e-learning platform. They are evenly divided
into four groups (i.e., classes) according to the gender distribution (Table 49.1).
Pearson coefficient of age between any two groups shows that there is no statistical
significance. Participants are promised to give extra credits if they participate in our
aforementioned discussion application actively.
answers’ endorsement. Second, we give the same stimulus to one learner in each
group to let him/her try to motivate the whole class’ learning progress. Then their
mastery of knowledge is examined by an additional quiz. The statistics on the quiz
is the other measurement of our method.
First, the discussion application is introduced to all groups before the online lec-
ture. Then at the mid-term of the lecture, four interactive networks of learners are
constructed via previous discussion activities. Afterward, the same stimulus is per-
formed to one learner for each group selected by different schemes: (1) In group
G1, the interactive network of learners with the effective resistance distance (i.e.,
+
RG1 ) is utilized, and the learner with the highest centrality is selected; (2) In group
G2, similar method to select key learner is conducted, and the only difference is that
the distance is normal distance (i.e., RG2 ); (3) To avoid the effects caused by age,
the eldest learner is selected as the key one; (4) As the control group, we select one
learner randomly as key learner. We perform the stimulus by sending a message to
the four key learner candidates, with the words about thanking his/her contribution to
the discussion, and a confirmation of a scholarship to encourage him/her motivating
the whole class’ discussion. In addition, the target key learner would be appointed
as the monitor to the class.
The remaining half of the semester is given to the four key learners and their
classmates. To avoid potential cheats in the final exam, we organized a quiz before
the final exam in the name of the pre-examination review.
49.4.4 Results
Table 49.2 The correlation between centrality and the number of “agree” or the number of answers
Correlation G1 G2 G3 G4 Overall
Centrality and number of “agree” 0.789* 0.791* 0.770* 0.762* 0.778*
Centrality and number of answers 0.644* 0.639* 0.613* 0.623* 0.630*
Number of “agree” and answers 0.576* 0.568* 0.538* 0.501* 0.546*
*denotes the statistically significant (p-value < 0.5)
478 C. Lu et al.
After the whole online lecture process from March 2018 to July 2018, we calcu-
lated the difference in the four classes of quiz results which is shown in Table 49.3.
It is worth to mention that the mid-term quiz is also presented as a comparison. It is
worth mentioning that using Pearson correlation between four groups’ quiz results
and final exam ones, statistically significant correction can be found in all groups.
This preliminary result demonstrates that our approach to identify and motivate key
learners has contributed to improve the quality of online lecture among the class.
49.5 Conclusion
In this paper, we utilized resistance distance and centrality to construct interactive net-
work of learners, in order to identify key learner and further improve the performance
of online lecture. The centrality with effective resistance distance approach can cover
and depict learners’ discussion activities as well as achieve a visible improvement in
the results of final quiz, which is demonstrated by our experiments.
Acknowledgements This research is supported by Major Project of the Tripartite Joint Fund of
the Science and Technology Department of Guizhou Province under grant (LH[2015]7701).
References
1. Rovai, A., Ponton, M., Wighting, M., Baker, J.: A comparative analysis of student motivation
in traditional classroom and e-learning courses. Int. J. E-Learn. 6, 413–432 (2007)
2. Blagojević, M., Živadin, M.: A web-based intelligent report e-learning system using data mining
techniques. Comput. Electr. Eng. 39(2), 465–474 (2013)
3. Chu, T.H., Chen, Y.Y.: With good we become good: understanding e-learning adoption by
theory of planned behavior and group influences. Comput. Educ. s92–s93, 37–52 (2016)
4. Balaban, A.T., Klein, D.J.: Co-authorship, rational Erdős numbers, and resistance distances in
graphs. Scientometrics 55(1), 59–70 (2002)
49 Identifying Key Learner on Online E-Learning Platform … 479
5. Wang, S., Hauskrecht, M.: Effective query expansion with the resistance distance based
term similarity metric. In: Proceedings of the 33rd International ACM SIGIR Conference
on Research and Development in Information Retrieval, Geneva, Switzerland, pp. 715–716
(2010)
6. Schmidt, S.: Collaborative filtering using electrical resistance network models. In: The 7th
Industrial Conference on Advances in Data Mining: Theoretical Aspects and Applications,
Leipzig, Germany, pp. 269–282 (2007)
7. Fouss, F., Pirotte, A., Saerens, M.: The application of new concepts of dissimilarities between
nodes of a graph to collaborative filtering. In: Workshop on Statistical Approaches for Web
Mining (SAWM), Pisa, Italy (2004)
8. Kunegis, J., Schmidt, S., Albayrak, Ş., Bauckhage, C., Mehlitz, M.: Modeling collaborative
similarity with the signed resistance distance kernel. In: Conference on ECAI 2008: European
Conference on Artificial Intelligence, Patras, Greece, pp. 261–265 (2013)
9. Guo, G.Q., Xiao, W.J., Lu, B.: Similarity metric based on resistance distance and its applications
to data clustering. Appl. Mech. Mater. 556–562, 3654–3657 (2014)
10. Aporntewan, C., Chongstitvatana, P., Chaiyaratana, N.: Indexing simple graphs by means of
the resistance distance. IEEE Access 4(99), 5570–5578 (2017)
11. Ritzer, G.: The Blackwell encyclopedia of sociology. Math. Mon. 107(7), 615–630 (2007)
12. Badashian, A.S., Stroulia, E.: Measuring user influence in GitHub: the million follower fallacy.
In: IEEE/ACM International Workshop on Crowdsourcing in Software Engineering, Austin,
USA, pp. 15–21 (2016)
13. Pal, A., Counts, S.: Identifying topical authorities in microblogs. In: ACM International Con-
ference on Web Search and Data Mining, Hong Kong, China, pp. 45–54 (2011)
14. Boyd, D., Golder, S., Lotan, G.: Tweet, Tweet, Retweet: conversational aspects of retweeting
on Twitter. In: Hawaii International Conference on System Sciences, Hawaii, USA, pp. 1–10
(2010)
15. Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239
(1978)
16. Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers.
In: The Third ACM International Conference on Web Search and Data Mining, New York,
USA, pp. 261–270 (2010)
17. Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: The 2008
International Conference on Web Search and Data Mining, Palo Alto, USA, pp. 231–240 (2008)
18. Kong, S., Feng, L., Sun, G., Luo, K.: Predicting lifespans of popular tweets in microblog. In:
International ACM SIGIR Conference on Research and Development in Information Retrieval,
Portland, USA, pp. 1129–1130 (2012)
19. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In:
International Conference on World Wide Web, Raleigh, USA, pp. 591–600 (2010)
20. Bozzo, E., Franceschet, M.: Resistance distance, closeness, and betweenness. Soc. Netw. 35(3),
460–469 (2013)
Chapter 50
A User Study on Head Size of Chinese
Youth for Head-Mounted EEG Products
Xi Yu and Wen Qi
Abstract Head-mounted EEG products are wearable devices that detect the voltage
fluctuations generated by the ionic currents of neurons in the brain, which is caused
by the changes in people’s brain states. As EEG products collect the physiological
signals of the brain from the head directly, the better an EEG headset fits a wearer’s
head, the more accurate the EEG signals obtained are. At present, most of EEG
headsets are designed for European and American users. There are few EEG headsets
that are suitable for Chinese user. In addition, there is no specific study on measuring
the head size of Chinese people for the purpose of designing an EEG headset. This
study is aimed at collecting the size information of the head of Chinese users. The
results become important reference while designing EEG headsets.
50.1 Introduction
X. Yu · W. Qi (B)
Donghua University, 200051 Shanghai, China
e-mail: design_wqi@sina.com
Daniel Lacko found that available head-mounted EEG devices cannot fit every user’s
head [2]. The mismatch often leads to poor contact between electrodes and scalp.
He proposed an adjustable EEG headset (Fig. 50.1) and verified it by experiments.
After comparing his design with the existing Emotiv EPOC cap with 14 electrodes,
the author found that the performance of the modified EEG headset was slightly
improved and easier to use.
Thierry Ellena et al. noted that EEG helmets that do not fit the user’s head will
increase their safety concerns [3], and there are no helmets that can fit for every user
currently. Based on 3D anthropometry, she proposed a method to improve the fit of
the helmet. The two parameters including SOD and GU were used to evaluate the
HFI in different cases, and the helmets with better data quality were found to be
better accepted by their users. Their study also showed that men and Europeans felt
more comfortable with the experiment helmet than women and Asians, respectively.
Hong Kong Polytechnic University and Delft University have compared the 3D
head scans of 50 European and Chinese participants. The age of these participants
ranged from 17 to 77 years old with an average age of 39 years old. The result indi-
cated that the shape of Chinese people’s head is more rounded than that of Europeans
and Americans. In addition, the forehead and hindbrain are flatter (Fig. 50.2). These
differences lead to the fact that wearable products such as helmet and masks designed
for Europeans and Americans cannot be fully adapted Chinese users [4].
The national standard GB10000-88 of the People’s Republic of China provides
basic human body size for Chinese adults (male 18–50 years old, female 18–55 years
old) [5]. In this survey, seven kinds of head data are measured, and the data is divided
into two groups according to gender (Fig. 50.1).
Fig. 50.1 The head size of Chinese adult. (Image comes from the National Standard of the People’s
Republic of China GB10000-88)
50 A User Study on Head Size of Chinese Youth … 483
The purpose of this study is to provide the data of the head size of Chinese youth in
order to help in designing the head-mounted EEG products that are customized for
Chinese users. There are two reasons for carrying out such study. First, the existing
data of head size is outdated and not suitable for reference. Second, the data samples
from other studies, for example, in Fig. 50.1, cover a wide range of age groups. There
are no specific measurements about the head size of Chinese youth. In this study, six
parameters are measured as shown in Fig. 50.2:
1. Maximum Head Breadth: the linear distance between the left and right cranial
points (EU).
2. Maximum Head Length: the linear distance from the point of the eyebrow (g) to
the point of the back of the pillow (op).
3. Head Sagittal Arc: the arc length in the median sagittal plane from the point of
the eyebrow (g) to the point of the occipital bulge (i). It should be noted that,
considering the final design size of an EEG headset, the sagittal arc is divided
into the front and the back by the apex of the head, providing the measurement
for the EEG product design.
4. Head Transversal Arc: the arc length from one side of the tragus point (t), through
the head vertex (v) to the other side of the tragus point (t).
5. Head Circumference: the perimeter from the eyebrow point (g) as the starting
point, through the back of the pillow (op) to the starting length of the starting
point.
6. Head Auricular Height: offset from the apex (v) at the tragus point (t).
Fig. 50.2 The definition of six sizes of the human head. (Image from National Standard of the
People’s Republic of China GB10000-88)
484 X. Yu and W. Qi
There are five different types of tools that are used to measure the six parameters
mentioned above: Anthroscan Bodyscan Color 3D body scanner, Martin-style body
shape measuring ruler, soft ruler, and nylon cap (Fig. 50.3). The 3D body scanner is
used for scanning the head of each participant and extracting related data such as the
maximum head length and the maximum head breadth from the three-dimensional
model. The Anthroscan Bodyscan is produced by Human Solutions, Germany. The
Martin ruler was used to measure the maximum head length, the maximum head
breadth, and the head auricular height in this experiment. The soft ruler (Fig. 50.3)
is used to manually measure the head circumference. A nylon cap (Fig. 50.3) is
worn on the head by each participant to avoid the interference of the hair of each
participant. Different from the traditional measurements, red markers are pasted on
the nylon cap according to the electrode positions of the FP1, FP2, F3, F4, T7,
T8, P7, and P8 in international 10–20 system. An online questionnaire is presented
to each participant to collect personal information including name, gender, age,
education, birth province, ethnicity, student number, and contact information. There
are questions about their opinions on EEG products.
Fig. 50.3 The measurement tools (left up: anthroscan bodyscan; right up: Martin-style ruler; left
down: soft ruler right down: nylon cap)
50 A User Study on Head Size of Chinese Youth … 485
First, each participant filled the name–number registration form and answered the
online questionnaire. They were informed that data will be used only for research
purpose and will not be shared by others. Each participant took off his/her shoes
and wore the nylon cap, and then entered the Anthroscan Bodyscan color 3D body
scanner for scanning.
Following that, the authors use the antennae gauge of Martin ruler to measure the
linear distance between the left and right cranial points, which is the maximum head
breadth, the linear distance from the point of the eyebrow to the point of the back
of the pillow, which is the maximum head length. Then, the cross gauge of Martin
ruler was used to measure offset from the apex at the tragus point which is the head
auricular height. After the measurement with Martin ruler, the soft ruler was used
to measure three parameters: the head sagittal arc, the head transversal arc, and the
head circumference.
The last step was to measure the height and weight, and check the correctness of
the information of each participant and whether there are any measurements missing.
After the experiment, 3D model data of full body of each participant was processed
after the measurement by the experimenter with Anthroscan Bodyscan software. The
data of head part was extracted from the 3D model using the software Rhino.
50.4 Results
This study in total measured 20 Chinese young undergraduate and postgraduate stu-
dents including 10 males and 10 females. They are either from Northern or Southern
part of China, such as Liaoning, Jiangsu, and Guangdong Provinces. The geograph-
ical distribution of the samples is quite wide. The average age is 23 as well.
The average height of the 20 samples is 170 mm, and the median is 168 mm. The
average height of the male participants is 177 mm and the median is 176 mm. The
average height of the female participants is 162 mm, and the median is 162 mm.
The average body weight of the whole sample is 61 kg, and the median is 62 kg.
The average weight of male students is 70 kg, and the median is 70 kg. The average
weight of female students is 52 kg, and the median is 52 kg.
The results of this experiment include the maximum head breadth, maximum
head length, sagittal arc length, coronal arc length, head auricular height, and head
circumference.
It should be noted that in addition to the sagittal arc length itself, the sagittal
arc length is divided into the front and the back by the apex of the head, and the
measurement is provided for the design of the EEG product. Figure 50.4 shows a
summary of the experimental data statistics. The specific data will be elaborated and
analyzed in this chapter.
486 X. Yu and W. Qi
50.5 Conclusion
In this study, the authors measured and analyzed the head size of Chinese youth
in order to provide reference data for designing a head-mounted EEG headset for
Chinese users. It is found that the average head width, average head length, and
average head circumference of Chinese samples are smaller than those of European
users. When the head length is the same, the head width of a Chinese person is larger
than a European person. The head circumference of a person is affected by his/her
personal attributes such as height, weight, and age, while the maximum head breadth
and maximum head length are relatively less affected by personal attributes. In terms
of product appearance, the number of electrodes is not the primary factor considered
by the youth in China while selecting an EEG headset.
Acknowledgements The author would like to thank the Program for Professor of Special Appoint-
ment (Eastern Scholar) at Shanghai Institutions of Higher Learning (No. TP2015029) for financial
support. The study is also supported by “the Fundamental Research Funds for the Central Univer-
sities”.
References
1. Zhang, H., Wang, H.: Study on classification and recognition of multi-lead EEG signals. Com-
put. Eng. Appl. 24, 228–230 (2008)
2. Lacko, D.: Ergonomic design of an EEG headset using 3D anthropometry. J. Appl. Ergon. 58,
128–136 (2017)
3. Ellena, T., Subic, A.: The helmet fit index-an intelligent tool for fit assessment and design
customization. J. Appl. Ergon. 55, 194–207 (2016)
4. Roger, B., Shu, C.: A comparison between Chinese and Caucasian head shapes. J. Appl. Ergon.
41, 832–839)(2010)
5. National Standard—Anthropometric Terminology (GB 3975–1983). China Standard Press,
Beijing (1984)
6. China’s National Development and Reform Commission, The outline of the 13th five-year plan
for national economic and social development of the People’s Republic of China, Xinhua News
Agency 6(1) (2016)
7. Chinese Academy of Sciences, Brain Science and Brain-Like Intelligence Technology. Shen-
zhen International Genomics Conference, Institute of Neuroscience (2015)
50 A User Study on Head Size of Chinese Youth … 487
8. Xiao, H., Xia, D.: Research on head and face size of Chinese adults. J. Ergon. 4(4) (1998)
9. Roger, B.: Size China: a 3D anthropometry survey of the Chinese head. Dissertation, Delft
University of Technology (2011)
10. Yan, L., Roger, B.: The 3D Chinese head and face modeling. J. Comput. Aided Des. 44(1),
40–47 (2012)
11. Yu, X., Qi, W.: A user study of wearable EEG headset products for emotion analysis. In:
ACM International Conference Proceeding Series, December 21, 2018, ACAI 2018 Confer-
ence Proceeding—2018 International Conference on Algorithms, Computing and Artificial
Intelligence; ISBN-13: 9781450366250. https://doi.org/10.1145/3302425.3302445
Author Index