You are on page 1of 16

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS 1

Complex Network Construction of Multivariate


Time Series Using Information Geometry
Jiancheng Sun, Senior Member, IEEE, Yong Yang, Senior Member, IEEE, Neal N. Xiong, Senior Member, IEEE,
Liyun Dai, Xiangdong Peng, and Jianguo Luo

Abstract—Cyber physical systems (CPS) is a tightly coupled and interactions between elements are edges. Cyber physical
integration and interaction between computational and physi- systems (CPS) is such a typical system which is composed of
cal components. In many cases, information collection in CPS is computing elements, communication components, and phys-
provided through a group of distributed sensors and all of them
change continuously with time. Thus the sensor information is ical resources [1]. In most cases, these units are connected
usually in the form of time series. One particularly interesting by a multitude of wired or wireless communication or sen-
application in time series analysis is use of complex networks to sor networks. It is insufficient to study a single unit or sensor
represent and study behaviors of system. Complex networks has since a CPS is not their union, but their intersection. Thus, it is
been playing an important role for analyzing complex systems necessary to build intelligent gateway architecture or seamless
as it helps understanding the topology structure of systems with
different interacting units. In this paper, we proposed a reliable data processing framework for CPS communication, control,
method for constructing complex networks from multivariate and data management. In addition to the engineered systems,
time series (MTSs) in the cases of single and multisensor based CPS has been used to address social and environmental issues
on information geometry theory, which allows the information in recent years (e.g., environmental monitoring, health care,
in the time series to be extracted by analyzing the associated etc.) [2], [3]. In common cases, the information of sensor is
complex network. We first estimate covariance matrices and
then a geodesic-based distance between the covariance matri- expressed in the form of time series and time series analy-
ces is introduced. Consequently, the network can be constructed sis is a classic means of data analysis [4]. Recently, there is
on a Riemannian manifold where the nodes and edges corre- a growing industry in the application of complex network the-
spond to the covariance matrix and the geodesic-based distance, ory to carry out time series analysis. The time series firstly
respectively. The proposed method provides us with a nonlinear is transformed into networks and then analyzed with various
relationship and intrinsic geometry viewpoint to understand the
MTSs and also an alternative approach to fuse, model, repre- complex network tools [5]–[7]. Networks play an important
sent, and visualize the multisensor data in CPS. A number of role of numerous complex systems, which have attracted con-
experimental studies and numerical examples are presented to siderable attention both from a theoretical side [8]–[10] as well
demonstrate the generality and the effectiveness of our approach as various application fields [11]–[13]. This strategy provides
with both synthetic and real datasets. us with new viewpoint to understand the behavior and person-
Index Terms—Complex networks, information geometry, time ality of the time series, and also provides a promising way for
series. seamless digital engineering in CPS.
In general cases, time series is obtained from a dynamical
system which describes how one state develops into another
I. I NTRODUCTION
state over the course of time. CPS can also be seen as
ANY real systems can be modeled as networks
M (or graphs), where the elements of the system are nodes
a dynamical system where time series sensor data are shared
among individuals to fuse information of mutual interest [14].
To a dynamical system, the basis of network-based analysis is
Manuscript received February 25, 2017; revised June 5, 2017; accepted to find recurrent points along a trajectory in the phase space.
September 6, 2017. This work was supported in part by the National
Natural Science Foundation of China under Grant 61362024, Grant 61662026, Two recurrent points are likely to be similar as they share
and Grant 61661022, and in part by the Natural Science Foundation of common features [15]. Consequently, a distance metric to
Jiangxi Province, China under Grant 20161BAB202056. This paper was measure the similarity between the recurrent states is required.
recommended by Associate Editor A. Trappey. (Corresponding author:
Jiancheng Sun.) The selection of the distance metric is important because
J. Sun, L. Dai, X. Peng, and J. Luo are with the School of Software and the different metric will result in different network structure.
Communication Engineering, Jiangxi University of Finance and Economics, There are many different metrics such as Pearson correlation
Nanchang 330013, China (e-mail: sunjc73@gmail.com; lyundai@sohu.com;
pxdfj@163.com; luojianguo65@163.com). coefficient [16], [17], Euclidean distance [18], dynamic time
Y. Yang is with the School of Information Technology, Jiangxi warping (DTW) [19], and longest common subsequence [20]
University of Finance and Economics, Nanchang 330013, China (e-mail: to be used for measuring the similarity. How to find the best
greatyangy@126.com).
N. N. Xiong is with the Department of Mathematics and Computer metric? Apparently, the answer to this question depends on
Science, Northeastern State University, Tahlequah, OK 74464 USA (e-mail: what you want to do with the network.
xiongnaixue@gmail.com). Multivariate time series (MTSs) is widely available in
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. different fields including medicine, finance, science, and
Digital Object Identifier 10.1109/TSMC.2017.2751504 engineering [21], [22]. The MTS mining technologies are
2168-2216 c 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

usually needed to study the multisensor time series data. eigenvectors can be used to describe the properties of
Multisensor time series analysis is the process of fusing obser- a covariance matrix. Consequently, studying the varia-
vations from many different sensors to find a robust and tion of these eigenvalues and eigenvectors, i.e., the way
appropriate description of a system or process of interest [23]. they evolve over time, is an effective way to investigate
In many areas such as environment mapping, medical and the system dynamics.
health care, behavioral biometrics, and varieties of sensor The remainder of this paper is organized as follows.
data are collected during a time period. For example, in the Section II briefly discusses related work; Section III describes
case of the weather forecast, the multivariate data such as air the basic strategy and concrete steps for constructing the
temperature, precipitation, and humidity are collected from complex network, moreover, the properties of the proposed
various climate-observing stations (namely multisensor). To distance is also presented; Section IV shows numerical exam-
draw some reasonable decision or action, we need to analyze ples using both simulated data and real data which include
these multisensor time series carefully. Lorenz system and climate data; Section V gives concluding
Comparing with the univariate data, analysis of mul- remarks.
tivariate data from multisensor is a challenging task as
its huge volume and the complex interaction between dif-
ferent sensors. Classical approach is to use the machine II. R ELATED W ORK
learning algorithm for analyzing the MTS. Chakraborty [24] There are three distinct bodies of this paper that are associ-
proposed Bayesian classifier and a decision tree to analyze ated with and motivate this paper: 1) computation application
MTS data, meanwhile they also developed temporal naive of sensor data in CPS; 2) complex network of time series;
Bayesian model and temporal decision tree. To realize the and 3) information geometry. In general, CPS consists of three
time series classification, Zhang et al. [25] raised a novel parts: 1) sensor; 2) computation application; and 3) actuator
method to transform the MTS data to a lower dimensional which correspond to three basic functions: 1) sensing charac-
representation by extracting characteristic features. Recently, teristics of the physical world; 2) analyzing based on sensor
a parametric derivative DTW distance is proposed for MTS input; and 3) generating response actions through actuation. In
classification [26]. this paper, we focus on the second part, namely computation
It is worthwhile to note that most of the aforemen- application and objective here is to analyze the interrelation-
tioned methods can only be used in cases of univariate ships of the sensors through the time series generated by
time series (UTSs). In this paper, based on the information the sensors. The typical application is the integration and
geometry [27], we utilize a geodesic-based distance met- fusion of sensor data in CPS [28], [29]. Finding the relation-
ric for constructing complex network from MTS lying on ship between sensors is an important step of data fusion and
a Riemannian manifold and its advantages will be present mining in CPS.
in the following description. That is, we will first estimate The second one relates to the study of the complex network
covariance matrices to represent the MTS, and then use theory and its applications for analyzing the time series.
a geodesic-based metric to measure the similarity between In the context of concepts borrowed from graph theory
the covariance matrices. Finally, a network can be constructed as well as statistical physics, in recent years, the com-
on a Riemannian manifold. The main contributions of the plex network theory has attracted much attention, with the
proposed method have three aspects. aim of studying various possible information of a complex
1) Geodesic Distance Can Effectively Capture the system [30]–[33]. More recently, the network-based approach
Nonlinear Relationship Between Two Variables: on time series analysis has been developed and applied in
Generally, the relationship among points (variables) many fields such as biology, sociology, physics, climatol-
on the Riemannian manifold can be studied from two ogy, and neurosciences [6], [18], [34]–[39]. In this approach,
ways: the first one is to study the embedding structure a time series is firstly mapped into a network version and
in Euclidean space which is an extrinsical style; the then be analyzed from a complex network perspective. The
second one use an intrinsical way to capture the theoretical basis is that the characteristics of the time series
relationships between any two points by the distance are inherited in the resulting network. Based on the type of
along curves on the surface. The proposed geodesic edge between nodes, these methods can be roughly distin-
distance provides us with the nonlinear relationship and guished into three classes, which are based on: 1) mutual
an intrinsic geometry viewpoint to understand the MTS. proximity of different segments of a time series (proximity
2) Our Approach Can be Used for MTS to Construct networks) [6], [17], [34], [40]; 2) convexity of successive
Network: Investigating interaction of multi variables has observations (visibility graphs) [35], [41]; 3) transition proba-
always been a hot topic in the case of time series bilities between discrete states (transition networks) [36], [42].
analysis. To achieve a more effective and valid represen- Relatively speaking, among the three classes of networks men-
tation of the MTS system, based on a covariance matrix, tioned above, the vast majority of the literatures focused on
we can quantitatively capture the multivariate relation- the proximity networks since the similarity of different seg-
ships between different variables and then construct the ments of a trajectory can be measured in different ways. The
networks. most important concept is the metric that is used to measure
3) Our Approach Is Alternative Way to Study the Temporal the mutual closeness or similarity of different segments of
Dynamics of MTS: The eigenvalues and corresponding a trajectory. Generally, different metric will result different
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 3

types of such proximity networks. Consequently, the metric a certain period of time. The second question focuses on the
plays a key role in the case of transforming a time series to multisensor, which model and explains the interactions and co-
network. movements between different sensors. To analyze the MTS in
The third one related to this paper is the information geom- these two situations, the method for constructing the complex
etry which is the study of stochastic manifolds. Each point networks will be present in this section.
on the manifolds is a hypothesis about some state of events.
Amari and Nagaoka [27] have made a tremendous contribu- A. Basic Strategy of Complex Network Construction
tion to the establishment of the theoretical background of this We start from the system model which is employed to gen-
field which has been popularized in recent years. This field erate the MTS. Generally, an MTS data item is represented by
is usually considered as a branch of differential geometry a c × n matrix, where n is the number of time samples and
and statistics, has wide applications to machine learning [43], c is the number of the observed variables. Let us denote by
image processing [44], and diffusion tensor imaging [45]. X ∈ Rc×n an MTS item which is assumed to follow a multi-
Thanks to information geometry we are able to define a metric variate normal model with zero mean and with a covariance
enjoying the sought properties. More recently, Amari [46] also matrix  ∈ Rc×c , that is:
proposed a general and unique class of decomposable diver-
gence functions in the manifold of positive definite matrices X ∼ N (0, ). (1)
which can be used for clustering and related pattern matching
To the first question mentioned above, our approach first is
problems.
to separate an MTS into segments by a sliding window, and
The methods discussed above have enabled compelling
then the complex networks can be obtained by analysing the
analysis and led to novel insights for the time series behav-
relationship between the segments. In this paper, the Lorenz
iors. However, they are subjected to the case of UTSs. In
system [48] is used to simulate the MTS of a single sensor.
fact, Steinhaeuser et al. [47] have pointed out that there
The model is a system of three ordinary differential equations
are three challenges in the case of network representation
which described as
of time series: 1) nonlinear relationships exist in different
variables; 2) multivariate relationships must be considered dx dy dz
= σ (y − x), = x(ρ − z) − y, = xy − βz (2)
and integrated with the networks to achieve a more realis- dt dt dt
tic representation; and 3) the relationships in the system are here x, y, and z make up the system state, t is time, and σ , ρ,
constantly changing in both space and time, namely dynam- and β are the system parameters. That is to say the data of
ics, which is hard to be detected by basic network model. a single sensor has three variables which are x, y, and z.
Though these challenges were proposed in the context of cli- The proposed strategy for network construction from MTS
mate data, they are also open problems in complex networks is illustrated in Fig. 1. One realization of the Lorenz system is
literature. illustrated in Fig. 1(a) where the three time series correspond
Aiming at the three challenges mentioned above, our moti- to the variables x, y, and z. The thick frame here is a sliding
vations here were three-fold: first, we utilize a geodesic-based window which is used for separating time series into segments
distance based on information geometry to estimate link along the time axis (here time is represented by the sequence
strength in network, and the geodesic distance is a non- number of data). In this paper, MTS are assumed to follow
linear relationship metric; second, the covariance matrix is a multivariate normal distribution with a zero mean, which is
used to capture the multivariate relationships between vari- a common assumption in many applications. So we use the
ables; last, a covariance matrix can be described as an covariance matrix as the feature of the MTS. That is to say
ellipsoid when the random variables are Gaussian. In par- a covariance matrix  i of x, y, and z of the Lorenz system
ticular, the eigenvectors of the covariance matrix will point can be estimated by the sub time series inside the ith sliding
along the principal axes of this ellipsoid, and the eigenvalues window. Then lots of covariance matrices can be obtained with
will say how stretched out the ellipsoid is in each direc- moving of the sliding window. The next work is to analyze
tion. Consequently, studying the variations of eigenvectors and the relationship between the different  i . There are several key
eigenvalues is an alternate way to investigate the dynamics of advantages of using covariance matrices as feature descriptors.
variables. 1) Providing a natural way of combining multiple variables
that might be correlated.
2) A covariance matrix extracted from a sub window is
III. A PPROACH D ESCRIPTION usually enough to capture the multiple features of a time
In general cases, time series is the measured data of the series.
sensors. The MTS in these cases mean that a single sensor 3) The individual noised samples are largely filtered out
or integrated sensors can measure multiple signals simulta- since the covariance computation can be served as an
neously. One example of MTSs is the smart mobile phones average filter in a natural way.
which have equipped location and acceleration sensors. In 4) It can be used in the case of variable length data, namely
the cases of MTS, there are usually two scientific questions sliding windows with variable length can be used for
need to be addressed according to the specific problem to be separating the time series.
solved. The first question focuses on a single sensor, which 5) Due to symmetry of covariance matrices, the descriptors
capture the dynamic observation of multiple variables over are low-dimensional expression.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

(a) (b)

Fig. 1. Strategy for network construction from MTSs. (a) Time series of trajectory of the Lorenz system. (b) Network construction lying on Riemannian
manifold.

The notion of relationship implies the ability to measure B. Concrete Steps for Constructing the Complex Network
dissimilarity between the data, i.e., the definition of a metric. In summary, the procedure of the network construction can
To find the appropriate metric is a challenging problem and be formulated as the following steps.
should be realized according to the intrinsic personality of the Step 1: Estimate the covariance matrix of MTS.
data. By means of information geometry, we can define a met- Step 2: Calculate the geodesic-based distance between the
ric to satisfy the sought properties. Information geometry is covariance matrices.
a field by combining mathematics with probability theory. In Step 3: Construct the network by the geodesic-based
other words, applies the methods of differential geometry to distance.
the probability theory [27], [46] where the probability distri- 1) Estimation of Covariance Matrix and Its Characteristics:
butions (PDF) for a statistical model are taken as the points In step 1, as showed in Fig. 1(a), the covariance matrices are
of a Riemannian manifold. The Fisher information metric is estimated inside a sliding window. For a time stamp i, the
introduced as the definition of a distance between PDFs. With sliding window is given and the number of observed empiri-
a definition of the geodesic-based distance on the Riemannian cal values of the variables is equal to the width of the sliding
manifold, we are now able to measure the similarity between window. Consider c variables, MTS data in sliding window is
covariance matrices. represented as a matrix Xi ∈ Rc×n where n is the number of
To construct the complex network, the basic idea here is time samples. In this paper, the covariance matrix  i of Xi
to regard each  i as a node of the network and the edge is approximated with the classical maximum likelihood esti-
between any two nodes is determined by the geodesic-based mator (or “empirical covariance”). The maximum likelihood
distance. As showed in Fig. 1(b), the constructed network is estimator is unbiased, i.e., it converges to the true (population)
lying on a Riemannian manifold (here the sphere is just an covariance when given enough observations.
example of a Riemannian manifold and the curve between Covariance matrices are positive definite symmetric matri-
nodes is the segment of a great circle). The different color of ces which are often encountered in image and signal pro-
nodes and the links between nodes denote the different covari- cessing, such as characterizing statistics on deformations,
ance matrices and the edge of the network, respectively. The encoding of the principal diffusion directions in Diffusion
geodesic-based distance provides us with an intrinsic geometry Tensor Imaging. Every positive definite symmetric matrix
viewpoint to analyze the time series which will be explained defines an ellipsoid [49]. The principal axes are given by the
in the following section. eigenvectors of the matrix and the square roots of the eigen-
In the case of the multisensor, namely, the second ques- values are the radii of the corresponding axes. To illustrate
tion mentioned above, the complex network construction is its property from the geometry view, we introduced a 3 × 3
more straightforward. Each  i is estimated utilizing all the covariance matrix as
observed data from one sensor, where the subscript i denote ⎡ ⎤
11 12 13
the label of the sensor. It is difference from the case of network
 = ⎣ 21 22 23 ⎦. (3)
a single sensor, where the subscript i of  i corresponds to the
31 32 33
segment of MTS in a sliding window. From this point, the con-
structed network of multisensor can be utilized to analyze the By diagonalizing  ∈ Sym+ 3 , we get three positive eigenval-
interactions between the different sensors. In contrast, except ues λ1 > λ2 > λ3 > 0 and their corresponding eigenvectors
for analyzing the topology structure of MTS, the constructed e1 , e2 , e3 . The shape of an ellipsoid is inherently related to
network of a single sensor can be used to study the evolution the eigenvalues and eigenvectors of the covariance matrix: the
of multiple parameters over time. three principal radii and three directions of axes are determined
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 5

by d(, exp (W)) = W . Also, there exists the inverse


mapping log : M → T which is uniquely defined around
a small neighborhood of the point . Then log 1 ( 2 ) is given
by the tangent vector with the smallest norm between two
points  1 and  2 .
A Riemannian manifold M is a differentiable manifold and
an invariant Riemannian metric on the tangent space T of
Sym+d is given by [45]
(a) (b) (c)
 1 1


w1 , w2  = tr  − 2 w1  −1 w2  − 2 (5)
Fig. 2. Ellipsoids of covariance matrices. (a) Linear ellipsoid, λ1 = 10, λ2 =
λ3 = 1. (b) Planar ellipsoid, λ2 = λ3 = 10, λ3 = 1. (c) Isotropic ellipsoid,
λ1 = λ2 = λ3 = 10. where wi ∈ T , T ∈ Sym+ d and tr is the trace operator for
matrix. The exponential map associated to the Riemannian
metric is given by
by the eigenvalues and orthogonal eigenvectors of the covari-  1
1 1 1 2
ance matrix, respectively. In other words, the eigenvalues and exp (w) =  2 exp  − 2 w − 2 . (6)
eigenvectors regulate the shape and direction of the ellipsoid,
respectively. Fig. 2 shows three basic ellipsoids representing Then the logarithm map exist uniquely at all the points on
covariance matrices, namely, linear, planer, and isotropic. That the manifold
is to say the linear, planer, and isotropic correspond to the 1
 1 1
 1
log 1 ( 2 ) =  1 2 log  1 − 2  2  1 − 2  1 2 . (7)
cases of λ1  λ2 ≈ λ3 , λ1 ≈ λ2  λ3 , and λ1 ≈ λ2 ≈ λ3 ,
respectively. All other shapes are interpolations of these three Note that where the exp and log are the ordinary matrix
basic shapes. In following section, we will use the ellipsoids exponential and logarithm operators, and moreover, exp and
to illustrate the dynamic feature of covariance matrices. Each log are manifold specific operators, which are also point
 is regarded as one node in the network and then a met- dependent,  ∈ Sym+ d.
ric is necessary to measure the similarity between covariance Both the manifold and the tangent spaces are m = d(d+1/2)
matrices which will be discussed in step 2. dimensional as the tangent space of Sym+ d is the space of d×d
2) Distance on Riemannian Manifold: In step 2, we intro- symmetric matrices. By substituting (7) into (5), the distance
duce some notions of Riemannian geometry before defining between two points on Sym+ d can be given by
the geodesic-based distance between covariance matrices. 
A manifold is a topological space that is locally resembles d( 1 ,  2 ) = log 1 ( 2 ), log 1 ( 2 ) 
a Euclidean space, which means there is a neighborhood   1 1

1

= tr log2  1 − 2  2  1 − 2
around each point that is homeomorphic to the open unit ball

c 1/2
in Rn . To understand this idea, take into account people once
convince that the world was flat as they observed it on small = log λk
2
(8)
scales. A Riemannian manifold is a differentiable manifold k=1
equipped with an inner product on the tangent space, which where λk , k = 1, . . . , c are the real eigenvalues of
varies smoothly from point to point. The minimum length  1 −1/2  2  1 −1/2 and c is the number of variables. We have
curve connecting two points on the manifold is called the mentioned that the distance in (8) can give us an intrinsic point
geodesic, and the distance between the points is given by the of view to understand the time series. In the case of intrin-
length of this curve. For the surface of a sphere, as showed in sic geometry, the geometry structure is investigated within the
Fig. 1(b), geodesic is the segment of a great circle. geometric object itself without any help from ambient space.
As mentioned above, the d × d dimensional covariance This point of view is more flexible. For example, the intrinsic
matrices are always symmetric positive definite matrices Sym+ d geometry is used in relativity as we cannot understand what
which can be formulated as a connected Riemannian manifold would be “outside” of space-time.
M. It is possible to define the derivatives of the curves on the 3) Complex Network Construction: In step 3, we first cal-
differentiable manifold. The derivatives at a point  on the culate a distance matrix D = (d( i ,  j )) by (8) where the
manifold lie in a vector space T , which is the tangent space d( i ,  j ) is the distance between each pair of the covariance
at that point. inner product <, >∈M is introduced in the matrices. What follows is choosing a critical threshold rc , D
tangent space, which changes smoothly from point to point. be converted into adjacent matrix A = (a( i ,  i ,  j ) ≤ rc
Then a norm is introduced by the inner product for the tangent and a( i ,  j ) = 0 if d( i ,  j ) > rc . How to find an optimal
vectors w ∈ T such that rc is still open problem. In this paper, an appropriate threshold
w2 = <w, w> . (4) is chosen by the experimental method. In order to avoid too
dense a graph, only the 20% edges with the highest values are
From , a unique geodesic exists on the manifold M preserved. The topological structure of the network is deter-
starting with the tangent vector w. The exponential map mined by the adjacent matrix A, where a( i ,  j ) = 1 if node
exp : T → M maps the vector w to the point reached k connects to node l, and a( i ,  j ) = 0 if the edge (i, j) does
by this geodesic, and the distance of the geodesic is given not exist.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

C. Properties of Distance on Riemannian Manifold


1) Relation to Fisher Information Distance: We try to
find a relationship between the proposed distance d(,)
and Fisher information metric which is the foundation of
information geometry. This relationship also illustrates that
d(,) can be explained within the information geometry
framework.
First, it is necessary to comprehend the notion of statisti-
cal manifolds. The set M is called statistical manifold when
its elements are PDF on a set X . This manifold construc-
tion is defined by its own properties and metrics and cannot
exist in Euclidean space [27]. Each point on M is a PDF
which can be parameterized by θ = θ 1 , . . . , θ n , and then
M is known as a statistical model on X . To measure the sim-
ilarity between points on M, the Fisher information distance
is introduced. The Fisher information distance between two Fig. 3. Comparing d(,) with dSKL (,).
distributions p(x; θ1 ) and p(x; θ2 ) is

 1  T   where c is the number of variables. In this paper, we assume
dθ dθ
dF (θ1 , θ2 ) = min [J (θ )] dt (9) the MTS data have zero mean, namely μ1 = μ2 = 0.
θ(·): dt dt
θ(0)=θ1
0 Substituting (14) into (12) and replace dKL (p1 , p2 ) with
θ(1)=θ2 dKL ( 1 ,  2 ), then we can get
   
where θ = θ (t) is the parameter path along the dKL ( 1 ,  2 ) = tr  −1  +  −1

manifold [27], [50] and J (θ ) is the Fisher information matrix 1 2 tr 2 1 − 2c. (15)
whose elements are
 Let λk , k = 1, . . . , c are the real eigenvalues of  −1
1 2,
∂logf (X; θ ) ∂logf (X; θ ) dSKL (p1 , p2 ) in (13) can be reformulated as
[J (θ )]i,j = f (X; θ ) dX. (10)
c 1/2
∂θ i ∂θ j
c
1
Theoretically, Fisher information distance dF (,) is the best dSKL ( 1 ,  2 ) = λk + − 2c (16)
λk
candidate to measure similarity between PDFs as it is an exact k=1 k=1
measure of the geodesic distance between points along the As tr( −1
1  2 ) = tr( 1
−1/2   −1/2 ), we reformulate (8)
2 1
manifold. However, an exact dF (,) is hard to get without know- for comparing dSKL (,) with d(,)
ing the parameterization of the manifold or in the case of
c   1/2
1 2 c
multivariate data. Fortunately, dF (,) can be approximated with 2 1
Kullback–Leibler (KL) divergence and the distance between d( 1 ,  2 ) = √ log λk + log . (17)
2 k=1 λk
PDFs p1 and p2 can be approximated as [50] k=1
 From (16) and (17), we can find relationship between
p1 (x)
KL(p1 p2 ) = p1 (x)log dx. (11) dSKL (,) and d(,), namely, the difference concerns the loga-
p2 (x) rithm operator for λk and 1/λk except for the constants. To
As the KL divergence is not symmetric, namely, understand the difference intuitively, we show dSKL (,) and
KL(p1 p2 ) = KL(p2 p1 ), it is not a distance metric. To d(,) in Fig. 3 by letting c = 1 and λ ∈ [0.1, 19.9]. Loosely
overcome this weakness, the symmetric KL divergence is speaking, d(,) is the logarithmic version of dSKL (,).
defined as 2) Invariant by Inversion: One advantage of intrinsic geom-
etry is the property of affine-invariant [52]. The distance in (8)
dKL (p1 , p2 ) = KL(p1 p2 ) + KL(p2 p1 ). (12) has two important properties of invariance. First, the distance
Let dSKL (p1 , p2 ) = [dKL (p1 , p2 )]1/2 , the dSKL (,) approxi- is invariant by inversion, that is
 
mates the dF (,) as p1 → p2 d( 1 ,  2 ) = d  −1 −1
1 , 2 . (18)
dSKL (p1 , p2 ) → dF (p1 , p2 ). (13) We give an intuitive explanation for (18) from two aspects.
Suppose both p1 and p2 are the PDFs of multivariate normal In one side,  is a measure of how the variables are dis-
distributions with means μ1 and μ2 and variances  1 and persed around the mean (the diagonal elements) and how they
 2 , respectively. The KL divergence divergence in (11) can co-vary with other variables off-diagonal). On the other side,
be formulated as [51]  −1 is the inverse of the covariance matrix, also called the
   precision. It is a measure of the partial correlations among the
1 log| 2 |
KL(p1 p2 ) = − c + tr  −1  1 + ( 1 )
variables. In other words, the diagonal elements elements say
2 |1 | 2
how closely clustered the variables are around the mean, mean-

T −1 while, the off-diagonal elements measure they do not co-vary
+ (μ2 − μ1 )  2 (μ2 − μ1 ) (14)
with the other variables.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 7

(a) (b)

Fig. 4. Lorenz system and the corresponding distance matrix. (a) Lorenz system with the parameters σ = 8, ρ = 28.1, and β = 8/3. (b) Graphical
representation of the distance matrix.

3) Invariant by Congruent Transformation: The second the system, which can be characterized in terms of small-scale
invariance of distance d(,) is invariant by the congruent as well as large-scale features. In addition, a periodic regime
transformation which is another term for an isometry in is reflected by the grid pattern in Fig. 4(b). This phenomenon
mathematics. That is for any invertible square matrix P we shows that with the evolution of time, the similarity between
have two points in system will change periodically. In other words,
  even if the two points are far apart in time, they are likely to
d( 1 ,  2 ) = d PT  1 P, PT  2 P . (19)
be very similar. This is the widely observed phenomenon of
This property is very important in the case of time series as recurrences in chaotic system. Although the distance matrix
it ensures that any linear operation on the time series that can can reflect some of the details of the system, but the global
be modelled by an c × c invertible matrix P has no effect topology cannot be displayed.
on the Riemannian distance d(,). This type of transforma- As mentioned above, only the 20% edges with the highest
tion includes rescaling and normalization of the time series, values are preserved. Based on this criterion, the threshold
whitening, spatial filtering or source separation, etc. rc = 0.7 is derived based on the distance matrix and we
can get a network of 1170 nodes from the Lorenz system
IV. R ESULTS AND D ISCUSSION which is shown in Fig. 5. The graphs have been embedded
into an abstract 2-D space using the algorithm in [53] and is
In order to evaluate the performance of the proposed
drawn by the software NodeXL [54]. On the whole, the fig-
method, we have carried out the simulation for a single sen-
ure exhibits a pronounced community structure with two major
sor and also multisensor. First the Lorenz system is used to
groups which correspond to the double-scroll structure of the
simulate the signal of a single sensor, and then the historical
Lorenz system shown in Fig. 4(a). This double-scroll topol-
climate data from NOAA’s National Centers for Environmental
ogy is an inherent feature of the Lorenz system. Consequently,
Information (NCEI) are used for investigating the performance
compared with the distance matrix, the network is more intu-
for the case of multisensor.
itive to show the global structure of the system. In addition,
the nodes with similar time index tend to live in same commu-
A. Synthetic Data nity structure. This is easily understood since the nodes with
1) Complex Network Construction for Single Sensor: similar time index have similar characters in general cases.
Fig. 4(a) shows one realization of the Lorenz system described On the other side, the nodes with very different time index
in (2), which is characterized by a double-scroll topology of can also be in one community which reflects the recurrence
the attractor with pronounced chaotic oscillations. Here, the feature of the Lorenz system.
parameters used σ = 8, ρ = 28.1, and β = 8/3 with the 2) Evaluation of Similarity Measure: To evaluate the
initial state [0.0, −0.8, 0.0]T and sampling time t = 0.02. performance of distance in (8), we compare it with the
With these parameters, the Lorenz system is in a chaotic state. classical similarity metric such as Frobenius distance [55],
The time series or the orbit to be tracked consists of the Pearson correlation coefficient [12], DTW [19], and Euclidean
1500 data points by removing the first 300 data points. The distance [18].
color bar from blue to red denotes the temporal order of obser- In the case of MTS from multisensor, we comparing
vations. The width and sliding step of sliding window are 30 t Riemannian distance with Frobenius distance which can be
and t, respectively. Thus, from the point of view of temporal considered as the Euclidean distance between two matrices
evolution, the dynamics of time series are studied here. With  k and  l
the definition in (8), the distance matrix can be obtained which  
is shown in Fig. 4(b). It shows distinct structural properties of dF ( k ,  l ) = tr ( l )( k −  l )T . (20)
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

denote the groups of stable fixed points, transient chaos, and


strange attractor, respectively. Based on the distance matrix,
average linkage method is used for realizing the hierarchical
clustering and the results are shown in the dendrograms. For
the dendrogram on top of the heat map (the left dendrogram
is same as the dendrogram above), the vertical and horizon-
tal direction of the dendrogram represents the distance and
clusters, respectively. The splitting of a vertical line (distance
threshold) represents the joining of two clusters. As mentioned
earlier, the samples are divided into three groups according to
the system parameter ρ and this is verified in in Fig. 7(a). It can
be seen that the three clusters as three branches that occur at
about a same distance in Fig. 7(a). Taking note of the colored
Fig. 5. Graphical representation of the network of the Lorenz system based
on the distance matrix. The color of the nodes corresponds to their temporal labels, you can see the MTS samples are clustered into their
order (from blue to bright red). own groups correctly. In addition, the distance threshold for
clustering is well-distributed, and the heat map of the distance
matrix shows a few parts with distinct boundary. However, in
Here, we use an alternative way, hierarchical clustering, to Fig. 7(b), with any distance threshold, the data cannot be cor-
investigate the personality of distance in (8) and (20) in the rectly divided into three clusters, and several samples are also
network construction. The reason for this is the fact that the misclassified. In addition, distribution of the distance threshold
hierarchical clustering is one method for finding community is not uniform. The results show that the Frobenius distance
structures in a network. In the hierarchical clustering algo- is inappropriate in the space of the Riemannian manifold.
rithm, a weight is assigned to each pair of vertices in the In order to further evaluate the performance of Riemannian
network for indicating how closely related the vertices are. distance, we compare it with Pearson correlation coeffi-
Here, the distances in (8) and (20) can be used for deducing cient, DTW and Euclidean distance in the case of UTS
the weights. The hierarchy of clusters is represented as a tree from multisensor. Although this paper focuses on the analy-
(or dendrogram). In particular, the dendrograms can vividly sis of MTS, the Riemann distance can be directly extended
display the effectiveness of the distance. to UTS. The x component of the Lorenz system is used
The Lorenz system in (2) has highly complex behaviors as the UTS and the values of the parameters are described
with the variation of the system parameters. For simplicity, as above. For similar reasons, Gaussian noise with mean 0
we generate time series by keeping σ = 8 and β = 8/3 while and variance 1 is added to the UTS to verify the robust-
varying ρ. Specifically, 5 ≤ ρ < 13.926, 13.926 ≤ ρ < 24.06, ness of the methods. The metrics mentioned here can be
and 24.06 ≤ ρ < 34.79 correspond to three groups of sta- applied directly to UTS except the Riemann distance. To
ble fixed points, transient chaos, and strange attractor (chaos), calculate the Riemann distance, we first transform the UTS
respectively [15]. Gaussian noise with mean 0 and variance into the Riemannian manifold by estimating the covariance
1 is added to the MTS to verify the robustness of the method. matrix. The key step of representation can be realized by uti-
Furthermore, the different length MTS is generated to simulate lizing the idea of phase space reconstruction. With Taken’s
the case of varying length (in the range of 2000–3500 time theorem of embedding [56], the goal of phase space recon-
points) or missing data, and finally we use 45 MTS samples to struction is to reconstruct the original space of system by
carry out the clustering. That is, these samples are treated as unfolding the scalar time series to a higher dimensional phase
vertices in the network, and the distance is used for measuring space. With an embedding dimension m and a time delay τ ,
similarity between the vertices. a UTS x1 , x2 , . . . , xn can be transformed into state vectors in
To perceive the pattern of the MTS intuitively, three typical a reconstructed phase space as
MTS samples are shown in the upper half of Fig. 6 which  T
xi = xi , xi+τ , xi+2τ , . . . , xi+(m−1)τ (21)
correspond to the three groups above mentioned. The lower
half of the figure is produced from the upper one by adding where the embedding dimension m is known in advance, and
the noise. It can be seen that the original patterns are heavily the time delay τ which is determined by the mutual informa-
disrupted by the noise and then the noisy data is used to carry tion method [57]. So a UTS x1 , x2 , . . . , xn can be described
out the clustering. Since the chaotic time series cover a wide by a m × (n − τ ) matrix X which is a time-delayed versions of
spectral domain, it is not trivial to filter the noise by common the UTS. Thus the sample X can be represented with a m × m
filter techniques. So it is reasonable to use the noisy data to covariance matrix and then the Riemannian distance can be
evaluate the robustness of our approach. obtained using (8). Finally we use 145 UTS samples (the num-
Fig. 7 shows the results of clustering with different distance ber of network nodes) to carry out the network construction.
measure. In this figure, the distance matrix, a symmetric one, Fig. 8 shows the results of network construction using
containing the Riemannian distance and Frobenius distance different similarity metrics. According to the value of the
between all pairs of the covariance matrices, respectively. It is parameter ρ (5 ≤ ρ < 13.926, 13.926 ≤ ρ < 24.06,
plotted as a color-encoded matrix, namely a heat map. The red, and 24.06 ≤ ρ < 34.79), we know that the samples are
yellow, and green labels above and to the left of the heat map divided into three groups. As can be seen from the figure,
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 9

Fig. 6. Lorenz system with different parameters in 3-D space. The upper half and lower half are the clean and noisy data, respectively.

(a) (b)

Fig. 7. Clustering using different distance measure. (a) Riemannian distance. (b) Frobenius distance.

only the network in Fig. 8(a) reveals the structure of the three that Euclidean distance is not appropriate for measuring the
communities which reflects the powerful performance of the similarity of time series in complex cases. The reason is that
Riemann distance. In fact, the relevant analysis of chaotic time Euclidean distance is widely known to be very sensitive to dis-
series is a challenging task as it usually has highly complex tortion in time axis, noise and outliers. In many cases, DTW is
behaviors. In chaos theory, “Butterfly Effect” is a well-known a powerful technique to find an optimal nonlinear alignment
phenomenon which is the concept that small causes can have between two time series since it can overcome the problem
large effects. Initially, the two time series are very consistent, of distortion in the time axis. However, noise and “Butterfly
but the difference becomes larger and larger with the evolution Effect” still have a greater impact on the performance of
of time. Thus it is very hard to catch the similarity between DTW. Thus, Fig. 8(d) shows that DTW does not well reflect
the chaotic time series with a general metric. Fig. 8(c) shows the essential structure of the time series samples. In contrast, in
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

(b)

(c)

(a) (d)

Fig. 8. Networks constructed from the x-coordinate of the Lorenz system using different similarity metrics. The graphs have been embedded into an abstract
2-D space using edge-weighted spring-embedded layout. For panels (a)–(d), the node color indicates the system parameter ρ (from blue to bright red).
(a) Riemannian distance (embedding dimension m = 3 with delay τ = 8 time steps). (b) Pearson correlation coefficient. (c) Euclidean distance. (d) DTW.

the current case, the performance of Pearson correlation coef- to say there are five variables in this case. Considering the
ficient is better than that of Euclidean distance and DTW and data integrity, finally, we used data from 116 stations to form
this point can be shown in Fig. 8(b). Pearson correlation coef- the multisensor for carrying out the simulation.
ficient is invariant to scaling, i.e., multiplying all elements by To investigate the evolution feature of the network of realis-
a nonzero constant or adding any constant to all elements. In tic data, we first study the observed data from Moose (a town
addition, it focuses on catching the co-variation between two in Wyoming, western U.S.), namely a single sensor. Similar
time series which is similar to the Riemann distance. However, to the practice in artificial data, we still use a sliding win-
compared with Pearson correlation coefficient, Riemann dis- dow to intercept time series to study the dynamics of time
tance can capture more detailed information in time series by series. As the visualization cannot be realized more than three
transforming time series to a high-dimensional space. dimensions, we just select three variables, namely MAT, TAP,
and ARH to estimate the 3 × 3 covariance matrices. Finally,
168 covariance matrices are estimated with slip step 2 (two
B. Weather and Climate Data days) and slip window width 30. As mentioned in Section III,
In this paper, we use weather and climate data to evaluate the covariance matrix can be described by ellipsoid. To under-
the performance of the proposed method. In fact, weather stand the evolutionary process, we plot the ellipsoid series
sensing is a typical CPS that collects reliable information from in Fig. 9(a). The color of the ellipsoid corresponds to their
every weather station in different geographical space [2]. It temporal order in the year 2014 (from blue to bright red),
has become a comprehensive system that includes information namely the first one and last one is on the bottom left and
collection, analysis of sensor data, control and protection of upper right in the figure, respectively. The orientation and
environment. NCEI is the world’s largest provider of weather sharpness of the glyphs depend on the eigenvectors and eigen-
and climate data. In this paper, the daily data of 2014 from values of the covariance matrix, respectively. Along with time,
quality controlled datasets are used for analyzing which are Fig. 9(a) shows a gradual change in shape and orientation
collected from climate-observing stations in the U.S. (http:// of ellipsoids which reflect the evolution process of the three
www1.ncdc.noaa.gov/pub/data/uscrn/products/daily01/). We variables. That is, the variation of eigenvectors and eigenval-
selected five types of data from each station which are mean ues reflect a composite, or group, of changing variables MAT,
air temperature (MAT), total amount of precipitation (TAP), TAP, and ARH. We show the variation of the eigenvalues in
total solar energy (TSE), average infrared surface tempera- Fig. 9(b). In most cases, it can be seen that λ3 is larger than
ture (AIST), and average relative humidity (ARH). That is the other two which lead to the linear shape of ellipsoids.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 11

(a) (b)

Fig. 9. Evolution of ellipsoids. (a) 3-D glyphs visualizing ellipsoids. (b) Variation of eigenvalues.

Fig. 10. Complex network structure of Moose based on the distance matrix. The color of the nodes corresponds to their temporal order (from blue to
bright red).

To construct the network for the station in Moose, we esti- nontrivial topological features, such as the degree, clustering
mated the 5 × 5 covariance matrices for all five variables and coefficient, and assortativity etc., can also be obtained from
Fig. 10 shows the constructed network graph. Each node cor- the network to understand the climate behavior [58]. We have
responds to the covariance matrix estimated from the data in not investigated these topics since this paper focus on the
the sliding window over time. Therefore, this network can construction method of a network.
be considered as a dynamic one. The edge is determined by To compare the proposed method with the classical tech-
the distance between covariance matrices and here the thresh- niques, we also use the correlation distance to construct the
old rc = 2.0 by preserving the 20% edges with the highest network in the case of univariate data. In the following exper-
values. The nodes are mainly divided into two communities iment, the threshold rc is also determined by preserving the
encircled by two ovals. Generally, nodes in same community 20% edges with the highest values. The correlation distance
mean that they share similar features. Note that the color of between variables u and v, is defined as d(u, v) = 1 − ρ,
nodes denotes the temporal order. So the nodes in the commu- where ρ is the Pearson correlation coefficient for u and v.
nity at the bottom left approximately correspond to January, Five networks are created separately based on MAT, TAP, TSE,
February, March, November, and December. Meanwhile, the AIST, and ARH which are presented in Fig. 11(a)–(e). Each
second community corresponds to the time period from April node corresponds to a climate-observing station and the edge is
to October. The results are reasonable since the first com- determined by the correlation distance. The size of the sphere
munity represents the winter and spring, and the second one corresponds to the degree of the node (station). To illustrate
corresponds to summer and autumn. In addition, substantial the results more clearly, the network is overlaid with the map
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

(a) (b)

(c) (d)

(e) (f)

Fig. 11. Network structure for different variables. (a) MAT, rc = 1.0. (b) TAP, rc = 0.85. (c) TSE, rc = 0.3. (d) AIST, rc = 0.9. (e) ARH, rc = 0.6.
(f) Combination of five variables, rc = 1.6.

of the U.S. except for Alaska and Hawaii and each sphere is used to cluster the station [59]. Same color of the sphere
denote the location of the station. In order to investigate the means that the stations belong to a same cluster which rep-
effectiveness of the network, the Wakita–Tsurumi algorithm resents these stations have similar features. In addition, color
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 13

(a) (b)

Fig. 12. Climate data. Items are MAT, TAP, TSE, AIST, and ARH from top to bottom, respectively. (a) Moose. (b) Lander.

of background represents the interpolated value of variable in The U.S. includes areas with a very great range of weather
different stations. and climatic conditions around the year. Some interesting phe-
The network of univariate data can only reflect the par- nomena can be immediately found in Fig. 11(f). Here, larger
tial relationship and different variables will lead to different size of the sphere represents the larger degree which usu-
network structures. Fig. 11(a) shows the network of MAT. We ally means that the node share the common features with its
can see that the network is mainly divided into three parts neighbor. In contrast, small size denotes the node have an inde-
which are west (green nodes), heartland (light blue nodes), and pendent character. In Fig. 11(f), there are some isolated nodes
east (dark blue nodes). Fig. 11(a) is similar to Fig. 11(d) that located at the western U.S. which reveal that these regions
corresponds to the infrared surface temperature (AIST). This have special weather conditions. It is easy to understand this
is explained by the fact that there is a very close relation situation as we all have seen the cowboy movies which showed
between MAT and AIST, namely, they both are associated the climate conditions in that area.
with temperature. The network structure in Fig. 11(b) is nearly In general cases, the weather and climate condition in
homogeneous that is the TAP. It should note that homoge- related areas are similar. However, it is worthwhile to note
neousness does not mean the nodes have similar TAP but that the weather of Darrington and Port Aransas in Fig. 11(f).
rather smooth transition. Nonetheless, also have an excep- The thick red curves in figure show the connections between
tion: the station in Everglades has similar TAP with others Darrington and other areas. Though Darrington locate at the
which locate far away from it (denoted by the red edge). north-western U.S., the figure reveals that it has comparable
Fig. 11(c) and (e) illustrates the network for TSE and ARH, weather condition with the southeast. Note that this situa-
respectively. Generally, TSE has close relationship with ARH tion also found in Port Aransas (the edges are denoted as
which can be seen in the two figures. In particular, the West, purple curves). This phenomenon may be explained by “tele-
high TSE corresponds to low ARH and the network is dense connections” that is the term in the climate science [60]. By
in this area. As a generalization, the climate of the West can teleconnections, we mean the relationship between weather
be described as an overall semiarid; however, parts of the changes arising in widely separated regions. The main rea-
West get extremely high amounts of rain and/or snow, and still son for this fact is that information is transferring between the
other parts are true desert and get less than 130 mm of rain distant points through the atmosphere. Generally, the informa-
per year. tion transmission will lead to a significant positive or negative
In most cases, the interaction between two objects is com- correlation in these places.
plex which is a comprehensive effect of various elements. From Fig. 11(f), we can also find another exception is the
Similarly, the weather is often affected by various factors weather in Moose (orange sphere). In a strange way, Moose
such as air temperature, atmospheric pressure, humidity, etc. has no any connection to its neighbor even if they are close
Consequently, it is necessary to investigate the behavior in the together in the geographical location. Perhaps the reason is
case of multivariate data. Fig. 11(f) shows the network con- that it is nearby the Grand Teton National Park which belongs
structed by the proposed method which has considered all the to the Greater Yellowstone Ecosystem. To illustrate this phe-
five variables. The covariance matrix  i of data from each nomenon from the point of view of data, we compare the
station is first estimated where the subscript i denote the label climate data in Moose and Lander which are close together.
of ith station. The threshold rc = 1.6 is used to determine the Fig. 12 shows five types of data including MAT, TAP, TSE,
adjacent matrix and then the network is deduced. AIST, and ARH. We noted that the big difference lies in the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

14 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

MAT, TAP, and AIST. Moose has lower MAT and AIST as [3] I. Lee et al., “Challenges and research directions in medical cyber–
well as higher TAP which can explain the phenomenon of physical systems,” Proc. IEEE, vol. 100, no. 1, pp. 75–90, Jan. 2012.
[4] O. Yazdanbakhsh and S. Dick, “Forecasting of multivariate time series
particular weather condition. In fact, more information can via complex fuzzy logic,” IEEE Trans. Syst., Man, Cybern., Syst.,
be mined from the constructed network. For example, it can vol. 47, no. 8, pp. 2160–2171, Aug. 2017.
provide a viewpoint by focusing on information flow within [5] I. V. Bezsudnov and A. A. Snarskii, “From the time series to the complex
networks: The parametric natural visibility graph,” Phys. A Stat. Mech.
a network over time [39]. Since we mainly address on the con- Appl., vol. 414, pp. 53–60, Nov. 2014.
structed method of complex network, further topics were not [6] Y. Zhao, T. Weng, and S. Ye, “Geometrical invariability of transfor-
investigated in this paper. With regard to the dynamics anal- mation between a time series and a complex network,” Phys. Rev. E,
ysis, we have given an approach to do it using eigenvectors Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 90, no. 1, 2014,
Art. no. 12804.
and eigenvalues in this paper. Though the proposed approach [7] M. Small, “Complex networks from time series: Capturing dynamics,”
is elementary (or naive), it provides us with a novel viewpoint in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), Beijing, China, 2013,
to study the dynamics of variables in the case of MTS. In pp. 2509–2512.
[8] A. Clauset, C. Moore, and M. E. J. Newman, “Hierarchical structure and
fact, eigenvectors and eigenvalues have always been an import the prediction of missing links in networks,” Nature, vol. 453, no. 7191,
method to analyze the dynamics of a system [61], [62]. In pp. 98–101, May 2008.
short, the complex networks of time series are capable of cap- [9] A.-L. Barabasi, “Scale-free networks: A decade and beyond,” Science,
turing complex relationships, discovering spatial, and temporal vol. 325, no. 5939, pp. 412–413, Jul. 2009.
[10] D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small-world’
structure and incorporating predictive modeling into a single networks,” Nature, vol. 393, no. 6684, pp. 440–442, Jun. 1998.
framework. This network approach has already led to novel [11] A. E. Motter and R. Albert, “Networks in motion,” Phys. Today, vol. 65,
insights for time series analysis, and we believe it will hold no. 4, p. 43, 2012.
[12] J. McGloin and D. Kirk, “Social network analysis,” in Handbook
even greater potential. of Quantitative Criminology, A. R. Piquero and D. Weisburd, Eds.
New York, NY, USA: Springer, 2010, pp. 209–224.
V. C ONCLUSION [13] M. Kárný and R. Herzallah, “Scalable harmonization of complex
networks with local adaptive controllers,” IEEE Trans. Syst., Man,
In CPS, a large number of sensors are often used to collect Cybern., Syst., vol. 47, no. 3, pp. 394–404, Mar. 2017.
information which requires further processing and analysis. In [14] N. Pham, T. Abdelzaher, and S. Nath, “On bounding data stream pri-
general, the information is presented in a time series form. In vacy in distributed cyber-physical systems,” in Proc. IEEE Int. Conf.
Sensor Netw. Ubiquitous Trustworthy Comput., Newport Beach, CA,
this paper, we proposed an effective method for constructing USA, 2010, pp. 221–228.
complex networks for MTSs from a single sensor and mul- [15] S. H. Strogatz, Nonlinear Dynamics and Chaos With Applications to
tisensor. This method extends existing techniques of network Physics, Chemistry and Engineering. Cambridge, MA, USA: Westview
Press, 1994.
construction from time series to support the time series analyz- [16] Z. Gao and N. Jin, “Flow-pattern identification and nonlinear dynam-
ing problems and multisensor data fusion. The primary aim is ics of gas-liquid two-phase flow in complex networks,” Phys. Rev. E,
to represent the time series with covariance matrices among the Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top., vol. 79, no. 6, 2009,
Art. no. 66303.
variables and then construct the network. We showed that the
[17] Y. Yang and H. Yang, “Complex network-based time series analysis,”
proposed method is intrinsic, that is, the constructed network Phys. A Stat. Mech. Appl., vol. 387, nos. 5–6, pp. 1381–1386, 2008.
lying on a Riemannian manifold. In many cases, the time [18] X. Xu, J. Zhang, and M. Small, “Superfamily phenomena and motifs of
series form a non-Euclidean geometry structure, and hence the networks induced from time series,” Proc. Nat. Acad. Sci. USA, vol. 105,
no. 50, pp. 19601–19605, 2008.
proposed method may be suitable for real-world applications. [19] S. Salvador and P. Chan, “FastDTW: Toward accurate dynamic time
In particular, this method provides a new idea and method for warping in linear time and space,” Intell. Data Anal., vol. 11, no. 5,
modeling, representation, visualization, and big data analytics pp. 561–580, 2007.
in CPS. [20] L. Bergroth, H. Hakonen, and T. Raita, “A survey of longest common
subsequence algorithms,” in Proc. 7th Int. Symp. String Process. Inf.
The proposed method is to convert time series into com- Retrieval. (SPIRE), 2000, pp. 39–48.
plex networks and the basic motivation for this conversion is [21] M. G. Baydogan and G. Runger, “Learning a symbolic representation for
to use complex network science to analyze and quantify the multivariate time series classification,” Data Min. Knowl. Disc., vol. 29,
no. 2, pp. 400–422, 2014.
time series. In future work, we intend to develop classifica- [22] H. Ferreira and M. Ferreira, “Extremes of scale mixtures of multivariate
tion and clustering algorithm on a Riemannian manifold for time series,” J. Multivariate Anal., vol. 137, pp. 82–99, May 2015.
time series mining. Classification problem plays an important [23] H. M. La, W. Sheng, and J. Chen, “Cooperative and active sensing in
mobile sensor networks for scalar field mapping,” IEEE Trans. Syst.,
role in time series data mining such as speech recognition, Man, Cybern., Syst., vol. 45, no. 1, pp. 1–12, Jan. 2015.
financial market analysis, or climate prediction. The proposed [24] B. Chakraborty, “Feature selection and classification techniques for
method provides a novel form (complex network) to deal multivariate time series,” in Proc. 2nd Int. Conf. Innov. Comput. Inf.
with these problems. In particular, we intend to apply the Control (ICICIC), Kumamoto, Japan, 2007, p. 42.
[25] X. Zhang, J. Wu, X. Yang, H. Ou, and T. Lv, “A novel pattern extrac-
presented approach to functional magnetic resonance imaging tion method for time series classification,” Optim. Eng., vol. 10, no. 2,
applications, such as the decoding of human visual systems. pp. 253–271, 2009.
[26] T. Górecki and M. Łuczak, “Multivariate time series classification with
parametric derivative dynamic time warping,” Expert Syst. Appl., vol. 42,
R EFERENCES no. 5, pp. 2305–2312, Apr. 2015.
[1] S. K. Khaitan and J. D. McCalley, “Design techniques and applica- [27] S. I. Amari and H. Nagaoka, Methods of Information Geometry.
tions of cyberphysical systems: A survey,” IEEE Syst. J., vol. 9, no. 2, Providence, RI, USA: Oxford Univ. Press, 2000.
pp. 350–365, Jun. 2015. [28] M. Ma, J. An, Z. Huang, and Z. Cao, “Sensor data fusion based on an
[2] G. Mois, T. Sanislav, and S. C. Folea, “A cyber-physical system for improved dempaster-shafer evidence theory in vehicular cyber-physical
environmental monitoring,” IEEE Trans. Instrum. Meas., vol. 65, no. 6, systems,” in Proc. IEEE Int. Symp. Intell. Control (ISIC), Sydney, NSW,
pp. 1463–1471, Jun. 2016. Australia, 2015, pp. 683–687.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

SUN et al.: COMPLEX NETWORK CONSTRUCTION OF MTSs USING INFORMATION GEOMETRY 15

[29] A. Miloslavov and M. Veeraraghavan, “Sensor data fusion algorithms for [57] A. M. Fraser and H. L. Swinney, “Independent coordinates for strange
vehicular cyber-physical systems,” IEEE Trans. Parallel Distrib. Syst., attractors from mutual information,” Phys. Rev. A, vol. 33, no. 2,
vol. 23, no. 9, pp. 1762–1774, Sep. 2012. pp. 1134–1140, Feb. 1986.
[30] A.-L. Barabási, Linked: The New Science of Networks. Cambridge, MA, [58] M. E. J. Newman, Networks: An Introduction. Oxford, U.K.: Oxford
USA: Perseus Books Group, 2002, pp. 9–25. Univ. Press, 2010.
[31] A. Barrat, M. Barthélemy, and A. Vespignani, Dynamical Processes on [59] K. Wakita and T. Tsurumi, “Finding community structure in mega-scale
Complex Networks. New York, NY, USA: Cambridge Univ. Press, 2008. social networks,” in Proc. 16th Int. Conf. World Wide Web, Banff, AB,
[32] R. Pastor-Satorras and A. Vespignani, Evolution and Structure of the Canada, 2007, pp. 1275–1276.
Internet. Cambridge, U.K.: Cambridge Univ. Press, 2007. [60] A. A. Tsonis, K. L. Swanson, and G. Wang, “On the role of atmospheric
[33] P. Manshour and A. Montakhab, “Contagion spreading on complex teleconnections in climate,” J. Climate, vol. 21, no. 12, pp. 2990–3001,
networks with local deterministic dynamics,” Commun. Nonlin. Sci. Jun. 2008.
Numer. Simulat., vol. 19, no. 7, pp. 2414–2422, Jul. 2014. [61] S. Adhikari, “Rates of change of eigenvalues and eigenvectors in damped
[34] J. Zhang and M. Small, “Complex network from pseudoperiodic time dynamic system,” AIAA J., vol. 37, no. 11, pp. 1452–1458, Nov. 1999.
series: Topology versus dynamics,” Phys. Rev. Lett., vol. 96, no. 23, [62] P. Gonçalves, “Behavior modes, pathways and overall trajectories:
Jun. 2006, Art. no. 238701. Eigenvector and eigenvalue analysis of dynamic systems,” Syst. Dyn.
[35] L. Lacasa, B. Luque, F. Ballesteros, J. Luque, and J. C. Nuño, “From Rev., vol. 25, no. 1, pp. 35–62, Jan./Mar. 2009.
time series to complex networks: The visibility graph,” Proc. Nat. Acad.
Sci. USA, vol. 105, no. 13, pp. 4972–4975, 2008.
[36] A. H. Shirazi et al., “Mapping stochastic processes onto complex Jiancheng Sun (SM’17) received the B.S. and
networks,” J. Stat. Mech. Theory Exp., vol. 2009, no. 7, Jul. 2009, M.S. degrees in nuclear science and technology
Art. no. P07046. from Harbin Engineering University, Heilongjiang,
[37] R. V. Donner and J. F. Donges, “Visibility graph analysis of geophysical China, in 1997 and 2000, respectively, and the Ph.D.
time series: Potentials and possible pitfalls,” Acta Geophys., vol. 60, degree in information and communication engineer-
no. 3, pp. 589–623, 2012. ing from the School of Electronics and Information
[38] C. J. Stam, “Modern network science of neurological disorders,” Nat. Engineering, Xi’an Jiaotong University (XJTU),
Rev. Neurosci., vol. 15, no. 10, pp. 683–695, Sep. 2014. Xi’an, China, in 2005.
[39] I. Ebert-Uphoff and Y. Deng, “A new type of climate network based on He is currently a Professor with the School of
probabilistic graphical models: Results of boreal winter versus summer,” Software and Communication Engineering, Jiangxi
Geophys. Res. Lett., vol. 39, no. 19, pp. 157–159, 2012. University of Finance and Economics, Nanchang,
[40] Z. Gao and N. Jin, “Complex network from time series based on phase China. From 2007 to 2009, he held a Post-Doctoral Fellowship in the Key
space reconstruction,” Chaos, vol. 19, no. 3, 2009, Art. no. 33137. Laboratory of Biomedical Information Engineering of Ministry of Education,
[41] C. Liu, W.-X. Zhou, and W.-K. Yuan, “Statistical properties of visibility XJTU. From 2013 to 2014, he was an Academic Visitor with Neuralimage
graph of energy dissipation rates in three-dimensional fully developed Centre, York University, York, U.K. His current research interests include
turbulence,” Phys. A Stat. Mech. Appl., vol. 389, no. 13, pp. 2675–2681, machine learning and nonlinear time-series analysis.
Jul. 2010.
[42] K. Padberg, B. Thiere, R. Preis, and M. Dellnitz, “Local expansion con-
cepts for detecting transport barriers in dynamical systems,” Commun. Yong Yang (SM’16) received the Ph.D. degree
Nonlin. Sci. Numer. Simulat., vol. 14, no. 12, pp. 4176–4190, Dec. 2009. in biomedical engineering from Xi’an Jiaotong
[43] K. Sun and S. Marchand-Maillet, “An information geometry of statistical University, Xi’an, China, in 2005.
manifold learning,” in Proc. 31st Int. Conf. Mach. Learn., Beijing, China, He is currently a Full Professor with the School
2013, pp. 1–9. of Information Technology, Jiangxi University of
[44] F. Barbaresco, “Innovative tools for radar signal processing based on Finance and Economics, Nanchang, China. From
Cartan’s geometry of SPD matrices & information geometry,” in Proc. 2009 to 2010, he was a Post-Doctoral Research
IEEE Radar Conf., Rome, Italy, 2008, pp. 1–6. Fellow with Chonbuk National University, Jeonju,
[45] X. Pennec, P. Fillard, and N. Ayache, “A Riemannian framework for South Korea. His current research interests include
tensor computing,” Int. J. Comput. Vis., vol. 66, no. 1, pp. 41–66, 2006. image and signal processing, medical image process-
[46] S.-I. Amari, “Information geometry of positive measures and positive- ing and analysis, and pattern recognition.
definite matrices: Decomposable dually flat structure,” Entropy, vol. 16, Dr. Yang has been a recipient of the Jiangxi Province Young Scientist Award
no. 4, pp. 2131–2145, 2014. since 2012. He is a member of ACM.
[47] K. Steinhaeuser, N. V. N. Chawla, and A. R. A. Ganguly, “Complex
networks in climate science: Progress, opportunities and challenges,” in
Proc. Conf. Intell. Data Understand., 2010, pp. 16–26. Neal N. Xiong (SM’12) received the Ph.D. degree in
[48] E. N. Lorenz, “Deterministic nonperiodic flow,” J. Atmos. Sci., vol. 20, sensor system engineering from Wuhan University,
no. 2, pp. 130–141, Mar. 1963. Wuhan, China and the Ph.D. degree in dependable
[49] P. B. Kingsley, “Introduction to diffusion tensor imaging mathematics: sensor networks from the Japan Advanced Institute
Part I. Tensors, rotations, and eigenvectors,” Concepts Magn. Reson. A, of Science and Technology, Nomi, Japan.
vol. 28A, no. 2, pp. 101–122, Mar. 2006. He was with Georgia State University, Atlanta,
[50] R. E. Kass and P. W. Vos, Geometrical Foundations of Asymptotic GA, USA, Wentworth Technology Institution,
Inference. New York, NY, USA: Wiley, 1997. Boston, MA, USA, and Colorado Technical
[51] J. E. Contreras-Reyes and R. B. Arellano-Valle, “Kullback–Leibler University, Colorado Springs, CO, USA, about
divergence measure for multivariate skew-normal distributions,” Entropy, 10 years. He is currently an Associate Professor
vol. 14, no. 9, pp. 1606–1626, Sep. 2012. (3rd year) with the Department of Mathematics
[52] V. Arsigny, P. Fillard, X. Pennec, and N. Ayache, “Geometric means in and Computer Science, Northeastern State University, OK, USA. His cur-
a novel vector space structure on symmetric positive−definite matrices,” rent research interests include cloud computing, security and dependability,
SIAM J. Matrix Anal. Appl., vol. 29, no. 1, pp. 328–347, 2007. parallel and distributed computing, networks, and optimization theory. He
[53] D. Harel and Y. Koren, “A fast multi-scale method for drawing large has published over 280 international journal papers and over 120 inter-
graphs,” J. Graph Algorithms Appl., vol. 6, no. 3, pp. 179–202, 2002. national conference papers. Some of his works were published in IEEE
[54] S. Matei, “Analyzing social media networks with NodeXL: Insights J OURNAL ON S ELECTED A REAS IN C OMMUNICATIONS, IEEE or ACM
from a connected world by Derek Hansen, Ben Shneiderman, and Transactions, ACM Sigcomm Workshop, IEEE International Conference
Marc A. Smith,” Int. J. Human Comput. Interaction, vol. 27, no. 4, on Computer Communications, International Conference on Distributed
pp. 405–408, 2011. Computing Systems, and International Symposium on Parallel and Distributed
[55] K. Petersen and M. Pedersen. (May 10, 2012). The Matrix Processing.
Cookbook. [Online]. Available: https://www.math.uwaterloo.ca/∼ Dr. Xiong was a recipient of the Best Paper Award in the 10th
hwolkowi/matrixcookbook.pdf IEEE International Conference on High Performance Computing and
[56] F. Takens, “Detecting strange attractors in turbulence,” in Dynamical Communications (HPCC-08) and the Best Student Paper Award in the 28th
Systems and Turbulence, Warwick 1980. Berlin, Germany: North American Fuzzy Information Processing Society Annual Conference
Springer-Verlag, 1981, pp. 366–381. (NAFIPS2009).
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

16 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Liyun Dai was born in 1974. She received the M.S. Jianguo Luo received the B.S. degree in commu-
and Ph.D. degrees in communication and informa- nication engineering from Xidian University, Xi’an,
tion system from the Beijing University of Posts and China, the M.S. degree in software engineering from
Telecommunications, Beijing, China, in 2003 and the Huazhong University of Science and Technology,
2012, respectively. Wuhan, China, in 1991 and 2006, respectively.
In 2005, she was a Research Fellow of the He is currently an Associate Professor with
Modern Communication Institute, Jiangxi University the School of Software and Communication
of Finance and Economics, Nanchang, China, where Engineering, Jiangxi University of Finance and
she is currently an Associate Professor. Her current Economics, Nanchang, China. His current research
research interests include coded modulation, error- interests include signal processing and machine
correcting code, Multiple-Input Multiple-Output, learning.
and Code Division Multiple Access.

Xiangdong Peng was born in 1975. He received


the M.S. degree in electrical and communica-
tion engineering from the Huazhong University of
Science and Technology, Wuhan, China, in 2004 and
the Ph.D. degree in mechatronic engineering from
Nanchang University, Nanchang, China, in 2015.
He is currently a Lecturer with the Jiangxi
University of Finance and Economics, Nanchang.
His current research interests include machine learn-
ing, compressed sensing, healthcare service robot,
and body area sensor network.

You might also like