You are on page 1of 10

Received November 12, 2019, accepted December 16, 2019, date of publication December 20, 2019,

date of current version December 31, 2019.


Digital Object Identifier 10.1109/ACCESS.2019.2961243

Opportunistic Networks Link Prediction Method


Based on Bayesian Recurrent Neural Network
YULIANG MA AND JIAN SHU
School of Software, Nanchang Hangkong University, Nanchang 330063, China
Corresponding author: Jian Shu (shujian@nchu.edu.cn)
This work was supported in part by the National Natural Science Foundation of China under Grant 61762065 and Grant 61962037, in part
by the Natural Science Foundation of Jiangxi Province under Grant 20171ACB20018 and Grant 20171BBH80022, and in part by the
Innovation Foundation for Postgraduate Student of Jiangxi Province under Grant YC2018093.

ABSTRACT The goal of link prediction is to estimate the possibility of future links among nodes using
known network information and the attributes of the nodes. According to the time-varying characteristics
and the node’s mobility of opportunistic networks, this paper proposes a novel link prediction method
based on the Bayesian recurrent neural network (BRNN-LP) framework. The time series data of a dynamic
opportunistic networks is sliced into snapshots in which there exist the correlation information and spatial
location information. A vector of a snapshot is constructed based on such information, which represents
the link information. Then, the vectors of multiple network snapshots constitute a spatiotemporal vector
sequence. Benefiting from the BRNN’s ability of extracting the features of time series data, the correlation
between spatiotemporal vector sequence and node connection states is learned, and the law of the link
evolution is captured to predict future links. The results on the MIT Reality dataset show that compared
with methods such as the similarity-based indices, the support vector classifier, linear discriminant analysis
and recurrent neural network, the proposed prediction method is more accurate and stable.

INDEX TERMS Link prediction, opportunistic networks, Bayesian recurrent neural network.

I. INTRODUCTION
Opportunistic network (ON) [1] is an ad hoc network based
on Mobile Ad Hoc Networks (MANETs) and Delay Tolerant
Networks (DTNs). ON utilizes node mobility and the ‘‘store-
carry-forward’’ routing pattern to realize network commu-
nications, and the network topology changes over time [2].
Figure 1 shows the process of sending messages from the
FIGURE 1. ON communication.
source node S to the destination node D. The network topol-
ogy of ON changes constantly. ON does not require the
full connectivity of the network and is more suitable for nodes by analysing the known link information, which
the actual requirements of a self-organizing network, which is helpful to obtain the evolutional law of the network
can meet the needs of network communications under bad structure [9], and provide decisions for ON routing and
conditions [3]. ON is one of the hot fields in network, and forwarding, so as to provide better services for the upper-layer
it is widely applied in many fields such as Location-based application.
Service [4], Media Services [5], Mobile Data Offloading [6], According to the time-varying characteristics and the
and Intelligent Traffic [7] and so on. nodes’ mobility of ON, a link prediction method based on
Among the various aspects, the opportunistic forwarding a Bayesian recurrent neural network (BRNN) [11], [12] is
mechanism is the first and foremost issue in ON field. Link proposed.
prediction can dig out the potential relationships between This paper makes two research contributions. Firstly,
it proposes a representation method for the link information in
The associate editor coordinating the review of this manuscript and ON. We construct a vector to represent the link information,
approving it for publication was Bilal Alatas . which includes the correlation information and the spatial

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
185786 VOLUME 7, 2019
Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

location information of node pairs. Secondly, this paper puts A successful similarity indices-based prediction method
forward a deep learning model, BRRN-LP, to predict the links depends upon the exact representation of the network’s struc-
of ON. It converts prediction problems into classification tural features. This method can be effective for networks with
problems, and a better result of link prediction is achieved static topology or topology changing slowly, but it has a poor
by the proposed model. predictive effect on ON with sparse and topological changes
The rest of the paper is organized as follows. Section 2 anal- because it cannot depict the changes of ON [22].
yses the related work. Section 3 describes the problem of
link prediction. Section 4 explains the representation methods B. PROBABILISTIC MODEL-BASED
for link information. Section 5 introduces the construction PREDICTION METHODS
of the BRRN-LP model. The experiments and analyses are Probabilistic model-based prediction methods constructs a
described in Section 6. The last part is the conclusion. probabilistic model that can describe the network structure
and the relational features primarily by optimizing of the
II. RELATED WORK parameter. Wang et al. [23] design a local probability model
The link prediction methods for ON can be grouped into with two-hop neighbours, which utilize Markov random field
the following categories: link prediction methods based on and undirected graphs to solve the link prediction problem in
similarity indices, probabilistic models, machine learning and large graphs. Das and Das [24] considered the impacts of fine-
time series. grained and coarse-grained time on links, and established
a Markov prediction model based on time-varying graphs
A. SIMILARITY INDICES-BASED PREDICTION METHODS of social networks, which can better predict the connection
Similarity indices-based prediction methods predict links by of the node pairs in a dynamic network. Shalforoushan and
constructing similarity indices which are used to describe the Jalali [25] use a bayesian network to learn the relation-
similarity between nodes. The similarity between nodes is ships between variables, and propose friend recommendation
characterized by attribute similarity and structure similarity. method based on Bayesian network for a social network.
Usually, structure similarity is employed to predict links, Probabilistic model-based prediction methods describe the
because the attributes of node are difficult to obtain and not entire network (including the node attribute information and
accurate. Researchers have proposed around 30 traditional network structural information), which allows the method to
algorithms [13] that are based on different network structures. obtain more accurate prediction result. However, this method
The structure similarity indices-based prediction is further is computationally complex. The model parameters need to
categorized as local, global, and quasi-local methods [14]. be adjusted according to the requirements of different net-
Local similarity methods use local information of an edge works, causing poor applicability.
to calculate the similarity matrix, such as the degree of a
node, common neighbour (CN) [15], etc. The CN similarity C. MACHINE LEARNING AND TIME SERIES-BASED
index considers the impact of a one-hop neighbour on the PREDICTION METHODS
node pairs, and assumes that the more common neighbours Machine learning based prediction methods predict links by
are, the more possibility of connection between nodes is, and learning the more essential features of the network. Link
thus deriving a series of new similarity index, such as Salton, information is the input of the machine learning model.
Jaccard, preferential attachment (PA) [16], and Adamic-Adar Output is 0 or 1, where 0 means that the node pairs are not
(AA) [17]. Shang et al. [18] points out that the performance connected, and 1 means that the node pairs are connected. The
of these algorithms on highly sparse or treelike networks is commonly used supervision classification algorithms, such
poor, and proposes heterogeneity index, homogeneity index as the support vector machine (SVM), kernel regression [26],
and heterogeneity adaption index for their heterogeneity. decision tree and linear discriminant analysis (LDA) [27], can
Global similarity method uses total information on net- solve link prediction problems, and the SVM is better than the
work to calculate the similarity matrix between nodes. most classification algorithms on the link prediction for Sina
Katz [19] takes into account all paths and distributes different Weibo [28]. Yang et al. [29] propose a link prediction method
weights to paths with different lengths. The similarity indices based on clustering and decision trees. After clustering two
based on random walks uses the node random walk to estab- types of objects, the method constructs decision trees to solve
lish the probability transfer matrix, so as to obtain the prob- the link prediction problem of heterogeneous networks.
ability from one node going to another node, thus obtaining With further studies on link prediction for ON, schol-
the probability that the node pairs establish a connection. ars have paid more attention to the regulation of network
Quasi-Local similarity Methods do not require global topo- changes, and predict links using time series. There are usually
logical information but use more information than local three steps. First, the similarity indices is chosen. Second,
index. The local path (LP) [20] takes into account the contri- the time series method is employed to analyse the law of
butions of path information of third-order neighbour. In [21], the similarity indices changes. Finally, the predicted scores
SimRank is defined in a self-consistent way, according to the of the future connections of the node pairs is obtained [30].
assumption that two nodes are similar if they are connected Güneş et al. [31] used time series prediction models to
to similar nodes. weight different similarity indices (such as the CN, PA, AA,

VOLUME 7, 2019 185787


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

and Jaccard), which showed that the traditional similarity


indices are suitable for link prediction in dynamic networks.
Wang et al. [32] proposed a time series link prediction method
based on evolution. In this method, the tensor decomposition
is used to find the time factor in the network, and the time
factor is analysed using time series to obtain the time factor
value of the next time. This allows the network structure of
the next time slice to be obtained and the link prediction
problem to be solved. Ozcan et al. proposed NARX neural
networks [33] and Multivariate Time Series Link Prediction
method [34] to solve the link prediction problem in het-
erogeneous dynamic networks. K-K Shang et al. proposed
the concepts of current or history network and the future
network to analyse the link evolution, then proposed the new FIGURE 2. Snapshot of ON.
link prediction metric and algorithms for evolving networks,
and they found that the direct link algorithm performs better
than the indirect link algorithm for a range of time-varying
networks [35]. Furthermore, they found that geographical
location and link weight both have significant influence on
the transport network [36].
For a dynamic ON with more sparse connections, it is
FIGURE 3. Connection diagram of the node pairs in time slices.
necessary for time series analysis by deep learning method.
Using convolutional neural networks (CNNs), [37] extracted
ON’s structural features from the evolutionary process of
the state diagram, and predicted the links between multi- the correlation information and location information within
ple nodes. Considering the historical connection informa- each slice is defined using the contact information and spatial
tion, [38] constructed a sequence vector, and extracted the information respectively. Finally, we construct the vector with
features of the changes of the historical information using correlation information and location information to represent
a recurrent neural network (RNN). Then, link prediction is the link information.
performed.
In this paper, we turn link prediction problems into clas- A. DATA CONVERSION
sification problems. We construct a vector to represent the This paper employs the time series analysis method to seg-
link information, which includes the correlation information ment the continuous data in the data set and divide ON into
and the spatial location information of node pairs. We also several continuous network snapshots.
construct the spatiotemporal vector sequence to represent Figure 3 shows the connection information of the node
historical link information. We propose a link prediction pairs (A, B) in a single network snapshot [TS , TN ]. T denotes
method based on the Bayesian recurrent neural network the time slice length. TS and TN represent the moments when
model (BRNN-LP) to extract the spatiotemporal features the time slice begin and end, respectively. T1S and T2S rep-
from the evolutionary process of ON. The method achieves resent the moments when the node pairs connect during the
a better link prediction. time slice. T1N and T2N represent the moments when the node
pairs disconnect during the time slice. tcon1 and tcon2 represent
III. PROBLEM DESCRIPTION the duration of connection between two nodes. If the time
The topology of ON changes over time. The network topol- slice length T is too large or too small, the information may
ogy is divided into several continuous network snapshots. be blurry or insufficient. Thus, the link prediction results
As shown in Figure 2, the network snapshot set is G = will be affected according to their different values. Later,
{G1 , G2 , . . . , Gt , . . . , GN }, where Gt = (Vt , Et ), Gt repre- we determine the value of T by experiments.
sents the network snapshot at t − th moment, and Vt and
Et respectively represent the sets of nodes and edges in the B. INFORMATION REPRESENTATION
network snapshot at t − th moment. The link prediction in The generation of links depends on the encounter chance that
ON is to predict connection state in the (t + 1) − th network is determined by two factors: the location and the reason
snapshot according to the historical link information of the for encounter. The location factors are influenced by the
node pairs in the previous t network snapshots [38]. habits of the nodes’ movement. The reason for the encounter
seems to be random, but it is more likely that there is some
IV. LINK INFORMATION REPRESENTATION correlation between the nodes. Thus, this paper represents the
The ON dataset are stored in chronological order. Firstly, ON link information with respect to the correlation information
is sliced into several continuous network snapshots. Then and spatial location information.

185788 VOLUME 7, 2019


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

1) REPRESENTATION OF THE CORRELATION INFORMATION


In ON, there are often potential correlations between nodes,
such as social relationships among people in social networks
and race relationships among animals in animal tracking
networks. However, reliable data are often difficult to obtain.
FIGURE 4. Input sample set composition process of the BRNN-LP model.
The contact information of nodes can reflect their correlation.
The contact information includes the frequency and duration
of the connection [39]. The bigger the frequency and duration
V. THE CONSTRUCTION OF PREDICTION MODEL
of connections are, the stronger the correlation between the
To learn the deep relationship between the node connection
nodes is. We believe that the correlation of the nodes is
states and link information, BRNN is employed to capture the
proportional to the frequency and duration of connections.
spatiotemporal features of the connections between nodes in
Thus, we define the correlation of the node pairs at t − th
ON. The trained link prediction model can predict the connec-
time slice as wt , as shown in (1).
 tion states of the node pairs at the next moment. We construct
L the spatiotemporal vector sequence, which consists of vectors
L
 X
etconi , i ∈ (1, 2, . . . , L)

wt = T (1) in several time slices, and take it as the input of the link
 i=1 prediction model BRNN-LP. The output of the model is the

 0, (L = 0) connection states of the node pairs at the next moment. The
where T denotes the time slice length, L denotes the number BRNN-LP model is constructed with the following sections:
of links that are generated by node pairs, tconi denotes the network structure, hyper-parameter setting, and training opti-
duration corresponding to the i − th connection, and L/T mization algorithm.
denotes the connection frequency of the node pairs within
the time slice. The more connections that exist in a time slice, A. INPUT AND OUTPUT OF THE MODEL
the bigger the connection frequency is. The exponential func- We construct the spatiotemporal vector sequence and take
tion etconi represents that the duration impacts on correlation. it as the input of the BRNN-LP model. The spatiotemporal
When wt is 0, there is no connection between nodes in the vector sequence consists of vectors of several time slices,
time slice. where the length of the spatiotemporal vector sequence is the
prediction window size. The vector sequence is defined in (4).
2) REPRESENTATION OF THE SPATIAL
Seq = [VE1 , VE2 , . . . , VEn ] (4)
LOCATION INFORMATION
The spatial location information of the node pairs is the In order to reduce computational complexity, we don’t use all
location information when the nodes meet. This paper marks historical link information. It is commonly assumed that the
the location of the node pairs using the communication area. present connection state is only related to link information
If two nodes are in the same communication area within a of previous n time slices. Selecting an appropriate input
certain period of time, it is more likely that two nodes will sequence length is helpful for feature extraction and link
connect [40]. M is the set of whole communication area prediction. We determine the optimal value of the length of
marks to indicate the locations where the link was generated the vector sequence by experiments.
in the network, where M = {x|x ∈ Z +}. The spatial location Assuming that the length of the space-time vector sequence
information metric of the node pairs at t − th time slice is the is 3, construction process of the input sample set for
location pt , as shown in (2). BRNN-LP model is shown in Figure 4.
X j We turn link prediction problems into classification prob-
 2, m⊂M lems. The input of the BRNN-LP is spatiotemporal vector
pt = j∈m (2) sequence in a link prediction window. The model’s output of

0, m=∅ the node pairs at the next moment is 0 or 1. 0 means that the
where m denotes the subset of M to indicate the locations node pairs are not connected, and 1 means that the node pairs
where the link was generated in the time slice, j is the element are connected.
in the m. If the j is repeated, it is only calculated once. If the
pt value is 0, there is no connection between nodes in the time B. NETWORK STRUCTURE
slice. Recurrent neural network (RNN) [41] are a kind of neural
network, which can be used to analyse time series data.
3) REPRESENTATION OF THE LINK INFORMATION BRNN uses Bayesian method to express the uncertainty of
Comprehensively considering the correlation information neural network parameters. These parameters are treated as
and location information, we construct a vector VEt to repre- random variables. Furthermore, the posterior distribution of
sent link information at t − th time slice, as shown in (3). the parameters based on the given data can be inferred, so that
the BRNN model does not need much data training which is
VEt = (wt , pt ) (3) suitable for solving classification problems with smaller and

VOLUME 7, 2019 185789


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

hyper-parameters. The construction of the BRNN in this


paper involves the following hyper-parameters: the activation
function of the hidden layer, the initialization weight and the
length of the input sequence.

1) THE ACTIVATION FUNCTION OF THE HIDDEN LAYER


The performance of the BRNN-LP model is affected by the
activation function. To ensure successful training of the deep
network structures, one must select a reasonable activation
function. For example, the sigmoid function has double-end
saturation characteristics, which makes it subject to the ‘‘van-
FIGURE 5. Network structure. ishing gradient’’ problem when training deep networks, and
the convergence rate of the model becomes very slow. In view
of this issue, the rectified linear unit (ReLU) was proposed by
sparser samples. We use the model BRNN trained by vector some scholars. The ReLU is a linear activation function with
sequence to extract potential features of the link information the characteristics of simple calculations, high efficiency, and
changes of the node pairs over time. The construction process no gradient saturation. Therefore, we use the ReLU activation
of the BRNN for the link prediction classification task is function to accelerate network training.
shown in formulas (5), (6) and (7).
θ ∼ N (0, I ) (5) 2) THE INITIALIZATION WEIGHT
π = fNN (Seq; θ ) (6) Too bigger or too small initial weights in the RNN may
lead to gradient exploding or gradient vanishing in back
y ∼ Cat(logistic(π )) (7) propagation. The BRNN learns the distribution of weights
where θ is the weight of the neural network fNN , which is from the training samples using Bayesian backpropagation,
the potential feature between nodes, π is the non-normalized which avoids the gradient vanishing and gradient exploding
log probability for predicting all classes, which indicates the problems. Thus, the initial weight values are generated by
connection probability of the node pairs, y is a category label randomly sampling.
indicating whether the node pairs generates a connection at
the next moment. Cat denotes the classification distribution. 3) THE LENGTH OF THE INPUT SEQUENCE
When fNN is a recurrent neural network, the above process The input of the BRNN-LP model is a spatiotemporal vector
describes a BRNN. sequence Seq, and the length of sequence is n which indicates
Long short-term memory (LSTM) improves the traditional the link information of the n network snapshots. We use
RNN through the ‘‘gate’’ limit mechanism, which can memo- Seq to predict the connection status of the node pairs in
rize useful information and solve the problem of gradient van- the (n + 1) − th network snapshot. If the length of the input
ishing. The goal of Bayesian LSTM is to obtain the posterior sequence is too long, it will make the BRNN-LP model have
distribution of weights, which reduces the training parameters too many features at one input, leading to increasing the diffi-
and shortens the training time. The BRNN model is shown culty of model training. If the length of the input sequence is
in Figure 5 [42]. too short, the BRNN-LP model cannot learn feature of link
(t )
We use the current input, vector Vi i , of BRNN and the evolution. In this paper, an appropriate length of the input
(ti −1) sequence is determined contrast experiments.
potential spatiotemporal feature hi−1 at the last moment
(t )
to generate the potential spatiotemporal features hi i at the
(t ) D. OPTIMIZATION ALGORITHM FOR TRAINING
current time, and hi i is taken as the input of BRNN for the
next moment. Bayesian LSTM uses input gates to control This paper chooses Bayes by backpropagation (BBB) to learn
the inflow of spatiotemporal features at the last moment and the weights of the BRNN. BBB is a variational inference
updates the implicit spatiotemporal features of the previous scheme for learning the posterior distribution of the weights
moment using the forget gate. The implicit spatiotemporal θ of neural networks. The posterior distribution generally
features are obtained from the output gate. The input of follows the Gaussian distribution with a mean of µ and a
Bayesian LSTM is the vector sequence, and the output layer standard deviation of σ , and it is denoted as N(θ |µ, σ 2 ).
is a classifier using a Logistic Regression, the result of which p(y|θ, Seq) is the log-likelihood of the model. Then we train
is whether the node pairs’ is connected or not. the network by minimizing the variational free energy which
is shown in (8).
C. HYPER-PARAMETER SETTING 
q(θ )

We adjust the hyper-parameters of the model when training 8(θ) = Eq(θ) (8)
p(y|θ, Seq)p(θ )
the deep learning model. Due to there is no good theoret-
ical support, it is often difficult to determine the optimal where p(θ ) is the prior probability of the weight.

185790 VOLUME 7, 2019


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

TABLE 1. Information of dataset. 2) EVALUATION INDICES OF ACCURACY


Accuracy assesses whether the prediction results are accurate
using the ratio of the number of correctly predicted linked and
unlinked to the total number of predictions. The accuracy is
defined as equation (11).
TP + TN
Accuracy = (11)
P+N
where TP is the number of correctly predicted links, TN is
The variational free energy minimization is equivalent to the number of correctly predicted disconnections, P is the
the log likelihood p(y|θ, Seq) maximization. Equation (8) can total number of predicted links and N is the total number of
be converted to equation (9). predicted disconnections.

8(θ ) = −Eq(θ ) p(y|θ, Seq) + KL [q(θ )||p(θ )]


 
(9) 3) EVALUATION INDICES OF PRECISION
The Precision only focuses on the L links with the top
In the case of a zero θ , the KL term can be seen as a form ranks or the highest scores [13]. The precision is defined as
of weight decay on the mean parameters, where the rate of equation (12).
weight decay is automatically tuned by the standard deviation
m
parameters of the prior and posterior [12]. precision = (12)
L
VI. EXPERIMENTS AND ANALYSIS Here, m is the number of correctly predicted links. When
We select the MIT Reality dataset for the experiments. L does not change, we can use a prediction method with
Different comparison experiments are designed to verify the comparatively better precision. Then, the larger m is, the more
effectiveness of the proposed link prediction method. The probable that a correct link can be found.
evaluation indices are AUC, Accuracy and Precision.
C. EXPERIMENTAL RESULTS AND ANALYSIS
A. EXPERIMENTAL DATASET This experiment is divided into two steps. The first step is
The parameters of the MIT Reality dataset are shown to compare the results of BRNN with different parameters
in Table 1. using the validation set and test set, and to choose the optimal
The MIT Reality dataset records the communications of parameters for BRNN. The second step is to compare BRNN
Bluetooth cell phones that are carried by 97 MIT experi- with other link prediction methods, and to verify the validity
menters over 246 days. This dataset includes cell phone com- of the proposed model by results. The proportion of the
munication records and Bluetooth connection records. This training set to the testing set for all experiments was 4:1.
dataset also has recorded the location information and activity
information of the experimenters. The sampling period is 1) PERFORMANCE OF THE BRNN-LP MODEL UNDER
300s. DIFFERENT PARAMETERS
The experiments in this section verify the influences of dif-
B. EVALUATION INDICES ferent parameters on our method. By experiments of differ-
In this paper, the area under the receiver operating character- ent the number of training iterations, time slice length, and
istic curve (AUC), accuracy and precision are used to evaluate input sequence lengths for BRNN-LP, the optimal number
the effect of the BRNN-LP model. of iterations, time slice length and sequence length can be
determined.
1) EVALUATION INDICES OF AUC
a: THE EFFECT OF THE TRAINING ITERATIONS
The AUC evaluates the algorithm’s performance according to
The number of iterations will affect the generation of the
the whole list. Then, at each time we randomly pick a missing
model. Too many or too few iterations will lead to over-
link and a non-existent link to compare their scores. The AUC
fitting or under-fitting. Hence, we use different numbers of
takes a test edge from the test set and a non-existent edge
iterations of 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100 in
from the non-existing edge set [13]. T he AUC is defined as
experiments to choose optimum number of iterations. The
equation (10).
loss value and accuracy curve of the objective function are
n0 + 0.5n" shown in Figure 6 and Figure 7.
AUC = (10) Figure 6 and figure 7 show that the connection of MIT
n
data set is relatively sparse, the accuracy at the beginning
where n is the number of times of independent comparisons, is relatively high. When the number of iterations is less
there are n0 times the missing link having a higher score and than 10, the loss value of the BRNN-LP model is reduced
n" times they have the same score. dramatically, and the accuracy is increased slowly. When the

VOLUME 7, 2019 185791


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

FIGURE 6. The loss value curve of the training process. FIGURE 8. The accuracy of different time slice lengths.

FIGURE 7. The accuracy curve of the training process. FIGURE 9. The precision of different time slice lengths.

TABLE 2. Prediction results for different time slice lengths.


number of iterations increased from 10 to 40, the loss value
of the BRNN-LP model is gradually reduced, and the accu-
racy is increased slowly. This phenomenon indicates that
the BRNN-LP model is under-fitting. When the number of
iterations increased from 40 to 90, the loss value does not
change much, and the change of accuracy is not obvious.
When the number of iterations is more than 90, the loss
value of the BRNN-LP model begins to fluctuate up and
down slightly and its accuracy of the model is basically the
same as the number of iterations at 40, which reflect that the
BRNN-LP model is over-fitting. The analysis of the above
experimental results show that the BRNN model has strong
feature extraction ability and can obtain better results when
the number of iterations is approximately 40. Considering the and average precision of the BRNN-LP model are the best,
performance and training time of model, the optimal number comparing to other slice lengths. When the time slice length
of iterations of the BRNN-LP model is 40. is too short, the information in the slice will be less, and the
model cannot further extract the features of the link infor-
b: THE EFFECT OF THE TIME SLICE LENGTH mation. When the slice length is too long, the information in
The length of the time slice has a great impact on the the slice will be too much, and the model feature extraction
BRNN-LP model. We use different time slice lengths includ- ability will be affected. Therefore, 360s is employed as the
ing 180s, 240s, 300s, 360s, 480s and 600s [38], to obtain the time slice length in this paper.
most suitable time slice length for this model on the MIT
dataset. Then, we randomly sample 30 times to assess the c: THE EFFECT OF THE INPUT SEQUENCE LENGTH
accuracy and precision of the prediction model for each time From the selected time slice lengths, it can be seen that the
slice length. The results are shown in Figure 8 and Figure 9, amount of link information will affect the performance of
and the average values are shown in Table 2. the BRNN-LP model. If the length of the input sequence is
It can be seen from TABLE 2 that when the slice length is too long, the amount of information and training complexity
360s with 30 times random sampling, the average accuracy will increase, which will lead to model over-fitting. If the

185792 VOLUME 7, 2019


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

TABLE 3. Prediction results for different sequence lengths.

FIGURE 10. The accuracy of different sequence lengths.

FIGURE 12. The comparison experiment with link prediction methods


based on similarity indices.
FIGURE 11. The precision of different sequence lengths.

a: COMPARISON WITH PREDICTION METHODS BASED ON


length of the input sequence is too short, the amount of SIMILARITY INDEX
information will be insufficient, which will lead to under- On the condition that the slice length is 360s, the number of
fitting. This paper sets different input sequence lengths of 1, BRNN-LP iterations is 40, and the input sequence length is
2, 3, 4, 5 and 6. We randomly extract 30 samples to calculate 2, we compare with similarity indices based link prediction
the accuracy and precision of the prediction of the model methods. The samples are chosen from the test sample set,
under different sequence lengths. The results are shown in the and the sample sizes are 50, 100, 150, 200, and 250. Five
Figures 10 and 11. The average values are shown in Table 3. total rounds of tests are conducted. The AUC is used to
The experimental results show that the length of the input evaluate the performance of the link prediction methods. The
sequence has a certain impact on the model predictive per- experimental results are shown in Figure 12.
formance. When the length of the input sequence is 2, its The experimental results show that the prediction effect
predictive performance is best, and the average prediction of the similarity indices is affected by the number of test
accuracy and precision are 99.45% and 99.79%, respectively, samples. Since the information that is obtained from the path-
which represents the BRNN-LP model can better capture the based similarity indices is more comprehensive, the path-
hidden features of links. Thus, the optimal length of the input based similarity indices (Katz and LP) perform better than
sequence of the BRNN-LP model is 2. the structure-based similarity indices (AA, RR and CN).
Furthermore, the prediction performance of the above sim-
2) PERFORMANCE COMPARISON OF DIFFERENT ilarity indices is not stable. The performance of the proposed
PREDICTION METHODS BRRN-LP model is comparatively stable and better than that
The experiments compare the performance of BRNN-LP with of the above similarity indices.
ones of other prediction models. Firstly, we compare the link
prediction methods of the BRNN-LP model with the similar- b: COMPARISON WITH PREDICTION METHODS BASED ON
ity indices (CN, AA, RA, PA, Katz and LP). Then, this model MACHINE LEARNING
is compared with the link prediction model (LDA) [18], On the condition that the slice length is 360s, the number
and the link prediction model based on the support vector of BRNN-LP iterations is 40, and the input sequence length
classifier (SVC). Finally, this model is compared with the link is 2, we compare with the machine learning based link predic-
prediction model (RNN) [23]. tion methods. All the experiments were conducted 30 times.

VOLUME 7, 2019 185793


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

considering the correlation information and the spatial loca-


tion information of node pairs. Taking BRNN advantages
of extracting the spatiotemporal features in the sparse net-
work, a better prediction performance is achieved. In the MIT
dataset, the experimental results show that the BRNN-LP
model achieves better predictive precision and accuracy com-
pared with existing methods of link prediction including the
CN, AA, RA, LP, Katz, SVC, LDA, and RNN.
The method proposed in this paper is to predict link for
single node pair. We will further study the link prediction
method of multi-node pairs in the future.
FIGURE 13. The comparison experiment with link prediction methods
based on machine learning (AUC).
REFERENCES
[1] P. Jadhav and R. Satao, ‘‘A survey on opportunistic routing protocols for
wireless sensor networks,’’ Procedia Comput. Sci., vol. 79, pp. 603–609,
2016.
[2] C. Y. Aung, I. W.-H. Ho, and P. H. J. Chong, ‘‘Store-carry-cooperative
forward routing with information epidemics control for data delivery in
opportunistic networks,’’ IEEE Access, vol. 5, pp. 6608–6625, 2017.
[3] E. K. Wang, Y. Li, and Y. Ye, ‘‘A dynamic trust framework for opportunistic
mobile social networks,’’ IEEE Trans. Netw. Service Manag., vol. 15, no. 1,
pp. 319–329, Nov. 2017.
[4] H. T. Cheng, F. T. Sun, and S. Buthpitiya, ‘‘Sensorchestra: Collabo-
rative sensing for symbolic location recognition,’’ in Proc. Int. Conf.
Mobile Comput., Appl., Services. Berlin, Germany: Springer, Oct. 2010,
pp. 195–210.
[5] E. Toledano, D. Sawada, and A. Lippman, ‘‘CoCam: A collaborative
content sharing framework based on opportunistic P2P networking,’’ in
FIGURE 14. The comparison experiment with link prediction methods Proc. IEEE 10th Consum. Commun. Netw. Conf. (CCNC), Las vegas,
based on machine learning (Precision). Nevada, USA, Jan. 2013, pp. 158–163.
[6] B. Han, P. Hui, V. S. A. Kumar, M. V. Marathe, J. Shao, and
The test samples are taken from the test set, and the number A. Srinivasan, ‘‘Mobile data offloading through opportunistic communi-
cations and social participation,’’ IEEE Trans. Mobile Comput., vol. 11,
of samples is 200. AUC and precision are employed to com- no. 5, pp. 821–834, May 2012.
prehensively evaluate the performance of the link prediction [7] E. Koukoumidis, L. S. Peh, and M. R. Martonosi, ‘‘Signal guru: Leveraging
methods. The experimental results are shown in Figure 13 and mobile phones for collaborative traffic signal schedule advisory,’’ in Proc.
9th Int. Conf. Mobile Syst., Appl., Services, Bethesda, MA, USA, Jun. 2011,
Figure 14, respectively. pp. 127–140.
The experimental results show that the SVC-based method [8] H. D. Ma, P. Y. Yuan, and D. Zhao, ‘‘Research progress on routing problem
is not ideal because it cannot extract deep features of the links in mobile opportunistic networks,’’ J. Softw., vol. 26, no. 3, pp. 600–634,
Mar. 2015.
in ON. The LDA-based approach portray link information, [9] L. Lue, L. Pan, T. Zhou, Y.-C. Zhang, and H. E. Stanley, ‘‘Toward link
and its performance is better than those of the SVC-based predictability of complex networks,’’ Proc. Nat. Acad. Sci. USA, vol. 112,
methods. However, it can be seen that the SVC and LDA are pp. 2325–2330, Feb. 2015.
[10] C. Dai, L. Chen, and B. Li, ‘‘Link prediction based on sampling in complex
not as good as the RNN and BRNN prediction methods which
networks,’’ Appl. Intell., vol. 47, no. 1, pp. 1–12, Jul. 2017.
consider the spatiotemporal features of ON. The AUC and [11] H. Wang and D.-Y. Yeung, ‘‘Towards Bayesian deep learning: A frame-
precision for the RNN and BRNN-LP methods are all above work and some existing methods,’’ IEEE Trans. Knowl. Data Eng., vol. 28,
0.98. It can be seen From Figure 14 that the precision of the no. 12, pp. 3395–3408, Dec. 2016.
[12] M. Fortunato, C. Blundell, O. Vinyals, ‘‘Bayesian recurrent neu-
BRNN prediction method is 0.9972, which is better than the ral networks,’’ Apr. 2017, arXiv:1704.02798. [Online]. Available:
RNN. This results show that the proposed BRNN-LP model https://arxiv.org/abs/1704.02798
can better characterize the features of ON and then make more [13] L. Lü and T. Zhou, ‘‘Link prediction in complex networks: A survey,’’
Phys. A, Stat. Mech. Appl., vol. 390, no. 6, pp. 1150–1170, 2011.
effective link predictions for ON. [14] B. Pandey, P. K. Bhanodia, and A. Khamparia, ‘‘A comprehensive survey
The above experiments show that the precision, AUC, of edge prediction in social networks: Techniques, parameters and chal-
and accuracy of the proposed BRNN-LP model are better lenges,’’ Expert Syst. Appl., vol. 124, pp. 164–181, Jun. 2019.
[15] D. Liben Nowell and J. Kleinberg, ‘‘The link prediction problem for social
compared with link prediction methods based on similarity networks,’’ J. Amer. Soc. Inf. Sci. Technol., vol. 58, no. 7, pp. 1013–1019,
indices. In addition, compared with the machine learning- Feb. 2007.
based link prediction models (SVC, LDA) and deep learning [16] M. E. J. Newman, ‘‘Clustering and preferential attachment in growing
networks,’’ Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip.
method RNN, the prediction performance of the BRNN-LP Top., vol. 64, no. 2, 2001, Art. no. 025102.
model is netter as well. [17] L. A. Adamic and E. Adar, ‘‘Friends and neighbors on the Web,’’ Soc.
Netw., vol. 25, no. 3, pp. 211–230, 2003.
VII. CONCLUSION [18] K.-K. Shang, ‘‘Link prediction for tree-like networks,’’ Chaos, An Inter-
discipl. J. Nonlinear Sci., vol. 29, no. 6, Apr. 2019, Art. no. 061103.
This paper proposes a BRNN-LP method for ON. We con- [19] L. Katz, ‘‘A new status index derived from sociometric analysis,’’ Psy-
struct a vector to represent the link information after chometrika, vol. 18, no. 1, pp. 39–43, 1953.

185794 VOLUME 7, 2019


Y. Ma, J. Shu: ONs Link Prediction Method Based on BRNN

[20] T. Zhou, L. Lü, and Y.-C. Zhang, ‘‘Predicting missing links via local [37] J. Shu, X. Zhang, L. Zhiyong, and L. Yang, ‘‘Multi-nodes link predic-
information,’’ Eur. Phys. J. B, vol. 71, no. 4, pp. 623–630, Oct. 2009. tion method based on deep convolution neural networks,’’ Acta Electron.
[21] G. Jeh and J. Widom, ‘‘SimRank: A measure of structural-context similar- Sinica, vol. 46, no. 12, pp. 2970–2977, Dec. 2018.
ity,’’ in Proc. 8th Acm Sigkdd Int. Conf. Knowl. Discovery Data Mining, [38] X. Cai, J. Shu, and M. AlkAli, ‘‘Link prediction approach for opportunis-
New York, NY, USA, Jul. 2002, pp. 271–279. tic networks based on recurrent neural network,’’ IEEE Access, vol. 7,
[22] C. Ma, Z. K. Bao, and H. F. Zhang, ‘‘Improving link prediction in pp. 2017–2025, 2018.
complex networks by adaptively exploiting multiple structural features of [39] E. Bulut and B. K. Szymanski, ‘‘Exploiting friendship relations for effi-
networks,’’ Phys. Lett. A, vol. 381, no. 39, pp. 3369–3376, Oct. 2017. cient routing in mobile social networks,’’ IEEE Trans. Parallel Distrib.
[23] C. Wang, V. Satuluri, and S. Parthasarathy, ‘‘Local probabilistic models for Syst., vol. 23, no. 12, pp. 2254–2265, Dec. 2012.
link prediction,’’ in Proc. 7th IEEE Int. Conf. Data Mining (ICD), Omaha, [40] B. Burns, O. Brock, and B. Levine, ‘‘MV routing and capacity building
NE, USA, Oct. 2007, pp. 322–331. in disruption tolerant networks,’’ in Proc. IEEE INFOCOM, Miami, FL,
[24] S. Das and S. K. Das, ‘‘A probabilistic link prediction model in time- USA, Mar. 2005, pp. 398–408.
varying social networks,’’ in Proc. IEEE Int. Conf. Commun. (ICC), Paris, [41] Y. LeCun, Y. Bengio, and G. Hinton, ‘‘Deep learning,’’ Nature, vol. 521,
France, May 2017, pp. 1–6. no. 7553, p. 436, 2015.
[25] S. H. Shalforoushan and M. Jalali, ‘‘Link prediction in social networks [42] J. Shi, J. Chen, J. Zhu, S. Sun, Y. Luo, Y. Gu, and Y. Zhou, ‘‘Zhusuan:
using Bayesian networks,’’ in Proc. Int. Symp. Artif. Intell. Signal Process. A library for Bayesian deep learning,’’ Sep. 2017, arXiv:1709.05870.
(AISP), Mashhad, Iran, Mar. 2015, pp. 246–250. [Online]. Available: https://arxiv.org/abs/1709.05870
[26] D. Huang, S. Zhang, and P. Hui, ‘‘Link pattern prediction in opportunistic
networks with kernel regression,’’ in Proc. 7th Int. Conf. Commun. Syst.
Netw. (COMSNETS), Chengdu, China, Jun. 2015, pp. 1–8.
[27] F. Aghabozorgi and M. R. Khayyambashi, ‘‘A new similarity measure for
link prediction based on local structures in social networks,’’ Phys. A, Stat.
Mech. Appl., vol. 501, pp. 12–23, Jul. 2018.
[28] Y. Li, K. Niu, and B. Tian, ‘‘Link prediction in sina microblog using YULIANG MA was born in Jiujiang, Jiangxi,
comprehensive features and improved SVM algorithm,’’ in Proc. IEEE 3rd in 1995. He is currently pursuing the master’s
Int. Conf. Cloud Comput. Intell. Syst. (CCIS), Shenzhen, China, Nov. 2014, degree with Nanchang Hangkong University.
pp. 18–22. His main research interest includes opportunity
[29] N. Yang, T. Peng, and L. Liu, ‘‘Link prediction method based on clustering networks.
and decision tree,’’ J. Comput. Res. Develop., vol. 54, no. 8, pp. 1795–1803,
Aug. 2017.
[30] Y. Li, X. Zhao, and H. Tang, ‘‘Dynamics nature and link prediction methods
in opportunistic networks,’’ in Proc. IEEE 17th Int. Conf. Comput. Sci. Eng.
(CSE), Chengdu, China, Dec. 2014, pp. 1362–1368.
[31] I. Güneş, S. Gündüz-Ögüdücü, and Z. Çataltepe, ‘‘Link prediction using
time series of neighborhood-based node similarity scores,’’ Data Mining
Knowl. Discovery, vol. 30, no. 1, pp. 147–180, Jan. 2016.
[32] S.-H. Wang, H.-T. Yu, R.-Y. Huang, and Q.-Q. Ma, ‘‘A temporal link
prediction method based on motif evolution,’’ Acta Automatica Sinica,
vol. 45, no. 5, pp. 735–745, May 2016.
[33] A. Ozcan and S. G. Oguducu, ‘‘Link prediction in evolving heterogeneous
networks using the NARX neural networks,’’ Knowl. Inf. Syst., vol. 55,
JIAN SHU was born in 1964. He received
no. 2, pp. 333–360, 2018.
[34] A. Ozcan and S. G. Oguducu, ‘‘Multivariate time series link prediction the M.Sc. degree in computer networks from
for evolving heterogeneous network,’’ Int. J. Inf. Technol. Decis. Making, Northwestern Polytechnical University. He is cur-
vol. 18, no. 1, pp. 241–286, Nov./Jan. 2019. rently a Professor with Nanchang Hangkong
[35] K.-K. Shang, M. Small, X.-K. Xu, and W.-S. Yan, ‘‘The role of direct University. His main research interests include
links for link prediction in evolving networks,’’ Europhys. Lett., vol. 117, wireless sensor networks, embedded systems, and
Jan. 2017, Art. no. 28002. software engineering. He is a Senior Member of
[36] K. K. Shang, W. S. Yan, and M. Small, ‘‘Evolving networks using past CCF.
structure to predict the future,’’ Phys. A, Stat. Mech. Appl., vol. 455,
pp. 120–135, Aug. 2016.

VOLUME 7, 2019 185795

You might also like