Professional Documents
Culture Documents
https://doi.org/10.1007/s12652-017-0630-1
ORIGINAL RESEARCH
Received: 28 July 2017 / Accepted: 14 November 2017 / Published online: 27 November 2017
© Springer-Verlag GmbH Germany, part of Springer Nature 2017
Abstract
Network embedding is an important pre-process for analysing large scale information networks. Several network embedding
algorithms have been proposed for unsigned social networks. However, these methods cannot be simply migrate to signed
social networks which have both positive and negative relationships. In this paper, we present our signed social network
embedding model which is based on the word embedding model. To deal with two kinds of links, we define two relation-
ships: neighbour relationship and common neighbour relationship, as well as design a bias random walk procedure. In order
to further improve interpretation of the representation vectors, the follow-proximally-regularized-leader online learning
algorithm is introduced to the traditional word embedding framework to acquire sparse representations. Extensive experi-
ments were carried out to compare our algorithm with three state-of-the-art methods for community detection and sign
prediction tasks. The experimental results demonstrate that our algorithm performs better than the comparison algorithms
on most signed social networks.
Keywords Signed social network · Network embedding · Word embedding · Sparse representation · Follow-proximally-
regularized-leader
13
Vol.:(0123456789)
consider the sign information of links are proposed and per- signed graphs based on the random walk normalized Lapla-
form effectively in different tasks. These works represent cian. However, the time complexity of spectral methods is at
nodes in signed social networks using dense low-dimen- least quadratic to the number of vertices, and very expensive
sional vectors. for networks with millions of nodes.
In this paper, we propose a network embedding method
which can not only capture the sign information of links but 2.2 Network embedding
can also represent nodes using sparse low-dimensional vec-
tors. The major contributions of this paper are as follows: With the boom in deep learning, a remarkable word embed-
ding framework called word2vec is proposed. It is proved
– To define an improved biased random walk procedure to that the embedding vectors derived from the model preserve
acquire nodes sequence which incorporate positive and the syntactic and semantic relations between words under
negative relationships; simple linear operations (Mikolov et al. 2013; Li et al. 2014;
– Add sparse constraint into network embedding and Levy et al. 2015).
innovatively employ the follow-the-proximally-regula- Word embedding conforms well to the representation
rized-leader (FTRL-Proximal) algorithm instead of SGD requirement in large-scale social networks and provides
(stochastic gradient descent) to original word embedding a new concept in the study of social networks. DeepWalk
framework to obtain sparse network embedding. (Perozzi et al. 2014), LINE (Tang et al. 2015), node2vec
– Conduct extensive experiments on signed social networks (Grover and Leskovec 2016) and SDNE (Wang et al. 2015)
to demonstrate the effectiveness of our proposed frame- are proposed respectively. The first three algorithms are
work. based on shallow neural network which are easy to train.
SDNE is based on deep neural network which can keep more
The rest of paper is organized as follows: Sect. 2 gives information. The four works are all designed for unsigned
some related works. Section 3 introduces the theory of social networks. Meanwhile, some works (Liao et al. 2017)
word embedding in the natural language process area and learn the network embedding by means of the node attribute
our proposed model is presented in Sect. 4. Section 5 shows information for specific tasks.
the experimental analysis and conclusions are finally drawn There are two network embedding method based on deep
in Sect. 6. learning (Wang et al. 2017; Yuan et al. 2017) which are
designed for signed social networks. The first (Yuan et al.
2017) adopts the log-bilinear model to combine the edge
2 Related work sign information and acquire the node representations. The
second (Wang et al. 2017) introduces virtual nodes into the
2.1 Sign network analysis signed social network and only considers the nodes whose
2-hop networks are all positive links, that will insert a large
signed social network research has attracted wide attention number of nodes which should be trained. The two are all
(Kunegis et al. 2010; Leskovec et al. 2010a; Chiang et al. based on deep network which are hard to train. The repre-
2013; Zheng and Skillicorn 2015). There are two basic theo- sentation vectors of the aforementioned network embedding
ries in signed social network which are the structure balance algorithms are dense and lack of interpretability.
theory (Heider 1946) and the status theory (Leskovec et al. In this paper, we explore sparse constraints and shallow
2010b). The balance theory is the key assumption which is neural network to signed social network embedding.
widely used to analyse signed social networks.
Community structure detection and sign prediction are
two fundamental tasks in signed social networks and the 3 Word embedding
widely adopted method is the Laplacian spectral method.
Kunegis et al. (2010) extended spectral algorithms to signed Our model is based on the skip-gram with negative-sampling
social networks and proposed a spectral method based on training method (SGNS) in word embedding (Mikolov et al.
the signed Laplacian. They showed that by dividing signed 2013). SGNS model utilizes word-context matrix to predict the
social networks into two groups with the signed Laplase ker- surrounding context of a given word. In the following we give
nel, this was similar to the ratio cut in unsigned social net- a brief review of the word-context matrix and SGNS model.
works. Chiang et al. (2013) gave a definition of social imbal-
ance (MOIs) based on l-cycles in signed social networks and 3.1 Word‑context matrix
proposed a community detecting and sign predicting method
based on spectral Laplacian. Zheng and Skillicorn (2015) Given a training corpus D and word vocabularies V, we can
developed two spectral approaches to model and analyse get word sequences which contain the relationships of each
13
word w ∈ Vw with its surrounding words. The surrounding maximally keeps structure information and sign information
words which fall in the fixed-sized windows centered at w in signed social networks. A biased random walk is executed
are called context words and each context word is denoted on signed social networks based on structure information to
c ∈ Vc. produce nodes sequences which will be the input of the word
#(w, c) is the number of all word-context pairs in a embedding framework. In the embedding process, the SGNS
corpus D. N(w, c) of different (w, c) constitute the word- model and the FTRL-Proximal optimization algorithm are
context matrix and #(w, c) is the cth row and wth column adopted to acquire the objective sparse low-dimensional net-
∑ ∑
of the matrix. #(w) = c #(w, c) , N(c) = w #(w, c) , work embedding vectors. In the following section we intro-
∑ ∑
�D� = w c N(w, c). duce the network structure information in the signed social
networks, biased random walk procedure and optimization
3.2 SGNS model algorithm of our proposed algorithm.
4 Proposed model but also enemy’s enemy may be selected as the next node.
However, propagation of distrust, unlike trust, is a tricky
We aim at representing nodes in signed social networks issue. Enemy’s enemy is not necessarily a friend, while a
using sparse low-dimensional vectors, which can intuitively
represent some kind of network structure such as commu-
nity structure and neighbour relationships. More formally,
let G = (V, E) , where G is a signed social network, V is
the nodes set of the network, and E is its edges set which
have positive or negative weights. Let f ∶ V → Rd be the
mapping function from nodes to vectors, d is a parameter
specifying the number of dimensions of our vector repre-
sentation. Our goal is to learn the mapping function f which Fig. 1 Undirected signed triads relations (Heider 1946)
13
ti denote the ith node in the path, walking from source node
friend of a friend can be considered trustworthy. So we can- S and t0 = S . Node ti is selected as the next node by the fol-
not select the next hop in the crude way and should utilize lowing probability:
the common neighbour relation mentioned below. { 𝜋vx
| if (v, x) ∈ E
Common neighbour relationship Common neighbour P(ti = x|ti−1 = v ) = Z (4)
0 otherwise
relationships involve two kinds of sets: friend set F(w) and
enemy set E(w). Intuitively, if two individual friend sets have where 𝜋vx is the transition probability between nodes v and
a larger overlap area, they tend to be best friends. Rather, if x, and Z is the normalized constant. Because links have
one’s enemy set which covers a large percent of the area of positive and negative weights, we define a biased random
another individual friend set, the relationship between them walk procedure with two parameters p and q to guide the
will be weak and even negative. It is consistent with the random walker. As shown in Fig. 2, a walker now resides
definition of community in signed social network. There are at the current node s and needs to decide the next node h.
more positive links within the same community and more So it evaluates the transition probability Psh on edges (s, h)
negative links between different communities (Gmez et al. leading from s.
2009). Based on the neighbour relationships in the signed social
Based on common neighbour relationships, we define networks, when the walker steps from node s they can select
a local common relation similarity between node s and its not only its positive neighbours a or b+ but also an enemy of
neighbour node h as following: its enemy (e.g. b′). The choice of b′ as the next node depends
∑ ∑ on the value of ls(s, b� ) in Eq. 3. If ls(s, b� ) > 0 , the walker
ls(s, h) =
r∈F(s)∩F(h)
wsr −
m∈E(s)∩F(h)
wsm (3) choose b′ as the next hop and add an virtual edge (s, b� ) to
E(wsb > 0). Conversely, if ls(s, b� ) < 0, b′ is abandoned and
where F(s) is the friend set of node s while E(s) is the enemy the next node from current node s will be reselect.
set. wij means the weight of the link from i to j. The first part In signed social networks, a walker can select a neigh-
of Eq. 3 is the overlapping degree between friend sets. The bor node guided by the extended transition probability p̃ ik
second part is the overlapping degree between the enemy as following:
set and the friend set.
�wik �
� �
4.2 Biased Random Walk procedure on signed p̃ ik =
∑n
�w � (5)
social networks � ij �
j∈nbs(i) � �
Here we defined a biased random walk procedure based on
neighbour relations and common neighbour relationship as However, k is an enemy of i when wik < 0 . In this case,
shown in 4.1 to acquire nodes sequence. we choose an enemy of k as the next node. We define the
Formally, given a source node S, we denote every ran- unnormalized transition probability 𝜋sh from node s to h as
dom walker walks on the network at a fixed step size l. Let following:
13
⎧p⋅ ∑
wsh
wsh > 0
⎪ �wsh �
⎪
K∈nbs(s)
�wsk � �wkh �
𝜋sh = ⎨ q ⋅ ∑
�wsi �
⋅ ∑
�wkj �
wsk < 0andwkh < 0and wsh = 0andls(s, h) > 0 (6)
⎪ i∈nbs(s) j∈negativenbs(k)
⎪0 otherwise
⎩
where p and q are adjustable parameters which guide the Given a sequence of gradients gt ∈ Rd , SGD performs the
walker by selecting one of its friends or an enemy of its update:
enemy as the next hop. Wt+1 = Wt − 𝜂t gt (7)
where 𝜂t is a non-increasing learning rate.
4.3 Sparse network embedding The FTRL-Proximal algorithm instead uses the update as
follows:
Traditional SGNS model encodes every word vector into
low-dimensional dense vectors. The reason for this is that the
1� �
t
𝜎 W − Ws �
2
SGNS employs SGD (stochastic gradient descent) as the opti- Wt+1 = arg min(g1∶t ⋅ W +
2 s=1 s � �2 + 𝜆1 ‖W‖1 )
mization strategy and the approximate gradient of SGD used
W
(8)
at each update is very noisy and the value of each entry in the
where 𝜎s is the learning-rate and 𝜎1∶t = 1∕𝜂t . 𝜆1 is the hyper-
vector can be easily moved away from zero by those fluctua-
parameter that controls the degree of regularization.
tions. We need to seek other online optimization algorithms to ∑
If we put Zt−1 = g1∶t−1 − t−1 𝜎 W , at the beginning of
replace the SGD to obtain sparse representation. Fortunately, s=1 s s
round t we update by letting Zt = Zt−1 + gt + ( 𝜂1 − 𝜂 1 )Wt ,
there have been several studies concerning the online optimi- t t−1
zation algorithms that target such l1 - norm objectives (Xiao and solve for wt+1 in closed form on a bases by:
2010; Mcmahan 2011; Wang et al. 2015; Liu et al. 2016). It {
0 if ||zt,i || ≤ 𝜆1
has been proven that the follow-the-proximally-regularized- wt+1,i = (9)
−𝜂t (zt,i − sgn(zt,i )𝜆1 ) otherwise
leader (FTRL-Proximal) model has the characteristic of high
efficiency and stability compared with the other two algo-
rithms. In this paper, we propose to employ FTRL-Proximal
algorithm (Mcmahan 2011) in word2vec framework to pro-
duce the sparse representations.
13
13
13
munity. 𝛿(Ci , Cj ) = 0 , when i and j in different From Table 3 we can see that, for the SPP and the GGS
communities. networks, all the algorithms can find the best community
Given two community structures A and B in the same net- structure. The two signed social networks are small in size.
work, and then let C be the confusion matrix whose element It is easy for the algorithms to find the optimal solutions.
Cij is the number of common nodes between community i in However, the algorithms based on deep learning some-
structure A and communityj in structure B. times performs worse than the other three spectral meth-
�CN� ods. The reason is that our algorithm is based on word
∑CA ∑CB
−2 i=1 j=1
Cij log ij
embedding technology in which nodes are represented in
Ci. C.j
NMI(A, B) = ∑ � � ∑ � C � (11) at least 16 dimensions space and this is inefficient for very
CA CB
small networks. Figures 3 and 4 show the worst commu-
Ci. .j
i=1
Ci. log N
+ j=1
C.j log N
nity structures obtained by our proposed algorithm when
experimenting on the SPP and the GGS networks.
where N is the number of nodes, CA (CB ) is the number of
For the other four networks, we can observed that the
communities in structure A(B) and Ci .(C. j) is the sum of ele-
maximum values and the averaged values obtained by our
ments of C in row i (column j).
proposed algorithm are higher than those obtained by the
rest of the algorithms, which indicates that the discovered
5.2.1 Testing on real‑world signed social networks
community structures are better than those discovered by
the comparative algorithms. All the experimental results
Six real-world signed social networks have been employed to
demonstrate that the proposed optimization model is effec-
test the community detection performances of the proposed
tive for the signed social network community detection
algorithm and the comparison algorithms. The statistics of
each network are given in Table 2, and only the ground truth
of SPP and GGS are known. Because the networks are all
small, in the experiments, d is set to 16 and each algorithm
has been independently tested for 30 times.
Table 3 Statistical results over Index Algorithm SPP GGS EGFR Macrophage Yeast E. coli
30 runs on the signed social
networks SQmax L̄ rw 0.4547 0.4530 0.2753 0.3010 0.5838 0.3696
Lsns 0.4547 0.4530 0.2731 0.2985 0.5879 0.3673
Lbns 0.4547 0.4530 0.2848 0.3028 0.5969 0.3753
SNE 0.4547 0.4530 0.2875 0.3240 0.6036 0.4100
node2vec-SN 0.4547 0.4530 0.2870 0.3236 0.6006 0.4034
sparse-node2vec-SN 0.4547 0.4530 0.2878 0.3244 0.6038 0.4032
SQavg L̄ rw 0.4532 0.4520 0.2678 0.2738 0.5321 0.3652
Lsns 0.454 0.4518 0.2635 0.2787 0.5434 0.3301
Lbns 0.4543 0.4526 0.2791 0.2921 0.5741 0.3631
SNE 0.4539 0.4516 0.2785 0.3164 0.5971 0.3883
node2vec-SN 0.4533 0.4421 0.2764 0.3043 0.5975 0.3864
sparse-node2vec-SN 0.4436 0.4428 0.2835 0.3189 0.5979 0.3908
13
13
Fig. 5 NMI of SN (1000, 15, 40, 2, 1, 10, 50, 0.7, P− , P+) obtained by different detection algorithms
cross-validation methodology. The sign prediction methods While the top k dimensional eigenvectors contain zero
of the four algorithms are described as follows: entries for links not present in the original graph, the
Sparse-node2vec-SN, node2vec-SN and SNE The sign approximation L̄ (k) is nonzero at these entries, and the sign
prediction of the three algorithms is based on the Cosine of these entries can be used as a prediction for the sign of
similarity of the vector representations. missing links. The value of k is set to 128.
f (u) ⋅ f (v)
Similarity(u, v) = (12) 5.3.2 Experimental results
|f (u)| ⋅ |f (v)|
The resulting similarity ranges from −1 meaning an enemy For measuring the prediction accuracy, we use three kinds
relationship, to 1 meaning a friend relationship, with 0 of accuracy: accuracy of all signs (Accuracy-all), accuracy
indicating no relationship, and in-between values indicat- of positive links (Accuracy+) and accuracy of negative links
ing intermediate similarity or dissimilarity. Here, we set (Accuracy−). The experimental results are shown in Fig. 6.
d = 128, r = 15, l = 80, k = 10. As the results for sign prediction shown in Fig. 6, the
Spectral methods L̄ rw , Lsns , and Lbns To evaluate the general observation that the sparse network embedding in
signed spectral approach, we use three graph kernels based signed social networks performs better than the other algo-
on the signed Laplacian matrix L̄ rw , Lsns , Lbns , and labelled rithms, especially when the overall accuracy is concerned.
them as L.
̄ These kernels are computed with the reduced Accuracy for predicting positive links and negative links
eigenvalue decomposition of L̄ (d) = U ∧ U T only the top varies widely, particularly in the three spectral methods. This
d dimensional eigenvectors are kept. L̄ (k) ≈ U(k) ∧ U(k) T . is because; the ratio of negative links is much lower than
13
positive links in the three datasets. Meanwhile, the reduc- define a biased random walk procedure based on two rela-
tion and reconstruction operations in spectral methods may tionships defined in this paper. Furthermore, we employ the
lose information and add noise. Our algorithm is based on sparse process into our model to improve the quality and
random walk in which the neighbour relationships are a con- efficiency of the representations. Empirically, we evaluated
cern. This will ease the information loss. SNE performs best the generated network representations in a variety signed
when predicting the positive links while it performs worse social networks and applications. The results demonstrate
than our algorithm when predicting the negative links, espe- substantial gains of our method compared with state-of-art
cially in Slashdot. The reason for this is that sparse represen- methods currently available.
tation is traditionally viewed as a typical denoising method. In spite of the overall higher performance of our algo-
The sparse process in our algorithm can decrease the differ- rithm, our algorithm is still more sensitive to negative noisy
ence between predicting positive links and negative links. information than positive noisy information. In our future
work, more efforts will be made to reduce this kind of sen-
sitivity in community detection and sign prediction of signed
6 Discussion and conclusion social network.
Positive and negative links in signed social networks give Acknowledgements This work is partly funded by the National Nature
Science Foundation of China (nos. 61672329, 61373149, 61472233,
us more information about the users, which is very impor- 61572300, and 81273704), Shandong Provincial Project for Science
tant for recommendations and prediction tasks. The scale and Technology Development (no. 2014GGX101026), Shandong Pro-
of these complex networks is usually high. In this paper, vincial Project of Education Scientific Plan (no. ZK1437B010), Tais-
we design a signed social network embedding model based han Scholar Program of Shandong Province (nos. TSHW201502038
and 20110819), and Shandong Provincial Project of Exquisite Course
on the word embedding model. To obtain the corpus, we (nos. 2012BK294, 2013BK399, and 2013BK402).
13
13
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com