You are on page 1of 10

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS 1

Rumor Identification in Microblogging Systems


Based on Users’ Behavior
Gang Liang, Wenbo He, Chun Xu, Liangyin Chen, and Jinquan Zeng

Abstract—In recent years, microblog systems such as Twitter the abuse of microblogs to spread rumors (or unreliable infor-
and Sina Weibo have averaged multimillion active users. On mation) has also been widely reported [1]–[4]. Rumors often
the other hand, the microblog system has become a new refer to information whose truth and source are unreliable, and
means of rumor-spreading platform. In this paper, we investi-
gate the machine-learning-based rumor identification approaches. are likely to be generated under emergency situation, causing
We observed that feature design and selection has a stronger public panic, disrupting the social order, decreasing the govern-
impact on the rumor identification accuracy than the selection ment credibility, and even endangering national security. For
of machine-learning algorithms. Meanwhile, the rumor publish- example, on March 2011, few days after a powerful earthquake
ers’ behavior may diverge from normal users’, and a rumor rocked Japan, triggered a tsunami, and later a nuclear crisis.
post may have different responses from a normal post. However,
mass behavior on rumor posts has not been explored adequately. A rumor that iodized salt was capable of protecting people
Hence, we investigate rumor identification schemes by applying from nuclear radiation had been widespread across China via
five new features based on users’ behaviors, and combine the Sina Weibo and other microblog platforms. This made peo-
new features with the existing well-proved effective user behavior- ple flocked to stores, supermarkets, and dispensaries to buy
based features, such as followers’ comments and reposting, to salt. As a result, the iodized salt price increased 5 to 10 times
predict whether a microblog post is a rumor. Experiment results
on real-world data from Sina Weibo demonstrate the efficacy during that time. To limit the wide spreading of rumors, it is
and efficiency of our proposed method and features. From the essential for microblog systems to detect rumors as soon as
experiments, we conclude that the rumor detection based on mass possible.
behaviors is more effective than the detection based on microblogs’ A popular approach to discriminate the rumor posts from
inherent features. normal ones is to consider the problem of rumor identification
Index Terms—Microblog, rumor identification, users’ behavior. as a binary classification problem of machine learning. This is
based on a hypothesis that the rumor posts are similar statisti-
cally [4], [5]. If the data fail to exhibit the statistical similarity,
I. I NTRODUCTION the learning will fail. Researches have demonstrated that no

N OWADAYS, microblog systems such as Twitter and


Sina Weibo become more and more popular since they
facilitate fast dissemination and acquisition of information.
one single learning method is overwhelmingly superior in all
scenarios, and different learning algorithms may yield similar
results [6]. It is also reported that identifying the most salient
Individuals can freely share information with microblogs, espe- features of data for machine learning usually has an enormous
cially in emergency situations, such as earthquakes, floods, and impact [7] in classification problems. Therefore, choosing a
hurricanes [1]. Microblog systems have grown quickly, and representative set of features is a crucial step for rumor identi-
people tend to use them as an important source to share and fication. We observe that existing researching efforts on rumor
obtain the information in everyday life [2]. On the other hand, identification [8]–[17] have been focused on using microblogs’
inherent features, such as content-based features, multimedia-
Manuscript received June 22, 2015; revised December 21, 2015; accepted based features, propagation-based features, and Topic-based
January 03, 2016. This work was supported in part by the National Natural features listed in Table I. In contrast, the features based on
Science Foundation of China under Grant 61373091, Grant 91338107, users’ behaviors are less explored, though it has been shown
and Grant 11102124; by the Ph.D. Program Foundation of Ministry of
Education of China under Grant 20130181110095; by the Provincial Key that there exists close correlation between users’ behaviors
Science, Technology Research and Development Program of Sichuan, China and the information credibility of microblogs [1], [16]. In
under Grant 2013SZ0002 and Grant 2014SZ0109; by the Sichuan Provincial [1], Mendoza et al. found that rumor microblogs tend to be
Department of Science and Technology Project (No. 2014JY0036); and by
questioned by more users than normal microblogs. In [16],
the Scientific Research Fund of Sichuan Provincial Education Department
(No. 13TD0014). (Corresponding author: Chun Xu.) Shirai et al. reported that 14.7% people or organization would
G. Liang was with Sichuan University, Chengdu 610064, China. He is now publish rumor correction microblogs as soon as they found
with the School of Computer Science, McGill University, Montreal, QC H3A the rumor microblogs. Besides, mass behaviors on microblog
0E9, Canada (e-mail: lianggang@scu.edu.cn).
W. He is with the School of Computer Science, McGill University, Montreal,
posts such as number of posts and number of followers can
QC H3A 0E9, Canada (e-mail: wenbo@mcgill.ca). be exploited to help determine whether a microblog post is a
C. Xu and L. Chen are with the College of Computer Science, rumor.
Sichuan University, Chengdu 610064, China (e-mail: xuchun@scu.edu.cn; In this paper, we investigate the problem of rumor identi-
chenliangyin@scu.edu.cn; huxiaoqin@scu.edu.cn).
J. Zeng is with the School of Computer Science and Engineering, University
fication in microblog systems. By observing that the rumor
of Electronic Science and Technology of China, Chengdu 610054, China. publishers’ behaviors may diverge from normal users’ and a
Digital Object Identifier 10.1109/TCSS.2016.2517458 rumor post may have different responses from a normal post, we
2329-924X © 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution
requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
2 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

TABLE I identification. 3) We use the trained classifiers from the second


C OMMONLY U SED F EATURES FOR RUMOR I DENTIFICATION phrase to predict whether a microblog post is a rumor.
Experiments are conducted on real-world data from Sina
Weibo, which is a gigantic Chinese microblog platform with
over 500 million users, to demonstrate the efficacy and effi-
ciency of our proposed method and features. The experiment
results show that the precision, recall, and F-score of our
approach reach 0.8645, 0.8535, and 0.8590 respectively, and
these three metrics have achieved 13.14%, 18.13%, and 16.68%
improvements on average, comparing to the baseline approach.
This paper is organized as follows. In Section II, we pro-
vide the related work of the rumor identification. In Section III,
we define rumor identification problem. In Section IV, we pro-
pose our approach for rumor identification. In Section V, we
present experimental result. Finally, we conclude this paper in
Section VI.

II. R ELATED W ORK


In this section, we briefly summarize the results of existing
rumor identification research.
To identify the rumors spreading in microblog systems, sev-
eral attempts have been made by microblog service providers.
Sina Weibo maintains an official account @WeiboPiyao [20],
operated by senior journalists 24 × 7. It publishes microblogs
on the new rumors regularly, so that Weibo users who follow
this account can be alerted. In addition, Sina Weibo adopts
crowdsourcing technique [32] to provide a service named
Weibo Misinformation Declaration [21]. Any Weibo users can
report suspicious rumors through this service. A team of jour-
nalists will judge whether the reported microblogs are really
rumors, and publish the results on Weibo. Both methods to
some extent curbs the rumors spreading on Weibo. In these
methods, however, the credibility of the information was com-
pletely assessed by Weibo journalists manually, which is costly
and labor intensive. Moreover, there exists a large delay in
rumor detection in these methods. From December 2010, when
@Weipopiyao posted the first microblog on rumor, to January
2015, 480 rumor microblogs had been published totally. It is
obvious that the @Weibopiyao can only catch a small por-
tion of rumors on the Weibo. As for Weibo Misinformation
Declaration, the average delay between the reported time of a
suspicious rumor and its decision time is more than 24 h. It is
crucial to design and develop a system which can automatically
identify the information credibility of microblog systems.
Automatic rumor identification in microblog systems is a rel-
atively new field. There have so far been only a few works to
address this problem, and most of these works primarily focus
propose a user behavior-based rumor identification schemes, on using microblogs’ inherent features. In [9], Castillo et al.
in which the users’ behaviors are treated as hidden clues to extracted 68 features from posts of twitter and categorized them
indicate who are likely to be rumormongers or what posts into four types: 1) content-based features, which consider char-
are possible rumor microblogs. Our approach on rumor iden- acteristics of the tweet content, such as the length of a message
tification consists of three phrases. 1) Based on the collected and number of positive/negative sentiment words in a message;
microblogs and users profiles, we gather the features of users’ 2) user-based features, which consider traits of Twitter users,
behaviors from each microblog post. In total, nine features of such as registration age, number of followers, and number of
users’ behaviors are adopted in this paper, and five of them followees; 3) topic-based features, which are aggregates com-
have not been studied before. 2) We apply five most popu- puted from message-based features and user-based features,
lar machine-learning algorithms to train classifiers for rumor such as the fraction of tweets that contain URLs, the fraction
LIANG et al.: RUMOR IDENTIFICATION IN MICROBLOGGING SYSTEMS BASED ON USERS’ BEHAVIOR 3

of tweets with hashtags, and the fraction of sentiment posi-


tive and negative in a set; and 4) propagation-based features,
which consider features related to the propagation tree of a post,
such as the depth of the retweet tree, or the number of initial
tweets of a topic. After the research of Castillo et al., research
efforts have been focused on exploiting new features for rumor
detection. Qazvinian et al. [10] extracted attributes related to
contents of tweets, features about the network, and specific
memes of Twitter to build different Bayes classifiers to detect
the rumors spreading on the twitter. Yang et al. [11] proposed
two new features: 1) client-based feature and 2) location-based
feature and trained a support vector machine classifier to iden-
tify the misinformation and disinformation of Sina Weibo. In
[12], Sun et al. first proposed multimedia-based features for Fig. 1. Illustration of the relation among microblog users.
event rumors identification. Cai et al. [13] proposed text fea-
tures from retweets and comments to construct rumor classifier. post; 2) a user publishes someone else’s post (repost); 3) a
Wang et al. [14] proposed graph-based features and applied user reposts someone’s post adding his or her comments
them in spam bots detection. Zhang et al. [33] mined the deep (commenting). The reposting and commenting help microblog
information of microblog contents and extracted implicit fea- users quickly and widely share posts with all of their followers.
tures, such as popularity, sentiment or viewpoint of message Fig. 1 illustrates the relations among four users, where a hol-
contents, and user historical information, to detect rumors in low circle represents a user, a solid circle denotes a post, a solid
microblogs. In [34], Wu et al. studied message propagation line with an arrow shows the following relation between two
patterns of Sina Weibo and used them as high-order features users, and a dashed line with an arrow indicates the direction
to construct a graph-kernel-based SVM classifier for rumor of post transmission. From Fig. 1, we can see that user A is a
identification. Table I lists commonly used features for rumor follower of user B and both user C and user D are followers of
identification of existing research. user A, while user B is a followee of user A and user A is a fol-
As shown in Table I, we observe that the existing research lowee of both user C and user D. In Fig. 1, user B publishes a
on mass user behavior in rumor identification has not been post b2. User A, which is user B’s follower, receives and reads
explored adequately. Unlike previous studies, in this paper, we post b2, and then publishes post b2 as post a1 through reposting
treat the features of user’s behaviors as fairly important clues or commenting. Although user C and user D cannot read post
to indicate who are likely to be rumormongers or what posts b2 from user B directly, they are able to read this post from user
are possible rumor microblogs. We propose several new user A through post a1.
behavior-based features to predict whether a microblog post is
a rumor.
IV. F EATURES FOR RUMOR D ETECTION
As shown in [7], feature design and selection play a key
III. BACKGROUND
role in rumor detection. The detection performance is heav-
In this section, we introduce the background and give a ily dependent on which features are adopted. By analyzing
general model of rumor identification. We define users’ behav- the characters of microblog users’ behaviors, we extract nine
iors of microblogs as a set of vectors, in which every vector user behavior features from microblog posts. Users’ behav-
(i) (i) (i)
m(i) = b1 , b2 , . . . , bn , c(i)  contains the user behavior fea- iors concerned in this paper include behaviors of author and
tures of microblog i, where n is the number of features of users’ readers of a microblog. There exist big differences in the use
(i) patterns between normal authors and rumormongers. For exam-
behavior, bj represent the jth feature of user’s behaviors of
microblog i, and c(i) is the type (rumor or normal) of microblog ple, a very few rumormongers will use authenticated accounts
i. Given a set of users’ behaviors of microblogs with known to publish rumor microblogs in order to escape the possi-
type, the problem of this paper is to find a method to predict ble corresponding responsibilities, while many normal users
the type of microblogs whose types are unknown based on their will use authenticated accounts to improve their reputation. It
users’ behavior vectors. is obvious that readers will respond differently when reading
A microblog system is a network made up of users and their rumor microblogs and normal microblogs. For example, rumor
relationships. Therefore, we can represent a microblog system microblogs tend to be questioned more than normal microblogs.
by a directed graph (N, R), which consists of a set of users In this section, we describe the detailed description of features
N = {1, 2, 3, . . . , n} and an n × n matrix R = [rij ]i, j ∈ N , based on user behaviors to represent a microblog.
where rij ∈ {0, 1} represents whether user i follows user j. If
rij = 1, it means that user i is a follower of user j and user j is
A. Behavior Features Based on Microblog’s Authors
a followee of user i.
There are three ways to share or deliver information among Behavior features based on microblog’s authors refer to
users in microblog systems. 1) A user publishes a microblog the features extracted from behaviors of authors who publish
4 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

microblog posts including verified user or not, number of fol- authentic microblogs can be witnessed and originated by a large
lowers, average number of followees per day, average number number of unrelated individuals. Therefore, there will be only
of posts per day, and number of possible microblog sources. one information source of the rumor in the network if rumor
Among these features, verified user or not and number of fol- microblog is initiated from one person and the possible infor-
lowers have been studied in the previous works [8]–[17] while mation sources are no more than the size of the group if a rumor
we proposed three new features: 1) average number of fol- is initiated by a small colluding group of people. Conversely, if
lowees per day; 2) average number post per day; and 3) number the content of a microblog is authentic, there are probably many
of possible microblog sources. information sources of the information [18]. The value of this
1) Verified User or Not: This feature is used to indicate feature can be gotten from a microblog post after applying the
whether a user is verified by microblog service providers. following four steps.
In order to improve reputation and social influence, some Step 1) For a given microblog post, represent it as a set of
microblog users apply to microblog service providers for user keywords by using Tf-Idf method [19].
identity authentication. If user’s identity is verified by ser- Step 2) Construct the searching keywords using the key-
vice providers, a verified tag will be shown after the user words generated in step 1) and collect the same
name and other users can judge whether a user is authenticated or similar original microblogs (the contents of
user by this tag. Strictly speaking, the feature of verified user microblogs which are forwarded usually contain
or not generally belongs to features based on user’s profile. keywords such as “Re” and we can get rid of these
However, this feature can also be described as one behavior of forwarded microblogs by looking up whether there
microblog authors, for few rumormongers will post their rumor exist forwarded keywords in their contents) using the
microblogs using the authentic account in order to escape the searching function provided by microblog service.
possible corresponding responsibilities. Therefore, we choose Step 3) Compute the similarity between the given microblog
the feature of verified user or not to describe a choice behavior and every collected similar microblog-based on
of microblog’s author. Jaccard coefficient as shown in (1) and get rid of
2) Number of Followers: A follower means a person who the irrelevant microblogs whose similarity values
follows or subscripts to an account. The posted microblogs pub- are below the threshold from the collected similar
lished by this account will appear on followers’ home timeline microblog set. In this paper, if the value of similarity
which display a stream of posts from accounts they have cho- is bigger than 0.75, we consider that two microblogs
sen to follow in microblog systems. The more followers a user are similar
has, the more people will receive his or her posts. Therefore, for
|ti .keywords ∩ tj .keywords|
rumormongers, to spread their rumors widely and rapidly, they sim(ti , tj ) = (1)
usually publish rumor microblogs after the number of followers |ti .keywords ∩ tj .keywords|
reach to a high value.
where |·| is the element number of a set, and
3) Average Number of Followees Per Day: A followee is
ti . keywords is a set of keywords which are extracted
someone whose microblogs was subscribed and followed by
from the text of microblog i.
other people. Unlike other social networks, such as Facebook
and Wechat, a microblog user can follow any other users of Step 4) Count the element number of the searched simi-
microblog without their permission. Generally speaking, the lar microblog set and assign it to the value of the
more people a person follows, the more followers he will get. number of possible microblog sources of the given
In order to attract more followers rapidly, rumormongers fol- microblog.
low many people in a very short time. Therefore, this feature
value of rumormongers usually is higher than that of normal
B. Behavior Features Based on Microblog’s Readers
users. The value of average number of followees per day can be
calculated by using number of followees divided by user regis- In this paper, features based on microblog’s readers mainly
ter days, and number of followees and user register days can be refer to the features extracted from behaviors of users after
extracted from users’ profile directly. they read microblogs including number of reposts, number of
4) Average Number of Posts Per Day: Average number of comments, ratio of questioned comments, and number of cor-
posts per day refers to how many microblogs a user posted rections. Number of reposts and number of comments have been
per day on average. Unlike normal users who share the infor- studied in previous investigations [8]–[17] as ratio of ques-
mation with their friends, the purpose of rumormongers using tioned comments and number of corrections are proposed in this
microblog is just to spread their fiction information. In order to paper.
escape the possible responsibilities, rumormongers will rarely 1) Numbers of Reposts and Comments: Almost all
or never log in the same account any more once they post rumor microblog services allow their users to repost and comment the
microblogs. Thus, the value for average number of post per day posts they have read. Both reposts and comments can be seen
of rumormongers is probably far less than that of normal users. as a response behavior which can reflect a kind of judgment
5) Number of Possible Microblog Sources: In this paper, to microblogs. Number of reposts indicates how many people
number of possible microblog sources refers to the number of repost a microblog and number of comments describes how
persons who post a specific microblog or its similar microblogs many people express their opinions and attitudes to a microblog
instead of forwarding it. Rumor microblogs usually are initiated post. These two features are usually used to evaluate the pop-
from one people or a small number of people, while the ularity of a post. The larger the values of these features are, the
LIANG et al.: RUMOR IDENTIFICATION IN MICROBLOGGING SYSTEMS BASED ON USERS’ BEHAVIOR 5

more the post is popular in microblog systems. As for rumor to all the collected comments. In (4), the value of P r(wi |c)
posts, although their truth and sources are unreliable, yet their may be problematic, since it would get value 0 for comments
contents usually are hot topics of microblog for they describe with unknown keywords. To eliminate zeroes, we use Laplace
the event of interest to others, not only to the friends of the smoothing [26] to calculate the conditional probabilities for
author of each message [1]. Therefore, the values of these two unknown keywords. The calculation is defined by
features usually are far more than that of normal microblogs.
1
2) Ratio of questioned comments: Rumor microblogs are P r(wi |c) = (5)
prone to be challenged in its dissemination process, for its truth nc + |v| + 1
and sources are unreliable. Mendoza et al. found that informa- where |v| is the number of keywords extracted from the col-
tion which turned out to be false was much more questioned lected comment set in step 1) and nc is the number of comments
than information which ended up being true [1]. Currently, of class c.
almost all microblog platforms provide the comment service for 3) Number of corrections: There are a number of
their users and they can express their views to any microblog microblog posts which tried to correct the misinformation and
posts by using comment service. According to Mendoza et al., disinformation and these posts are named as corrections. Shirai
we can conclude that a post with many questioned comments et al. [16] reported that 14.7% people or organization would
has a high likelihood of being a rumor. Therefore, we use the publish rumor correction as soon as they found the rumor
feature of ratio of questioned comments to describe the ques- microblogs. Obviously, rumor microblogs are involved in more
tioned behavior of readers. The value of ratio of questioned corrections than common microblogs. The detailed steps of this
comments is defined as feature extraction method can be summarized as follows.
Step 1) Construct the correction keywords dictionary which
|questioned commets of mi |
r(mi ) = (2) contain keywords such as “rumor” and “refute.”
|comments of mi | Step 2) For a given microblog, get its corresponding key-
where |comments of mi | is the number of comments of words vector by using Tf-Idf method.
microblog mi , and |questioned comments of mi | is the number Step 3) Combine the keywords generated in step 2) and
of comments which questioned microblog mi. keywords from correction keywords dictionary to
In order to calculate the value of r(mi ), we need to judge construct searching keywords and issue a search
whether a comment is questioned at first. In this paper, we use request to searching service provided by microblog
Bayesian method [25] to do this task. The details of judgment systems. The number of return result is the value of
can be summarized as follows. this feature.
Step 1) Collect a set of comments and label them with
questioned or not by manual. V. E XPERIMENT
Step 2) Extract keywords from the collected comment and
In this section, we demonstrate the collection of experiment
calculate the posterior probability of every keyword
dataset, the evaluation of the proposed user behavior-based
wi for each class. The calculation is as follows:
features, and the experiment results.
nc


P r(wi | c) = u(wi , ml ) nc (3)
j=1
A. Dataset
We collect the microblog data from Sina Weibo, the
where c represents the type (questioned comment or China’s leading microblog service provider, to test the
not) of a comment, u (wi , mj ) is a function whose performance of the method proposed in this paper. Sina
value will be 1 if the questioned comment mj con- Weibo provides two Web services: 1) @Weibopiyao [20] and
tains keyword wi , otherwise its value will be zero, 2) Weibo Misinformation Declaration [21] to publish the rumor
and nc is the number of comments of class c. microblogs. In order to construct the training set accurately and
Step 3) For a given unmarked comment mi, calculate its efficiently, we use these published rumor information to label
likelihood for each class based on the data calculated the rumor microblogs, rather than manually label them.
in step 2) and choose the class which maximizes The rumor announcement instance pages in @weibopiyao
this likelihood as the target class. The calculation of and Weibo Misinformation Declaration are shown in Figs. 2
probability is defined as and 3, respectively. To construct the dataset correctly, we need
to map a rumor announcement to its original rumor microblogs,
nc
and then obtain the user behaviors from the original microblogs.

Cmap = arg max P r(wi | c) P r(c) (4) However, the URL of the original microblogs is not given
c∈C i=1
directly on these announcement pages. Hence, we need to find
a way to obtain the URL of rumor microblog posts from the
where C = {questioned comment, normal comment}, rumor announcement pages.
P r(wi |c) is the conditional probabilities calculated in step We noticed that the rumor announcement pages usually
2), and P r(c) is the prior probability of class c and it is follow a fixed format, so we are able to obtain the neces-
the fraction of the comments with a target classification of c sary information to acquire the URL of the original rumor
6 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

using a crawler program. The collected microblogs belong to


two categories: 1) labeled rumor microblogs obtained from
@Weibopiyao and Weibo Misinformation Declaration and
2) unlabeled microblogs gathered from rumormongers’ fol-
lowers and followees. For the unlabeled microblogs, we ask
two annotators to put labels to them. Meanwhile, we use the
Cohen’s kappa coefficient [22] to the measure the consistency
between the two annotators. The Cohen’s kappa coefficient is
defined by
pobserved − pchanced
κ= (6)
1 − pchanced
Fig. 2. Microblog instance of @Weibopiyao.
|A∩B|+|C∩D| |A|×|B|
where P observed = |E| , and Pchanced = |E|2
+
|C|×|D|
|E|2
. A is the microblog set labeled by the first annotator.
B is the microblog set labeled by the second annotator. C is the
set of microblogs the first annotator cannot decide whether they
are rumors or not. D is the set of microblogs the second annota-
tor cannot decide whether they are rumors or not. E is the set of
all the collected microblog, and |·| represents the size of a set.
In our case, the Cohen’s kappa coefficient is set as κ = 0.9635,
and it demonstrates that the two annotators reach high agree-
ment in data annotation. Finally, the dataset constructed in this
paper contains 9199 microblogs, which includes 1608 rumor
microblogs and 7591 normal microblogs.
Fig. 3. Microblog instance of Weibo-misinformation-declaration.
B. Evaluation Metrics
microblogs through text structure analysis on announcement In order to assess the performance of our approach proposed
pages. in this paper, we use conventional precision, recall, and F-score
Fig. 2 illustrates a rumor announcement page of [31] as evaluation metrics. The precision P r is the fraction
@Weibopiyao. As shown in Fig. 2, the value of the user of the correctly predicted rumor microblogs to all the rumor
name is shown immediately after the keyword “ ” (means microblogs identified. Recall Re is the proportion of correctly
“user”), and the rumor’s content is summarized between predicted rumor microblogs to all the rumor microblogs. F-
beginning-words “ ” (means “post microblog and score can be considered as the harmonic mean of recall and
say”) and ending-words “ ” (means “after inspection”). precision. The calculation of precision, recall, and F-score are
After extracting the value of user name and keywords from defined as (7), (8), and (9), respectively
rumor announcement pages, we can construct a request URL: |correctly predicted rumor microblogs|
http://s.weibo.com/wb/” ”+” ”+” ”+” ” Pr = (7)
|rumor microblogs identified|
&xsort=time &userscope=custom:”Mickel” &Refer=g and
send it to Sina Weibo Server, which issues a search request |correctly predicted rumor microblogs|
Re = (8)
to Sina Weibo, and the original rumor page will be returned. |rumor microblogs|
In the request URL generated above, “ ” (a place 2pr × re
F = ∗ 100% (9)
of China), “ ”(means to demolish compulsively), “ ” pr + re
(means children), and “ ” (means death) are extracted
where |∗| is the number of elements in set *.
keywords from rumor contents and “Mickel” is user name of
rumormongers.
Fig. 3 shows a microblog instance of Weibo Misinformation C. Experiment Results
Declaration Web service. Different from “@Weibopiyao”, it In this section, we will show experiment results in two
provides a hyperlink named “ ” (“means original text”), aspects. 1) We analyze the values distribution of users’ behavior
which links to the original rumor microblog. Therefore, we can features to illustrate the discriminate capacity of every fea-
extract the URL of original rumor microblog from the HTML ture. 2) We conduct a comparative experiment with baseline
text of a page published in Weibo Misinformation Declaration. approach to test the performance of our approach.
In our experiment, we collect the microblogs published by 1) Discriminative Capacity of the Features: In order to test
the @Weibopiyao and Weibo Misinformation Declaration from the discriminative capacity of features based on users’ behav-
December 18, 2010 to December 24, 2014. We extract the ior, we analyze the distribution of feature values in Fig. 4. It can
profiles of the rumormongers, collect the microblogs posted be observed from the Fig. 4 that there exists a significant differ-
by their followers and followees, and build their profiles by ence between rumor and normal microblogs when we employ
LIANG et al.: RUMOR IDENTIFICATION IN MICROBLOGGING SYSTEMS BASED ON USERS’ BEHAVIOR 7

Fig. 4. Value distribution of users’ behavior features between rumor microblogs and normal ones.

the features, such as number of followers, average number of of the proposed user behavior features, we train five classifiers:
followees per day, average number of posts per day, number 1) logistic regression [24]; 2) SVM with RBF kernel function
of retweets, number of comments, and ratio of questioned com- [25]; 3) Naïve Bayes [26]; 4) decision tree [27]; and 5) K-
ments. As for features such as user type, number of information nearest neighbors [28] through tenfold cross validation strategy
sources, and number of correction, there is no significant differ- by using the open-source machine learning library Scikit-learn
ence in terms of median according to Fig. 4. However, it does [29]. We compare the previously proposed features [9]–[18]
not mean that these three features are not effective in rumor with the user behavior features and employ the feature-selection
identification since there exist significant differences between method given in [29] and [30] to choose the best eleven features
normal microblogs and rumors in the 50% of the largest val- from those listed in Table II. They are 1) number of sentiment
ues of those features. To achieve good performance using these words; 2) number of the URLs; 3) user type; 4) number of com-
three features, we usually apply other features to filter out the ments; 5) registration age; 6) count followers; 7) number of
noise values, and then apply these three features. For example, posts; 8) number of reposts; 9) number of followees; 10) user
many microblog posts are used to describe user’s mood or com- name type; and 11) is reposted?.
municate with their friends. The number of information sources These eleven features altogether serve as the baseline for
for these microblogs are similar to the rumor microblogs in comparison with the proposed user behavior features.
the lower half of the feature values. If we use the number 3) Rumor Identification Evaluation: Fig. 5 illustrates the
of reposts and the number comments to filter out these types experiment result of logistic regression algorithm. The preci-
of microblogs, the number of information sources will show sion, recall, and F-score of rumor classifier using the features of
promising discriminative capacity in rumor identification. users’ behaviors are 0.8333, 0.6, and 0.6977, respectively, and
2) Rumor Identification Evaluation: Since there are differ- those using the selected best eleven features are 0.7143, 0.6,
ences in users’ behaviors when they publish or read normal and 0.6521. The classification accuracy is improved to varying
and rumor microblogs, we represent a microblog with its degree which is 11.9%.
author and readers’ corresponding behavior features, and iden- Fig. 6 demonstrates that the result of SVM algorithm, the pre-
tify whether the microblog is a rumor or not based on these cision, recall, and F-score of rumor classifier constructed based
features. In order to test the efficiency and general applicability on users’ behavior are 0.8333, 0.7, and 0.7687, respectively, in
8 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

TABLE II
S ELECTED B EST 11 F EATURES BASED ON DATA F ROM S INA W EIBO

Fig. 6. Comparison between our approach and baseline approach based on


SVM algorithm (C = 1, gamma = 0).

Fig. 7. Comparison between our approach and baseline approach based on


Naive Bayes algorithm.

Fig. 5. Comparison between our approach and baseline approach based on


logistic regression algorithm.

this case, and those based on the selected best eleven features
are 0.8182, 0.6, and 0.6923. The precision, recall, and F-score
have increased 1.5%, 10%, and 7.64%, which means our
approach and feature set not only improve the prediction accu-
racy but also can detect more rumors than baseline approach.
The experiment result of Naïve Bayes algorithm is shown
in Fig. 7. The precision, recall, and F-score of rumor classi-
fier using the selected best eleven features are 0.4, 0, 4, 0, and
4 while those using user’s behaviors are improved to 0.7143,
0.8532, and 0.7776. The precision, recall, and F-score have Fig. 8. Comparison between our approach and baseline approach based on
increased 0.3143, 0.4532, and 0.3776, respectively. decision tree algorithm.
Fig. 8 describes the experiment result of decision tree algo-
rithm. We can find from Fig. 8 that the performance of rumor respectively, and those using the selected eleven best fea-
classifier trained based on decision tree is the best among the tures are 0.6667, 0.6, and 0.6316, respectively. The precision,
five rumor classifiers constructed in this paper, and its preci- recall, and F-score have increased 0.1978, 0.2535, and 0.1868,
sion, recall, and F-score reach to 0.8645, 0.8535, and 0.8590, respectively.
LIANG et al.: RUMOR IDENTIFICATION IN MICROBLOGGING SYSTEMS BASED ON USERS’ BEHAVIOR 9

R EFERENCES
[1] M. Mendoza, B. Poblete, and C. Castillo, “Twitter under crisis: Can
we trust what we RT?,” in Proc. 1st Workshop Social Media Anal.
(SOMA’10), 2010, pp. 71–79.
[2] A. Friggeri, L. A. Adamic, D. Eckles, and J. Cheng, “Rumor cascades,”
in Proc. 8th Int. AAAI Conf. Weblogs Social Media, 2014, pp. 101–110.
[3] J. Kostka, Y. A. Oswald, and R. Wattenhofer, “Word of mouth:
Rumor dissemination in social networks,” in Structural Information
and Communication Complexity, New York, NY, USA: Springer, 2008,
pp. 185–196.
[4] F. Chierichetti, S. Lattanzi, and A. Panconesi, “Rumor spreading in social
networks,” in Automata, Languages and Programming, New York, NY,
USA: Springer, 2009, pp. 375–386.
[5] L. Hang, “Overview of statistical learning methods” in The Study Method
of Statics. Beijing, China: Tsinghua Express, 2012, pp. 7–24.
[6] T. Shirai et al., “Estimation of false rumor diffusion modeland estimation
of prevention model of false rumor diffusion on twitter (in japanese), in
26th Annu. Conf. Jpn. Soc. Artif. Intell., 2012, vol. 26, pp. 1–4.
[7] M. A. Hall, “Correlation-based feature selection for machine learn-
ing,” Ph.D. dissertation, Dept. Comput. Sci., The University of Waikato,
Fig. 9. Comparison between our approach and baseline approach based on Hamilton, New Zealand, 1999.
K-nearest neighbors algorithm (k = 30). [8] P. Langley and H. A. Simon, “Applications of machine learning and rule
induction,” Commun. ACM, vol. 38, pp. 54–64, 1995.
[9] C. Castillo, M. Mendoza, and B. Poblete, “Information credibility on
Fig. 9 shows the experiment result of K-nearest neighbors twitter,” in Proc. 20th Int. Conf. World Wide Web, 2011, pp. 675–684.
algorithm. The precision, recall, and F-score of rumor classi- [10] V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei, “Rumor has
it: Identifying misinformation in microblogs,” in Proc. Conf. Empirical
fier using users’ behavior features reach to 0.9, 0.4, and 0.5538, Methods Nat. Lang. Process., 2011, pp. 1589–1599.
and those using the best eleven features are 0.8889, 0.3, and [11] F. Yang, Y. Liu, X. Yu, and M. Yang, “Automatic detection of rumor on
0.4485, respectively. Although the precision of the two classi- Sina Weibo,” in Proc. ACM SIGKDD Workshop Min. Data Semant., 2012,
p. 13.
fiers constructed based on KNN algorithm are both high, but it [12] S. Sun, H. Liu, J. He, and X. Du, “Detecting event rumors on Sina Weibo
does not mean the performances of these two rumor classifiers automatically,” in Web Technologies and Applications, New York, NY,
are the best among the eleven rumor classifiers trained based USA: Springer, 2013, pp. 120–131.
[13] G. Cai, H. Wu, and R. Lv, “Rumors detection in Chinese via crowd
on five algorithms for the recall of these two rumor classifiers responses,” in Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Min.
are both low, which means there are many rumor posts which (ASONAM’14), 2014, pp. 912–917.
cannot be identified by these two rumor classifiers. [14] G. Wang, S. Xie, B. Liu, and P. S. Yu, “Review graph based online store
review spammer detection,” in Proc. IEEE 11th Int. Conf. Data Min.
From Figs. 5–9, we can observe that the performance of (ICDM), 2011, pp. 1242–1247.
rumor classifier using users’ behavior features is better than that [15] C. Liang, Z. Liu, and M. Sun, “Expert finding for microblog misinforma-
of baseline approach. Compared with the baseline approach, the tion identification,” in COLING (Posters), 2012, pp. 703–712.
[16] T. Takahashi and N. Igata, “Rumor detection on twitter,” in Proc. Joint
precision, recall, and F-score of our approach have increased 6th Int. Conf. Soft Comput. Intell. Syst. (SCIS); 13th Int. Symp. Adv. Intell.
13.14%, 18.13%, and 16.68% on average, which demonstrates Syst. (ISIS), 2012, pp. 452–457.
the effectiveness of our method and the proposed features in [17] D. Trpevski, W. K. Tang, and L. Kocarev, “Model for rumor spreading
over networks,” Phys. Rev. E, vol. 81, p. 056102, 2010.
rumors identification. [18] E. Seo, P. Mohapatra, and T. Abdelzaher, “Identifying rumors and their
sources in social networks,” in SPIE Defense, Security, and Sensing,
2012, pp. 83891I–83891I-13.
[19] J. Ramos, “Using TF-IDF to determine word relevance in document
VI. C ONCLUSION queries,” in Proc. 1st Instruct. Conf. Mach. Learn., 2003.
[20] Sina Weibo. (2015). @Weibopiyao [Online]. Available: http://www.
Microblog systems have become a new platform for infor- weibo.com/weibopiyao?from=myfollow_all
mation sharing, but they can also easily be utilized to spread [21] Sina Weibo. (2015). Weibo-Misinformation-Declaration [Online].
rumors. It is of great importance to develop an automatic Available: http://service.account.weibo.com/?type=5&status=0
[22] A. J. Viera and J. M. Garrett, “Understanding interobserver agreement:
tool to identify the credibility of information spreading on the The kappa statistic,” Fam. Med., vol. 37, pp. 360–363, 2005.
microblog. [23] S. Lemeshow and D. W. Hosmer, “A review of goodness of fit statis-
In this paper, we investigate the rumor identification problem tics for use in the development of logistic regression models,” Amer. J.
Epidemiol., vol. 115, pp. 92–106, 1982.
in microblog systems. We propose a user behavior-based rumor [24] J. A. Suykens and J. Vandewalle, “Least squares support vector machine
identification schemes, in which the users’ behaviors are treated classifiers,” Neural Process. Lett., vol. 9, pp. 293–300, 1999.
as hidden clues to indicate who are likely to be rumormongers [25] B. Cestnik, “Estimating probabilities: A crucial task in machine learn-
ing,” in Proc. Eur. Conf. Artif. Intell., 1990, pp. 147–149.
or what posts are possible rumor microblogs. The experiment [26] D. A. Field, “Laplacian smoothing and Delaunay triangulations,”
results on real-world data from Sina Weibo demonstrate the Commun. Appl. Numer. Methods, vol. 4, pp. 709–712, 1988.
efficacy of our method and features proposed in this paper. The [27] R. Safavian and D. Landgrebe, “A survey of decision tree classifier
methodology,” IEEE Trans. Syst. Man Cybern., vol. 21, no. 3, pp. 660–
precision, recall, and F-score of our approach have increased 674, May/Jun. 1991.
19.24%, 18.3%, and 19.1% on average, compared with baseline [28] K. Fukunaga and P. M. Narendra, “A branch and bound algorithm for
result. The proposed new features will enrich the rumor iden- computing k-nearest neighbors,” IEEE Trans. Comput., vol. 100, no. 7,
pp. 750–753, 1975.
tification feature database, and benefit the design of automatic [29] Scikit-learn. (2015). Scikit-Learn Packages [Online]. Available:
rumor identification systems. http://scikit-learn.org/dev/install.html
10 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS

[30] I. Guyon and A. Elisseeff, “An introduction to variable and feature Chun Xu received the Ph.D. degree in computer sci-
selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003. ence from Sichuan University, Chengdu, China, in
[31] C. Goutte and E. Gaussier, “A probabilistic interpretation of preci- 2008.
sion, recall and F-score, with implication for evaluation,” in Advances He is currently an Associate Professor with
in Information Retrieval, New York, NY, USA: Springer, 2005, Sichuan University. His research interests include
pp. 345–359. network security, machine learning, and autonomous
[32] D. C. Brabham, “Crowdsourcing as a model for problem solving an and applied data mining.
introduction and cases,” Convergence: Int. J. Res. New Media Technol.,
vol. 14, pp. 75–90, 2008.
[33] Q. Zhang, S. Zhang, J. Dong, J. Xiong, and X. Cheng, “Automatic detec-
tion of rumor on social network,” in Natural Language Processing and
Chinese Computing, New York, NY, USA: Springer, 2015, pp. 113–122.
[34] K. Wu, S. Yang, and K. Q. Zhu, “False rumors detection on Sina Weibo
by propagation structures,” in Proc. IEEE Int. Conf. Data Eng. (ICDE),
2015, pp. 651–662. Liangyin Chen received the Ph.D. degree in com-
puter science from Sichuan University, Chengdu,
Gang Liang received the Ph.D. degree in computer China, in 2008.
science from Sichuan University, Chengdu, China, in He is currently a Professor with Sichuan
2007. University. His research interests include machine
He was an Assistant Professor with the College learning, autonomous agents, and applied data
of Computer Science, Sichuan University, Chengdu, mining.
China. He is currently a Visiting Scholar with McGill
University, Montreal, QC, Canada. His research inter-
ests include network security social networks and
machine learning.

Wenbo He received the Ph.D. degree in computer Jinquan Zeng received the Ph.D. degree in computer
science from the University of Illinois at Urbana- science from Sichuan University, Chengdu, China, in
Champaign, Urbana, IL, USA, in 2008. 2008.
She was an Assistant Professor with the He is currently an Associate Professor with
Department of Computer Science, University of New the School of Computer Science and Engineering,
Mexico, Albuquerque, NM, USA, from 2008 to 2010. University of Electronic Science and Technology
From 2010 to 2011, she was an Assistant Professor of China, Chengdu, China. His research interests
with the Department of Electrical Engineering, include network security and computer network.
University of Nebraska-Lincoln, Lincoln, NE, USA.
She is currently an Assistant Professor with the
School of Computer Science, McGill University,
Montreal, QC, Canada. Her research interests include big data systems, mobile
and pervasive computing, security and privacy, and cloud computing.

You might also like