You are on page 1of 9

S Ou ns tl ai ni ne a Bb el hi ta yv i o r a l A n a l y s i s

Suspicious Behavior
Detection: Current
Trends and Future
Directions
Meng Jiang and Peng Cui, Tsinghua University

Christos Faloutsos, Carnegie Mellon University

A
s Web applications such as Hotmail, Facebook, Twitter, and Amazon

have become important means of satisfying working, social, information-

seeking, and shopping tasks, suspicious users (such as spammers, fraudsters,

and other types of attackers) are increasingly attempting to engage in dishonest


Applications have activity, such as scamming money out of In- Detection Scenarios
ternet users and faking popularity in political We surveyed more than 100 advanced tech-
different definitions
campaigns. Fortunately, commercially avail- niques for detecting suspicious behaviors that
of suspicious able suspicious behavior-detection techniques have existed over the past 10 years, dividing
can eliminate a large percentage of spam, suspicious behaviors into four categories:
behaviors. Detection fraud, and sybil attacks on popular platforms. traditional spam, fake reviews, social spam,
Naturally, the owners of these platforms and link farming. Figure 1 shows the per-
methods often want to ensure that any behavior happening centages of research works that focus on
on them involves a real person interested in these categories. The various works gather
look for the most
interacting with a specific Facebook page, fol- different aspects of information, such as
suspicious parts lowing a specific Twitter account, or rating a content (C), network (N), and behavioral
specific Amazon product. (B) patterns from behavioral data. Table 1
of the data by Here, we describe detection scenarios that summarizes several experimentally success-
use different techniques to ensure security and ful detection techniques.1–23 From the figure
optimizing scores, long-term growth of real-world systems; we and table, we can see tremendous progress in
also offer an overview of the various methods link-farming detection systems and trends in
but quantifying the
in use today. As we move into the future, it’s information integration.
suspiciousness of a important that we continue to identify success-
ful methods of suspicious behavior detection Traditional Spam
behavioral pattern is at analytical, methodological, and practical A variety of spam-detection methods filter false
levels—especially those methods that can be or harmful information for traditional systems
still an open issue. adapted to real applications. such as email or short message service (SMS).

january/february 2016 1541-1672/16/$33.00 © 2016 IEEE 31


Published by the IEEE Computer Society
Online Behavior al Analysis

100
"Likes" Zombie followers
Astroturf Link farming
80 Ratings
Wall posts Sybils
Email spam Phone calls Social spam
60 Messages
Percent

Pages
SMS spam Suspended
40
APP reviews spammer Fake review
Web spam Restaurant reviews
20 Product reviews Hotel reviews
Traditional spam
0
2005–2007 2008–2009 2010 2011 2012 2013 2014

Figure 1. Percentage of recent research works on four main categories of suspicious behaviors: traditional spam, fake review,
social spam, and link farming. Lately, we’ve seen tremendous progress in link-farming detection systems.

Table 1. Experimentally successful suspicious behavior detection techniques and addresses from different information
their gathered information from data.* sources to classify emails. It achieves
Information Traditional stable, high-quality performance: the
aspects spam Fake reviews Social spam Link farming FPR is close to zero, smaller than 0.5
C AFSD1 — Astroturf 2 and — percent when the network is 50 per-
Decorate 4 cent sparser.
N MailRank4 — SybilLimit5 and OddBall7 Text message spams often use the
Truthy 6 promise of free gifts, product offers,
B SMSF8 — CopyCatch9 or debt relief services to get users to
C+N — — SSDM10 and Collusionrank 12 reveal personal information. Due to
SybilRank 11 the low cost of sending them, the re-
C+B — ASM,13 GSRank,14 URLSpam17 and — sulting massive amount of SMS spam
OpinSpam,15 and Scavenger 18 seriously harms users’ confidence in
SBM16
their telecom service providers. Con-
N+B — FraudEagle19 — Com220 and tent-based spam filtering can use in-
LockInfer 21
dicative keywords in this space, such as
C+N+B — LBP 22 — CatchSync23 “gift card” or “Cheap!!!,” but obtain-
* C: content, N: network, and B: behavioral pattern. ing a text message’s content is expen-
sive and often infeasible. An algorithm
for SMS filtering (SMSF) detects spam
Email spam can include malware or sured by the area under the receiver- using static features such as the total
malicious links from botnets with no operating-characteristics curve (AUC) number of messages in seven days and
current relationship to the recipient, of 0.991 (with the highest score being temporal features such as message size
wasting time, bandwidth, and money. 1) on a dataset from NetEase, one of every day.8 On SMS data from 5 mil-
A content-based approach called the largest email service providers in lion senders on a Chinese telecom plat-
Adap­tive Fusion for Spam Detection China, indicating an almost-perfect form, we can use static features to train
(AFSD) extracts text features (bag- performance in stopping text-rich support vector machine (SVM) classifi-
of-words) from an email’s character spam for email services. ers and get the AUC to 0.883—incor-
strings, develops a spam detector for a The MailRank system studies email porating temporal features gives an ad-
binary classification task (spam versus networks using data gathered from the ditional 7 percent improvement.
regular message), and shows prom- log files of a company-wide server and Researchers have developed vari-
ising accuracy in combating email the email addresses that users prefer to ous data mining approaches to detect
spams.1 People don’t want legitimate communicate with (typically acquain- email, SMS, and Web spam. High ac-
email blocked, so to take false-posi- tances rather than other unknown us- curacy (near-1 AUC and near-0 FPR)
tive rates (FPRs) into consideration, ers).4 The system applies personalized makes these methods applicable in real
AFSD gives the accuracy score mea- PageRank algorithms with trusted systems, and some achieve a certain

32 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS


degree of success. But with rising new combining these findings with text con- spammers tend to work with other
platforms such as shopping and social tent to improve detection perfor- spammers, and genuine reviewers tend
networking sites, researchers have new mance. GSRank studies fake review to work with other genuine review-
challenges to tackle. detection in a collaborative setting and ers. LBP incorporates review content,
uses a frequent itemset-mining method co-occurrence networks, and reviewer
fake reviews to find a set of fake reviewer groups.14 burstiness into a probabilistic graphical
Helpful customer reviews and ratings Using group features can improve the model (PGM). The “content + network +
promise many benefits in e-commerce AUC from 0.75 to 0.95. The Author behavior” combination significantly in-
systems such as Amazon, Yelp, and Spamicity Model (ASM) reviews spam- creases accuracy from 0.58 to 0.78 in
Google Play. But in recent years, these mers’ features in a latent space, receiv- the binary classification task.
companies have had to crack down on ing a 7 percent improvement in accu-
people selling fake reviews to bolster racy.13 The intuition is that spammers Social Spam
products, restaurants, or apps. Fake re- Social spam is unwanted user-gener-
views give undeserving positive opin- ated content (UGC) such as messages,
ions to promote a target object—or helpful customer reviews comments, or tweets on social net-
conversely, malicious negative opinions working services (SNSs) such as Face-
that damage an object’s reputation. and ratings promise many book, MySpace, or Twitter. Successfully
The first comprehensive study on defending against social spammers is
trustworthiness in online reviews in- benefits in e-commerce important for improving the quality of
vestigated 5.8 million reviews and experience for SNS users.
2.1 million reviewers on Amazon.15 systems such as amazon, The deployment of social honeypots
This work had to define text features harvests deceptive spam profiles from
from reviews (length, opinion-bearing yelp, and Google Play. an SNS; from here, statistical analysis on
words), attribute features from prod- these spam profiles creates spam classi-
ucts (price, sales rank), and rating-re- but in recent years, these fiers that actively filter social spammers.
lated features from reviewers (average Decorate, an ensemble learner of classi-
rating, standard deviation in rating); companies have had to fiers, uses features from profiles (such as
it also had to use a logistic regression sexual or advertisement content) to clas-
model to detect fake reviews (4,488 crack down on people sify spammers and legitimate users.3 It
duplicate reviews served as ground obtains an accuracy of 0.9921 and an
truth). Using only text features gives selling fake reviews FPR of 0.007 on a MySpace dataset of
only 0.63 on the AUC, indicating that 1.5 million profiles, and an accuracy of
it’s difficult to identify fake reviews us- to bolster products, 0.8898 and an FPR of 0.057 on a Twitter
ing text content alone. Combining all dataset of 210,000 profiles. URLSpam
the features gives the best result: 0.78 restaurants, or apps. focuses on detecting Twitter spammers
AUC. A scoring method called Spam- who post tweets containing words
ming Behavior regression Model found in trending topics and URLs that
(SBM) simulates the behavioral pat- have different behavioral distributions lead users to unrelated websites.17 It
terns of fake review spammers: they than nonspammers, creating a diver- uses both content-based features (such
target specific products to maximize gence between the latent population as the number of hashtags or URLs) and
their impact, and they tend to deviate distributions of the two reviewer clus- behavioral features (number of tweets
from other reviewers in their ratings.16 ters. FraudEagle spots fraudsters and posted per day or time between tweets)
SBM correctly identified the top and fake reviews in online review datasets as attributes of an SVM classifier, cor-
bottom 10 ranked reviewers as spam- by exploiting the network effect among rectly classifying 70 percent of spammers
mers and nonspammers, respectively, reviewers and products,19 and Loopy and 96 percent of nonspammers. Scav-
in a small labeled Amazon dataset of Belief Propagation (LBP) exploits the enger is a clustering technique to group
24 spammers and 26 nonspammers. bursty nature of reviews to identify Facebook wall posts that show strong
In past last five years, research- review spammers.22 Although review similarities in advertised URL desti-
ers have focused on discovering fake bursts can be due to either sudden nations or text descriptions.18 Specifi-
reviewers’ behavioral patterns and product popularity or spam attacks, cally, it characterizes static and temporal

january/february 2016 www.computer.org/intelligent 33


Online behaviOr al analySiS

properties of malicious clusters and astroturfing campaigns in the context uncovers lockstep behaviors in zom-
identifies spam campaigns such as, “Get of US political elections.6 bie followers and provides initializa-
free ringtones” in 30,000 posts and tion scores by reading the social graph’s
“Check out this cool video” in 11,000 Link farming connectivity patterns.21 CatchSync ex-
posts from a large dataset composed of Link farming previously referred to a ploits two of the tell-tale signs left in
more than 187 million posts. The pro- form of spamming on search engine graphs by fraudsters: they’re often re-
posed Social Spammer Detection in indexes that connected all of a web- quired to perform some task together
Microblogging (SSDM) approach is a page’s hyperlinks to every other page and have “synchronized” behavioral
matrix factorization-based model that in a group. Today, it’s grown to include patterns, meaning their patterns are
integrates both social network informa- many graph-based applications within rare and very different from the ma-
tion and content information for social millions of nodes and billions of edges. jority.23 Quantifying concepts and us-
spammer detection.10,24,25 The unified For example, in Twitter’s “who-follows- ing a distance-based outlier detection
model achieves 9.73 percent higher ac- whom” graph, fraudsters are paid to method, CatchSync can achieve 0.751
curacy than those with only one kind make certain accounts seem more accuracy in detecting zombie followers
of information on a Twitter dataset of on Twitter and 0.694 accuracy on Ten-
2,000 spammers and 10,000 legitimate cent Weibo, one of the biggest microb-
users. astroturfing is a particular logging platforms in China. CatchSync
Social sybils refer to suspicious ac- works well with content-based meth-
counts creating multiple fake identi- type of abuse disguised as ods: combining content and behavioral
ties to unfairly increase a single user’s information can improve accuracy by
power or influence. With social net- spontaneous “grassroots” 6 and 9 percent, respectively. So far, it
working information of n user nodes, has found 3 million suspicious users
SybilLimit accepts only O(log n) sybil behavior, but that is who connect to around 20 users from
nodes per attack edge.5 The intuition is a set of 1,500 celebrity-like accounts on
that if malicious users create too many in reality carried out the 41-million-node Twitter network,
sybil identities, the graph will have a creating a big spike on the out-degree
small set of attack edges whose removal by a single person or distribution of the graph. Furthermore,
disconnects a large number of nodes removing the suspicious user nodes
(all the sybil identities). SybilRank re- organization. leaves a smooth power law distribu-
lies on social graph properties to rank tion on the remaining part of the graph,
users according to their perceived like- strong evidence that recall on the full
lihood of being fake sybils.11 SybilRank legitimate or famous by giving them dataset is high.
found 90 percent of 200,000 accounts additional followers (zombies). In Face- CopyCatch detects lockstep Face-
as most likely being fake on Tuenti, an book’s “who-likes-what-page” graph, book page-like patterns by analyzing
SNS with 11 million users. fraudsters create ill-gotten page likes to the social graph between users and
Social media has rapidly grown in turn a profit from groups of users acting pages and the times at which the edges
importance as a forum for political, together, generally liking the same pages in the graph (the likes) were created.9
advertising, and religious activism. As- at around the same time. Unlike spam Specifically, it searches for temporally
troturfing is a particular type of abuse content that can be caught via exist- “bipartite cores,” where the same set
disguised as spontaneous “grassroots” ing antispam techniques, link-farming of users like the same set of pages, and
behavior, but that is in reality car- fraudsters can easily avoid content- adds constraints on the relationship
ried out by a single person or organi- based detection: zombie followers don’t between edge properties (like times) in
zation. Using content-based features have to post suspicious content, they this core. CopyCatch is actively in use
such as hashtag, mentions, URLs, and just distort the graph structure. Thus, at Facebook, searching for attacks on
phrases can help determine astroturf- the problem of combating link farming its social graph. Com2 leveraged tensor
ing content with 0.96 accuracy.2 The is rather challenging. decomposition on (caller, callee, and
Truthy system includes network-based With a set of known spammers and day) triplets and minimum description
information (degree, edge weight, and a Twitter network, a PageRank-like ap- length (MDL)-based stopping crite-
clustering coefficient) to track po- proach can give high Collusionrank rion to find time-varying communities
litical memes in Twitter and detect scores to zombie followers.12 LockInfer in a European Mobile Carrier dataset

34 www.computer.org/intelligent Ieee InTeLLIGenT SySTeMS


Table 2. Recent suspicious behavior-detection methods in three main categories.

Methods Traditional spam Fake reviews Social spam Link farming


Supervised methods AFSD1 and SMSF8 LBP,22 OpinSpam,15 Astroturf,2 Decorate,3 —
and SBM16 SSDM,10 and URLSpam17
Clustering methods — ASM13 and GSRank14 Scavenger 18 and Truthy 6 —
Graph-based MailRank4 FraudEagle19 SybilLimit5 and CatchSync,23 Collusionrank,12
methods SybilRank 11 Com2,20 CopyCatch,9
LockInfer,21 and OddBall7

of 3.95 million users and 210 million (fake reviewers, sybil accounts, and so- constructs a hyperplane that repre-
phone calls over 14 days.20 One obser- cial spammers). Because labeling data sents the largest margin between the
vation was that five users received, on is difficult and graph data is emerging, two classes (spammers and legitimate
average, 500 phone calls each, on each unsupervised methods (which include users). SMSF applies Gaussian kernels
of four consecutive days, from a single both clustering and graph-based meth- and URLSpam applies a radial basis
caller. Similarly, OddBall spots anom- ods) nicely overcome the limitations of function kernel to maximum-margin
alous donator nodes whose neighbors labeled data access and generalize to hyperplanes to efficiently perform a
are very well connected (“near-cliques”) the real world. nonlinear classification. According to
or not connected (“stars”) on a large the distance measure, instead of the
graph of political donations.7 Supervised Methods margin, the k-NN algorithms find the
Solving link-farming problems in The major approaches in supervised top k nearest neighbors of training in-
the real world, such as hashtag hijack- detection methods are linear/logistic re- stances from test instances. The least
ing, retweet promoting, or false news gression models, naive Bayesian mod- squares method in SSDM learns a lin-
spreading, requires deep understanding els, SVM, nearest-neighbor algorithms ear model to fit the training data. This
for their specific suspicious behavioral (such as k-NN), least squares, and en- model can use an l1-norm penaliza-
patterns. Only through the integration sembles of classifiers (AdaBoost). tion to control the sparsity. The clas-
of content, network, and behavioral We start with suspicious users, such sification task can be performed by
information can we find multi-aspect as social spammers, {u1, ..., uN}, where solving the optimization problem
clues for effective and interpretable the ith user is ui = (xi, yi), xi ∈ RD×1 (D
1
detection. is the number of features) is the fea- min || X T W − Y ||2F + λ || W ||1 ,
W 2
ture representation of a certain user
Detection Methods (a training example), and yi ∈ {0, 1} where X ∈ RD×N denotes the feature
Suspicious behavior detection prob- is the label denoting whether the user matrix, Y ∈ RN×C denotes the label
lems can be formulated as machine is spammer 1 or legitimate user 0. A matrix (C is the number of categories/
learning tasks. Table 2 presents three supervised method seeks a function g: labels, and C = 2 if we focus on classi-
main categories of detection methods: X → Y, where X is the input feature fying users as spammers or legitimate
supervised, clustering, and graph-based space and Y is the output label space. users), and l is a positive regulariza-
methods. Supervised methods infer a Regressions and naive Bayes are con- tion parameter. AdaBoost is an algo-
function from labeled email content, ditional probability models, where g rithm for constructing a strong clas-
webposts, and reviews. Labeling the takes the form of g(x) = P (y|x). Lin- sifier as a linear combination of sim-
training data manually is often hard, ear regression in SBM models the re- ple classifiers. Statistical analysis has
so large-scale real systems use cluster- lationship between a scalar dependent demonstrated that ensembles of clas-
ing methods on millions of suspicious variable (label) y and one or more in- sifiers can improve the performance of
users and identify spammer and fraud- dependent variables (features) x. Lo- individual classifiers.
ster clusters. Social network informa- gistic regression applied in AFSD, Dec- The key process in making super-
tion and behavioral information are orate, and OpinSpam assumes a logis- vised methods work better is feature
often represented as graph data, so tic function to measure the relation- engineering—that is, using domain
graph-based methods have been quite ship between labels and features. Na- knowledge of the data to create fea-
popular in detecting suspicious behav- ive Bayes classifiers assume strong in- tures. Feature engineering is much
ioral links (injected following links, dependence between features. Gauss- more difficult and time-consuming
ill-gotten Facebook likes, and strange ian, multinomial, and Bernoulli na- than feature selection (returning a
phone calls). Supervised methods are ive Bayes have different assumptions subset of relevant features). Logistic
often applied to detect suspicious users on distributions of features. An SVM regression models in OpinSpam, for

january/february 2016 www.computer.org/intelligent 35


Online Behavior al Analysis

Retweet

Which is more
suspicious?
Facet

Size Density

(a) (b) (c)

Figure 2. Three future directions for suspicious behavior detection: (a) behavior-driven suspicious pattern analysis, (b)
multifaceted behavioral information integration, and (c) general metrics for suspicious behaviors. Detecting retweet hijacking
requires comprehensively analyzing suspicious behavioral patterns (such as lockstep or synchronized). Detection techniques
should integrate multifaceted behavioral information such as user, content, device, and time stamp.

example, were fed with 36 proposed wall posts that share similar URLs.18 like approaches and density-based
features of content and behavior in- The next step is to identify which methods.
formation from three aspects (review, clusters are likely to represent the re- PageRank-like approaches solve
reviewer, and product), all of which sults of spam campaigns. Clustering suspicious node detection problem in
required knowledge from experts who methods can reveal hidden structures large graphs from the ranking per-
are familiar with Amazon and its re- in countless unlabeled data and sum- spective, such as MailRank for spam
view dataset. marize key features. ranking, SybilRank and Collusion-
The bottleneck in supervised meth- Latent variable models use a set of Rank for sybil ranking, and Fraud­
ods is the lack of labeled training latent variables to represent unob- Eagle for fraud ranking. For example,
data in large-scale, real-world ap- served variables such as the spamicity given a graph G = (U, E), where U = u1,
plications. As CopyCatch says, un- of Amazon reviewers, where spamic- ..., uN is the set of nodes, Eij is the edge
fortunately, there’s no ground truth ity is the degree of something being from node ui to uj, and initial spamic-
about whether any individual Face- spam. The Author Spamicity Model ity scores (PageRank values) of the
book page “like” is legitimate. Simi- (ASM) is an unsupervised Bayesian set of nodes R0 = [R(u1), ..., R(uN)]T,
larly, CatchSync argues that labeling inference framework that formulates find the nodes that are most likely to
Twitter accounts as zombie follow- fake review detection as a clustering be suspicious. The iterative equation
ers or normal users can be difficult problem.13 Based on the hypothesis of the solution is
with only subtle red flags—each ac- that fake reviewers differ from others
1− d
count, on its own, generates a few on behavioral dimensions, ASM looks R(t + 1) = d TR(t) + 1,
N
small suspicions, but collectively, they at the population distributions of two
raise many more. Researchers have clusters—fake reviewers and normal where T ∈ RN×N denotes the adjacency
come to realize the power of unsuper- reviewers—in a latent space. Its ac- matrix of the graph or transition
vised methods such as clustering and curate classification results give good matrix. Note that the initial scores
graph-based methods that look for confidence that unsupervised spamic- could be empty, but if they were de-
suspicious behaviors from millions of ity models can be effective. termined with heuristic rules or for-
users and objects. mer observations, the performance
Graph-Based Methods would improve. LockInfer works
Clustering Methods Graphs (directed/undirected, binary/ on Twitter’s “who-follows-whom”
Clustering is the task of grouping a weighted, static/time-evolving) rep- graph and infers strange connectivity
set of objects so that those in the same resent interdependencies through patterns from subspace plots. With
cluster are more similar to each other the links or edges between objects seed nodes of high scores from ob-
than to those in other clusters. To de- via network information in social servations on the plots, the PageR-
tect suspicious users in this manner, spam and behavioral information ank-like (trust propagation) algorithm
Scavenger extracts URLs from Face- in link-farming scenarios. Graph- accurately finds suspicious followers
book wall posts, builds a wall post based suspicious detection methods and followees who perform lockstep
similarity graph, and then clusters can be categorized into PageRank- behaviors.

36 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS


Density-based detection methods in How can we model behavioral infor- is mostly available after the fact (af-
graphs share the same idea with den- mation from real data? How can we ter the account publishes malicious
sity-based clustering: they’re looking quantify the suspiciousness of strange information). We need to change the
for areas of higher density than the re- behavioral patterns? focus from understanding how these
mainder of the graphs/data. The task is, fake accounts appear to behave (pub-
given a graph G, to find all subgraphs behavior-Driven Suspicious lishing duplicate tweets, malicious
Gsub (near-cliques, bipartite cores, and Pattern analysis URLs, and so on) to conceal their
communities) that have anomalous Several approaches have been pro- fake identities or launch attacks to
patterns (unexpectedly high density). posed for detecting fake accounts in discovering how they must behave (fol-
OddBall extracts features such as the social networks. Most learn content- low or be followed) for monetary pur-
number of node neighbors (degrees), based features from duplicate tweets, poses.23 The simple fact is that fake ac-
number of subgraph edges, total sub- counts consistently follow a group of
graph weight, and principal eigenvalue suspicious followees so the companies
of the subgraph’s weighted adjacency hosting these fake accounts can earn
matrix. With these features, OddBall although they try to money from them.
uses traditional outlier detection meth- As Figure 2a shows, existing retweet
ods for near-cliques that indicate mali- capture different aspects hijacking-detection methods analyze
cious posts and fake donations. Com2 text features in tweets and classify
applies incremental tensor decomposi- of spam behaviors, the them as spam or normal content. To
tion on a “caller-callee-time” data- comprehensively capture hijacking
set—that is, a phone call time-evolving patterns in these fake behaviors, we need behavior-driven
graph—to find anomalous tempo- approaches that analyze retweeting
ral communities. CopyCatch offers a accounts easily can be behavioral links, assuming that retweet
provably-convergent iterative algorithm hijacking often forms “user-tweet”
to search temporally coherent bi- varied by changing the bipartite cores.
partite cores (dense “user-page” sub- The nature of suspicious behaviors
graphs) that indicate suspicious lock- scripts that create them, isn’t content but monetary incentives,
step behaviors (such as group attacks or fraudsters, fake accounts, and many
ill-gotten likes) in a million-node Face- the updating speed of other kinds of suspicious users and
book graph. CatchSync finds synchro- their behavioral patterns. Behavior-
nized behavioral patterns of zombie fol- which is almost always driven suspicious pattern analysis can
lower on Twitter-style social networks become a driving force in suspicious
and reports anomalous “who-follows- faster than learning-based behavior detection.
whom” subgraphs. Many graph-based
approaches assume that the data ex- spam-detection algorithms. Multifaceted behavioral
hibits a power-law distribution. Catch- Information Integration
Sync successfully removes spikes on the User behavior is the product of a
out-degree distributions by deleting the multitude of interrelated factors.
subgraphs. malicious URLs, misleading content, Some of these factors, such as physi-
and user profiles. Although they try to cal environment, social interaction,
Future Directions capture different aspects of spam be- and social identity, can affect how
For more than a decade, there has haviors, the patterns in these fake ac- the behavior is related to monetary
been tremendous growth in our un- counts easily can be varied by chang- incentives or other motivations. For
derstanding of suspicious behavior ing the scripts that create them, the example, if a group of Twitter ac-
detection, with most research in this updating speed of which is almost al- counts engage in retweet hijacking,
domain focusing on spam content ways faster than learning-based spam- they operate together on a cluster of
analysis. Figure 2 offers three ideas for detection algorithms. These detection machines (maybe in the same build-
future directions that involve answer- methods also rely heavily on side in- ing or city), promoting a small group
ing the following questions: What’s formation (such as, say, tweet con- of tweets during the same time period
the nature of suspicious behaviors? tent) that isn’t always available and (retweeting every night in one week,

january/february 2016 www.computer.org/intelligent 37


Online behaviOr al analySiS

for example). Figure 2b represents (user, tweet, and IP), size, and density, experimental results show that a
retweeting behaviors as “user-tweet- which subtensor is more worthy of search algorithm based on this met-
IP-...” multidimensional tensors. User our investigation? ric can catch hashtag promotion and
behaviors naturally evolve with the Dense blocks (subtensors) often in- retweet hijacking.
changing of both endogenous (inten- dicate suspicious behavioral patterns
tion) and exogenous (environment) in many detection scenarios. Purchased
factors, resulting in different multi-
faceted, dynamic behavioral patterns.
However, there’s a lack of research
page likes on Facebook result in dense
“user-page-time” three-mode blocks,
and zombie followers on Twitter cre-
D ifferent real-world applica-
tions have different definitions
of suspicious behaviors. Detection
to support suspicious behavior anal- ate dense “follower-followee” two- methods often look for the most sus-
ysis with multifaceted and temporal mode blocks. Density is worth inspect- picious parts of the data by optimiz-
information. ing, but how do we evaluate the sus- ing (maximizing) suspiciousness scores.
Flexible Evolutionary Multifaceted piciousness? In other words, can we However, quantifying the suspicious-
Analysis (FEMA) uses a flexible and ness of a behavioral pattern is still an
dynamic high-order tensor factoriza- open issue.
tion scheme to analyze user behav-
ioral data sequences, integrating vari- if our job at twitter is to
ous bits of knowledge embedded in References
multiple aspects of behavioral infor- detect when fraudsters 1. C. Xu et al., “An Adaptive Fusion
mation.26 This method sheds light on Algorithm for Spam Detection,” IEEE
behavioral pattern discovery in real- are trying to manipulate Intelligent Systems, vol. 29, no. 4, 2014,
world applications, and integrating pp. 2–8.
multifaceted behavioral information the most popular tweets, 2. J. Ratkiewicz et al., “Detecting and
provides a deeper understanding of Tracking Political Abuse in Social
how to distinguish suspicious and nor- given time pressure and Media,” Proc. 5th Int’l AAAI Conf.
mal behaviors. Weblogs and Social Media, 2011; www.
considering facet (user, aaai.org/ocs/index.php/ICWSM/IC-
General Metrics for WSM11/paper/view/2850.
Suspicious behaviors tweet, and iP), size, and 3. K. Lee, J. Caverlee, and S. Webb,
Suppose we use “user-tweet-IP,” a “Uncovering Social Spammers: Social
three-order tensor, to represent a density, which subtensor Honeypots + Machine Learning,” Proc.
retweeting dataset. Figure 2c gives 33rd Int’l ACM SIGIR Conf. Research
us three subtensors: the first two are is more worthy of our and Development in Information Re-
dense three-order subtensors of dif- trieval, 2010, pp. 435–442.
ferent sizes, and the third is a two-or- investigation? 4. P.-A. Chirita, J. Diederich, and W.
der subtensor that takes all the values Nejdl, “Mailrank: Using Ranking for
on the third mode. For example, the Spam Detection,” Proc. 14th ACM Int’l
first subtensor has 225 Twitter us- find a general metric for the suspicious Conf. Information and Knowledge
ers, all retweeting the same 5 tweets, behaviors? Management, 2005, pp. 373–380.
10 to 15 times each, using 2 IP ad- A recent fraud detection study 5. H. Yu et al., “Sybillimit: A Near-
dresses; the second has 2,000 Twitter provides a set of basic axioms that Optimal Social Network Defense
users, retweeting the same 30 tweets, a good metric must meet to detect against Sybil Attacks,” IEEE/ACM
3 to 5 times each, using 30 IP ad- dense blocks in multifaceted data Trans. Networking, vol. 18, no. 3,
dresses; and the third has 10 Twitter (if two blocks are the same size, the 2010, pp. 885–898.
users, retweeting all the tweets, 5 to denser one is more suspicious).27 It 6. J. Ratkiewicz et al., “Truthy: Mapping
10 times each, using 10 IP addresses. demonstrates that while simple, meet- the Spread of Astroturf in Microblog
If our job at Twitter is to detect when ing all the criteria is nontrivial. The Streams,” Proc. 20th Int’l Conf. Comp.
fraudsters are trying to manipu- authors derived a metric from the World Wide Web, 2011, pp. 249–252.
late the most popular tweets, given probability of “dense-block” events 7. L. Akoglu, M. McGlohon, and
time pressure and considering facet that meet the specified criteria. Their C. Faloutsos, “Oddball: Spotting

38 www.computer.org/intelligent Ieee InTeLLIGenT SySTeMS


The Authors
Meng Jiang is a PhD candidate in the Department of Computer Science and Technology
Anomalies in Weighted Graphs,” of Tsinghua University, Beijing. His research interests include behavioral modeling and
Advances in Knowledge Discovery social media mining. Jiang has a BS in computer science from Tsinghua University. Con-
tact him at mjiang89@gmail.com.
and Data Mining, LNCS 6119, Spring-
er, 2010, pp. 410–421. Peng Cui is an assistant professor in the Department of Computer Science and Technol-
8. Q. Xu et al., “SMS Spam Detection ogy of Tsinghua University, Beijing. His research interests include social network analysis
and social multimedia computing. Contact him at cuip@tsinghua.edu.cn.
Using Noncontent Features,” IEEE
Intelligent Systems, vol. 27, no. 6, 2012, Christos Faloutsos is a professor in the School of Computer Science at Carnegie Mel-
pp. 44–51. lon University. His research interests include large-scale data mining with emphasis on
graphs and time sequences. Faloutsos has a PhD in computer science from University of
9. A. Beutel et al., “Copycatch: Stopping Toronto. He’s an ACM Fellow. Contact him at christos@cs.cmu.edu.
Group Attacks by Spotting Lockstep
Behavior in Social Networks,” Proc.
22nd Int’l Conf. World Wide Web,
2013, pp. 119–130.
10. X. Hu et al., “Social Spammer Detec- 18. H. Gao et al., “Detecting and Charac- Proc. Int’l Conf. Data Mining, 2014,
tion in Microblogging,” Proc. 23rd Int’l terizing Social Spam Campaigns,” Proc. pp. 180–189.
Joint Conf. Artificial Intelligence, 2013, 10th ACM SIGCOMM Conf. Internet 25. X. Hu, J. Tang, and H. Liu, “Online
pp. 2633–2639. Measurement, 2010, pp. 35–47. Social Spammer Detection,” Proc.
11. Q. Cao et al., “Aiding the Detection 19. L. Akoglu, R. Chandy, and C. Fa- 28th AAAI Conf. Artificial Intelli-
of Fake Accounts in Large Scale Social loutsos, “Opinion Fraud Detection in gence, 2014; www.aaai.org/ocs/
Online Services,” Proc. 9th Usenix Online Reviews by Network Effects,” index.php/AAAI/AAAI14/paper/
Conf. Networked Systems Design and Proc. Int’l Conf. Web and Social viewFile/8467/8399.
Implementation, 2012, pp. 15–28. Media, 2013; www.aaai.org/ocs/index. 26. M. Jiang et al., “FEMA: Flexible Evo-
12. S. Ghosh et al., “Understanding and php/ICWSM/ICWSM13/paper/ lutionary Multi-Faceted Analysis for
Combating Link Farming in the Twitter viewFile/5981/6338. Dynamic Behavioral Pattern Discovery,”
Social Network,” Proc. 21st Int’l Conf. 20. M.Araujo et al., “Com2: Fast Proc. 20th ACM SIGKDD Int’l Conf.
World Wide Web, 2012, pp. 61–70. Automatic Discovery of Temporal Knowledge Discovery and Data Mining,
13. A. Mukherjee et al., “Spotting (‘Comet’) Communities,” Advances in 2014, pp. 1186–1195.
Opinion Spammers Using Behav- Knowledge Discovery and Data Min- 27. M. Jiang et al., “A General Suspicious-
ioral Footprints,” Proc. 19th ACM ing, LNCS 8444, Springer, 2014, ness Metric for Dense Blocks in Multi-
SIGKDD Int’l Conf. Knowledge pp. 271–283. Modal Data,” to be published in Proc.
Discovery and Data Mining, 2013, 21. M. Jiang et al., “Inferring Strange IEEE Int’l Conf. Data Mining, 2015;
pp. 632– 640. Behavior from Connectivity Pattern www.meng-jiang.com/pubs/crossspot-
14. A. Mukherjee, B. Liu, and N. Glance, in Social Networks,” Advances in icdm15/crossspot-icdm15-paper.pdf.
“Spotting Fake Reviewer Groups in Knowledge Discovery and Data Min-
Consumer Reviews,” Proc. 21st Int’l ing, LNCS 8443, Springer, 2014, pp.
Conf. World Wide Web, 2012, 126–138.
pp. 191–200. 22. G. Fei et al., “Exploiting Burstiness in
15. N. Jindal and B. Liu, “Opinion Spam Reviews for Review Spammer Detec-
and Analysis,” Proc. 1st Int’l Conf. tion,” Proc. 7th Int’l AAAI Conf.
Web Search and Data Mining, 2008, Weblogs and Social Media, 2013;
pp. 219–230. www.aaai.org/ocs/index.php/ICWSM/
16. E.-P. Lim et al., “Detecting Product ICWSM13/paper/viewFile/6069/6356.
Review Spammers Using Rating Be- 23. M. Jiang et al., “Catchsync: Catch-
haviors,” Proc. 19th ACM Int’l Conf. ing Synchronized Behavior in Large
Information and Knowledge Manage- Directed Graphs,” Proc. 20th ACM
ment, 2010, pp. 939–948. SIGKDD Int’l Conf. Knowledge Dis-
17. F. Benevenuto et al., “Detecting Spam- covery and Data Mining, 2014,
mers on Twitter,” Proc. Collaboration, pp. 941–950. Selected CS articles and columns
Electronic Messaging, Anti-Abuse and 24. X. Hu et al., “Social Spammer Detec- are also available for free at
Spam Conf., vol. 6, 2010, p. 12. tion with Sentiment Information,” http://ComputingNow.computer.org.

january/february 2016 www.computer.org/intelligent 39

You might also like