You are on page 1of 11

This article was downloaded by: [SENESCYT ] On: 20 April 2013, At: 10:23 Publisher: Taylor & Francis

Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Information Security Journal: A Global Perspective


Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/uiss20

Managing Trust in Social Networks


G. Canfora & C. A. Visaggio
a a a

University of Sannio, Department of Engineering, Benevento, Italy Version of record first published: 05 Jun 2012.

To cite this article: G. Canfora & C. A. Visaggio (2012): Managing Trust in Social Networks, Information Security Journal: A Global Perspective, 21:4, 206-215 To link to this article: http://dx.doi.org/10.1080/19393555.2012.660677

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Information Security Journal: A Global Perspective, 21:206215, 2012 Copyright Taylor & Francis Group, LLC ISSN: 1939-3555 print / 1939-3547 online DOI: 10.1080/19393555.2012.660677

Managing Trust in Social Networks


G. Canfora and C. A. Visaggio University of Sannio, Department of Engineering, Benevento, Italy

Downloaded by [SENESCYT ] at 10:23 20 April 2013

ABSTRACT The huge diffusion of Web-based social networks increases the risks for users, including identity theft, personalized phishing, and malware. Effective mechanisms for trust evaluation can help to identify malicious users and contrast these risks. In this paper, we propose an approach to evaluate trust based on the history of the activity of each user. This guarantees robustness (it is difcult to be articially reproduced by an attacker) and is reliable (as it does not rely on user-generated tags or keyword but it makes an analysis of users conversation).
KEYWORDS application security, trust management, Web-based social networks

1. INTRODUCTION
Web-based social networks (WBSN) are on-line communities that allow users to share resources (le, video, audio, and software applications) and to establish relationships that may concern different purposes, including entertainment and business. Recently, the number of WSBN and their users has grown. The diffusion of WBSN has increased risks, which mainly concern information stealing and session hijacking. WBSN are now leveraged as principal vector for malware, as the well-known case of Koobface (Ciubotariu, 2008) demonstrated. The adoption of semantic Web technologies such as Friend of a Friend (FOAF; Brickley & Miller, 2007; Ding et al., 200) has simplied information access and dissemination over multiple WBSNs. This entails that information owners have more control over its diffusion. Solutions to limit attacks to WBSN rely basically on controlling the trustworthiness of a user asking to be accepted in the victims community. Two main approaches have been proposed in the literature: one based on prole similarity and the other assigning a trustworthiness value to users based on a voting mechanism. The rst method is weak, as a prole can be easily reproduced. This was demonstrated by some researchers who cloned proles which were recognized as real by many users of the network and included in their communities (BILGE, 2009). The vote system could be efcient and scalable, but the problem is that there is no guarantee that users will vote the trustworthiness of a user or that attacker will not drug, manipulate, or hamper the voting process of a fake prole used for accomplishing attacks.

Address correspondence to C. A. Visaggio, University of Sannio, Department of Engineering, Viale TRaiano, 1, Benevento, 82100, Italy. E-mail: visaggio@unisannio.it.

206

In this paper we propose a trust mechanism that is based on the conversations that each user had with the other members of the community. It is reasonable to assume that users will tend to talk about issues representing their own interests. Thus, the dynamics of the conversations in the network could be a dependable indicator of the dynamics of trust. The strength point of this approach is that an attacker cannot reproduce the story of conversations easily, so it has an intrinsic robustness. The paper is organized as follows. First, we discuss the risks WBSN users are exposed to, reporting some well-known attacks realized in the last years. Then, we compare and contrast weaknesses and strengths points of the current mechanisms for assuring trust in the WBSN. Finally we present our solution and a case study conducted on Facebook for evaluating its effectiveness.
Downloaded by [SENESCYT ] at 10:23 20 April 2013

2. ATTACKS THROUGH WEB SOCIAL NETWORKS


Most attacks that have been accomplished through social networks were aimed at stealing sensitive information or spreading malware. For stealing information, some methods make use of email messages that take advantage of some shared context among friends on a social network, such as birthday celebrations, living in the same town, or participating in a common event. Brown et al. (2008) identied three kinds of context aware spamming attack: Relationship-based attacks only use friend-to-friend relationship information. No other attributes from users proles are required to carry out an attack. Unshared-attribute attacks use friend-to-friend relationships, along with an attribute from only one of the parties in the relationship. An example is the use of a birthday attribute from one of the parties for devising an attack. Shared-attribute attacks use friend-to-friend relationships, along with an attribute that is visible at both parties in the relationship. Usually, if the attributes share a value (e.g., common hometown), that could help devise an attack. Narayanan and Shmatikov (2009) discuss how users privacy can be violated in social networks. A social network consists of nodes, edges, and information
207

associated with each node and edge. The existence of an edge between two nodes can be sensitive; for instance, in a sexual-relationship network with gender information attached to nodes (Bearman, 2004) it can reveal sexual orientation. Edge privacy was considered in (Backstorm, 2007). In most online social networks, however, edges are public by default, and few users change the default settings (Gross, 2005). While the mere presence of an edge may not be sensitive, edge attributes may reveal more information (e.g., a single phone call vs. a pattern of calls indicative of a business or romantic relationship). Phonecall patterns of the disgraced NBA referee Tom Donaghy have been used in investigation (Winter, 2008). In online networks, such as LiveJournal, there is much variability in the semantics of edge relationships (Fono, 2007). With regards to malware, the vehicles of diffusions are basically three. Some social networks, such as Facebook, give users the possibility to develop applications that other users can run through the social network Web page. Of course, this opens the way to the diffusion of malicious behaviors. A second way leverages social engineering techniques: the attacker creates a relationship with the victim (e.g., by chat or email), trying to lead the victim to click on a link that will execute malicious code. A third way of diffusion of malware exploits the social networks structure itself, as in the case of Koobface. The Koobface Worm (Ciubotariu, 2008) sends messages to friends of infected MySpace and Facebook users, using social engineering techniques to coerce the friends into visiting a malicious Website to watch a video and thus infect their own computers. Other observed attacks, including worms such as Samy, use cross-site scripting techniques to spread across social networks. MySpace was attacked in 2007 by a JavaScript that copied itself to the viewers prole along with a piece of text Samy is my hero (Kamkar, 2005).

3. TRUST IN WEB SOCIAL NETWORKS


Jsang et al. (2007) identify the trust as the extent to which a given entity A considers trustworthy another entity B; thus, it expresses a personal opinion of A about B, and consequently it is a subjective or local evaluation of trustworthiness. In contrast, reputation denotes the trustworthiness of a given entity for all the entities in a network, thus it expresses a collective
Managing Trust in Social Networks

Downloaded by [SENESCYT ] at 10:23 20 April 2013

opinion of a community on one of its members: it is an objective or global evaluation. A trust relationship is usually modeled as a directed edge, from the entity A to the entity B, labeled with the information about how A considers B trustworthy. The directed edge reects the asymmetric nature of trust; in fact, if A trusts B, this does not necessarily imply that B trusts A. While asymmetry occurs in all types of human relationships, it is documented more in situations where the two people are not of equal status. For example, employees typically say they trust their supervisors more than the supervisors trust the employees. This is seen in a variety of hierarchies (Yaniv & Kleinberger, 2000). Even outside hierarchies, social situations can arise with asymmetric trust relationships (Hardin, 2002; Cook, 2001). While trust is not necessarily transitivethat is, if A trusts B and B trusts C, it does not necessarily means that A trusts Cthe trustworthiness existing between entities not directly connected can be inferred by trust paths. In short, because Alice expects Bob to make good recommendations about music and Bob trusts Jane on music, Alice could merge those values to develop an idea of Janes trustworthiness. This does not necessarily mean that Alice will trust Jane, as trust does not transfer directly to Alice. Computationally, this idea of propagating trust along chains of connections has been widely studied and implemented (Jsang, 1996; Richardson et al., 2003; Ziegler & Lausen, 2004a).

3.1. Algorithms and technologies


Golbeck (2005) advocates the adoption of either binary or scalar trust relationships, arguing that the semantics of trust must be clear to average Web users, as WBSN members are, and that the task of trust specication must be as simple as possible. The D-FOAF system (Kruk et al., 2006; Choi et al., 2006) assumes that the trust level of a relationship between two WBSN members is determined by the path having the highest trust level among those of a given maximum length. The trust level of a path is computed as the product of the trust levels of its edges. Marsh (1994) describes a system for modeling trust based on social and psychological factors. It consists of interacting agents that could maintain information about history and observed behaviors. Unfortunately, in many WBSN that implement trust, users assign a
G. Canfora and C. A. Visaggio

single rating without explicit context or history to their neighbors, and thus much of the information needed for a system such as Marshs is missing. Castelfranchi and Falcone (1998, 2002) present an analysis of trust in multiagent systems. They identify the components forming trust, including the beliefs that must be held to develop trust and how belief is connected to previous experience; they also discuss the rational and irrational nature of trust. Their work relies on the psychological literature and includes many psychological factors in developing a model for trust in multiagent systems. In Web-based social networks, however, most information required to build the model is not explicit or easy to acquire. In the recommender systems literature, several researches deal with the relationship between similarity and trust. Sinha and Swearingen (2001) and Swearingen and Sinha (2001) show that users trust recommendations from friends more than from systems. Ziegler and Golbeck (2006) show that there is a strong and signicant correlation between trust and user similarity: the more similar two people are, the greater the trust between them. There are a number of works on algorithms for inferring trust in social networks. While designed for peer-to-peer systems, one of the most widely cited trust algorithms is EigenTrust (Kamvar et al., 2003). It computes trust as a function of corrupt versus valid les that a peer provides using a variation on the PageRank algorithm (Page et al., 1998) used by Google for rating the relevance of Web pages to a search. Advogato is a Website (http://advogato.org) where software developers freely discuss and share resources. It is also the testbed for Raph Levins trust metrics research (Levin & Aiken, 1998). Each user on the site has a single trust rating computed from the perspective of designated seeds (authoritative nodes). Many algorithms to compute trust exploit principal eigenvalues (Richardson et al., 2003); this entails that trust must rst be normalized to work within a matrix. However, socially trust is not a nite resource; it is possible to have very high trust for a large number of people, and that trust is not any weaker than the trust held by a person who only trusts one or two others. Golbeck (2005) denes the TidalTrust algorithm for computing personalized trust values in social networks. Unlike the eigenvalue-based approaches, it outputs a trust recommendation in the same scale that users assign trust values. Robertson et al. (2010) leverage
208

social networks to help making decisions about the presence or lack of malicious intent in Web content. The approach applies social networking mechanisms to malware detection: the underlying assumption is that a link sent to you by someone you talk often with, or have many mutual contacts, is much less likely to be malicious than a link you receive from someone you rarely talk to or who is at the far reaches of your social network. These algorithms are effective at predicting trust and for improving recommendations in certain cases. However, they can only be used when there are paths in the social network connecting individuals and when the trust values on those paths are accessible.

4. MODEL FOR TRUST IN WEB-BASED SOCIAL NETWORKS


Downloaded by [SENESCYT ] at 10:23 20 April 2013

Our model aims at assigning a level of trust to each networks member, relying on the history of the members social activity in the network. For social activities we mean the conversation that a member had with other members. The underlying idea is that the communitys members will talk about a certain topic more intensively with the persons they consider trustworthy on that topic than with the people they consider untrustworthy on that issue. Consequently, we evaluate trust for a specic semantic domain; this entails that Joan can be trustworthy for the theme music, but she could be untrustworthy for the theme soccer. This happens because people tend to talk with people mainly about their interests and will talk with people that share that interest. Thus, if Joan is fond of music, we can expect that a huge part of her conversation will regard music, and many of her friends will talk with her about music. The assumption underlying this model is that the trust of a member about a certain theme in WBSN is proportional to how much and how long other networks members talk with that member about that theme, being conversation the main social activity in WBSN. Each user is associated with a couple (theme, if ), where theme identies a semantic domain and if is the evaluation of how much the user is trusted on the domain theme. if stands for information ow and is a function of the number of messages the user exchanged with the other members of the network on the argument identied by the theme and the relevance of the messages to the theme. For instance, if
209

the user Mark is characterized by the vector{(music, 4); (soccer, 1); (computer science, 12)} this means that in the story of his discussions on the social network Mark discussed mainly about three classes of arguments: computer science, music and soccer, in descendent order of importance. In other words, the model is saying that Mark uses to talk mostly about computer science; his second favorite argument is music , and, at last soccer . In such case, we claim that the community considers Mark more trustworthy with regard to music than soccer . Let us assume that the users u1 and u2 share the interest theme k, with these correspondent if s: u1 (s, if u1,k ) and u2 (s, if u2,k ). To evaluate whether two users could trust each other, we check if they have comparable levels of trust in correspondence of a same theme: this should point that the two users share common interests. For each themek the similarity Su1,u2,k between user u1 and user u2 is computed as: Su1,u2,k = |ifu1,k ifu2,k | The closer to zero this difference, the stronger the trust relationship between the two users is: this means that they talked of the same topic (themek ) with similar intensity. Let us now consider the case of comparable themes. Two themes coincide when they represent the same concept and are expressed with the same label (soccer, soccer ). Two themes are comparable when they are linked between each other by relationships of sinonimy or hyper(/hypo)nymy. A hyponym is a word or phrase whose semantic eld is included within that of another word, its hypernym (sometimes spelled hyperonym outside of the natural language processing community). In simpler terms, a hyponym shares a type-of relationship with its hypernym. For example, scarlet , vermilion, carmine, and crimson are all hyponyms of red (their hypernym), which is, in turn, a hyponym of color . Comparable themes are connected by relationships of sinonimy and iponomy. For instance, soccer and football are synonyms, while drums and guitar are two iponomies of musical instruments. Words connected by iponomy/ipernomy relationship are linked in a hierarchical structure, as shown in Figure 1. It could happen that the two users u1 and u2 discuss themes that belong to the same hierarchy, like the case of Mark={ (sport, 10)} and Jean={(soccer, 8)}. In this
Managing Trust in Social Networks

FIGURE 1 Example of ipononmy/ipernomy tree.

the user under analysis and the ones received by the user in the same conversation. The total information ow of the user u1 on the theme t is the arithmetic means of the partial i-ows computed on the M conversations of the user u1 concerning the theme t :
M

case, with u1 (s, if u1,s ) and u2 (t, if u2,s ) we will compute the similarity Su1,u2,s,t between user u1 and user u2 on themes s and t as: Su1,u2,s,t = |ifu1,s ifu2,t + 1| (a) where a is k n, k is a constant and n is the sum of segments of the minimum path between s and t , that passes through the root of the hypernymy/hyponymy hierarchical tree. In our experimentation we set k = 0.1. In the case of the example in Figure 1, if s is soccer and t is sport, then a = 0.1; if s is 5 players soccer and t is basketball , then a = 0.3. The themes that are not common between the two users under analysis and that do not belong to the same hierarchy are not included in the computation of trust. Let us see how the if of a conversation is computed. A conversation between the user u1 and the user u2 is composed of a set of messages, and each message contains a set of terms. The value of if expresses how much that set of messages is consistent with a certain theme s. In other words if s is soccer , if measures how much that conversation concerns soccer . Each theme is associated with a thesaurus that contains the key terms for that theme. First, the if for a message i of a conversation is computed as: ifi,t = OccurrencesOfKeyTerms MeaningfulWords

ifpj iful,t =
j =0

We now compute the trust among two users, u1 and u2 : u1 : {(s, ifu1,s ); (t, ifu1,t ); (v, ifu1,v ); . . . ; (k, ifu1,k )} u2 : {(s, ifu1,s ); (j, ifu2,j ); (v, ifu2,v ); . . . ; (n, ifu2,n )} We dene the set of all themes of the users u1 and u2 , namely the interest themes of a user, as follows: themesu1 = {s, t, v, . . . , k} themesu2 = {s, j, v, . . . , n} The number of themes can vary among different users; some themes can stand in both sets while other themes can be part of a same hypernymy/hyponymy hierarchical tree. We compute the common themes among users u1 and u2 as: CTu1,u2 = t, s | t themesu1 s themesu2 ((t s) (t and s belong to the same hypernymy / hyponymy hierarchical tree)) The cumulative trust for a couple of users is computed by considering all the m themes in CTu1,u2 :
m

Downloaded by [SENESCYT ] at 10:23 20 April 2013

where OccuerencesOfKeyTerms is the sum of the occurrences of each key term belonging to the thesaurus of the theme t and present in the message; MeaningfulWords is the total number of words, except stop words. The ifp, i.e. the if for a single conversation (p stands for partial) for a certain theme t for a user is the arithmetic mean of all the if i,t of all the K messages pertinent with theme t and belonging to the same conversation. The K messages are both the ones written by
G. Canfora and C. A. Visaggio

CTRUSTu1,u2 =

j ,k1

|iful, tj ifu2, tk| m , where tj

and tk (j = k) are equal themes,


m

CTRUSTu1,u2 =

j ,k1

|iful, tj ifu2, tk + 1| a m , where tj

and tk (j = k) are comparable themes.


210

For a community of K users, it is possible to dene the trustworthiness of each user ui as


k

Trustui =
j =i , j =1

CTRUSTuiuj

The second run was in-vivo. We analyzed the home page of the subjects for a certain timeframe and identied the messages concerning the ve topics of observation. Such conversations are called undrugged in the remaining of the paper. Each run lasted ve months, for a total observation of ten months.

5. A CASE STUDY
We performed a case study that addresses two research questions: RQ1: Is the model able to correctly identify the trust relationship within a network of friends? RQ2: Does the thesaurus size increase the accuracy of the models measures?

5.1.1. Assessment questionnaire


At the end of the observation period, each subject received a questionnaire aimed at evaluating the actual trust relationships. An excerpt of it is showed in Figure 2 properly anonymized. The trust was evaluated in a binary form: I trust or not a buddy of mine. This consideration is consistent with the context of this paper: if I accept a friend in my network then I trust her, and if I do not trust her I do not accept her. However, our model is able to evaluate trust as a continuous value for application in other contexts.

Downloaded by [SENESCYT ] at 10:23 20 April 2013

5.1. Case study design


The study consisted of observing the behavior of a group of WBSN members and comparing the trust relationships computed by the model with the real world relationships. To determine the actual trust relationships among the experimental subjects, we selected a group of friends belonging to the Facebook personal social network of one of the authors (in the following, the experimenter). By this way, we were able to select the semantic domains more appropriate to the proles of the subjects. We used ve domains, namely soccer, music, politics, shoes, and cinema, selected for convenience. Not all the subjects were fond of all the arguments; we knew that the experimenter considered some of them trustworthy for some arguments, and not for other ones. Of course, standing the subjectivity of the trust, the subjects the experimenter considered trustworthy were not necessarily considered trustworthy also by the other subjects. An assessment questionnaire helped to establish the actual trust among subjects. Two runs were performed. The rst was an in-vitro run and considered conversations that were stimulated by the authors. In particular, as the experimenter knew that certain arguments could stir the curiosity of certain subjects, he started specic conversations with some of them in different ways: posting messages on his own home page and tagging those subjects or posting those messages in the friends home page. Such conversations are called drugged in the remaining of the paper.
211

5.1.2. Sampling of subjects and preparation


Twenty-three subjects took part to the study. They were not informed about the study before the observation but only after, to avoid that the consciousness of the study could alter their behavior, and thus the validity of the study. At the end of the study, we collected the approval of the subjects to use the collected data set. The personal knowledge of each subject helped to enforce the condence in the questionnaires results.

5.1.3. Metrics
The metrics applied were the ones introduced in the section 4. For each metrics the false positives, false negatives and correct values were computed. As the metrics vary into a continuous interval, a threshold was established to decide which value corresponds to a trust relationship and which one does not. The authors developed a software tool which collects information from the social networks pages of the subjects and computes the values.

5.2. Analysis of results


With regards to RQ1, from the results of the study (Tables 1 and 2) it emerges that the proposed method is able to correctly evaluate trust, as it identies the actual trust relationships among the communitys members. False negatives are maintained consistently low: constantly at 0% in the drugged conversations, between
Managing Trust in Social Networks

Questionnaire Number

Please, evaluate the trust of all these buddies of your friends list by with by putting a cross on the value you consider adequate.

Subject Mark Smith

Soccer

Music

Politics

Shoes

Cinema

a) very much a) very much a) very much a) very much a) very much b) not at all b) not at all b) not at all b) not at all b) not at all

Vera Jackson

a) very much a) very much a) very much a) very much a) very much b) not at all b) not at all b) not at all b) not at all b) not at all

Noam Hancock a) very much a) very much a) very much a) very much a) very much b) not at all b) not at all b) not at all b) not at all b) not at all

FIGURE 2 An excerpt of the assessment questionnaire.

Downloaded by [SENESCYT ] at 10:23 20 April 2013

0% and 18% in the undrugged conversations. False negatives can be a problem, but they are less serious than false positives. In the case of false negatives, a person risks to reject a person who is potentially trustworthy, so it is inconvenient, especially for WBSN management that should encourage socialization and exploration of new friends, but not dangerous. In the case of false positives, the problem is more serious as a person that is potentially risky is classied as trustworthy. Unfortunately the model produces a number of false positives that is relatively high, with 47% peak in the worst case. The number of correct values is more than 50% of the sample. In two cases the number of correct values is around 90% while in one case it is 100%.
TABLE 1
Results of the drugged conversations

The accuracy of the method could be improved by increasing the thesaurus size, as shown in Tables 1 and 2. The method evaluates that a certain conversation regards the argument soccer by counting the occurrences of keywords collected in the thesaurus of soccer theme. Consequently, to have a larger number of keywords in the thesaurus entails increasing the possibility to correctly classify the conversations. We believe that the low number of false positives obtained in the soccer domain is explained by the fact that the correspondent thesaurus included many nouns, such as names of players, teams and stadiums which have no synonyms in other domains (synonyms are a main causes for false positives). The length of messages also seems to affect results: a likely reason why

CINEMA Thesaurus Size 30 60 90 False Positives 20% 10% 10% False Negatives 0% 0% 0% Correct Values 80% 90% 90% False Positives 30% 47% 47%

MUSIC False Negatives 0% 0% 0% Correct Values 70% 53% 53% False Positives 30% 45% 45%

SHOES False Negatives 0% 0% 0% Correct Values 70% 55% 55%

TABLE 2

Results of the undrugged conversation

CINEMA

POLITICS

SHOES

SOCCER

Thesaurus False False Correct False False Correct False False Correct False False Correct Size Positives Negatives Values Positives Negatives Values Positives Negatives Values Positives Negatives Values 30 60 90 10% 10% 10% 0% 0% 0% 90% 90% 90% 30% 19% 0% 0% 0% 0% 70% 81% 100% 32% 32% 30% 0% 0% 0% 68% 68% 70% 20% 19% 10% 10% 10% 18% 70% 71% 72%

G. Canfora and C. A. Visaggio

212

Downloaded by [SENESCYT ] at 10:23 20 April 2013

the cinema theme has the higher number of correct values is that posts falling in this category contained the higher number of words. For the same reason, also politics produced a high number of correct values. Conversely, in the other domains posts were usually short sentences about the performances of a certain player or how users appreciated certain kinds of shoes. Performances of method are better when we observe a higher number of messages, as it emerges from Figure 3: the smaller the number of messages, the higher the probability that the measure is affected by error, that is, the subject is talking about the argument soccer incidentally and not because the subject soccer is a favorite argument of discussion. There are several other dimensions that could inuence results, including objectiveness or subjectiveness of the content, whether one is more factual, and the amount of passion/interest generated by those involved in the discussions. Further research is needed to nding suitable ways to measure and assess the inuence of these factors. To answer RQ2, we computed correlations between the number of messages and the number of correct values evaluated at different thesaurus sizes. We performed Spearman Test in order to evaluate the correlation between the number of correct values and the number of messages analyzed. We performed this test

TABLE 3 Correlation between the number of messages and the number of correct values. Results of the Spearman R-test
Thesaurus size 30 60 90 All Data Spearman R 0.31 0.48 0.82 0.54 p-level 0.491 0.268 0.023 0.010

with three different sizes of Thesaurus: 30, 60, and 90 terms. As showed in Table 3, only when the thesaurus size is 90 we have a correlation with statistical signicance. The results of the study produced another interesting outcome: activity can be a proxy measure to identify trustworthy people. Activity is intended as the number of messages sent and received in a timeframe by the user. There may be only two reasons why a member has a low activity in the social network. That member does not want to be involved in conversation with other members (the member does not want to build a trust relationship with the community), or other members do not want to involve that member in conversation (community does not recognize that member trustworthy). In both the case, and this is our conjecture, low level of activity is related to low level of trust. To demonstrate such a conjecture, we performed a Spearman test (p-level <.05) for evaluating the correlation between activity and if. Table 4 does not provide any statistical evidence about this conjecture in each sample segmentation, but evidence is obtained if we execute the test with all the observed data: this fact makes us to believe that more data points are needed.

6. CONCLUSIONS
The main weakness point is that the method relies on the human language which presents some properties that make hard to be analyzed by a software agent: ambiguity, the use of slang, the use of irony or sarcasm,
TABLE 4 Correlation between the If and the activity metrics.
Results of the Spearman test

Thesaurus size 30 60 90 All Data

Spearman R 0.40 0.55 0.23 0.36

p-level 0.315 0.666 0.301 0.005

FIGURE 3
213

The number of correct values increases with the number of messages. (color gure available online.)

Managing Trust in Social Networks

and of course spelling or typing error (which in many cases are easily corrected by the human brain but not by a software program with the same precision). The main strength point is that the method could offer a reliable measure of trust, based on the history of members activities in the social network, and consequently cannot be hampered easily. There are several directions in which the work presented in this paper can be extended. A rst area of investigation is concerned with the identication of mechanisms for determining the trust directionality, that is, which user is more trusted, for example, a teacher communicating with a student or an expert communicating with a novice. A factor to investigate is whether one individual is regularly more dominant in conversations for a given theme. If several people converse with Mark about soccer, and he always contributes far more or far longer messages, it might be a sign that he is a trusted expert. A second area of investigation is aimed at the denition of mechanisms for protecting the system against automated messages, which could alter the real trust relations. There are different lines of action that could be explored, including checking timestamps of messages to determine whether they were posted faster than humanly possible and detecting posts which contain exactly the same text sent to different people. A key aspect to be improved with future work regards the determination of the context and meaning of a conversation with the aim of establishing whether messages are legitimate part of it or just users/bots trying to conduct keyword bombing to raise their trust level on a given theme. We are developing a method based on Bayesian nets and thesauri to assigning a post to a semantic context.

REFERENCES
Bearman, P., Moody, J., and Stovel, K. (2004). Chains of affection: The structure of adolescent romantic and sexual networks. American Journal of Sociology, 110(1), 4491. Backstrom, L., Dwork, C., and Kleinberg, J. (2007). Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In World Wide Web Conference, May 2007. Bilge, L., Strufe, T., Balzarotti, D., and Kirda, E. (2009). All your contacts belong to us: Automated identity theft attacks on social networks. 18th International World Wide Web Conference, April. Brickley, D. and Miller, L. (2007). FOAF vocabulary specication 0.91. Namespace Document. Retrieved from http://xmlns.com/foaf/0.1 Brown, G., Howe, T., Ihbe, M., Prakash, A., and Borders, K. (2008). Social networks and context-aware spam. Proceedings of the ACM 2008 Conference on Computer Supported Cooperative Work, pp. 403412.

Castelfranchi, C. and Falcone, R. (1998). Principles of trust for MAS: Cognitive anatomy, social importance, and quantication. Proceedings of the 3rd International Conference on Multi Agent Systems. Paris, France. Castelfranchi, C. and Falcone, R. (2002). Social trust: A cognitive approach. In C. Casteleranchi and Y.-H. Tan (Eds.), Trust and deception in virtual societies. Dordrecht, Holland: Kluwer Academic Publishers. Choi, H.-C., Kruk, S. R., Grzonkowski, S., Stankiewicz, K., Davis, B., and Breslin, N, J. G. 2006. Trust models for community-aware identity management. Proceedings of the Identity, Reference, and the Web Workshop (IRW06). Retrieved from http://www.ibiblio.org/hhalpin/ irw2006/skruk.pdf Cook, K. (Ed.). (2001). Trust in society. New York, NY: Russell Sage Foundation. Ding, L., Zhou, L., Finin, T. W., and Joshi, A. (2005). How the semantic Web is being used: An analysis of FOAF documents. Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS05), IEEE, Los Alamitos, CA. Fono, D. and Raynes-Goldie, V. (2007). Hyperfriends and beyond: Friendship and social norms on LiveJournal. Internet Research Annual, Vol. 4: Selected papers from the Association of Internet Researchers Conference. Gross, R., Acquisti, A., and Heinz, H.. (2005). Information revelation and privacy in online social networks. Workshop on Privacy in the Electronic Society, Alexandria, VA. Goldbeck, J. A. (2005). Computing and applying trust in Web-based social networks. (Unpublished doctoral dissertation), Graduate School of the University of Maryland, College Park. Hardin, R. (2002). Trust & trustworthiness. New York, NY: Russell Sage Foundation. Ciubotariu M. (2008). W32.koobface.a. Symantec, August 2008. Kamvar, S. D., Schlosser, M. T., and Garcia-Molina, H. 2003. The EigenTrust algorithm for reputation management in P2P networks. Proceedings of the 12th International World Wide Web Conference, Budapest, Hungary. Retrieved from http://nambila/popular/tech.html Kamkar, S. (2005). Technical explanation of the myspace worm. Levin, R. and Aiken, A. (1998). Attack resistant trust metrics for public key certication. 7th USENIX Security Symposium, San Antonio, Texas. Marsh, S. (1994). Formalising trust as a computational concept. (Unpublished doctoral dissertation), Department of Mathematics and Computer Science, University of Stirling. Narayanan, A. and Shmatikov, V. Retrieved from 2009 http://www.cs. utexas.edu/~shmat/shmat_oak09.pdf Page, L., Brin, S., Motwani, R., and Winograd, T. (1998). The PageRank citation ranking: Bringing order to the Web. Tech. Rep., Stanford University, Stanford, CA. Richardson, M., Agrawal, R., and Domingos, P. (2003). Trust management for the semantic Web. Proceedings of the 2nd International Semantic Web Conference, Sanibel Island, FL. Sinha, R. and Swearingen, K. (2001). Comparing recommendations made by online systems and friends. Proceedings of the DELOS-NSF Workshop on Personalization and Recommender Systems in Digital Libraries, Dublin, Ireland. Swearingen, K. and Sinha, R. (2001). Beyond algorithms: An HCI perspective on recommender systems. Proceedings of the ACM SIGIR 2001 Workshop on Recommender Systems, New Orleans, LA. Jsang, A. (1996). The right type of trust for distributed systems. Proceedings of the 1996 New Security Paradigms Workshop. Lake Arrowhead, CA. Jsang, A., Ismail, R., and Boyd, C. (2007). A survey of trust and reputation systems for online service provision. Decis. Support Syst., 43(2), 618644. Kruk, S. R., Grzonkowski, S., Choi, H.-C., Woroniecki, T., and Gzella, A. (2006). D-FOAF: Distributed identity management with access

Downloaded by [SENESCYT ] at 10:23 20 April 2013

G. Canfora and C. A. Visaggio

214

rights delegation. Proceedings of the Asian Semantic Web Conference (ASWC06). Berlin: Springer, pp. 140154. Richardson, M., Agrawal, R., and Domingos, P. (2003). Trust management for the semantic Web. Proceedings of the 2nd International Semantic Web Conference, Sanibel Island, FL. Robertson, M., Pan, Y., and Yuan, B. (2010). A social approach to security: Using social networks to help detect malicious Web content. Proceedings of 2010 International Conference on Intelligent Systems and Knowledge Engineering (ISKE2010), November, Hangzhou, China.

Yaniv, I. and Kleinberger, E. (2000). Advice taking in decision making: Egocentric discounting and reputation formation. Organizat. Behav. Human Decision Process., 83(2), 260281. Winter, J. (2008). Disgraced former NBA referee Tim Donaghys phone calls to second ref raise questions. Retrieved from http://www. foxnews.com/story/0,2933,381842,00.html Ziegler, C. N. and Lausen, G. (2004a). Spreading activation models for trust propagation. Proceedings of the IEEE International Conference on E-Technology, E-Commerce, and E-Service, Taipei, Taiwan.

Downloaded by [SENESCYT ] at 10:23 20 April 2013

215

Managing Trust in Social Networks

You might also like