You are on page 1of 20

Modeling and Aggregating

Social Network Data

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 1


Outline

 Introduction
 Network Data Representation
 Semantic-based Representation
 Ontological Representation of Social Individuals
 Ontological Relationship of Social Relationships
 Aggregating and Reasoning with Social Network Data

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 2


Introduction

 Two fundamental reasons for developing semantic-based


representations
 maintaining the semantics of social network data is crucial for
aggregating social network information
 facilitate the exchange and reuse of case study data in the
academic field of Social Network Analysis
 Current state-of- the art in network analysis ignore the
semantics of data
 Difficult to verify results independently, to carry out
secondary analysis and to compare results across different
studies
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 3
State-of-the-art in Network Data Representation

 Most common representation by Graphs


 Matrices for attribute data
 A number of formats exist for serializing such graphs and
attribute data in machine-processable electronic
documents.
 Text-based formats: Pajek and UCINET
 dot /GraphML (XML for graphs)
 Do not support the aggregation and reuse of network data
 Key challenges: Identification and Disambiguation

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 4


A simple Graph described in Pajek, UCINET and GraphXL
formats

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 5


Motivation for data aggregation

 One of the common reasons to use multiple data sources is


to perform triangulation
 use a variety of data sources and/or methods of analysis to
verify the same conclusion
 we need to be able to recognize matching instances in the
different data sources and merge the records before we can
proceed with the analysis
 We need a representation that allows to capture and
compare the identity of instances and relationships.
 Maintaining the identity of individuals and relationships is
also crucial
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 6
Example of a Case of Identity Reasoning

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 7


Semantic-based Representation

 A rich Semantic-based representation of the primary objects


in social networks data.
 A semantic-based representation will allow us to wield the
power of ontology languages and tools in aggregating data
sets through domain specific knowledge about identity
 Additional advantage is that at the same time we can easily
enrich our data set with specific domain knowledge
 Key problems in aggregating social network data are:
 Identification and disambiguation of social individuals
 Aggregation of information about social relationships

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 8


What is Ontology?

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 9


Ontological Representation of Social Individuals

 FOAF is an example of an ontological representation of


individuals
 OWL based format for representing personal information and an
individual’s social network.
 FOAF greatly surpasses graph description languages in
expressivity by using the powerful OWL vocabulary to
characterize individuals.
 Eliminates the drawbacks of early social networks like Friendster,
Orkut
 The early social networks had centralized control and were
difficult to manage
 FOAF is distributed and has a rich ontology to characterize
individuals
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 10
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 11
Drawbacks to centralized social networking
services

 The information is under the control of the database owner


 Profiles cannot be exported in machine processable formats
 data cannot be transferred from one system to the next
 do not allow users to control the information they provide on
their own terms.

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 12


FOAF Profiles

 FOAF profiles are created and controlled by the individual


user and shared in a distributed fashion.
 FOAF profiles are typically posted on the personal website of
the user and linked from the user’s homepage with the HTML
META tag.
 Distributed nature
 FOAF uses the rdfs:seeAlso mechanism to link individual profiles
and thus allow the discovery of related profiles.
 Address the issues of identification and aggregation with
foaf :Person class

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 13


An FOAF Profile

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 14


Benefits of FOAF

 An advantage of FOAF in terms of sharing FOAF data is the


relative stability of the ontology
 To facilitate adoption, terms are not added to the vocabulary ,
rather authors are encouraged to create extensions using the
mechanisms of RDF, e.g. creating subclasses and subproperties
and adding new properties to existing classes.
 The terms of the FOAF vocabulary and foaf :Person in
particular are also often referenced in other ontologies
 SIOC (Semantically Enabled Online Communities) project
 DOAP (Description of a Project) ontology.
 BuRST format
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 15
FOAF limitations

 FOAF has a poor vocabulary for describing relationships.


 There is a single foaf :knows relationship defined between
Persons and this relationship
 Use the extensibility of the RDF/OWL language to define
more precise notions of relationships.

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 16


Ontological Representation of Social
Relationships
 Social networks such as FOAF need to be extended to
support relationships
 Support the integration of social information
 Integrates/aggregates multiple social networks
 Properties of relationships
 Sign: Positive or Negative relationships
 Strength (e.g., frequency of contact)
 Provenance (different ways of viewing relationships)
 Relationship History
 Relationship roles
 Conceptual models for social data – semantic net, RDF
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 17
Aggregating and Reasoning with Social Network
Data
 Representing Identity
 URI (Universal Resource Identifier)
 Disambiguation (A and B are the same; There are two people
called John Smith)
 OWL has the “sameAS” property
 Equality
0 The property sameAs is reflexive, symmetric and transitive
 Descriptive Logic vs. Rule based reasoners
 Rule based reasoners use forward chaining and backward
chaining
 Descriptive logic is used for classification and checking for
ontology consistency
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 18
The benefits:
modelling & aggregation

 Explicit
 RDF/OWL allows to express and reason with what it means for two
things to be the same (smushing)
 Extendible
 Designed to be distributed both in terms of schema and data
 Mappings between different schemas can also be expressed in the
language
 Flexible
 Mappings can be partial, robustness*
 Standard
 Standard languages (RDFS, OWL, SPARQL)
 Standard vocabularies (DC, PRISM, SWRC)
 Standard protocols (SPARQL)
IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 19
The drawbacks

 Limited expressivity
 e.g. complex inverse functional properties
 e.g. swrc:page, prism:startingPage and prism:endingPage
 Ontology-based interchange is still partly social engineering
 Scalability

IFETCE\M.E CSE\III SEM\NE7012-SNA\UNIT 2-PPT 20

You might also like