Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Download
Standard view
Full view
of .
Look up keyword
Like this
0Activity
0 of .
Results for:
No results containing your search query
P. 1
Modeling Personal and Social Network Context for Event Annotation in Images

Modeling Personal and Social Network Context for Event Annotation in Images

Ratings: (0)|Views: 1|Likes:
Published by Sowarga

More info:

Published by: Sowarga on Sep 30, 2013
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

09/30/2013

pdf

text

original

 
 
Modeling Personal and Social Network Context for EventAnnotation in Images
 Bageshree Shevade Hari Sundaram Lexing Xie
 Arts Media and Engineering, Arizona State University IBM TJ Watson Research Center 
{bageshree.shevade, hari.sundaram}@asu.edu xlx@us.ibm.com
ABSTRACT
 
This paper describes a framework to annotate images using  personal and social network contexts. The problem is important as the correct context reduces the number of image annotationchoices.. Social network context is useful as real-world activitiesof members of the social network are often correlated within a specific context. The correlation can serve as a powerful resourceto effectively increase the ground truth available for annotation.There are three main contributions of this paper: (a) development of an event context framework and definition of quantitativemeasures for contextual correlations based on concept similarityin each facet of event context; (b) recommendation algorithmsbased on spreading activations that exploit personal context aswell as social network context; (c) experiments on real-world,everyday images that verified both the existence of inter-user  semantic disagreement and the improvement in annotation whenincorporating both the user and social network context. We haveconducted two user studies, and our quantitative and qualitativeresults indicate that context (both personal and social) facilitateseffective image annotation.
Categories and Subject Descriptors
 H.3.3 [
Information Search and Retrieval
] Information filtering,search process
General Terms
 Algorithms, Experimentation, Human Factors
Keywords
 Social networks, context, event annotation, images, contentmanagement, multimedia
1
 
INTRODUCTION
In this paper, we develop a novel collaborative annotative systemthat exploits the correlation in user context and the social network context. This work enables members of a social network toeffectively annotate images. The problem is important sinceonline image sharing frameworks such as Flickr [1] have becomeextremely popular, yet the user-supplied tags are relatively scarcecompared to the number of annotated images. The same tag canadditionally be used in different senses, making the problem evenmore challenging. In such systems, text tags are the primarymeans used to search for photos, hence robust annotation schemesare very much desired. The social network context is importantwhen different users’ annotations and their correspondingsemantics are highly correlated.The annotation problem in social networks has several uniquecharacteristics different from the traditional annotation problem.
 
The participants in the network are often family, friends, or co-workers, and know each other well. They participate incommon activities – e.g. traveling, attending a seminar,going to a film, parties etc.
There is a significant overlap intheir real world activities.
 
Social networks involve multiple users – this implies thateach user may have a distinctly different annotation scheme,and different mechanisms for assigning labels. There may besignificant
 semantic disagreement 
amongst the users, over the same set of images.The traditional image annotation problem is a very challengingone – only a small fraction of the images are annotated by theuser, severely restricting the ground truth available. This makesthe problem of developing robust classifiers hard. It may seemfruitless to develop annotation mechanisms, for social networks – where the problems seem to have multiplied.Counter intuitively, the annotation problem is
helped 
 by theformation of the social network. The key observation here is thatmembers of the social network have highly correlated real-worldactivities – i.e. they will participate in common activities together,and often repeatedly. For example two users may be good friendswho always do things together – e.g. shop together at theMcAllister mall. For user1, “shopping” and “user2” and“McAllister” go together. Also, the shopping event may recur many times over their friendship. Since their activities arecorrelated, such correlation can have effects on the data – theywill often take images of the same event / activity,
thus effectivelyincreasing the ground truth
available. Detecting correlationamongst members of the social network, and the
 specific context 
in which these common activities occur, can greatly help theannotation algorithms.In our approach we define event context – the set of facets /attributes (image, who, when, where, what) that support theunderstanding of everyday events. Then we develop measures of similarity for each event facet, as well as compute event-event anduser-user correlation. The user context is then obtained byaggregating event contexts and is represented using a graph.Recommendations are generated using a spreading activationalgorithm on the user context, when given a query event attribute.For social network based recommendations, we first find theoptimal recommender, by computing the correlations between the personal context models of the network members. Then we perform activation spreading on the recommender, but filter therecommendations with a salient subset of the current user’scontext.We have conducted two experiments – one to verify the empiricalobservation that there exists semantic disagreement and the
Permission to make digital or hard copies of all or part of this work for  personal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and thatcopies bear this notice and the full citation on the first page. To copyotherwise, or republish, to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.
 JCDL’07 
, June 17–22, 2007, Vancouver, British Columbia, Canada.Copyright 2007 ACM 978-1-59593-644-8/07/0006...$5.00.
 
 
second on our proposed personal and social context basedannotation system. Our first user experiment on real-world personal images indicates that semantic diversity is greater oneveryday personal photos when compared to the Corel dataset..The second experiments indicate that context (both personal andsocial) can significantly help event annotation when compared to baseline recommendation systems.In the next section we review related work in this area. In section3 we present the event context framework. In section 4, we present our recommendation algorithms that use personal andsocial context. We discuss two experiments in section 5 andsection 6, the former presents the experimental evidence for inter-user disagreement, and the latter shows the improvement inannotation when the user and social network context areincorporated.
2
 
RELATED WORK 
There has been recent interest in ‘
 folksonomy
’ [8,15,17]. It has been noted that a large number of ordinary untrained folk aretagging media as part of their everyday encounters with the web(http://del.icio.us
 
), or with media collections(http://www.flickr.com
 
). The attraction of folksonomy lies in theidea that collective tagging can significantly reduce the time todetermine media that are semantically relevant to the users, for example as part of a search. Our research for image annotation broadly falls under this umbrella – the key issue is that we believethe annotations to be recommended are only interesting / relevant,
 provided the context is correct.
There has been prior work in using groups for the purposes of image annotation / labeling [2,16]. In the ESP game [2], theauthors develop an ingenious online game, in which people playagainst each other to label the image. In [16] the authors take intoaccount browsing history with respect to an image search for determining the sense associated with the image. Both work aimsat recovering one
correct 
sense either shared by commonknowledge or the user’s own history. The context in which theannotation is used / labeled is not taken into account. In [19] theauthors explore a collaborative annotation system for mobiledevices. There they used appearance based recommendations aswell as location context to suggest annotations to mobile users.In [13], the authors provide label suggestions for identities basedon patterns of re-occurrence and co-occurrence of different peoplein different locations and events. However, they do not make useof user-context, or commonsensical and linguistic relationshipsand group semantics.In [4,10], the authors use sophisticated classification techniquesfor image annotation. However, they do not investigatecollaborative annotation within a social network. The image basedclassifier schemes run into two broad problems: (a) scalability – each tag, requires its own classifier, and (b) the fact that peoplemay use a tag in very different senses makes the classifiersdifficult to build.A key limitation of prior work is that there is an implicitassumption that there is one correct semantic, that needs to beresolved through group interaction / classification. In socialnetworks the assumption of consistent labeling of images (thusimplying semantic agreement) over the dataset may not hold over a diverse set of concepts. In prior work [14], we have observedthat there is non-negligible disagreement among users, particularly on concepts that are more abstract rather thanconcrete. For example, people are more likely to disagree onabstract concepts such as “love”, “anger”, “anxiety” etc. ascompared to everyday concepts such as “pen”, “light bulb”, “ball”etc.Secondly, the context in which the annotation is used / labeled isnot taken into account. We argue that for effective annotation weneed to extend the feature / text based approaches to annotation asthey do not exploit
the context in which the annotations have beenmade.
Users may annotate very similar images (say from theworkplace) with very different tags, while they may use the sametags to describe very different activities. Thus, in order tounderstand these differences, we need to understand the context inwhich these annotations were used.We believe that in general, both traditional taxonomies (asimplied by traditional image based classifiers) as well asfolksonomies are needed in semantic annotation. However, bothframeworks will benefit through the incorporation of eventcontext. We next present our event context framework and itsrelationship to user context..
3
 
EVENT CONTEXT
An event refers to a real-world occurrence, which is describedusing attributes such as images, and facets such as who, where,when, what. We refer to these attributes as the event context – 
the set of attributes / facets that support the understanding of everyday events
. This event model definition draws upon recentwork by Jain and Westermann [18]. While the notion of an eventcan be abstract in general, in this paper we restrict our discussionof events to being associated with
a single time and place
only.The notion of “context” has been used in many different waysacross applications [6]. Note that set of contextual attributes isalways application dependent [7]. For example, in ubiquitouscomputing applications, location, identity and time are criticalaspects of context [6]. In describing everyday events the
who
,
where
,
when
,
what 
are among the most useful attributes, just asnews reporting 101 would teach "3w -- who when where" as the basic background context elements for reporting any real-worldevent.
Figure 1:
Context plane graphs for the who, where, when,what and the images facets of a context slice. The nodes inthe context plane graph are the annotations and the black edges indicate the co-occurrence of the annotations. Notes(.,.) denote the facet similarity between twowords/locations/activity etc. The strong (black) links denoteassociation, i.e., nodes in different planes are connected if they co-occur in one image; the weak (gray) links denotesimilarity, i.e., edge strength obtained by evaluating thesimilarity function between two nodes in the same facet.
 
imageswho:when:what:where:
 p
1
 p
s(p
1
, p
 )
 
s(l 
1
, l 
 )
 
1
1
s(y 
1
, y 
 )
 
a
1
a
s(a
1
, a
 )
 
1
s(t 
1
, t 
 )
 
 
 
3.1
 
The user context model
In our approach the user context is derived through aggregationover the contexts of the events in which the user has participated.This can be conceptualized as a graph, where the semantics of thenodes are from each different event facet (who, where, what,when and image), and the value of each node then is thecorresponding image feature / text annotation. The edges of thegraph encode the co-occurrence relationship as weights. So if “Mary” and “Mall” co-occur twice, then the strength of the edge between the nodes is 2. Figure 1 show the user context.ConceptNet [11] is used to get contextual neighborhood nodes for the
what facet 
nodes that are already present in the graph. Thisenables us to obtain additional relevant recommendations for theuser. For every
what 
node in the graph, the system introduces topfive most relevant contextual neighborhood concepts obtainedfrom ConceptNet as new nodes in the graph. These nodes areconnected to the existing nodes with an edge strength of 1. Thesenodes now become a part of the context model.
3.2
 
Concept similarity
We now discuss the similarity measures for the different eventfacets. We first derive a new ConceptNet based event similaritymeasure for a pair of concepts. We then extend this similaritymeasure to two sets of concepts. Similarity measures over thecontext facets are then defined using the two above measures.
3.2.1
 
The ConceptNet based semantic distance
In this section, we shall determine a procedure to computesemantic distance between any two concepts using ConceptNet – a popular commonsense reasoning toolkit [11].ConceptNet has several desirable characteristics that distinguish itfrom the other popular knowledge network – WordNet [12]. First,it expands on pure lexical terms to include higher order compoundconcepts (“buy food”). Secondly, it greatly expands on threerelations found in WordNet, to twenty. The repository representssemantic relations between concepts like
“effect-of”
,
“capable-of”
,
“made-of”,
etc. Finally, ConceptNet is powerful because itcontains practical knowledge – it will make the association that“students are found in a library” whereas WordNet cannot makesuch associations. Since our research is focused on recommendingannotations to images from everyday events, ConceptNet is veryuseful.The ConceptNet toolkit [11] allows three basic functions on aconcept node [11]:
 
GetContext(node)
– this finds the neighboringrelevant concepts using spreading activation around thenode. For example – the neighborhood of the concept
“book”
includes
“knowledge”
,
“library”
,
“story”
,
“page”
etc. ConceptNet terms this operation as“contextual neighborhood” of a node.
 
GetAnalogousConcepts(node)
– Two nodes areanalogous if they derive incoming edges (note that eachedge is a specific relation) from the same set of concepts. For example – analogous concepts for theconcept
“people”
are
“human”
,
“person”
,
“man”
etc.
 
FindPathsBetweenNodes(node1,node2)
– Find paths in the semantic network graph between twoconcepts, for example – path between the concepts
“apple”
and
“tree”
is given as
apple [isA] fruit 
,
 fruit [oftenNear] tree
.
 Neighbors of Concepts:
Given two concepts
e
and
 f 
, the systemdetermines all the concepts in the contextual neighborhood of 
e
,as well as all the concepts in the contextual neighborhood of 
 f 
. Letus assume that the toolkit returns the sets
e
and
 f 
containing thecontextual neighborhood concepts of 
e
and
 f 
respectively. Thecontext-based semantic similarity
 s
c
(e,f)
between concepts
e
and
 f 
 is now defined as follows:
 
||(,),||
ece
C seC
=
<1>where
|C 
e
 f 
|
is the cardinality of the set consisting of commonconcepts in
e
and
 f 
and
|C 
e
 f 
|
is the cardinality of the setconsisting of union of 
e
and
 f 
.
 
 Analogous Concepts:
Given concepts
e
and
 f 
the systemdetermines all the analogous concepts of concept
e
as well asconcept
 f 
. Let us assume that the returned sets
 A
e
and
 A
 f 
 
containthe analogous concepts for 
e
and
 f 
respectively. The semanticsimilarity
 s
a
(e,f)
between concepts
e
and
 f 
based on analogousconcepts is then defined as follows:||(,),||
eae
 AA se AA
=
<2>where
|A
e
 A
 f 
|
is the cardinality of the set consisting of commonconcepts in
 A
e
and
 A
 f 
and
|A
e
 A
 f 
|
is the cardinality of the setconsisting of union of 
 A
e
and
 A
 f 
.
 Number of paths between two concepts:
Given concepts
e
and
 f 
,the system determines the path between them. The system extractsthe total number of paths between the two concepts as well as thenumber of hops in each path. The path-based semantic similarity
 s
 p
(e,f)
between concepts
e
and
 f 
is then given as follows:
1
11(,),
 N  pii
 se Nh
=
=
<3>where N is the total number of paths between concepts
e
and
 f 
inthe semantic network graph of ConceptNet and
h
i
is the number of hops in path
i
.The final semantic similarity between concepts
e
and
 f 
is thencomputed as the weighted sum of the above measures. We useequal weight
on
each of the above measures (in the absence of astrong reason to support otherwise), and write the conceptsimilarity
CS 
the as follows:(,)(,)(,)(,),
ccaapp
CSefwsefwsefwse
= + +
<4>where
w
c
=w
a
=w
 p
=1/3
.In the next subsections, we use ConceptNet distances to computedistances in the
where
and
what 
facets of the user and eventcontext, since these two facets are described with a free-formnatural vocabulary on which ConceptNet similarities aremeaningful, while other facets such as
who
and
when
usequantitatively distances on time, or intersection on proper nouns.
3.2.2
 
Similarity between two sets of concepts
An event usually contains a number of concepts in a facet;therefore we also need a similarity measure between sets of concepts based on that between two individual concepts. Wedefine the set similarity between two sets of concepts A and B,where A: {
a
1
,
a
2
, …} and B: {
b
1
,
b
2
, …}, given a similaritymeasure
m(a,b)
on any two set elements
a
and
b
in the followingmanner.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->