You are on page 1of 155

Title:

The Genealogy of Knowledge in Wikipedia – Method Development and Application

Master’s Thesis

at the Chair for Information Systems and Inter-Organizational Systems

Supervisor: Prof. Dr. Stefan Klein


Tutors: Dr. Uri Gal
Dr. Kai Riemer

Presented by: Friedrich Chasin


Bismarckallee 49
48151 Münster
0177 7229890
f_chas01@uni-muenster.de

Date of Submission: 2013-08-01


II

Content

Figures ............................................................................................................................. IV
Tables .............................................................................................................................. VI
Abbreviations ................................................................................................................ VII
1 Introduction .................................................................................................................. 1
2 Literature Review ......................................................................................................... 4
2.1 The Theory of Social Representations .................................................................. 4
2.1.1 The Cradle of SRT: A Historical Perspective .............................................. 4
2.1.2 Concepts and Processes ............................................................................... 6
2.1.3 Previous Applications and Ongoing Research ........................................... 12
2.2 The Wikipedia Project ......................................................................................... 14
2.2.1 Wikipedia Outline ...................................................................................... 14
2.2.2 Subject of Scientific Studies ...................................................................... 15
2.3 Social Representations on Wikipedia .................................................................. 19
3 Methodology of Studying Social Representations on Wikipedia ............................... 20
3.1 Wikipedia Structure ............................................................................................. 20
3.2 Suitability of Wikipedia for Studying Social Representations ............................ 23
3.3 Wikipedia Article as a Central Element of Study ............................................... 25
3.3.1 Social Representation Concepts in the Article ........................................... 25
3.3.2 Social Representation Processes in the Article Evolution ......................... 27
3.4 Case Studies......................................................................................................... 30
3.4.1 Case Study Anchoring Analysis ................................................................ 31
3.4.2 Case Study Objectification Analysis.......................................................... 33
3.5 Towards a Quantitative Analysis of Wikipedia Articles ..................................... 33
4 WikiGen: A Statistical Tool for Quantitative Analysis of Wikipedia Articles .......... 35
4.1 Architecture ......................................................................................................... 35
4.2 Statistical Features ............................................................................................... 38
4.2.1 Editing Statistics ........................................................................................ 38
4.2.2 Links Statistics ........................................................................................... 42
4.2.2.1 Anchor Maps .................................................................................. 42
4.2.2.2 Anchor Snapshots ........................................................................... 43
4.2.2.3 Anchor Dynamics ........................................................................... 45
4.2.3 Reference Statistics .................................................................................... 49
4.2.4 Integrated Tools ......................................................................................... 50
5 Case Studies of Social Representations on Wikipedia ............................................... 52
5.1 Cloud Computing: From Utility Computing to a Jargon Term ........................... 52
5.1.1 General Description ................................................................................... 52
5.1.2 Anchoring Analysis.................................................................................... 56
5.1.2.1 Employed Anchor Statistics ........................................................... 56
5.1.2.2 Anchor Coding ............................................................................... 59
5.1.2.3 Narrative Interpretation .................................................................. 63
5.1.3 Objectification Analysis ............................................................................. 74
5.2 The iPad: From a Big Smartphone to a New Market .......................................... 75
5.2.1 General Description ................................................................................... 75
5.2.2 Anchoring Analysis.................................................................................... 79
5.2.2.1 Employed Anchor Statistics ........................................................... 79
III

5.2.2.2 Anchor Coding ............................................................................... 81


5.2.2.3 Narrative Interpretation .................................................................. 83
5.2.3 Objectification Analysis ............................................................................. 90
6 Discussion ................................................................................................................... 92
7 Conclusions and Directions for Further Research ...................................................... 99
References ..................................................................................................................... 102
Appendix ....................................................................................................................... 108
A Page and Article Distributions According to (Anderka and Stein 2012) .......... 108
B Anchor Coding for Cloud Computing ............................................................... 110
C Anchor Coding for iPad..................................................................................... 113
D Disregarded Anchors in the iPad Case Study .................................................... 115
E Interpretation Scheme for Trends in the Collaboration Process ........................ 116
F Statistics for Cloud Computing Evolution Phases............................................. 117
G Statistics for iPad Evolution Phases .................................................................. 118
H Cloud Computing Case Study Statistics ............................................................ 119
I iPad Case Study Statistics ................................................................................. 135
IV

Figures

Fig. 1 Models of representation. Source: MOSCOVICI (2000/1984, p.71)................. 8


Fig. 2 Breakdown of the First Part of the Research Question 1 ............................. 25
Fig. 3 Internal Wikipedia Links in the Definition section of an Article ................. 26
Fig. 4 Breakdown of the Second Part of the Research Question ............................ 28
Fig. 5 Process of Deriving Categories form a Generated Anchor List ................... 32
Fig. 6 The Infrastructure of the WikiGen Solution................................................. 36
Fig. 7 Internal Organisation of Modules in the WikiGen Application ................... 37
Fig. 8 WikiGen Bar Charts for Edits Statistics ....................................................... 39
Fig. 9 WikiGen Text Output for Edits Statistics..................................................... 39
Fig. 10 WikiGen Edits per Editor Chart ................................................................... 40
Fig. 11 Exemplarily Effect of an External Event on the Editing Activity ................ 40
Fig. 12 Exemplarily Application of Interpretation Scheme (Appendix E) on
Editing Data .................................................................................................. 41
Fig. 13 Example of Revision Map with Illustrated Navigation Functionality ......... 42
Fig. 14 Anchoring Map for User Interface Anchor in iPad Article .......................... 42
Fig. 15 Extract form the Historical iPad Article dated 15.04.2013 .......................... 43
Fig. 16 Anchor Snapshot for iPad Social Representation in 2010 ........................... 44
Fig. 17 Example for New and Obsolete Anchors Chart in WikiGen ....................... 45
Fig. 18 Monthly and Yearly Anchor Dissimilarity Measures in WikiGen ............... 46
Fig. 19 Monthly and Yearly Average Anchor Durability in WikiGen ..................... 47
Fig. 20 Monthly and Yearly Edit War Level Measures in WikiGen ........................ 48
Fig. 21 Exemplary Distribution of Referencing Articles in WikiGen ...................... 50
Fig. 22 Exemplary Output of Article Traffic Statistics Tool .................................... 50
Fig. 23 Exemplary Output of Contributors Tool ...................................................... 51
Fig. 24 Extrapolated Distribution of Cloud Computing Article Revisions Based
on Monthly Data ........................................................................................... 53
Fig. 25 Interest for Cloud Computing according to Google Trends ......................... 54
Fig. 26 Number of Edits for Cloud Computing Article on Wikipedia ..................... 54
Fig. 27 Daily Views for Cloud Computing and Facebook Articles ......................... 55
Fig. 28 Standardised Monthly Edits, Editors and Edits per Editors Statistics .......... 55
Fig. 29 Amount of New and Obsolete Cloud Computing Anchors per Month ........ 57
Fig. 30 Anchors Dissimilarity for Cloud Computing per Month ............................. 58
Fig. 31 Average Anchor Durability for Cloud Computing ....................................... 58
Fig. 32 Anchor Edit-War Level for Cloud Computing............................................. 59
Fig. 33 Anchors for Utility Computing as for 3th of March 2007 ........................... 64
Fig. 34 Dissimilarity and Anchor Movements for Cloud Computing
in Sep 07-Oct 08 ........................................................................................... 66
V

Fig. 35 Collaboration Dynamics for Cloud Computing in Sep 07-Oct 08 ............... 66


Fig. 36 Dissimilarity and Anchor Movements for Cloud Computing
in Nov 08-Jan 10 .......................................................................................... 68
Fig. 37 Dissimilarity Measure for Cloud Computing in Feb 10-Feb 11 .................. 69
Fig. 38 Dissimilarity Measure for Cloud Computing in Mar 11-Jan 12 .................. 70
Fig. 39 Cloud Computing Anchors as for Dec 2011 ................................................ 71
Fig. 40 Interest Decrease in Collaboration in the Period Sep 11-Dec 11 ................. 71
Fig. 41 Strongest Anchors for Cloud Computing in 2013 ........................................ 73
Fig. 42 Distribution of Articles that Contain References to the Cloud Computing
Article (without indirect references) ............................................................ 75
Fig. 43 Extrapolated Distribution of iPad Article Revisions .................................... 76
Fig. 44 Interest for the iPad Search Term According to Google Trends .................. 77
Fig. 45 Number of Edits for iPad Article on Wikipedia Including Events ............... 77
Fig. 46 Daily Views for iPad and Facebook Articles ............................................... 78
Fig. 47 Standardised Monthly Edits, Editors and Edits per Editors Statistics
for the iPad Article ....................................................................................... 78
Fig. 48 Amount of New and Obsolete Anchors for iPad per Month ........................ 80
Fig. 49 Anchors Dissimilarity for iPad per Month ................................................... 80
Fig. 50 Title Picture of the iPad Ads Section on the Apple’s Website ..................... 83
Fig. 51 Anchor Movements for the iPad SR in Feb 10-Aug 10 ............................... 85
Fig. 52 High Centralisation of the Collaboration for the iPad SR
in Feb 10 -Aug 10 ......................................................................................... 85
Fig. 53 Anchor Dissimilarity for iPad in 2010 ......................................................... 87
Fig. 54 Strongest Anchors for iPad in Sep 2010 – Sep 2012 ................................... 87
Fig. 55 Anchor Dissimilarity for the iPad SR in Sep 10-Sep 12 .............................. 88
Fig. 56 Anchor Movements for the iPad SR in Oct 12-Jun 13 ................................. 89
Fig. 57 Usage and Similarities Categories in for the iPad SR in 2013 ..................... 90
Fig. 58 Distribution of Articles that Contain References to the iPad Article (without
indirect references) ....................................................................................... 91
Fig. 59 Evolution Phases and Anchor Perspectives for the CC Case Study............. 93
Fig. 60 Evolution Phases and Anchor Categories for the iPad Case Study .............. 94
Fig. 61 Interpretation Scheme for Collaboration Evolution on Wikipedia ............ 116
VI

Tables

Tab. 1 Namespaces of English Wikipedia (Source: wikipedia.org) ........................ 21


Tab. 2 Anchor Categories for the Cloud Computing Social Representation ........... 60
Tab. 3 Anchor Categories for the iPad Social Representation ................................ 81
Tab. 4 Distribution of Pages among Namespaces as for January 15, 2011
(Anderka and Stein 2012)........................................................................... 108
Tab. 5 Distribution of Articles among Topics as for January 15, 2011
(Anderka and Stein 2012)........................................................................... 109
Tab. 6 Strongest Anchors for Cloud Computing Including Categories ................ 112
VII

Abbreviations

CC Cloud Computing
IS Information Systems
MUSE Mutually Exclusive and Collectively Exhaustive
SR Social Representation
SRT Social Representations Theory
URL Unique Resource Locator
WF Wikimedia Foundation
WP Wikipedia
1

1 Introduction

The quest for the origins and the nature of human knowledge has a history as long as
those of the self-reflective human mind. In our lives, we are constantly reminded of the
imperfection of our knowledge: Our senses deceive us, our logical conclusions are often
contradictory and facts we never question turn into fabrications. It was arguably
Copernicus who most clearly exposed the potentially illusive nature of what we call
reality by discovering the heliocentric model of the universe. Abandoning the centricity
of the Earth was a reorientation not only in the physical sense but, more importantly, a
radical change in the very way human kind perceived its role and significance in the
universe.

From the philosophy of the Greek Classics, until modern philosophers such as
Descartes, the phenomenon of change however was not of main concern. Although the
possibility of change was accepted, the essence of each phenomena was considered
timeless (Marková 1996, p.178). As a consequence, “throughout the history of the
natural and social science, there has been a long-lasting difficulty in conceptualising
phenomena that are inherently dynamic” (Marková 1996, p. 178). Knowledge, being not
a simple description of what is ‘out there’ but a product of human interactions and
communication involving different interests (Duveen 2000, p.2), is one of those
inherently dynamic phenomena. A ‘lay knowledge’ (Marková 1996; Wagner et al.
1999) is correspondingly not a simple deviation from the true understanding of the
phenomenon but a social construct (Berger and Luckmann 1966). Sharing this socially
constructed knowledge is equal to constituting the common reality (Moscovici 1990,
p.164) - a reality for an individual which is “to a high degree [...] determined by what is
socially accepted as reality” (Lewin 1948).

Given the dynamic nature of knowledge, the focus of a researcher who is interested in
social phenomena, and therefore the focus of this thesis, shifts towards understanding
the processes through which knowledge is generated and projected into the social world
(Duveen 2000, p.2). By these processes, knowledge acquires a historical dimension.
Past ideas and experiences continue to exist by changing and infiltrating the present
(Moscovici 2000/1984, p.24). Subjects of study in this process become changing
representations of phenomena collectively held by individuals. SERGE MOSCOVICI,
whose work is fundamental for the given thesis, states that “in order to understand and
to explain a representation, it is necessary to start with that, or those, from which it was
born” (Moscovici 2000/1984, p.27). Following the offspring metaphor, the aim of this
work is to explore the genealogy of knowledge. That is, to explore the origins and thus
the kinship of socially procreated knowledge.
2

Once we agree that we live in a world full of socially manufactured representations


which constitute our reality, the importance of applying some form of ‘reverse
engineering’ to them is apparent. First, representations are in general prerequisites for
human actions (Weber et al. 1978) and, according to the French philosopher Michel
Foucault, are controlling entities in our lives as they exclude and include what is
permissible and what is not (Munslow 1997, p.120). Second, the manufacturing of
social representations is asymmetric in the power of their manufacturers. The role of
professions such as media, political and social specialists is often decisive in the process
of creating social representation (Moscovici 1988, p.225). This is above all true where
individuals are unaware of the representations they hold in that “invisible is inevitably
harder to overcome” (Moscovici 2000/1984, p.26).

That said, the importance of making single representations explicit resides in the
potential it opens to understand and even influence group processes in particular and
human conduct in general. To become aware of conventional aspects in our life is to
evade some of the constraints they impose on our perceptions and thoughts (Moscovici
2000/1984 p.23). If social sciences would take account of this aspect, they would
provide richer contextual explanation for social phenomena (Bauer and Gaskell 2008).
For natural sciences, understanding the representations that are related to the research
promises a higher efficiency when advisory services are required (Farr 1993). To give a
vivid example of benefits resulting from understanding the origins of our knowledge,
problems in the information systems (IS) field such as too static analysis techniques for
studying organisational phenomena (Boland 1999) can be addressed as it was shown by
GAL AND BERENTE (2008).

The heterogeneity within modern Western societies increases the relevance of exploring
the origins of knowledge. The absence of powerful centralised institutions as well as the
decreasing role of traditions and religion creates a diversity of representations (Duveen
2000, p.7). This modern world is characterised by very heterogeneous practices in
politics, philosophy, religion and arts (Moscovici 2008/1976, p.5). The resulting effects
are amplified by changes in the communication towards cross-space and cross-culture
dialogs and, more recently, by the increasing use of electronic media (Gervais 1997,
p.192). In this context, social media, with its influence on representations which freely
circulate the digital world, serves as an illustration for the aforementioned tendency.

The study of the genealogy of knowledge is however difficult. High complexity and
interdependencies between social and individual factors in the process of forming a
representation make any attempt to trace its lineage a challenging task in itself (László
1997). According to MOSCOVICI, various forms of knowledge, as “outgrowths of long
3

mutation chains”, can only be understood when “reimmersed” in the social setting of
communication (Moscovici 1988, p. 214). Tackling such collective processes requires a
suitable theoretical framework (Wagner et al. 1999). For the purpose of this thesis, the
theory of social representations (SRT) was chosen – a theory which became canonical
in the social psychology over the last 50 years (Jodelet 2008, p. 427).

To study the genealogy of knowledge in the sense of the social representations theory,
two criteria must be satisfied. First, the used data must represent socially constructed
knowledge. Second, the amounts of data must be extensive in order to capture the
historical dimension of knowledge. The difficulty in achieving the latter is emphasised
in studies in which the social representations theory is adapted as a framework for
studying group phenomena (Vaast 2007). Considering the increasing role of social
media in communication and collaboration, the Wikipedia platform, as a popular
collaboratively edited online encyclopaedia, provides a unique opportunity to study
social representations in the process of their creation. Historical data on every article
change along with the transparency of its collaboration processes make Wikipedia an
ideal choice for exploring the genealogy of knowledge. The selection of the Wikipedia
platform as the data source leads to the research question of the given thesis:

How can the evolution of social representations be studied on Wikipedia?

The aim is therefore to develop a method capable of revealing the genealogy of social
representations on Wikipedia and to exemplarily apply it in several case studies. It is
argued that both research on social representations and research on Wikipedia will
benefit from the introduction of this method.

The remainder of the thesis is organised as follows: First, an outline for the research on
both SRT and Wikipedia is provided in the literature review section. On behalf of
identified gaps in the research, a method for studying social representations on
Wikipedia which combines quantitative and qualitative techniques is developed in
section 3. Section 4 introduces a tool for a quantitative analysis of Wikipedia articles
that was developed to support the qualitative analysis. The latter is demonstrated in
section 5 as case studies of cloud computing and iPad social representations on
Wikipedia. Results of the method application in general and of the case studies in
particular are discussed in section 6. The thesis is concluded with a summary and
suggestions for further research on genealogy of knowledge.
4

2 Literature Review

The literature review is divided into two major parts reflecting the two pillars the thesis
is based upon: the theory of social representations and the Wikipedia platform. A
thorough analysis of previous research on both phenomena is inevitable in order to
identify research gaps and to situate the method of studying social representations on
Wikipedia in the context of ongoing research.

2.1 The Theory of Social Representations

The central idea in the theory is that people’s knowledge of the world is mediated.
Objects in this world, whether physical or not, acquire meaning only through
representations. The role of representations in the relationship between people and their
world is apparent as there is no single true representation of an object. Instead, social
groups create and change different representations of the same object in a continuous
process. Therefore, social representations are semiotic mediating devices (Valsiner
2003, p.7.2) that members of a social group invariably use to render their world
meaningful (Wagner et al. 1999). They are formed and transformed in this process and
are partially distributed among the individuals comprising the social group (Moscovici
1994, p.168). But social representations are more than this; they stand for a theory with
over 50 years of history since SERGE MOSCOVICI published his famous book “La
Psychanalyse, son Image et son Public”1. Its presence in the canon of social psychology
is justified by thousands of published papers associated with the theory, a specially
devoted journal2, PhD programs and multitudinous followers (Howarth 2006). The long
history of the theory requires a historical perspective on the phenomenon itself and its
roots to gain a deeper understanding of its characteristics (Marková 1996; Rosa 2013).

2.1.1 The Cradle of SRT: A Historical Perspective

For someone who lives in the 21st century and is not familiar with the history of social
psychology, it can be difficult to believe that only 60 years ago there was no theory
capable of taking account for the dynamic nature of social phenomena. Those were,
however, precisely the circumstances in which MOSCOVICI found himself in the mid 20st
century when he started to work on SRT (Moscovici 1988, p.214; Rosa 2013, p.3). To
clarify these circumstances, the section will explain the historical background of the
theory as well as situate the theory in the epistemological context.

1
(from French) Psychoanalysis: Its Image and Its Public (Moscovici 2008)
2
Cf. http://www.psych.lse.ac.uk/psr/
5

Knowledge and the outside world

Whenever sociology of knowledge is the subject of a theory, implicitly taken


epistemological stances must be revealed (Kurzman 1994). The social representations
theory is no exception and has a specific perspective on knowledge. Although the
theory sees knowledge as a social construct which constitutes individuals’ reality
(Moscovici 1990, p.164), the theory should not be considered as purely anti-positivistic.
The latter denies any form of pre-existing reality ‘out there’ (Mills et al. 2006, p.3). For
MOSCOVICI there is, however, a world beyond social representations in which social
representations “’correspond’ to something we call the outside world” (Moscovici
2000/1984, p.20). Important is that the outside world can enter individuals’ reality only
indirectly, through social representations (McKinlay and Potter 1987, p.477). This
epistemology, in which any unmediated knowledge is rejected, goes back to HEGEL
(1998/1807). A theory which presuppose that realities are socially constructed is called
constructivist (Mills et al. 2006, p.2). Accordingly, the social representations theory can
be classified as a constructivist theory (Marková 2000; Moloney and Walker 2002;
Moscovici 1988; Wagner et al. 1999).

Individual and collective

MOSCOVOCI’s understanding of the relation between social and individual aspects in the
development of knowledge must be seen in the context of Hegelian views on the
independency between the universal and the particular (Marková 1996, p.179). For
HEGEL, something universal was co-developed in interaction with particular (Hegel
1998/1807). Similarly, the collective and the individual in the social representations
thory are interdependent. In fact, the interdependence between individual and collective
aspects of knowledge distinguishes SRT from other theories of knowledge in social
psychology (Rose et al. 1995, p.3).

European theories in social psychology preceding the MOSCOVICI’s work are based on
Kantian philosophy, which has no account for interdependence between the collective
and the individual. The collective is seen as a given social fact which is independent
from individuals (Marková 1996, p.179). Other approaches such as cognitive
psychology focus on information processing and operate similar to natural sciences by
disregarding any social influences on the individuals’ mind (Duveen 2000, p.12). In the
similar vein, the theory of attitudes, which is sometimes considered as a North
American counterpart of SRT, lost most of its social elements as a result of the
individualisation of social psychology (Farr 1993; Wagner et al. 1999). A theory which
takes account for collective elements is the theory of social cognition. The concept of
social schemata used within the theory is related to SRT but is to static BARTLETT'S
6

(1995/1932 p.201) and lacks theoretical explanations of the schemata origins (Wagner
et al. 1999). That said, MOSCOVICI’s (1984b) attempt to integrate both individual and
social factors into a theory without losing its claim to be an explanatory device for
social phenomena is characteristic. Höijer (2011, p.4) described this characteristic as
follows: “giving the individual some room the theory of social representations avoids
social determinism and opens for processes of transformation. [...] the individual is
mainly embedded in and formed by social structures”.

Social representations and DURKHEIM’s social psychology

Contrary to widely held beliefs that MOSCIVICI’s theory is derived from DURKHEIM’s
social psychology (Gillespie 2008; Glăveanu 2009; Ju and Gluck 2011; Vaast 2007),
DURKHEIM’s influence on the theory is limited. First, a similar approach aiming to
include social dimensions into psychology was already introduced by WUNDT’s
“Völkerpsychology”3 (Wagner et al. 1999). Second, DURKHEIM’s concept of collective
representations was not suited for something MOSCOVICI considered as a new era and a
new society (Duveen 2000, p.7-9). It was poorly equipped for the diversity of
representations as well as tensions and conflicts in the modern world (Rose et al. 1995,
p.3). For MOSCOVICI (1988, p.219), the focus, when analysing modern societies, had to
be on “innovation rather than tradition” and on “social life in the making rather than a
preestablisehed one”.

Hybrid nature of social representations theory

MOSCOVICI (1984b) insisted on the role of social psychology as a new science in


between sociology and psychology, instead of psychology with sociological elements or
vice versa. The impulses to create this hybrid science came from cybernetics in which
explanatory power is derived from a combination of sciences - neither of which could
explain the phenomenon on its own (Jodelet 2008, p.426). Hence, social representation
theory combines sociology and psychology into a new science to explain social
phenomena.

2.1.2 Concepts and Processes

To provide a detailed summary of the theory, its concepts and processes will be
elaborated upon in this section starting with the concept of unfamiliarity. The
unfamiliarity is, in a sense, life-giving for the remaining theory elements as it triggers
the subsequent processes, which lead to the formation of social representations.

3
German for ‘folk psychology’
7

Unfamiliarity

For the process of forming or changing a social representation to come into being,
something disruptive must threaten the reality of a social group (Moscovici 2000/1984,
p.38). Disruptive in this context is a strange unfamiliar phenomenon, or a strange
unfamiliar characteristic of a familiar phenomenon. This human reaction to the
unknown is fundamental if one recalls that Aristotle had identified the very reason to do
philosophy in a similar vein: “human beings began to do philosophy [...] because they
wondered about the strange things right in front of them” (Met. 982b12)4. The
unfamiliar leaves a human being with a sense of “incompleteness and randomness” and
emphasises the “actuality of something absent” (Moscovici 2000/1984, p.38). The
necessity of dealing with unfamiliar is even accompanied by fear (Moscovici 1988,
p.234). GERARD DUVEEN has given the most striking metaphor for the naturalness of
this process by comparing a mind which “abhors” an absence of meaning with the
nature which abhors a vacuum (Duveen 2000, p.8). To overcome unfamiliarity, to make
something unfamiliar familiar, is the purpose of any social representation (Moscovici
2000/1984, p.37).

Social Representation

The social representation phenomenon is difficult to compress into a single definition.


The author of the theory himself refuses to provide an exact definition arguing that the
phenomenon, as an element of a theory among social sciences, does not require an exact
definition and none of the definitions could do justice to its manifold nature (Moscovici
1988, p.213). An alternative to the precise definition is a thorough elaboration of its
characteristics (Marková 2000, p.11). The given thesis will follow this strategy by
starting with a sketchy definition and then continuing by emphasising the key aspects of
the theory.

“Social representations are collaborative elaborations of a social object


by the community for the purpose of behaving and communicating”
(Wagner et al. 1999, p.95)

The first aspect to be clarified about social representations is their relation to both
individuals and social groups. Social representations exist not only in people’s minds
but also in the culture being collectively realised (Rose et al. 1995). It means that once
created, a social representation continues to exist on its own, on a trans-individual level
(Farr 1993, p.194; Moscovici 1988, p.231). However, it does not imply that social

4
Citation is taken from Stanford Encyclopedia of Philosophy (http://plato.stanford.edu/entries/aristotle/,
accessed 2013.05.29 )
8

representations are completely shared by everyone. Instead, they are partially distributed
as pieces of knowledge shared by some people yet possibly unknown to others
(Moscovici 1994, p. 168). Different social representations of the same social object
exist simultaneously across social groups, but, what is more, within the same group
(Howarth 2006, p.68). Their lack of uniformity is amplified by the fact that distinct
social representations can be inconsistent by incorporating conflicting concepts
(Moscovici 1988, p.233). Similarities and differences in representations across and
within groups are indeed substantial as they allow communication. In a felicitous
manner, Gillespie (2008, p.379) noted that “the possibility of communication is born out
of similarity, while the necessity of communication is born out of difference”. The
permanent dialog between individuals is furthermore the only driving force behind the
continuous change in social representations (McKinlay and Potter 1987, p.473).

Despite communication, social representations have a second function which is to


enable individuals to orientate themselves in their social and material world (Moscovici
2008/1976, p.xiii). This world is only meaningful because social representations fill it
with meaning (McKinlay and Potter 1987, p.473). Figure 1 illustrates the prescriptive
nature of social representations (on the right) as opposed to the traditional view (on the
left). It is important to note that, consequently, stimuli are not independent of
representations an individual holds. Instead, stimuli are interpreted in accordance with
common definitions that are shared in a corresponding society.

Fig. 1 Models of representation. Source: MOSCOVICI (2000/1984, p.71)

The power of social representations is in their mediating role between individuals and
the outside world already on the level of stimuli. Each stimulus is thus interpreted
according to social representations an individual holds (Compare Fig. 1). This aspect
was already highlighted in the previous section when the outside world was identified as
something that enters social life only through social representations. Especially when a
social representation is not recognised as such, it is thought by an individual as the
reality on which he acts (Moscovici 1985, p.91).

Given the above perspective, it is apparent why social representations are considered to
be mediating devices which regulate human conduct (Valsiner 2003, p.7.6). MOSCOVICI
(2000/1984, p.23) uses a strong image to highlight the prescriptive role of social
9

representations: “They impose themselves upon us with an irresistible force”. However,


even if MOSCOVICI’s words sound as if social representations would determine every
course of human action, it is certainly not what he is trying to stress. Clearly, his
intension is to put emphasis on the impossibility of individuals to eradicate all social
conventions, even though it is possible to consciously evade some of their constraints
(Moscovici 2000/1984, p.23). Individuals are actors in this process and are capable of
changing their realities constituted by social representations through acting on them
with awareness (Marková 1996, p.180).

Another important aspect of the theory is the distinction between the reified world of
science and consensual world of common sense (Bauer and Gaskell 1999, p.167). For
MOSCOVICI, social representations arise at the transition from science into common
sense (Farr 1993, p.195) In this world of common sense, people appropriate only a
fraction of information about the objects they encounter and they do it by forming social
representations of those objects through communication (Moscovici 1988, p.215). And
yet people are still able to successfully orient themselves based on this incomplete body
of representations - this is the paradox which is at the roots of the social representations
theory. Analogously to anthropology and child psychology which “trace the genealogy
of mythic thought to scientific thought”, the aim of social psychology and thus of SRT
is to explore the transition “from science to representations” (Moscovici 1988, p.217).
To portray this important aspect of the theory in a more vivid way, an extensive citation
from the work of BAUER AND GASKELL (1999) is valuable:

“Consider the following analogy: throwing a stone (genetic research) into a pond
(public) creates ripples. We are more interested in the ripples (representations of
genetics) and what they tell us about the invisible depths of the pond (local
concerns and sensitivities), than the stone itself (theories of genetics). Equally,
we assume that the stone throwers (geneticists and bio- technologists), while
starting the ripples, cannot control them. The very unpredictability of common
sense is the problematic of social representations theory”
(Bauer and Gaskell 1999, p.166-167)

In fact, most knowledge and ideas communicated by media and verbally among the
individuals have a scientific origin according to MOSCOVICI (1988, p.215). He
differentiates science from common sense on behalf of the notion of the consensual and
reified universes of knowledge. The reified universe is a domain of “rationality,
intellectual precision and independent judgement” which is neutral to individual values
(Marková 1996, p.182) while the consensual world is characterised by men being “the
measure of all things” (Moscovici 2000/1984, p.33). MOSCOVICI sees human cognition
10

as a product of an interrelation between two cognitive systems: While the first system
corresponds to the consensual universe and is based on associations and
discriminations, the function of the latter is in verification and control based on logical
rules (Moscovici 2008/1976 p.256).

The last characteristic of social representations to be mentioned is that they can be either
implicit or explicit. A social representation is explicit only if it becomes the subject of a
discussion itself or when communication is interpreted in terms of the underlying
representations (Gillespie 2008, p.377). Apart from these cases, social representations
are “buried under the layers of words and images” (Moscovici 1994, p.168).

As it was outlined previously, the very focus of SRT is not on describing what social
representations are, but on how they are formed. They are formed by two processes:
anchoring and objectification (Moscovici 2000/1984, p.41).

Anchoring

Whenever something unfamiliar and disturbing is experienced, the process of


overcoming unfamiliarity is triggered. In order to come to an understanding of the
phenomenon and allow communication, the unfamiliar must be named and classified
(Wagner et al. 1999). This process is called anchoring. Classification in this context is a
reduction of something unknown to familiar concepts (Moscovici 2000/1984, p.42). The
choice of suitable classes is based on comparing the new phenomena to prototypes
generally considered to represent the corresponding class (Moscovici 2000/1984, p.42).
As an example, in its early stage, the unfamiliar phenomenon of HIV/AIDS, before
acquiring this name, was anchored in terms of a ‘gay plague’ or ‘gay cancer’ (Farr
1993, p.201). To name the unfamiliar, on the other hand, is a distinct act without which
neither an effective communication nor a sense of familiarity is possible even if the
unfamiliar can be classified (Moscovici 2000/1984, p.43).

It is important to note that anchoring is more than simple naming and classifying of
unfamiliar. According to MOSCOVICI (2000/1984), the main aim of the process can be
seen in allowing interpretation of characteristics associated with the unfamiliar. By
comparing unknown to a prototype, it acquires characteristics of the prototypes category
and is even adapted to fit within this category. Simultaneously, a positive or negative
relation with the unfamiliar is established since anchoring is never neutral. This happens
on a subconscious level resulting in the “priority of verdict over the trial” (Moscovici
2000/1984, p.44). Accordingly, a person wearing black glasses and using a white cane
is often instantaneously classified as a blind person without much efforts being made to
determine the degree of the persons visual impairment.
11

However, even after the unfamiliar phenomenon has initially been anchored, the process
does not stop. In fact, anchoring never stops (Höijer 2011, p7). Anchors are thus an
integral part of thinking in general. As MOSCOVICI expresses it: There is no thought or
perception without anchor (Moscovici 2000/1984, p.48) .

Objectification

The second process in forming social representations is objectification. It is a


complementary process which often occurs in parallel to anchoring (Rosa 2013, p.20).
However, to distinguish both processes one should note that every objectified
phenomenon is necessarily anchored but not every anchored phenomenon is objectified.
In the introduction to this thesis, the reality for an individual was described as a sum of
social representations it holds. In this reality, every object corresponds to a social
representation which has gone through the process of objectification. The result of the
objectification is thus a concept which ceases to be a sign in the individual’s mind and
becomes a replica of reality - something which the philosopher Hume called “the
mind’s property to spread itself on external objects” (Moscovici 1988, p.214). It is a
transfer from what is in the mind to something existing in the physical world
(Moscovici 2000/1984, p.42). Objectified phenomena have the ability to appear in front
of the eye since they acquire an “iconic quality” in addition to purely intellectual
perception before (Moscovici 2000/1984, p.49).

However, some concepts cannot be directly ‘converted’ into images. For a concept such
as greed, the resulting image after objectification is another object which is created
using the objectified concept; a greedy man for example. MOSCOVICI found in his
famous La Psychoanalyse study that psychoanalytical concepts such as the complex
were objectified as men with complexes instead of producing an image for the complex
itself (Moscovici 2000/1984, p.52). In the case of a taboo object or an object for which
no image can be found, different images are integrated into a ‘figurative nucleus’
(Moscovici 2000/1984, p.50).

To realise the iconic power of objectified representations it is sufficient to recall the


concept of HIV/AIDS. The thought alone spawns fierce images in our minds. The
substitutive nature of social representation where representation becomes reality is
brightly illustrated by MOSCOVICI through the tale of Sinbad the Sailor (Moscovici
1988, p.220). An island which is used by Sinbad and his fellow sailors to build a camp
on turns out to be a giant fish. In this myth, a phenomenon which is objectified as an
island to such a degree that it appears absolutely familiar until the fish suddenly dives
into the deep sea.
12

2.1.3 Previous Applications and Ongoing Research

Alongside with an intensive research on the theory itself, the social representations
community has generated a considerable amount of empirical research over the last 50
years (László 1997). This section will provide a summary of theory alterations, an
overview over its heterogeneous applications and an outline of the main criticism
towards the theory.

Theory alterations

Within the growing community of ‘social representations scholars’, many researchers


have contributed to the theory itself. Some of them have focused on the structural
aspects of social representations. The structural approach aims at identifying roles of
different elements within social representations. A research group around JEAN-CLOUDE
ABRIC, for example, has advocated for identifying a central core within social
representations which is existential for any social representation (Wagner et al. 1996,
p.332). A considerable amount of studies has adopted this structural perspective on
social representations including works by DUVEEN (1996), JODELET (2008), PSALTIS
(2012), SÁ (1996) and WAGNER ET AL. (1996) to mention only a few. For example,
WAGNER ET AL. (1996) demonstrated the existence of central nuclei by asking for word
associations with concepts such as ‘war’ and ‘peace’ while changing the context of the
question. As a result, one part of the anchors – the “hot” stable core – was observed
despite the changing character of the question, while another part was identified to be
sensitive to those changes.

Gillespie (2008) suggested an extension for the theory based on the developed concept
of alternative representations. He argued that the concept is necessary to analyse how
social groups account for other groups’ representations. To guide the research on social
representations, BAUER AND GASKELL (1999, 2008) have repeatedly tried to formulate a
progressive research program providing different models of social representations such
as the Toblerone Model (1999) and the Wind Rose Model (2008). HOWARTH (2006)
encouraged the development of a more critical social representations theory that is
aimed at tackling a broader spectrum of relevant problems within society. Addressing
the challenge of predictability, Valsiner (2003) derived a theory of enablement from the
theory of social representations. The new theory aims at better explanation of the
transition from present to future. Having discovered a growing body of sub-theories
within SRT, JODELET (2008) advocated for a dialogue between the latter to increase
explanatory power of the combined theory.
13

Theory applications

The original work of MOSCOVICI on SRT included the social representation analysis of
psychoanalysis in French society (Moscovici 2008/1961,1976). Following MOSCOVICI’s
initial work, a number of studies examined the public understanding of science and
technology (Bauer and Gaskell 1999; Farr 1996). Another body of research was
dedicated to representations of physical and mental health issues including works by
JODELET (1991) on madness, JOFFE (2009) on AIDS, MOLONEY AND WALKER (2002) on
transplantants and WAGNER ET AL. (1995) on conception. Typical research topics for
social representations also include studies of gender (Duveen 1996; Psaltis 2012),
disasters (Gervais 1997) and human rights (Doise et al. 1998).

Today, applications of the theory, apart from the traditional areas, range from analysing
social representations of food (Backstrom et al. 2003) to the perception of wolves in
Scandinavia (Figari and Skogen 2011) and to social representations of burnout in IT
professions (Pawlowski et al. 2007). Especially in the latter area of information systems
(IS) and IT there is a growing body of research based on the theory. Alongside with the
study on burnout in IT, VAAST (2007) introduced one of the first studies in the IS field
by investigating the social representation of IS-security within the healthcare domain. It
was followed by an introduction of the SRT perspective for studying socio-cognitive
processes during IS-implementation (Gal and Berente 2008). More recent studies within
information systems have adopted a social representations research perspective for
studying online privacy (Oetzel 2011), information relevance (Ju and Gluck 2011) and
social representation of social media in organisations (Kaganer 2010).

The social representations theory is known for the multiplicity of research methods one
can apply to study social representations (Moscovici 2000/1984). In an overview of
several studies which have adopted the theory, WAGNER ET AL. (1999) have identified
the use of methods including ethnography, interviews, focus-groups, content analysis of
media, statistical analysis of word associations, questionnaires and experiments. While
the majority of the techniques are qualitative, there is an increasing use of quantitative
methods for supporting analysis of social representations (Doise et al. 1993; Ju and
Gluck 2011; Breakwell and Canter 1993). Furthermore, the use of triangulation
techniques, where quantitative methods support the qualitative analysis, is considered
valuable (Gervais 1997; László 1997; Wagner et al. 1999)

Theory criticism

Similar to how aforementioned researchers devoted themselves to enhance and to apply


the theory, others formulated aggressive criticism towards MOSCOVICI’s work (Marková
14

2000, p.419). According to the critics, the theory is lacking clear definitions (Jahoda
1988), is conceptually incoherent (McKinlay and Potter 1987), flawed in its major
concepts (Bangerter 1995) and confused (Billig 1986).

Some of the concerns are repeatedly observed across the critical elaborations of the
theory. The explanatory power of SRT, which is based on the interdependence between
individual and collective processes, is difficult to make use of in the context of the
empirical research (László 1997, p.156). An example for high complexity of this
interdependence is the constituting role of social representations and individual
capabilities to evade the imposed constraints5.

Another recurring point of criticism is directed towards the sharp distinction between
the reified world of science and consensual world of common sense (Bangerter 1995;
Bauer and Gaskell 1999, 2008; Duveen 1990; Howarth 2006; Potter and Edwards
1999). It is argued that there is a vague border between those worlds, if any. Social
representation can penetrate science as much as science enters the world of common
sense through social representations. Some voices within academia even call to abandon
the very distinction between science and non-science: “Go to a laboratory, any lab will
do, and hang around [...] do you see anything beyond ordinary discourse and situated
action?” (Lynch 1997/1993).

Despite all the criticism, the theory is successfully spreading across researchers and
different sciences – a fact which is acknowledged even by its most vehement critics
such as JAHODA (1988) and POTTER AND EDWARDS (1999).

2.2 The Wikipedia Project

The second pillar of the given thesis is the Wikipedia platform. An overview of the
project and associated research in academia is inevitable to establish a connection
between the social representations thoery and the Wikipedia project.

2.2.1 Wikipedia Outline

Wikipedia is one of the best known websites ever created. Its popularity is reflected in
the internet traffic in which the online encyclopaedia is ranked as the 6th most visited
web page in the world6. The definition of Wikipedia can be reduced to the definition of
its properties: free, collaboratively edited, consensus-based and multilingual. Wikipedia

5
Cf. section 2.1.2
6
Wikipedia is 6th most visited site according to Alexa.com http://alexa.com/siteinfo/wikipedia.org,
accessed 26.05.2013
15

is free due to the use of the GNU Free Documentation License7 (“Wikipedia License
Information”). Collaborative editing is achieved by allowing any individual with
internet access to contribute to any non-restricted8 article. Regarding the editorial
policy, Wikipedia has a unique position among encyclopaedias by preferring consensus
over credentials in the process of creating articles9 (Yasseri et al. 2012, p.2). The mix of
the aforementioned characteristics is available in 284 languages currently supported by
Wikipedia.

The Wikipedia project is run by the American non-profit organisation Wikimedia


Foundation (WF) headquartered in San Francisco. Besides Wikipedia, the WF portfolio
includes projects such as Wiktionary, Wikiquote, Wikinews and MediaWiki. The latter
is the software behind all WF projects. The complete list of projects is maintained at the
WF Website10.

The history of Wikipedia goes back to January 2001 when the encyclopaedia was
created by Jimmy Wales and Larry Sanger with the aim that any internet user can edit
any article of the encyclopaedia at any time (Olleros 2008). Since then it has grown
rapidly in both amount of articles and participating users. According to the official
Wikipedia statistics, the encyclopaedia has a total of over 4 million articles and over 18
million users among which around 130 thousand are considered to be active 11.
Consequently, even within academia there are voices claiming Wikipedia to be
unquestionably the number one reference in practice (Yasseri et al. 2012). Another
claim Wikipedia to be one of the most complex and relevant data sets humanity has ever
produced (Martin 2011).

2.2.2 Subject of Scientific Studies

Apart from being a starting point for intellectual curiosities, Wikipedia is also known as
a platform providing data for various research projects. Its popularity within academia is
rooted in the fact that every article revision and every discussion post ever made are
saved and available on the platform. This makes Wikipedia a suitable subject of a broad
research spectrum including epistemological studies and studies regarding collaborative
processes (Martin 2011). In the following, research related to the Wikipedia project is
introduced. The section is structured according to different aspects of the platform,
which are researched by the academia.

7
http://www.gnu.org/copyleft/fdl.html
8
Some Wikipedia articles are limited to contributions only by users with certain rights. Cf.
http://en.wikipedia.org/wiki/Wikipedia:User_rights for details
9
http://en.wikipedia.org/wiki/Wikipedia:Consensus
10
http://wikimediafoundation.org/wiki/Our_projects
11
http://en.wikipedia.org/wiki/Wikipedia:Statistics as for 12.04.2013
16

Distributional aspects

A number of studies were devoted to the category structure of Wikipedia as a form of


structuring knowledge. The category network was found well-maintained (Holloway et
al. 2007) and mostly stable (Suchecki et al. 2012). Appendix A contains the distribution
of pages across Wikipedia namespaces and the thematic distribution of encyclopaedia
articles identified by ANDERKA AND STEIN (2012). The latter shows cultural and
thematic biases towards western cultures and historical content (Bellomi and Bonato
2005). Similarly, distributional aspects on the contributor levels were researched at the
United Nations University MERIT showing the correlation between the amount of
contributors and amount of articles within a thematic field, as well as cultural
differences in the contributory patterns when looking at different language versions of
Wikipedia (Glott et al. 2010).

Weaknesses and potentials

One of the major research areas within the Wikipedia research community is dedicated
to questions regarding weaknesses of the platform. They represent waves of criticism
Wikipedia experienced during its years of existence. Its main criticism is directed
towards the question of quality and reliability of information on Wikipedia (Anderka
and Stein 2012; Magnus 2009; Martin 2011; Stross 2006). Among the most discussed
issues in this area are concerns regarding vandalism (Potthast et al. 2008) and the lack
of creditability in the process of creating encyclopaedia articles on Wikipedia
(Kubiszewski et al. 2011).

While the aforementioned works drive attention to the question of Wikipedia’s general
reliability, other studies call for these concerns to be tempered. BARTON'S (2005)
analysis has identified long-term potentials of the Wikipedia platform if its democratic
and decentralised nature is preserved. In this context, Wikipedia is seen as a
demonstration of information systems potentials in creating more emancipatory forms
of communication (Hansen et al. 2009). This perspective appears even more optimistic
when the current quality of Wikipedia articles is found to be comparable with those of
the proprietary Britannica encyclopaedia (Giles 2005). OLLEROS (2008) goes further in
defending the ‘Wikipedia principle’ by questioning the applicability of the traditional
quality criteria to the platform. According to OLLEROS, the quality dimensions for
encyclopaedias undergo changes as a result of disruptive influence that collaborative
projects such as Wikipedia are exercising. An example for such reorientation in
assessing quality of knowledge in general is an essay by ROSENZWEIG (2006) in which
the way of producing historic knowledge on Wikipedia is opposed to the traditional,
rather individualistic, approaches of historians. Furthermore, both WF and academia
17

continue proposing and introducing different instruments to improve the overall quality
of Wikipedia such as the reputation system suggested by KORSGAARD AND JENSEN
(2009).

Related to the question of quality is the vandalism threat Wikipedia is constantly


exposed to. Nevertheless, the fact that Wikipedia users do not often observe vandalism,
especially in the more popular articles, is due to the high number of editors and the
resulting fast reactions to delinquents’ activities. In this context, MARTIN (2011) stresses
the outperforming characteristics of the human group supervision compared to machine
capabilities. It is important to note that Wikipedia has recognised the need for protection
against vandalism early on and introduced a number of instruments to counter it such as
watchlists12, protected articles, revision reverts13 and IP blocking (Martin 2011).
Additionally, there is a great variety of approaches proposed by academia to
automatically detect vandalism, a part of which is already in operation (Adler et al.
2011; Belani 2010; Chin et al. 2010; Ghazal et al. 2007; Javanmardi et al. 2011;
Potthast et al. 2008; Smets et al. 2008). As a result, Wikipedia appears to be able to
sustain control over delinquents’ activities.

Semantic aspects

Researchers concerned with the development of semantic web build upon limits in
Wikipedia’s search capabilities and its inconsistencies through duplication of
information on different pages (Morsey et al. 2012). Their efforts are directed towards a
better utilisation of information contained in Wikipedia. The DBpedia project is a
successful example of these efforts. BIZER ET AL. (2009) introduce DBpedia as an
extraction mechanism to convert structural information contained in Wikipedia
infoboxes into Linked Data14. In a related research, Siorpaes and Bachlechner (2006)
address the challenge of creating and maintaining ontologies for knowledge systems.
They advocate for using millions of Unique Resource Identifiers (URI)15 available on
Wikipedia to improve knowledge management in organisations by using socially
maintained consensual Wikipedia vocabularies.

Content analysis

A considerable part of the Wikipedia research is concentrated around the


encyclopaedia’s content itself. The analysis of Wikipedia articles reveals correlations
and relationships which are important to understand content generation mechanisms.

12
Users can put pages into their watchlist in order to be updated on any changes those pages undergo
13
Cf. http://en.wikipedia.org/wiki/Wikipedia:Edit_warring for details
14
Cf. www.w3.org/DesignIssues/LinkedData.html for Tim Bernes-Lee essay on Linked Data
15
Cf. section 3.1 for Wikipedia structure
18

WILKINSON AND HUBERMAN (2007) discovered a high correlation between the number
of article edits and its quality. Correlation between popularity of an article indicated by
a high number of views and the editing intensity was identified by RATKIEWICZ ET AL.
(2010). Consequently, one could assume a correlation between the quality of a
collaborative work and the popularity of its content. Interestingly, another correlation
was found between the quality of an article and its size16. This speaks in favour of a
rather smooth process of the article evolvement in which additional edits would lead to
a constructive extension and correction of the existing text (Wilkinson and Huberman
2007). According to YASSERI ET AL. (2012), the majority of articles in English
Wikipedia follow this smooth evolution process.

Epistemological and collaboration aspects

While the aforementioned studies are concerned with either questions related to the WP
project as a phenomenon or with explanations of its properties such as the quality of the
articles, another branch of research comes from the epistemological direction. In light of
the thesis’ focus, this perspective on Wikipedia as a knowledge system and on the
evolution of this knowledge is more relevant. Following SUCHECKI ET AL. (2012) it is
possible to consider Wikipedia as a “proxy for knowledge in general”. The emphasis on
temporality of knowledge is characteristic for this type of research. Accordingly,
KALTENBRUNNER AND LANIADO (2012a) analysed changes in talk pages over time in
order to understand patterns of collaborative process associated with the content
creation. In this study, a measure of discussion growth is used as an instrument for
detecting controversies and assessing discussion maturity. RATKIEWICZ ET AL. (2010)
analysed measurable effects of external events on the related Wikipedia articles. They
identified exogenous factors, such as an Oscar nomination for an actor, to have
influence on the content of corresponding articles and on the process behind the creation
of this content.

The process of approaching a consensus on Wikipedia is particularly interesting for the


given work. YASSERI ET AL. (2012) have studied conflict dynamics and controversy
within WP articles. They come to a conclusion that if a consensus is not reached it is
due to either new editors arriving at the platform new events happening. However, the
researchers emphasise that there must be a theoretical explanation for their observations.

16
Blumenstock, J. E. (2008), “Size matters: Word count as a measure of quality on wikipedia“, in
Proceedings of the 17th international conference on World Wide Web (WWW) 2008, pp. 1095–
1096.
19

2.3 Social Representations on Wikipedia

The conducted literature review reveals that although Wikipedia is seen as a proxy for
knowledge in general (Suchecki et al. 2012), there have been no attempts to study the
genesis of this knowledge. In terms of the social representation theory, no studies
explore the genealogy of social representations circulating on Wikipedia. The latter is
striking since core aspects, namely temporality and high emphasis on communication,
are characteristic for both Wikipedia platform and the social representations theory. The
transparency of the collaboration process on Wikipedia makes an application of the SRT
framework to the encyclopaedia platform especially appealing.

The potential benefits of the method for studying social representation on Wikipedia are
evident when recapitulating problems in the research outlined in the both sections of the
literature review. On the one hand, a branch of Wikipedia research is concerned with
patterns in collaboration and consensus making (Kaltenbrunner and Laniado 2012a) -
processes which are difficult to study without a proper theoretical framework.
Considering that the application of qualitative techniques on Wikipedia is rare, the
explanatory power of current analysis conducted on Wikipedia remains limited. On the
other hand, applications of social representation theory are often limited by the lack of
data required to analyse the genesis of social representations (Vaast 2007). The theory
would furthermore directly benefit from a large scale study of anchoring and
objectification processes on Wikipedia since those processes are still barely understood
and rarely investigated (Duveen and De Rosa 1992, p.106). Critics of the theory
challenge the very existence of anchoring and objectification by arguing that tests for
existence of those hypothetical mechanisms are virtually impossible (McKinlay and
Potter 1987). A successful application of the method proposed in this thesis can provide
evidence for the existence of anchoring and objectification processes within the social
representations theory.

In summary, researchers within the Wikipedia community work with unique historical
data regarding knowledge genesis but lack a theory to explain it, while the research on
social representations theory provides an excellent explanatory device for studying
genealogy of knowledge but lacks the required data to identify any patterns. The given
thesis is a step towards overcoming this gap by introducing a method for studying the
genesis and evolution of social representations on the Wikipedia platform. In the world,
in which social media plays an increasing role in communication and collaboration
within and across social groups, both Wikipedia and social representation research
communities will benefit from this ‘marriage’.
20

3 Methodology of Studying Social Representations on Wikipedia

In the following, the research methodology employed by the thesis will be introduced.
The aim of this section is twofold. First, the concepts and processes of social
representation theory need to be mapped to Wikipedia. Second, the quantitative analysis
on the basis of the established link between SRT and Wikipedia must be integrated into
a case study. Case studies are required to demonstrate and to verify the applicability of
the method. Furthermore, they should illustrate the necessity of a qualitative analysis
which supplements the quantitative data that can be derived from the historical data on
the Wikipedia platform.

The section begins with an analysis of the Wikipedia structure followed by a discussion
regarding the suitability of Wikipedia for studying social representations. The link
between Wikipedia and SRT elements is established in section 3.3, while section 3.4
presents the structure of the case studies. Implications of the required analysis
techniques for the case studies are discussed in section 3.5.

3.1 Wikipedia Structure

Any research based on Wikipedia’s historical data requires an understanding of the


platform’s internal structure (Suchecki et al. 2012, p.1). In the following, Wikipedia
elements are explained in detail.

Page

When considering technical details of Wikipedia from the user’s perspective, its central
element is the page. Every Wikipedia page is an instance with a Unique Resource
Locator (URL) leading to it. An example of a page is the article “Thesis” with a
corresponding URL: http://en.wikipedia.org/wiki/Thesis.

Not every Wikipedia page is an encyclopaedia article. There are different page types
represented by so-called namespaces – sets of pages containing a special prefix that is
recognised by the MediaWiki software17. It is important to note that every instance of
MediaWiki software such as the English Wikipedia can have its own namespaces. Table
1 indicates the namespaces of the English Wikipedia.

Basic namespaces Talk namespaces


(Main/Article) Talk
User User talk
Wikipedia Wikipedia talk

17
http://en.wikipedia.org/wiki/Wikipedia:Namespace
21

File File talk


MediaWiki MediaWiki talk
Template Template talk
Help Help talk
Category Category talk
Portal Portal talk
Book Book talk
Education Program Education Program talk
TimedText TimedText talk
Module Module talk
Tab. 1 Namespaces of English Wikipedia (Source: wikipedia.org)

Consequently, URLs to http://en.wikipedia.org/wiki/ ending with Editor or User:Editor


will lead to different pages. While the former will lead to an article in the Main/Article
namespace, the destination of the latter is a page of the user “Editor” in the user
namespace. It is apparent that pages in different namespaces are semantically different.
Pages within the user namespace, for example, represent data about the user. In
contrast, file namespace contains pages dedicated to uploaded files and their metadata.
All 26 namespaces have their own functions, the majority of which might be less known
to occasional Wikipedia users. Important is that it is possible on the platform to have 26
pages with the same name but located in different namespaces each of which is used to
represent a particular Wikipedia resource type such as an article, user, book or category.
The existence of different namespaces must be taken into account when conducting
quantitative analysis of articles. This will be demonstrated in section 3.

As illustrated by Table 1, namespaces are divided into basic and talk namespaces. One
of the central collaboration aspects of Wikipedia is this distinction according to which
every page in any of the basic namespaces has a corresponding talk page in which users
can discuss the content of the actual page. While the ‘Alice’ encyclopaedia article can
be discussed on the ‘Talk:Alice’ page, the page for discussing the user ‘Alice’ is
‘User_talk:Alice’. The same logic applies to all other basic / talk namespace pairs.

The majority of Wikipedia pages in the main/article namespace have several


subsections for structuring the content of the encyclopaedia article. The existence of
different sections is indicated by a table of contents at the beginning of the article, and
by corresponding section headlines in the article text. The first section is typically the
definition section for the subject of the interest, while the subsequent sections are
content-specific. Short or newly created articles often comprise only one section.

The Wikipedia platform allows the content of entire pages to be used within other
pages. This is achieved by using the so-called templates. Templates reduce formatting
22

efforts by including extensive amounts of formatted content in a page, through a simple


embedding technique. A typical example of a template is an infobox – a table often
displayed at the beginning of articles about cities and countries containing general
statistics for the latter. The syntax for inclusion of a template using the Wikipedia
markup language is illustrated below.

{{Template:<template name>|<template parameters>}}

Every Wikipedia article is assigned to at least one category. The Wikipedia category
structure is hierarchical, with 26 main categories18. Each category includes a list of the
pages it contains, a list of its subcategories and a list of the supercategories that this
category is a part of. Page categories are displayed at the bottom of every page in the
main namespace. This allows for navigation between thematically related encyclopaedia
articles. Navigation between Wikipedia pages is also enabled by internal links entitled
wikilinks, which appear within the content of Wikipedia pages and point to other pages.
It is important to note that for encyclopaedia content, only links to articles relevant for
the understanding of the phenomenon are welcomed by the Wikipedia community19.

Any change to a Wikipedia page results in a new version of this page being permanently
stored on the Wikipedia servers as a revision which is made accessible for any user at
any time. Additionally, Wikipedia stores meta-data for the new revision. This includes
the timestamp, a user description of the change, an indication of the change in the page
size, a user indication of whether the change is minor or major and information about
the user who made the change. The latter includes the name of the user, or IP address in
the case of an anonymous user, and an indication of whether or not the user is a bot.

User

The user concept on Wikipedia requires elaboration. Wikipedia users are generally
divided into anonymous and registered users. Anonymous users are known by their IP
address, and have minimal rights such as permission to read and to edit unrestricted
articles. To obtain the rights to perform additional actions, a registered user must have
the required user access level20. Advanced user access levels allow blocking other users,
moving or renaming articles and applying restrictions to prevent pages from being
edited.

Registered users are divided into standard users, higher-privileged users such as
administrators, bureaucrats and stewards, and bot users. The privileged users possess

18
http://en.wikipedia.org/wiki/Category:Main_topic_classifications
19
http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Linking
20
http://en.wikipedia.org/wiki/Wikipedia:User_access_levels
23

additional access rights as mentioned before, while bot users are a group of non-human
automata in the form of programs scripts and macros. Bot users perform automated
tasks for the purpose of both maintaining Wikipedia quality and gathering statistics.

3.2 Suitability of Wikipedia for Studying Social Representations

Additional remarks regarding the suitability of the Wikipedia platform for studying
social representations are required before establishing the link between the theory and
Wikipedia elements in the next section.

The claim by MARKOVÁ (1996, p.183) that “society forms its opinions by consensus
amongst members of the public who are all equal” can be applied to both the process of
forming representations and the process of creating Wikipedia articles. Although she
refers to the former, the similarity in the temporal consensus-making within SRT and on
Wikipedia is striking. This becomes clearer when the consensus-making on Wikipedia
is compared to the process of forming social representations as described by
MOSCOVICI. To illustrate the comparison, an extensive citation is employed:

“[...] we could think of social representations as being produced by a


collective decision committee. Its members cast their votes and can express a
broad range of opinions. Each one knows how the others have voted so that he
can change his mind, combine opinions. The final decision is the joint effort of
the participants and expresses a sense of the meeting. There is no need to
reach an explicit consensus or to submit to a rite; as long as the individual
initiatives are in line with the social flow, nothing more is needed. Each
individual proposition is thus tied in with the action of the group, which can
give it a shape that is acceptable and comprehensible for all concerned.”
(Moscovici 1988, p.220)

When the aforementioned process is projected onto the Wikipedia platform, its users
form the collective decision committee. They vote and express opinions by changing the
Wikipedia content and participating in discussions on the corresponding talk pages.
They know how others have “voted” by accessing the revision history and by observing
the previous discussions regarding the topic of interest. In doing so, Wikipedia users
build upon the efforts of others. Consequently, the natural way of reaching temporal
consensus, as observed on the Wikipedia platform, is exactly the type of consensus
MOSCOVICI refers to in his words about forming social representations. In fact,
MOSCOVICI’s citation might well be used as an official description of the collaboration
process on Wikipedia.
24

As outlined in the literature review, the main weaknesses of Wikipedia are identified
around the quality of the encyclopaedia content. Those weaknesses are however less
relevant for answering the research question. This is due to the thesis focusing on the
process of forming social representations, rather than on the ‘correctness’ of this
content. The very idea of ‘correctness’ itself is abandoned in social representation
theory as was explained in section 2.1.2. The dynamic nature of Wikipedia and the
SRT, together with the growing number of social representations circulating the digital
world, make Wikipedia an appropriate means for investigating the concept of social
representations.

However, several aspects of the theory do not immediately match with the collaborative
processes on Wikipedia. First, the social representations theory assumes plurality of
representations within a single social group (Moscovici 1988, p.219). The Wikipedia
platform does not provide direct means for tracing different social representations of the
same phenomenon that exist simultaneously within a single social group. Rather, there
is only ever of one representation at any given point in time that can be analysed. This is
one of the Wikipedia specifics. Consequently, possible effects this aspect might have on
the evolution of social representations must be discussed. Another potentially
problematic aspect is the fact that all representations on Wikipedia seem to be of a
rather formalised nature: There are continuous efforts towards improving the quality of
Wikipedia by making it more neutral, objective and trustworthy. An example for such
efforts is the provision of extensive information regarding the sources from which the
content has been acquired. In the MOSCIVICI’s terminology one would say that they are
of a ‘scientific’ nature. The above implies that the social representations found on
Wikipedia are not situated in the consensual world as it is supposed by the original
formulation of SRT. Instead, they tend to be “indifferent to individuality and [to] lack
identity” – a characteristic for the opposite reified world (Moscovici 2000/1984, p.32).
This incoherence seems to support the criticism that the theory has experienced
regarding the distinction between the reified world of science and the consensual world
of common sense. Consequently, the work supports the view that the interrelation
between these two worlds is more complex than originally formulated by MOSCOVICI.

In light of the aforementioned arguments, Wikipedia is considered suitable for studying


the evolution of social representations that circulate the platform. Nevertheless, the
specifics of the platform must be explicitly taken into account in the discussion part of
the work.
25

3.3 Wikipedia Article as a Central Element of Study

The theory of social representations was presented in section 2.1.2, while the structure
of the Wikipedia platform was discussed in section 3.1. Section 3.2 was considered with
the suitability of the Wikipedia platform to study social representations. It is now
possible to address the research question of how social representations on Wikipedia
can be studied.

3.3.1 Social Representation Concepts in the Article

To answer the research question it is necessary to break it down into several sub
questions as illustrated in Fig. 2. The first part of the question (Q1.1) will be addressed
in the following, while the second part (Q1.2) comprises the content of the subsequent
section.

Fig. 2 Breakdown of the First Part of the Research Question 1

In order to study the evolution of social representations on Wikipedia, the first step must
be the clarification of what is understood by ‘social representation’ on the platform
(Q1.1.1), as well as what the equivalent of an anchor is on Wikipedia (Q1.1.2). The
answers to both questions establish a theoretical link between concepts within the social
representations theory and Wikipedia elements.

According to MOSCOVICI (2008/1976, p.xiii), social representations must provide a code


for naming various aspects of the world unambiguously. Furthermore, they must help
individuals to orientate themselves and allow communication. Those characteristics are
inherent to every encyclopaedia article. Each article gives a unique name to a particular
aspect of the world and aims to provide meaning to it (Siorpaes and Bachlechner 2006).
These functions are clearly of importance to individuals’ orientation and
communication. In this context, each socially created encyclopaedia article reflects a
corresponding social representation. In the case of Wikipedia, only articles in the Main
namespace are encyclopaedia entries, and thus are the only entries containing social
representations (cf. Tab. 1 in section 3.2). Articles in other namespaces are limited to
26

representing the specific Wikipedia resources. They are used to organise the Wikipedias
content, and are therefore not considered to contain social representations.

Unlike with the social representation concept, the correspondence between the anchor
concept and Wikipedia elements is less apparent. There are different candidates among
Wikipedia elements that can be interpreted as anchors. In order to identify elements that
are meaningful and coherent with the theory, a number of different articles were
analysed in the initial phase of the research project. The most natural ‘candidate’ for an
anchor role within an article is an internal link to another article. The theory requires
that anchors are social representations themselves (Moscovici 2000/1984, p.42). An
internal link to an article in the main namespace satisfies this requirement as per
definition. However, not every internal link contained in an article plays the role of an
anchor: There are internal links that are not related to the social representation under
study. It was found that different sections of the article contain different numbers of
relevant anchors. After the examination of multiple Wikipedia articles, the anchors from
the first section of the article, known as the ‘definition part’21, were identified as the
most relevant. The reason for this is because the process of defining internal links
directly corresponds to the social representations theory. This process aims at
identifying concepts relevant to understanding the phenomenon of interest. Such
phenomena are linked by the Wikipedia users to the corresponding articles. Figure 3
exemplarily shows potential anchors in the definition section of the “Thesis” article.

Fig. 3 Internal Wikipedia Links in the Definition section of an Article

21
The definition is the first part of the article before the table of contents or any other article sections.
27

Further candidates for the anchor role are the categories the article is assigned to. As
outlined in the section 3.1, every Wikipedia article in the main namespace is assigned to
at least one category. However, not every category has a corresponding encyclopaedia
article in the main namespace. Most categories have as their single function the
organising of articles into a category. For example, the article “IPad” is assigned to the
category “Products introduced in 2010”. The latter has no corresponding article in the
main namespace meaning that there is no encyclopaedia article for “Products introduced
in 2010”. This contradicts the requirement of an anchor to be a social representation
itself. Furthermore, analysing the categories of an article has proven itself to be a
tedious task. The ‘category’ elements on Wikipedia are hence disregarded as possible
anchors due to both theoretical and technical challenges.

In addition to internal links in the definition section and categories, any word in the
plain text of the article can potentially be an anchor. It is, however, difficult to identify
which of them are possible anchors and which are not. Additionally, there are reasons
for excluding unlinked components of the plain text from the list of potential anchors.
The process of linking concepts as such is an important factor when it comes to
interpretation of an element as an anchor. Internal linking on Wikipedia is a purposeful
act by the social group that indicates an intended reference to a related concept. Such
intention emphasises the importance of the referenced social representation when it
comes to giving meaning to the social representation under study.

3.3.2 Social Representation Processes in the Article Evolution

Similarly to the link between the concepts of social representation theory and Wikipedia
elements, a connection between SRT processes and those observable on Wikipedia must
be established in this section. A match between theoretical processes within the theory
and those observed on Wikipedia is required to interpret changes in the social
representation on Wikipedia in accordance with MOSCOVICI’s theory. This is the second
part of the research question and can be broken down into sub questions as shown in
Fig. 3. Answers to the sub questions will provide a foundation for recognising patterns
in the evolution of social representations and in the collaboration processes
accompanying this evolution.

Regarding social representation evolution, there are two different groups of aspects to
be considered when searching for means of studying these processes on Wikipedia. The
first group consists of aspects regarding the change in the representation while the
second group focuses on collaboration dynamics within the social group. Figure 3
illustrates the distinction.
28

Fig. 4 Breakdown of the Second Part of the Research Question

i. Representation evolution aspects

A change in a social representation is marked either through a change in the anchoring


or through a change in the objectification level. Answers to the first three sub questions
1-3 will provide a way of studying the anchoring process on Wikipedia. The evolution
of social representations is impossible without the introduction of new anchors and the
removal of old anchors. To identify new and obsolete anchors in Wikipedia (1) all
article revisions within two different periods of time are considered. New anchors are
those found in revisions of the newer period of time that are absent in revisions of the
previous period of time. Conversely, obsolete anchors are those that were present in the
previous period, but are absent in the current period of time.

Similarity of anchor states (2) is another important aspect, as anchors are not considered
to possess equal strength or significance for the corresponding social representation.
Even after the introduction of many new anchors, the similarity between anchors from
two different periods can stay on the same level due to the relative weakness of the new
anchors. To measure the similarity of anchoring in two different time periods, the
cumulative time during which the anchors were present in those periods of time can be
compared. A higher similarity value should indicate that the time in which the strongest
anchors were present in one period is comparable with the time the same anchors were
present in the other period of time. It is important to note that to assess similarity, a pure
quantitative analysis as described above is not sufficient. Only a qualitative analysis of
29

anchors and their context can reveal whether they are indeed similar or not. Quantitative
data can only indicate changes that will then trigger the subsequent qualitative analysis.

Identifying new/obsolete anchors (1) and measuring the similarity of anchoring (2) does
not however completely describe the anchoring evolution process: Data about the
stability of anchoring (3) is also of importance. It is possible that anchoring in two
different time periods is very similar according to the above measures, and yet none of
the anchors stays in the article for a longer time. Therefore, it is required to assess the
average time that anchors are present in the article. A value that is considerably less
than the timeframe would indicate that the anchoring in this period is instable due to
anchors in average being present only for a short amount of time.

The last aspect in the representation evolution is the indication of the objectification
process (4). According to the SRT, objectification is a process through which the
unfamiliar becomes familiar. Consequently, indications for objectification can be found
by looking at other social representations using the corresponding social representation
as an anchor. This is because only something already familiarised to a certain degree
can be used as an anchor. Thus, social representations of more familiar phenomena are
used more often to anchor other phenomena than those that have not yet reached a
higher degree of objectification.

ii. Collaboration evolution aspects

Another group of aspects regarding the evolution of social representations on Wikipedia


is important for understanding the collaboration dynamics. In the social representations
theory, any change in the representation is accompanied by social interactions.
Consequently, the evolution of the collaboration process is an important factor in
understanding how social representations change over time. Three different aspects of
collaboration on Wikipedia are considered. Similar to the representation aspects, they
are based on the historical revision data.

The intensity of the collaboration process that is associated with a social representation
under study (5) is derived by comparing the number of article edits and editors in
different time periods. Given that each encyclopaedia article on Wikipedia is interpreted
as a social representation, this is the most natural and the only available means of
accessing changes in the collaboration process. Correspondingly, an increase in the
number of article edits in a period of time indicates higher collaboration intensity.
Similarly, the number of participating editors in different time periods is another
possible indication of change in the intensity of the collaboration process. Disagreement
among participants (6) can be measured by looking at the relative number of anchor
30

introductions and anchor disappearances, in relation to the number of anchors in a given


period of time. This measure is a justified means of assessing disagreement on
Wikipedia in that it indicates periods of time in which a user group repeatedly
reintroduces anchors that are (repeatedly) removed by another user group. The last
aspect in assessing the collaboration process, which is associated with a social
representation on Wikipedia under study, is measuring asymmetries (7). The latter are
indicated by centralisation or decentralisation in the collaboration. Centralisation
implies a higher average participation of contributors, meaning that each user would
make more edits in average compared to the previous time period. In a decentralised
collaboration, on the other hand, edits are split between higher amounts of different
contributors so that every user contributes a smaller amount of edits to the total sum of
edits. The scheme in Appendix E facilitates the interpretation of intensity and
asymmetries in the collaboration process by presenting possible collaboration trends
based on the change in the number of edits and editors in a time period.

3.4 Case Studies

Case studies in this work aim to demonstrate and verify the applicability of the method
developed to study social representations on Wikipedia. Furthermore, they should
illustrate the use of quantitative data that is available on Wikipedia to support the
analysis of the social representation under study.

It must be noted that not every Wikipedia article is suitable for a case study. The article
should satisfy the following criteria. First, it should have a high number of edits and
distinct editors in order to represent the social aspect of representations. Second, its
historical data should cover a longer period of time to provide enough data for pattern
recognition. The phenomenon described by the Wikipedia article, however, should not
be older than the encyclopaedia itself. Otherwise, the initial phase in the process of
forming a social representation cannot be observed. Finally, the phenomenon should
have a certain level of ambiguity or novelty which is required to demonstrate the
complexity of the corresponding collaboration processes.

Following the concepts of SRT, all case studies should feature the following three pars.
The first part will introduce the social representation under study to provide a context
for the subsequent analysis. The second part will focus on the evolution of the
corresponding social representation in terms of anchoring. Integral parts of the
anchoring analysis are described in section 3.4.1. The objectification process of a social
representation under study will be analysed in the third part of the case study.
Correspondingly, section 3.4.2 provides details about how the analysis of the
objectification process is conducted.
31

3.4.1 Case Study Anchoring Analysis

Changing anchors in a Wikipedia article over time provide information that is required
to interpret the direction of the change in the social representation under study. In order
to identify patterns in this process, it is apparent to group anchors in categories, which
reflect distinct aspects of the representation. The categorisation requires a qualitative
analysis of the anchors. Wikipedia’s historical data, which includes all relevant internal
links (cf. Section 3.3.2), will provide the foundation for this qualitative analysis.

By considering the overall time-frame used for the case study, the “stability” of the
anchoring is examined as the first step of the anchoring analysis. The analysis includes
interpretation of data about the stability and similarity of anchoring as described in
section 3.3.2. Interpretation of this data provides first indications for the intensity of the
change that a particular social representation is undergoing without going into details.

After the interpretation of indications provided by the quantitative data about internal
links, a detailed qualitative anchor analysis should be conducted on the level of single
anchors. For this purpose, monthly and yearly anchor data are used. Out of all anchors
in the given time period only the strongest anchors are considered. Anchors that fall
under the threshold, which is set individually for each case study, are not accounted for
in the analysis. In order to reveal the meaning of each anchor, it is necessary to observe
anchors in the context of the historical revisions they are a part of. The exploration of
different historical revisions is a foundation for coding anchors and categorising them
into different groups. The coding is, in turn, a prerequisite for the construction of the
narrative that describes the evolution of the anchoring for the social representation
analysed in a case study. Identified anchor categories help to trace significant changes in
the social representation, in contrast to changes given by anchor changes within
categories. In order to derive meaningful categories from the list of generated anchors,
steps illustrated in the Fig. 5 are required.

Although anchors identified in the definition section of the article are suitable
candidates for representing first order codes as defined by DACIN ET AL. (2010, p.16),
redundant or unrelated anchors may exist among them. In the first step (1), every anchor
in context of an anchor is uncovered. During the analysis of the historical revisions, all
different text passages in which the anchor appears are examined, documented and their
relationship type to the phenomenon is formulated. For example, the relationship of the
anchor ‘iPhone’ to the social representation of the Apple iPad can be “a device from
which iPad inherits”. Depending on the context, it can also be “a device iPad is opposed
to”. In case there are different contexts among historical revisions for the same anchor,
all different relationship types are formulated.
32

Fig. 5 Process of Deriving Categories form a Generated Anchor List

Given the context for every anchor, in the form of relevant citations and relationship
types the anchors form with the phenomenon under study, it is possible to merge
redundant and to delete unrelated anchors in step (2). The reasons for an anchor to be
considered unrelated are manifold. For example, the definition section of an article can
contain a reference to a pdf document in which ‘pdf’ is linked to the article about the pdf
file format. This article is not an anchor for the phenomenon. Redundancy, on the other
hand, is always due to either the renaming of an article during the timeframe of the
analysis or to the existence of multiple articles that have different names, yet describe
the same phenomenon22. In accordance with MILLS ET AL. (2006), the resulting list of
relevant anchors corresponds to open codes and is subject for theoretical coding, which
is done by applying steps (3) and (4) in several iterations. The aim is to derive a
minimal set of mutually exclusive and collectively exhaustive categories. As long as the
current set of categories does not satisfy the condition, both categorisation steps are
applied to each anchor that is not yet assigned to a category.

22
This is a rare case resulting from the quality issue on Wikipedia where two articles which describe the
same phenomenon coexist for a certain amount of time before they are merged.
33

In step (3), an anchor is assigned to a category. The anchor will either be assigned to an
existing category, or to a new category in the case that a category corresponding to the
relationship type is yet to be defined.

Step (4) is required to account for adding a new category. The function of the step is
twofold: First, verification of whether or not there are anchors that can be assigned to
the new category, and second, reconsideration of the existing categories in case they are
concurrent with the newly added category in the sense that some of the anchors can be
assigned to either of them. In case of the conflict, categories are redefined accordingly
and all their anchors categorised into one of the unambiguous categories.

It is important to note that the anchoring analysis includes more than the categorisation
of anchors. The identified categories are subsequently used to divide the overall analysis
time frame for the evolution of the corresponding social representation on Wikipedia
into distinct phases. Each phase should represent a significant change in the social
representation under study, which is given by increasing or decreasing importance
among the identified anchor categories in the corresponding phase.

3.4.2 Case Study Objectification Analysis

The last part of every case study will shed light on the evolution of the social
representation under study regarding the objectification process. As identified in section
3.3.2, the level of objectification can be assessed by identification of the Wikipedia
articles that contain a reference to the article under study. Out of all references, only the
references to articles in the main namespaces are considered. Furthermore, indirect
references resulting from articles using templates containing the analysed article are
disregarded. The remaining articles are thus the encyclopaedia articles with a reference
to the article under study in the text or within categories.

Using the results from anchor analysis, objectification evolution will be compared with
the anchoring dynamics. In this context, the aim is to discover how changes in
anchoring influence the objectification process. This comparison can potentially provide
an additional perspective on the evolution picture process of the social representation
under study.

3.5 Towards a Quantitative Analysis of Wikipedia Articles

Case studies of social representations on Wikipedia depend on data that Wikipedia


provides for the historical revisions of its articles: First, the quantitative data are
required to identify possible changes in social representations allowing for the intended
34

look into the causes of those changes. Second, the qualitative analysis of anchors
requires lists of anchors for different time periods which must be isolated from the
revisions of the article. Third, analyses such as the identification of articles pointing at
the social representation of the interest are impossible to conduct manually – it is a task
that requires analysing all encyclopaedia articles on Wikipedia. Lastly, quantitative data
are required for identifying common patterns in the evolution of social representations
on Wikipedia.

However, the existence of thousands of revisions for an article of an average popularity


makes a manual data analysis impossible. To answer the research question, a tool
capable of providing the necessary statistics for the case studies is required. The use of
computer-aided analysis tools is already advocated within the social representations
community (László 1997). In fact, as outlined in the literature review, a number of
researchers have already adopted quantitative analysis techniques for studying social
representations (Breakwell and Canter 1993; Pawlowski et al. 2007; Wagner et al.
1996). To this end, the next section will outline the blueprint for analysis software that
allows tracking of the SRT phenomenon with Wikipedia data. However, considering
that social representations derived solely from quantitative analysis are rather limiting
(Bauer and Gaskell 1999), this thesis advocates always embedding any quantitative
analysis in a corresponding qualitative analysis.
35

4 WikiGen: A Statistical Tool for Quantitative Analysis of


Wikipedia Articles

WikiGen is an online tool created to support case studies in this thesis. The tool is a web
application which can be accessed via a standard web browser. By connecting to
Wikipedia databases over Wikipedia’s Application Programming Interface (API),
WikiGen generates multiple statistics and navigation maps based on the historical
revisions of the chosen article. There are 196 Wikipedia languages supported by the
web application from which the user can select articles.

In the next section, design decisions made during the initial phase of the tool
development are explained and the resulting tool architecture is presented. This is
followed by a detailed elaboration on statistical features implemented in the tool. The
corresponding section 4.2 provides both formulas for the measures as well as
exemplarily graphical and textual outputs of the corresponding statistics.

4.1 Architecture

Three different approaches can be used to programmatically access Wikipedia’s


historical data: The first of these approaches is to download database dumps23 of single
Wikipedia language instances. A database dump of the English Wikipedia, for example,
contains all Wikipedia resources in the form of database tables that can be accessed via
SQL queries. The next approach is to access the data via API. The Wikipedia API is
specially created by the WF as a two-way-communication between third party program
environments and Wikipedia databases by means of the HTTP request-response
protocol. The last method to access Wikipedia’s historical data is to use the so-called
Wikimedia Toolserver24. Researchers and volunteer programmers can use this cluster of
servers to access databases of all Wikimedia projects, including the Wikipedia projects,
to perform various performance critical tasks on the replicated data.

Selection of one of the aforementioned ways to access Wikipedia’s data is determined


by their suitability for the intended task. The use of database dumps is associated with
high performance and is encouraged by the Wikimedia Foundation for the reason of
decoupling workload caused by third party applications from the live databases.
However, there are multiple drawbacks to be considered regarding the usage of database
dumps. First and foremost, the size of the data dumps25 is known to render computations
intractable in many scenarios (Martin 2011, p.10). Furthermore, dumps imply the

23
Cf. http://en.wikipedia.org/wiki/Wikipedia:Database_download
24
https://toolserver.org/
25
Full dumps amounts to several terabytes when expanded. Cf. http://dumps.wikimedia.org/enwiki/latest/
36

impossibility of working with the live data per definition. Updating dumps on a regular
basis complicates the project maintenance and adds complexity to the infrastructure.
The Toolserver addresses the latter problems by taking over the maintenance of the
replicated Wikipedia databases. It allows data accessing for third party applications over
a specially created interface. In exchange, the Toolserver adds an additional dependency
for the project – a complex component out of the programmer’s control. Additionally,
the Toolserver has a long and tedious registration process for anyone who intends to use
the service. The delay in registration time is amplified by the learning curve associated
with the complexity of the Toolserver’s interface. In context of a limited time provided
for the thesis, the use of the Toolserver services is therefore not feasible.

Consequently, the API approach was chosen for accessing Wikipedia’s historical data.
This approach allows for the most compact and transparent infrastructure. Moreover, it
is the only way to access live data and, apart from using the Toolserver, the only way to
access different Wikipedia language databases efficiently. The latter utilise the same
API, which allows for an inherent multilingual support in the WikiGen tool. The only
restriction of the API solution in the context of this work is the limited performance – a
drawback which is only relevant in case of performance critical tasks. Our analysis has
shown that for the quantitative analyses required by case studies, API performance is
sufficient. Especially shifting the calculation procedures from the server to the clients
allows the scalability required to overcome possible performance issues.

The overall infrastructure for the WikiGen web application is illustrated in Fig. 6.

Fig. 6 The Infrastructure of the WikiGen Solution


37

The web application is stored on a web server26 and can be accessed by any computing
device with browser functionality such as desktop computers, mobile phones or tablet
computers (1). Using statistics functions provided by the tool results in the HTTP
requests that are sent to the Wikipedia API. Each request contains specifications of the
data required for the corresponding statistical analysis (2). Note that every language
version of Wikipedia has its own API. Figure 6, however, depicts the API element
outside the Wikipedia elements. This is done in order to highlight the fact that all those
different API instances provide the same functionalities. The actual HTTP requests
from WikiGen are sent to different URL’s that corresponds to the Wikipedia language
platform that the user has chosen for the analysis. Requested data are then sent to the
requesting device in the JSON format (3) and is directly processed on the device.

The WikiGen web application is a pure HTML, CSS and JavaScript solution. The visual
elements are controlled by HTML/CSS while JavaScript controls the application
navigation flow as well as triggers the data requests and performs the subsequent
calculations. Figure 7 illustrates both front- and backend layers of the application.

Fig. 7 Internal Organisation of Modules in the WikiGen Application

The front layer consists of multiple HTML documents containing styled web
application elements as well as place holders for the statistical elements such as graphs
and data tables. As for the JavaScript backend, it utilises jQuery 27 libraries to control
web application elements through the application scripts and to control statistics
elements through the calculation and rendering script modules.
26
Current address of the server is http://wikigen.info.net.ua/
27
Cf. http://jquery.com/
38

The application scripts consist of scripts for navigation between different HTML pages
of the application (1), scripts for animation of the HTML elements (2) and scripts used
for application configuration (3). The latter includes control over appearance of HTML
elements as well as data source related configurations. Statistical elements in WikiGen
require dedicated modules for their calculation and rendering. The calculation module
includes scripts for data requests to the Wikipedia API (4), the implementation of data
processing algorithms (5) as well as multiple utility functions for data converting, text
parsing and other supporting algorithms (6). Rendering algorithms (7) as well as open
source libraries they utilise to visualise diagrams (8)28 and data tables (9)29 are part of the
rendering module. Technical characteristics of WikiGen include full parallelisation of
its functionalities, scalability due to efficient client calculations, broad language support
and extensive in-tool help.

4.2 Statistical Features

Statistical features in the WikiGen web application are divided into editing statistics
(section 4.2.1), link statistics (section 4.2.2), reference statistics (section 4.2.3) and
statistics provided by already existing tools which were integrated into the WikiGen
application (section 4.2.4).

4.2.1 Editing Statistics

Editing statistics in WikiGen corresponds to collaboration aspects of social


representations evolution on Wikipedia30. The tool provides aggregated data about the
number of edits and editors in different periods of time as well as a combined measure
edits per editor.

To capture the collaboration processes in greater detail, different types of editors are
distinguished according to user attributes provided by the Wikipedia platform:
anonymous, registered and bot editors31. Overall editors include all three different types.
In a similar vein, overall edits are divided into distinct and major distinct edits. The
distinction is based on the data provided by Wikipedia regarding major or minor edits as
well as observations made during the revision analysis of different articles: A typical
pattern was discovered when a number of subsequent revisions is created by the same
user within a short period of time. This pattern arises when an editor saves the article
several times during the editing procedure. In this case, all resulting revisions can be
considered as a single distinct edit.
28
Cf. http://www.flotcharts.org/
29
Cf. http://www.datatables.net/
30
Cf. section 3.3.2 and question Q1.2
31
Cf. section 3.1
39

Different types of edits and editors can be combined into multiple versions of edits per
editor statistic. However, from all possible combinations, only the major distinct edits
per corresponding non-bot users are considered. It is the only meaningful combination
in the sense that minor or intermediate edits as well as edits made by bots are irrelevant
from the perspective of the social representations theory.

Figure 8 illustrates an exemplarily output of the WikiGen tool for the editing statistics.
Per default, the data are visualised in the form of monthly and yearly bar charts. The
chart appearance can be however changed in the WikiGen settings.

Fig. 8 WikiGen Bar Charts for Edits Statistics

Furthermore, data for the overall period of time is provided in text form (Fig. 9).

Fig. 9 WikiGen Text Output for Edits Statistics

The output for editor statistics is analogous to the edits statistics. It displays yearly and
monthly data for the amount of editors that were active in the corresponding time
frames. The number of edits and editors is furthermore combines into a joint measure
edits per editors. An exemplary chart for edits per editor data is shown in Fig. 10.
40

Fig. 10 WikiGen Edits per Editor Chart

Editing statistics can be interpreted in accordance with the research question Q1.2
presented in section 3.3.2. An increase in the intensity of the editing activity might
indicate external events such as, for example, a murder charge in February 2013 for a
South African sprint runner with a double below-knee amputations Oscar Pistorius. A
clear rise of editing activity in the corresponding article is indicated in Fig. 11.

Fig. 11 Exemplarily Effect of an External Event on the Editing Activity


41

Using the interpretation scheme from Appendix E it is possible to use edits per editors
statistics to identify interest increase or decrease, asymmetries as well as find indication
for higher unfamiliarity level of the underlying phenomenon.

Fig. 12 Exemplarily Application of Interpretation Scheme (Appendix E) on


Editing Data

Figure 12 indicates a rising interest together with centralisation of the collaboration in


the October month (1) with a possible rise in the unfamiliarity of the underlying
phenomenon. Similar interpretations can be made for every. Note that the data in the
Figure are standardised – a feature provided by WikiGen for better comparison of
tendencies among statistics with different scales.

One of the most useful WikiGen features for the qualitative analysis of social
representation on Wikipedia is the interactive revision map illustrated in Fig. 13. Every
horizontal line in the chart corresponds to a historical revision of the article. Beside the
insight into the distribution of revisions over time which helps to identify gaps and
possible periodicity in the editing activity, it allows for fast navigation between different
revisions. By choosing any revision, the content of the corresponding historical version
of the article is displayed.
42

Fig. 13 Example of Revision Map with Illustrated Navigation Functionality

4.2.2 Links Statistics

The major part of the WikiGen tool focuses on the anchoring analysis. Since
Wikipedia’s internal links are interpreted as anchors, all measures in this section are
based on internal links to Wikipedia articles, which can be found among revisions of the
article under study.

4.2.2.1 Anchor Maps

The map visualises periods of time a chosen article was present or absent in the revision
and linking the historical revisions in which the anchor disappeared or (re)appeared. In
that way it is possible to navigate between relevant historical revisions of the article in
which the anchor which is analysed is contextualised. Figure 14 exemplarily depicts an
extract from the anchor map for the user interface anchor in the iPad article. There are
two time periods in which user interface anchor was present in the article.

Fig. 14 Anchoring Map for User Interface Anchor in iPad Article


43

Every point in the revision map corresponds to a revision in which an anchor was
introduced or removed. When clicking at the point, a corresponding revision is opened
showing the content of the historical article similar to the revision map functionality
explained in the previous section.

Figure 15 shows the article which is opened after clicking on a point corresponding to a
revision dated 15.03.2010 18:39.

Fig. 15 Extract form the Historical iPad Article dated 15.04.2013

User interface anchor is clearly introduced as a key feature of the iPad – an


improvement when compared to related Apple products such as iPhone and iPod touch.
The context of each anchor can be analysed using the anchor map and navigating
through revisions in which this anchor was observed. This is an essential step for
deriving anchor categories that can be then used to describe changes in the social
representation under study (cf. Coding procedure in section 3.4.1).

4.2.2.2 Anchor Snapshots

Any comparison between anchoring in two different periods of time requires data about
all anchors present in those periods. Statistics regarding anchor evolution are based on
this data in form of snapshots. A snapshot in this context is a list of all anchors in a
given period of time including all relevant attributes. Figure 16 shows a snapshot for the
iPad social representation in the year 2010. The table has 5 columns:

Anchor: name of the anchor which is an internal Wikipedia link within the definition
part of the article.

Days survived: cumulative number of days an anchor was present in the definition part
of the article. Note that this number is not resulting from the difference between first
and last seen. Instead it adds all time periods in which the anchor was present in the
article and thus corresponds to the graph in the Anchor map section.
44

Fig. 16 Anchor Snapshot for iPad Social Representation in 2010

(Re)Introductions: indicates how many times an anchor was introduced to the


definition part of the article. A number unequal one can result from the anchor being
removed and reintroduced again.

Revisions survived: number of major distinct edits an anchor survived. The definition
of major distinct edit is coherent with the one given in section 4.2.1. This measure is
beside Days survived another perspective on how strong an anchor is.

Anchor Strength: a linear combination of Days survived and Revisions survived (cf.
formula 1).

, where (1)

: strength of anchor in period

Number of days anchor survived in period

Number of days in period

Number of revisions anchor survived in period

Number of revisions in period

This rating between 0 and 1 indicates the strength of an anchor in the sense that an
anchor is strong (1) if it both survived all revisions and stayed in the article for the
whole period of time. All anchors are sorted in the snapshot table according to the rating
column to show the strongest anchors for the corresponding timeframe. Note that only
45

anchors which survived at least one day are entering the table in order to filter out
unimportant/junk data.

4.2.2.3 Anchor Dynamics

This section directly corresponds with the aspects 1-3 of the research question Q1.2 in
section 3.3.2. For the purpose of measuring dynamics in the anchoring process,
WikiGen Tool provides a number of different statistics which will be explained in the
following.

i. New and Obsolete Anchors

Figure 17 shows a bar chart WikiGen generates in order to indicate the amount of newly
introduced and removed anchors in a particular period of time. The data directly
correspond to the first aspect of the research question Q1.2 (cf. section 3.3.2).

Fig. 17 Example for New and Obsolete Anchors Chart in WikiGen

ii. Anchor Dissimilarity

The anchor dissimilarity measures how anchors in time period are dissimilar to
anchors in the previous period . The basis for the measure is the anchor attribute
days survived which is the cumulative time an anchor stayed in the definition part of the
article in the given period of time . Note that in order to remove influence of anchors
that enter the article due to vandalism, only anchors that were present in the
corresponding period for at least one day are accounted for.

The dissimilarity values range from 0 (anchors are absolutely dissimilar) to 1 (anchors
are absolutely similar). The logic behind the calculation is to take the sum of least
common days survived for all anchors present in periods and and set it in
relation to the total sum of maximum days anchors survived for every anchor in periods
and . The calculation is done according to the formula 1.
46

(1)
, where

union of anchor name vectors and in periods and

with being a vector of anchor names in period ,

in .

Exemplarily, given anchor names (a, b) with corresponding days survived in


period (5, 10) and anchor names (b, c) with corresponding days survived in
period (10, 5), the dissimilarity measure will be calculated according to formula 2.

(2)

The WikiGen tool provides monthly and yearly dissimilarity data in form of charts.
Random examples of dissimilarity measure outputs are shown in Fig. 18.

Fig. 18 Monthly and Yearly Anchor Dissimilarity Measures in WikiGen


47

iii. Average Anchor Durability

Average anchor durability measures the average time every anchor was present in the
definition part of the article in a particular time frame . However, only those anchors
are considered which stayed in the article at least for one day. This is done in order to
remove influence of weak anchors on the average value. The measure can shed light on
how stable the anchoring is during a particular time frame .

The calculation is done according to the formula 3.

(3)
, where

vector of anchor names in period (survived more than 1 day),

n: number of anchors in period (survived more than 1 day),

in .

The WikiGen tool provides monthly and yearly average anchor durability data in form
of charts. Examples of the charts are shown in Fig. 19.

Fig. 19 Monthly and Yearly Average Anchor Durability in WikiGen


48

iv. Anchor edit-war level

Edit-war level statistics measures the level of disagreement in the collaboration process
by relating the number of introductions and disappearances of anchors to the total
number of unique anchors in a period of time. The higher the value of the statistics the
more disagreement is expected to be observed in the corresponding period of time since
the same anchors would be introduced and removed several times.

The calculation is done according to the formula 4.

, where (4)

: Number of anchor introductions in period (cf. anchor snapshots)

Number of anchor disappearances in period (cf. anchor snapshots)

Number of anchors in period

The WikiGen provides monthly and yearly data for edit-war level in form of charts
indicated in Fig. 20.

Fig. 20 Monthly and Yearly Edit War Level Measures in WikiGen


49

4.2.3 Reference Statistics

Reference statistics part of the WikiGen tool reveals how many articles contain
references to the analysed article. For example, if a link to ‘Einstein’ was introduced to
the ‘Physics’ article in the year 2009, WikiGen would add one reference for the year
2009. If an article contains several links pointing at the article of interest, it is
considered as one reference. Note, that an article containing a reference to the article
under study is called a ‘backlink’.

In terms of the social representation theory, the statistics mirrors the process of
objectification during which a social representation is increasingly used as an anchor for
other social representations.

The result of the statistical analysis is the distribution of different reference types over
time. Reference types correspond to the location of the reference in the identified
article. Correspondingly, there are four different article types according to the reference
types:

Backlinks in definitions: articles where a reference is found in the definition section of


the article.

Backlinks in text: articles where a reference is found in the rest of the article following
the definition section.

Backlinks among categories: all articles assigned to a category which corresponds to


the encyclopaedia article of the same name in the article namespace.

Indirect backlinks: articles which point at the article indirectly by using Wikipedia
templates32.

Additionally, it is possible to display the combined distribution of references where all


article types are taken into consideration. The combined w/o indirect backlinks option
displays the combined number of articles without those with indirect links.

Figure 21 illustrates the graphical output of the reference statistics in WikiGen. The blue
line indicates the cumulative amount of referencing articles over time. The higher the
cumulative amount is the higher is the objectification level of the corresponding social
representation.

32
Cf. template definition in section 3.1
50

Fig. 21 Exemplary Distribution of Referencing Articles in WikiGen

4.2.4 Integrated Tools

In addition to statistics which are specially developed for the purpose of the given
thesis, WikiGen integrates two already existing tools for analysing Wikipedia articles.
The first tool provides Wikipedia article traffic statistics33. The tool visualises the
viewership of Wikipedia articles over time. Several articles can be compared with each
other in an integrative graph as shown in Fig. 22.

Fig. 22 Exemplary Output of Article Traffic Statistics Tool

The second tool integrated in the WikiGen web application is a tool called
contributors34. It allows for a detailed analysis of the contributor structure displaying the

33
Cf. http://toolserver.org/~emw/wikistats/
34
Cf. http://toolserver.org/~daniel/WikiSense/Contributors.php
51

amount of revisions created by every single contributor of the article. An exemplary


output of the contributors tool is shown in Figure 23.

Fig. 23 Exemplary Output of Contributors Tool


52

5 Case Studies of Social Representations on Wikipedia

This section contains two case studies. As outlined in section 3, case studies aims to
demonstrate and verify the applicability of the method developed to study social
representations on Wikipedia. Furthermore, they illustrate the employment of the
WikiGen statistical tool as they supplement quantitative data provided by the tool with a
necessary qualitative analysis.

The case study of ‘cloud computing’ is introduced in section 5.1. Section 5.2 contains
the case study of ‘iPad’. Results of both case studies including the evaluation of the
method are discussed in section 6.

5.1 Cloud Computing: From Utility Computing to a Jargon Term

Cloud computing (CC) is arguably one of the most popular yet ambiguous phenomena
in the information technology field today. As the title of the case study suggests, its
representation on Wikipedia is a subject of change. The demonstration of the evolution
of cloud computing social representation will begin with the context for the case study.
This will include a brief description of the phenomenon in its present form as well as
several characteristics of the CC social representation on Wikipedia. The latter include
collaboration aspects and are therefore necessary for the analysis of anchoring and
objectification processes in respective sections 5.1.2 and 5.1.3.

5.1.1 General Description

To begin the case study of cloud computing social representation on Wikipedia, the
current35 definition of the phenomenon on the Wikipedia platform is required.

Cloud computing is a colloquial expression used to describe a variety of


different computing concepts that involve a large number of computers that are
connected through a real-time communication network (typically the Internet).
Cloud computing is a jargon term without a commonly accepted non-ambiguous
scientific or technical definition. In science, cloud computing is a synonym for
distributed computing over a network and means the ability to run a program on
many connected computers at the same time. The popularity of the term can be
attributed to its use in marketing to sell hosted services in the sense of application
service provisioning that run client server software on a remote location.

English Wikipedia, formatting in original

35
https://en.wikipedia.org/wiki/Cloud_computing, accessed 14.06.2013 11:35
53

In this definition, cloud computing is represented as a phenomenon that has acquired a


high level of independence from its technical background, and which has become a
jargon term. Given the definition of jargon as “language used by people who work in a
particular area or who have a common interest [...] Much like slang [...]”36, the
representation of cloud computing is particularly interesting to study as it echoes one of
the central stances in MOSCOVICI’s theory. It demonstrates the evolution of a social
representation during which it acquires a ‘life of its own’ in the world of the common
sense (Moscovici 1988, p.231). In the following, this evolution process is examined in
greater detail.

The first version of the cloud computing article on Wikipedia is dated 3 of March 2007.
This date is therefore where the case study analysis will commence. The standardised
distribution of the subsequent 7264 revisions over time is sketched in Fig. 24 with the
value 100 indicating the peak revision number.

Fig. 24 Extrapolated Distribution of Cloud Computing Article Revisions


Based on Monthly Data

Note that, the evolution time frame for the social representation of cloud computing
corresponds to the ‘lifetime’ of the phenomenon outside of Wikipedia. The latter can be
approximated by the Google search index graph illustrated in Fig. 25. Consequently,
the evolution of the CC social representation on Wikipedia is concurrent to the
corresponding evolution of the cloud computing phenomenon outside of the platform.
This would not be the case if the phenomenon had existed longer than the online
encyclopaedia itself.

36
http://en.wikipedia.org/wiki/Jargon, accessed, 14.06.2013
54

Fig. 25 Interest for Cloud Computing according to Google Trends37

A detailed view of the monthly and yearly revision distribution in Fig. 26 illustrates the
major distinct edits38 measure to be more stable over time than the overall number of
edits. For example, overall edits in February 2010 increases by 62% compared to the
21% increase of the major distinct edits. This effect can be observed across all periods.
Consequently, peaks in the number of overall edits are mainly due to either an increase
in the number of minor edits or in the number of subsequent edits by the same user.
Both of these edit types are not accounted for in distinct major edits measure. This is an
important observation for interpreting changes in the collaboration process.

Fig. 26 Number of Edits for Cloud Computing Article on Wikipedia

To underline the claimed popularity of the cloud computing article on Wikipedia, the
number of views for the cloud computing article is compared to the number of views for
the most popular Wikipedia article in the year 2012 – “Facebook”39. Fig. 27 indicates
37
Source: http://www.google.com/trends/explore?hl=en#q=Cloud%20computing&cmpt=q
38
Cf. definition of major distinct edits in section 4.2.1
39
Source: http://toolserver.org/~johang/2012.html
55

that the average views per day for the cloud computing article is 11140. This is 14.9%
of the views for the ‘Facebook’ article40. In this thesis, power law distribution with a
long tail is assumed for the popularity of Wikipedia articles. Accordingly, the majority
of articles on Wikipedia is assumed to have less than 1% of the views compared to the
most popular article. Therefore, the cloud computing article must be considered as
rather popular.

Fig. 27 Daily Views for Cloud Computing and Facebook Articles

The last aspect required for reconstructing the context of the cloud computing social
representation on Wikipedia, before discussing the anchoring and objectification
processes, is the collaboration dynamics. Figure 28 illustrates the application of the
interpretation scheme from Appendix E. The orange line indicates the amount of edits
per month, the blue line indicates the amount of contributors per month and the green
line indicates the combined monthly measure edits per editor. All data are standardised
in the interval from 0 to 1 to capture the tendencies rather than the absolute values.

Fig. 28 Standardised Monthly Edits, Editors and Edits per Editors Statistics
40
Data is visualised by Wikistats: http://toolserver.org/~emw/wikistats/
56

The graph illustrates 69 months of a volatile collaboration process with a total of 3182
users having participated in it. According to the interpretation scheme, 11 from 13
possible collaboration trends are observed for cloud computing. Any one trend is
observed for a maximum of two subsequent months. As a result, the collaboration
process consists of several alternating centralisation and decentralisation phases. Green
areas on the graph indicate centralisation phases that come together with an increase in
the number of revisions. These phases potentially reflect a high degree of unfamiliarity
associated with the cloud computing phenomenon among the Wikipedia users. In the
following anchoring analysis, it will be verified whether or not the centralisation phases
are caused by the increasing editing efforts of users to resolve conflicting
representations. Consequently, the statistics from Fig. 28 help to relate changes in the
social representation of CC to the corresponding changes in the collaboration process.

5.1.2 Anchoring Analysis

The analysis of the cloud computing anchoring ranges from the first article revision
dated 3rd of March 2007 until 25th May of 2013. First, anchor statistics from the section
4.2.2.3 are introduced for this time frame. Second, all relevant anchors according to the
coding process introduced in section 3.4.2 are categorised. Finally, different phases in
the evolution of the social representation anchoring are identified and described using
both introduced statistics and identified categories.

5.1.2.1 Employed Anchor Statistics

Anchor statistics for cloud computing generated by the WikiGen include new and
obsolete anchors in each period, the dissimilarity between anchoring states, the average
anchor durability in different time periods and the anchor edit war level for each
period41. The statistics indicate a dynamic anchoring process. The following introduces
the statistics and explains them in terms of indications they provide. The actual
narrative interpretation of the data will follow in the subsequent section.

New and obsolete anchors

The search for changes in the social representation is difficult without the data for the
number of new and obsolete anchors for each time period in the analysis. New aspects
of the representation are necessarily accompanied by the introduction of anchors that
were not used in the previous period or by the removal of anchors from the last period.
Figure 29 illustrates this data for the cloud computing social representation together
with exemplification of time periods that contain the most significant changes. For

41
Cf. section 4.2.2.3 for the definition of the measures and Appendix F for statistic details
57

example, in the August 2008, 49 new anchors for cloud computing were introduced.
This was followed by the removal of 44 anchors two months later (cf. leftmost area
marked grey). Similarly, further marked areas show periods with extensive introduction
of new anchors or removal of obsolete anchors. In subsequent sections, such indications
are used as a starting point for a qualitative analysis with a twofold aim. The data help
to reveal the nature of the changes in the representation and facilitates the division of
the evolution process into distinct phases.

Fig. 29 Amount of New and Obsolete Cloud Computing Anchors per Month

Anchor dissimilarity

The effect of introducing new anchors, and removing of the obsolete ones, can be
observed on the monthly anchor dissimilarity data in Fig. 30. The introduction of the 49
aforementioned anchors results in a dissimilarity of the anchoring in August 2008,
peaking at 0.97. In this case, 0.97 indicates that 97% of the anchoring is different to that
of the previous month42. However, the dissimilarity data can also indicate that some
introductions and removals of anchors are only temporal. For example, the introduction
of 21 new and removal of 16 obsolete anchors in February 2010 corresponds to a low
dissimilarity measure of 0.33 indicating that those changes have not endured.

42
Cf. definition of the dissimilarity measure in section 4.2.2.3
58

Fig. 30 Anchors Dissimilarity for Cloud Computing per Month

It must be reiterated at this point that the quantitative data only indicate possible
changes. A high dissimilarity measure might, for example, be misleading. Old anchors
can be replaced by new anchors with a similar meaning. Therefore, it is necessary to
code anchors and to explain the dissimilarity in terms of changes among different
categories rather than within them. The anchor coding is provided in the next section.

Average anchor durability

The average anchor durability measure provides additional support for the overall high
dynamics of the cloud computing anchoring process. The year 2011 especially
demonstrates the anchor ‘instability’ with anchor being present in the article for an
average of only 52.96 days.

Fig. 31 Average Anchor Durability for Cloud Computing


59

For the majority of periods, decreases in average anchor durability follow increases in
anchor dissimilarity. However, exceptions from this observation indicate cases in which
the decrease in the measure is due to anchors being present in the article only for a short
period of time. These indications are used to correctly identify phases in the anchoring
evolution of cloud computing.

Anchor edit-war level

The last anchor statistics employed in the analysis is the anchor edit-war level. The
graph depicted in Fig. 32 is especially valuable when compared to the collaboration
dynamics in Fig 31. A combination of an intense collaboration and a high edit-war level
is a clear indication of a high level of disagreement between contributors. Therefore,
this becomes an additional instrument for interpreting changes in the social
representation of cloud computing.

Fig. 32 Anchor Edit-War Level for Cloud Computing

5.1.2.2 Anchor Coding

The quantitative WikiGen analysis identified a total of 325 anchors for cloud computing.
Following the analysis requirements explained in section 3.4, anchors that are identified
by the WikiGen tool need to be analysed qualitatively and grouped into categories in
order to reflect different aspects of the cloud computing social representation. In
accordance with section 3.4.2, the anchor strength threshold of 0.15 was chosen. The
context analysis of the remaining 122 strongest anchors identified 15 anchors to be
excluded. Three anchors (pdf, nist, gartner) were removed as they were unrelated to the
social representation of cloud computing. A further 12 anchors were found to have
duplicates due to different spellings for example “yahoo” and “yahoo!”.
60

The coding of the remaining 107 anchors revealed 9 different groups of anchors
presented in Tab 2. The complete list of the strongest anchors for the cloud computing
social representation including categorisation is provided in Appendix B.

Concepts from which cloud Figurative aspects of cloud Technical aspects of cloud
computing has departed (1) computing (2) computing (3)
Sub concepts of cloud Origins of cloud computing Cloud computing solutions and
computing (4) (5) providers (6)
Benefits of cloud computing Broader concepts related to Means to interact with cloud
(7) cloud computing (8) computing (9)

Tab. 2 Anchor Categories for the Cloud Computing Social Representation

1. The category “Concepts from which cloud computing has departed” (7 anchors)
describes concepts that cloud computing is distinguished from. Those anchors also
emphasise how cloud computing influenced typical practices in the IT. Exemplarily, the
anchor client-server43 describes the paradigm shift from using rich clients towards using
thin clients or web browsers in order to access IT resources. In the same context,
anchors such as product44 and software45 are used to emphasise that something that was
previously delivered as a software product became a service within cloud computing.

2. Six anchors in the category “Figurative aspects of cloud computing” use metaphors,
abstraction or analogies to describe cloud computing. Accordingly, electrical grid46 and
electricity47 anchors are analogies for resource delivery within cloud computing. The
analogies draw a similarity between cloud computing and how electricity is delivered
through the electricity network. Further anchors in this category such as computer
network diagram48 and cloud49 serve as metaphors for the internet as such. Among the
anchors in this category are also two anchors: abstraction and metaphor. Abstraction50
points at the hidden complexity of the internet, as implied by cloud computing, while
metaphor51 points at how cloud symbolises the internet.

3. The “Technical aspects of cloud computing” category (17 anchors) represents anchors
that are either integral technical components of cloud computing, or technologies used
within cloud computing. This group includes anchors such as server, multitenancy,
virtualization, remote server, data, parallel computing and computer cluster.

43
Cf. cloud computing revision from 22:38, 27 April 2010
44
Cf. cloud computing revision from 08:12, 19 August 2011
45
Cf. cloud computing revision from 09:35, 9 October 2007
46
Cf. cloud computing revision from 13:35, 18 August 2011
47
Cf. cloud computing revision from 22:52, 1 August 2008
48
Cf. cloud computing revision from 12:49, 14 October 2008
49
Cf. cloud computing revision from 16:52, 30 July 2008
50
Cf. cloud computing revision from 16:37, 23 April 2009
51
Cf. cloud computing revision from 07:02, 11 April 2009
61

4. The twenty anchors that comprise the “Sub concepts of cloud computing” category
either represent part of cloud computing, or can be considered as synonyms to the entire
phenomenon of cloud computing. Each anchor in this category can be used in one of the
following two wordings: a) Cloud computing is <anchor> or b) <anchor> is a part of
cloud computing. Consider the first anchor used for cloud computing on Wikipedia:
utility computing. In this case, cloud computing is substituted with utility computing. It
is possible to formulate: “Cloud computing is utility computing”. An example for an
anchor representing a part of cloud computing is infrastructure as a service52. In
addition to software as a service, data as a service and the other types of services cloud
computing can deliver, infrastructure is one of the integral components of cloud
computing. It is thus possible to construct the wording: “Infrastructure as a service is a
part of cloud computing”. The same logic applies to all anchors in the category.

5. The small “Origins of cloud computing” category comprises three anchors. Each of
these is linked to a reference of possibly the first scientist who used the term ‘cloud
computing’ in a scientific paper53. The name of this Brazilian professor is Ramnath
Chellappa. His affiliation to the Goizueta business school of the Emory University is
labelled in the two corresponding anchors.

6. The category “Cloud computing solutions and providers” (33 anchors) is formed by
examples of cloud computing instantiations, and examples of corporations that either
provide or use cloud computing. Among the corporations providing cloud computing
solutions are: Google, NetSuite and Salesforce54. Corporations such as General
Electric, L'Oréal and Procter & Gamble are known for adopting cloud computing
solutions55. Implementations of cloud computing, on the other hand, are represented by
anchors such as Google apps56, Amazon web services57, and azure service platform58.

7. “Benefits of cloud computing” category (5 anchors) focuses on economic advantages


and expectations towards the quality of service associated with CC. The economies of
scale benefit is inherent for cloud computing59. The anchor capital expenditure
emphasises the possibility to forego capital expenditures and pay for only the use of
resources60. The flexibility of cloud computing also allows for payments based on

52
Cf. cloud computing revision from 04:50, 25 February 2009
53
Cf. cloud computing revision from 22:20, 10 June 2009
54
Cf. cloud computing revision from 08:24, 31 July 2008
55
Cf. cloud computing revision from 22:13, 5 August 2008
56
Cf. cloud computing revision from 06:08, 8 August 2008
57
Cf. cloud computing revision from 22:13, 5 August 2008
58
Cf. cloud computing revision from 18:19, 3 December 2010
59
Cf. cloud computing revision from 10:36, 23 April 2012
60
Cf. cloud computing revision from 22:52, 1 August 2008
62

subscriptions61. Remaining anchors quality of service and service level agreement stress
the quality of the commercial cloud computing solutions, which are guaranteed by legal
agreements62.

8. The category “Broader concepts related to cloud computing” (13 anchors) consist of
anchors pointing out the broad scope of cloud computing. While this category might
appear to be heterogeneous, the context of the anchors enables them to be combined
into one category. Two main examples of anchors in this category are internet and
computing. Both anchors generalise the cloud computing concept to “any computations
in the internet”. Other anchors such as shared services or converged infrastructure
reflect general tendencies in the IT. They thus broaden the concept of cloud computing
as they perceive it as a consequence of global trends in the IT field63. Anchors such as
services64 or utility65 extend the concept of cloud computing even further by generalising
it to any possible services or resource delivery type over the network. Similar rational
can be applied to all anchors in this category.

9. The “Means to interact with cloud computing” category includes five anchors
corresponding to the manifestations of cloud computing. These are the different visual
interfaces for cloud computing. It can be a business application66 or application
software67 in general that uses resources from the cloud. The means of accessing
resources from the cloud are nevertheless unrestricted to software in the traditional
sense. Further examples are web applications in a web browser68 or mobile apps69 – both
of which provide an interface for triggering computation in the cloud and displaying the
results.

The high number of resulting categories has potential to reduce the clarity of the
analysis. While some of the categories play a central role in understanding the anchor
evolution, others have a limited significance due to the small number of weak anchors
in the categories. To avoid fragmentation of the analysis, categories was grouped into
five distinct perspectives.

The generalising perspective (Anchors that extend the scope of CC) comprises three
categories: Concepts from which CC has departed (1), Figurative aspects of CC (2) and

61
Cf. cloud computing revision from 17:23, 5 August 2008
62
Cf. cloud computing revision from 22:52, 1 August 2008
63
Cf. cloud computing revision from 08:20, 9 January 2012
64
Cf. cloud computing revision from 13:35, 18 August 2011
65
Cf. cloud computing revision from 22:52, 1 August 2008
66
Cf. cloud computing revision from 12:58, 9 June 2009
67
Cf. cloud computing revision from 09:54, 9 January 2012
68
Cf. cloud computing revision from 06:05, 8 August 2008
69
Cf. cloud computing revision from 12:47, 10 July 2011
63

Broader concepts related to CC (8). Each of these categories broadens the scope of the
cloud computing phenomenon. Anchors in the category “Broader concepts related to
cloud computing” do so per definition. “Figurative aspects of cloud computing” anchors
have a similar effect by ‘blurring the borders’ of cloud computing. As for “Concepts
from which cloud computing has departed” anchors, they do not describe what CC is.
Rather, they point at what cloud computing is not, and thus potentially include new
emerging phenomena into its scope.

The usage perspective (Use cases for the cloud computing) combines “Benefits of
cloud computing” (7) with the “Means to interact with cloud computing (9) categories.
Both categories make the user their focus through introducing different interfaces that
the user can operate in order to benefit from utilising cloud computing services.

The remaining three perspectives comprise of only one category each. “Technical
aspects of cloud computing” category constitutes the technical perspective. Similarly,
“Cloud computing solutions and providers” category makes up the example
perspective. The sub concept perspective incorporates anchors from the category of
“Sub concepts of cloud computing”. Anchors from the category “Origins of cloud
computing” are candidates for the sixth perspective. However, origin aspects do not
provide any evolutional insights. The category has three anchors, which appear in a
single period, and is therefore of a limited significance. For the clarity of the case study,
the origins perspective is disregarded in the analysis, which leaves five perspectives to
focus on in the next section.

5.1.2.3 Narrative Interpretation

In the following, anchors in different time periods and corresponding collaboration


processes are compared in order to reconstruct the anchoring for the social
representation of cloud computing. As a result, the overall anchoring time frame is
divided into phases reflecting different stages in the evolution of the CC representation.
A summary of WikiGen statistics for each evolution phase is provided in Appendix F.

Infancy (Mar 2007 – Aug 2007)

The first attempt to familiarise the cloud computing phenomenon on the Wikipedia
platform is dated to the 3rd of March 2007. In the first revision of the CC article, the
phenomenon was anchored in terms of utility computing (1.00)70 by redirecting to the
article of the latter. The social representation of utility computing on Wikipedia has
existed since the 3rd of June 2005. By the time that the CC article was created, utility

70
Numbers in the brackets indicate anchor strength in the corresponding period as defined in 4.2.2.2
64

computing had already been defined on Wikipedia as: “[a] business model whereby
computer resources are provided on-demand and on pay-per-use basis”71.

Anchors of utility computing as for 3rd of March are shown in Fig. 33. The majority of
anchors such as computer, hardware and grid computing have a technical character. An
interesting feature of this context is the anchor “natural gas”. This employs the analogy
of a gas provider to illustrate the utility computing provider who delivers computing
resources on demand. Anchoring utility computing in terms of grid computing, which is
a “computing model that distributes processing across a parallel infrastructure” 72,
emphasises the distributed nature of the technology.

Consequently, the historical anchors of utility computing depicted in Fig. 33 can be seen
as anchors for, at that time, rather unfamiliar phenomenon of cloud computing.

Fig. 33 Anchors for Utility Computing as for 3th of March 2007

Initial familiarisation attempts (Sep 2007 – Oct 2008)

Following the infancy stage, the next phase is characterised by the initial introduction of
different perspectives on cloud computing so to distinguish it from the utility computing
concept and make it more tangible. The phase begins with an introduction of new
anchors in September 2007. In comparison to a single utility computer anchor in its
infancy stage, there are 74 new anchors to be considered in this time frame.

The usage perspective is introduced during the rest of the year 2007 and is represented
by the anchors for web application (0.27), web browser (0.21) and rich internet
application (0.31). The corresponding concepts allow cloud services to be accessed, and
are therefore representative of how the user understands cloud computing.

In the year 2008, the sub concept perspective gains importance. Concepts that can be
seen as ‘close’ to cloud computing start to appear as anchors. Each of these anchors
come from the same category as utility computing, namely “Sub concepts of cloud
71
http://en.wikipedia.org/w/index.php?title=Utility_computing&oldid=113214987
72
http://en.wikipedia.org/w/index.php?title=Grid_computing&oldid=112784741
65

computing”. While grid computing (0.56) is one of the already introduced anchors of
utility computing, two other concepts introduce additional characteristics to the concept
of cloud computing. On the one hand, autonomic computing (0.21) puts an emphasis on
the self-management aspect of cloud computing networks. On the other hand,
distributed computing (0.27) underlines the distribution of the computational nodes that
are involved in a cloud network. One particular CC sub concept anchor - software as a
service (0.7) (SaaS) - appears to integrate the most important aspects of cloud
computing social representation of the time. The representation of SaaS summarises the
paradigm shift that includes the service orientation and the software migration from
clients’ devices into the web.

The next perspective on cloud computing during this phase is an attempt to make the
cloud computing phenomenon familiar through specifying its technological scope.
Among the 20 strongest anchors in this period are anchors such as virtualization (0.47),
data (0.45), computer cluster (0.29), multi-core (0.29), and parallel computing (0.29).

Similarly, there is another attempt to familiarise cloud computing in 2008 through the
example perspective. Anchor instances in this perspective are corporations, which at
this point of time have already introduced some forms of cloud computing: Google
(0.23) and their Google apps (0.45), Salesforce (0.23), IBM (0.23), Microsoft (0.23)
General Electric (0.22) and many others (cf. Appendix B).

The data in Fig. 34 reflect the introduction of different perspectives on cloud computing.
The phase is distinguishable from both the infancy phase, in which no anchoring
activities are observed, and from the subsequent phase. The end of the phase is marked
by the disappearance of a high number of anchors in October 2008, and by the change
of the dissimilarity pattern from peaking (due to the introduction of different
perspectives) to more ‘stable’ in the next phase.
66

Fig. 34 Dissimilarity and Anchor Movements for Cloud Computing in Sep


07-Oct 08

The collaboration process during the phase is marked by the steadily increasing interest
that occurs before the first peak in August 2008 (cf. Fig. 35). The centralisation trend of
the collaboration between April 2008 and August 2008 together with the rising editing
activity mirrors the competition between the technological, sub concept and example
perspectives that is observed in this time frame. The disagreement between the
contributors is also confirmed by the rising edit-war level in this period.

Fig. 35 Collaboration Dynamics for Cloud Computing in Sep 07-Oct 08

In a nutshell, the time period between September 2007 and October 2008 consist in the
initial search for different ways by which the phenomenon of cloud computing can be
made more tangible. A social representation, which was previously solely dependent
upon utility computing, starts to acquire new characteristics through the new anchors in
different categories. It is a period characterised by attempts to find a distinct and more
independent representation of cloud computing, rather than those of the utility
computing in the infancy phase. The anchors from the technological perspective have
their strongest position in this phase.
67

Establishment of the generalising core (Nov 2008 – Jan 2010)

There is a group of cloud computing anchors from the previous phase, which is not yet
elaborated upon. The group comprises generalising aspects of cloud computing. The
phase between November 2008 and February 2010 is marked by the ‘generalising core’
that starts to emerge on the background of the disappearing perspectives of the last
period.

The establishment of the strong generalising perspective had already begun in 2008.
During this year, cloud computing was anchored in terms of internet (0.72 – the
strongest anchor in 2008), where cloud is used as a synonym for the internet73, in terms
of computing (0.53) as a particular style of computing in this case74 and in terms of web
2.0 (0.47). Anchors from this perspective remain strong over the entire anchoring period
that is analysed in the case study. In the year 2009, the representation of cloud
computing became even more generalised through the introduction of three strong
anchors belonging to the category of “Figurative aspects of cloud computing”. The first
anchor is metaphor (0.69), where cloud is understood as a metaphor for the complexity
of the internet75. The second anchor in this category is abstraction (0.66), where cloud is
an abstraction from the internet complexity. In the similar vein, computer network
diagram (0.99) anchor is nothing more than another metaphor for the internet.

The generalising anchors appear to be responsible for the decline of other more concrete
perspectives that were construed in the previous phase. Given the strong position of the
anchors internet and computing, the core meaning of cloud computing can be
generalised to “calculating something in the internet”. For such a generalised
phenomenon it is difficult to define a more specific scope. A concrete example for the
effect of anchoring cloud computing in terms of generalising social representations can
be observed in the year 2008. Beside the specific software as a service anchor, a more
general one – everything as a service (0.52) – is introduced in order to account for other
types of computing in the internet. This thus facilitates a representation that is coherent
with the generalisation level already set by anchors such as internet and computation.

The anchoring statistics in this period reveal an interesting observation. While the
amount of edits reaches its peak in this period, the dissimilarity of anchoring (Fig. 36) is
around 30%, which is relatively low when compared to the previous periods. Similar
effects, indicating decrease in the intensity of the anchoring process, can be observed on
the anchor durability graph. It appears that after the intensive anchoring phase that

73
http://en.wikipedia.org/w/index.php?title=Cloud_computing&oldid=190252349
74
http://en.wikipedia.org/w/index.php?title=Cloud_computing&oldid=227985578
75
http://en.wikipedia.org/w/index.php?title=Cloud_computing&oldid=283087883
68

occurred in response to the novelty of the cloud computing phenomenon, this phase
introduces relative temporal anchor stability.

Fig. 36 Dissimilarity and Anchor Movements for Cloud Computing in Nov


08-Jan 10

The most convincing explanation for this temporal relative stability is a widening of the
phenomenon scope. Attempts to anchor cloud computing in terms of a close set of
anchors within the technological, sub concept, example or usage perspectives are no
longer observed in this timeframe. Instead, during the period, generalising anchors such
as computer network diagram (0.99), data (0.99), and software (0.99)76 gained strength.

The phase between November 2008 and February 2010 can be thus seen as a year of
establishing for the upcoming years the generalising nature of cloud computing
phenomenon. This will have significant implications for the subsequent anchoring
process. The effect will be demonstrated in the following course of the case study.

76
Anchors data and software change their context to “points of departure” in this period.
69

First concretisation attempt (Feb 2010 – Feb 2011)

In this phase, the social representation of cloud computing on Wikipedia reached an


generalisation degree, which appears to have triggered a reverse process. This process
aims at once again making cloud computing more specific. The changes, however, do
not occur through the abandonment of the generalising perspective of cloud computing.
Rather, they occur through the search for new anchors capable of making the
representation more tangible – a process similar to the one in the ‘initial familiarisation
attempts’ phase. The difference, however, lies in the limiting power of generalising
anchors.

The increase in the number of anchors in this period to 80, and decrease in the average
anchor durability, indicate that a more intense anchoring process is once again
occurring. However, since the generalising perspective remains strong, all anchor
changes occur on the background of the anchors that have been preserved from the
generalising perspective. This results in the dissimilarity measure showing the lowest
values in this period (Fig. 37).

Fig. 37 Dissimilarity Measure for Cloud Computing in Feb 10-Feb 11

Examination of individual anchors provides qualitative support for the observed data.
The example perspective is once again added to the cloud computing representation.
Every example is a step towards familiarising the generalising social representation of
cloud computing. In the year 2010, the list of corporations implementing cloud
computing and the corresponding solutions includes: Google (0.6), salesforce (0.52),
Microsoft (0.46), IBM (0.45), vmware (0.2), amazon web services (0.39) and many
others (cf. Appendix B). The usage perspective additionally regains strength. Two
anchors, service level agreement (0.84) and quality of service (0.84) from the benefits
70

category, emphasise the reliability of cloud computing services as a characteristic to be


expected when using business applications (0.99) over web browser (0.99).

Simultaneously, the generalising perspective is strengthened by emphasising the


paradigm shift (0.82) implied by the use of cloud computing. Anchors from the category
points of departure have the strongest position in this period of time. Consequently,
cloud computing is opposed to the traditional ways of handling data (0.99) and software
(0.99), as well as using mainframe (0.6) or client-server (0.67) architectures. In the
same context, the data center (0.34) anchor changes its meaning in 2010. In contrast to
being a technological aspect during the initial familiarisation attempts phase, it is now
seen as a traditional way of allocating resources, which involves humans. Cloud
computing is thus marked by an automation of this process.

In summary, this phase is characterised by the search for suitable ways through which
the social representation of cloud computing can be made more tangible, while
simultaneously preserving its generalising nature. Attempts to find more specific
representation for cloud computing include the use of a high number of examples and an
emphasis on the user benefits.

Collapse and reestablishment of the generalising core (Mar 2011 – Jan 2012)

The attempt to combine the anchoring of the phenomenon in both generalising and
specific terms in the previous phase fails. Consequently, the ‘generalising core’ begins
to collapse. Anchors such as internet (0.35), computing (0.54), metaphor (0.22),
computer network diagram (0.22), abstraction (0.23) and paradigm shift (0.00) lose
their strength in this period. The anchoring is completely changed repeatedly between
Mar 2011 and Aug 2011. Figure 38 illustrates the effect very clearly.

Fig. 38 Dissimilarity Measure for Cloud Computing in Mar 11-Jan 12


71

It is however not only generalising anchors that are affected during the collapse. In this
period, every perspective present in the representation is threatened. Thus, the anchoring
in September 2011 starts ‘from scratch’ with a new attempt to redefine cloud
computing. Consequently, the generalising core is once again restored by the end of the
phase. The resulting anchors by the end of the period are illustrated in Fig. 39. All of
them, except utility computing, belong to the generalising perspective.

Fig. 39 Cloud Computing Anchors as for Dec 2011

Interestingly, the reestablishment of the generalising core after the collapse is


accompanied by a decreasing interest and decentralisation of the collaboration process.
It appears that the lowest participation level is due to the absence of competing
perspectives. Figure 40 illustrates the effect.

Fig. 40 Interest Decrease in Collaboration in the Period Sep 11-Dec 11

Second concretisation attempt (Feb 2012 – May 2013)

The domination of a single generalising perspective continues until January 2012. The
new phase is characterised by a repeated attempts to combine the generalising
perspective with more specific ones. This phase is therefore an analogy for the first
concretisation attempt. It appears that cloud computing, when anchored in terms of only
72

generalising phenomena, remains unfamiliar. It therefore triggers the introduction of


more specific anchors.

Two perspectives are reintroduced to the social representation of cloud computing in


this period. The usage perspective has its strongest position when compared to previous
phases. The ‘resurrection’ of the usage perspective is most likely triggered by the
introduction of the mobile app (0.92) anchor in the year 2012. This can be interpreted as
a reaction to the increase in market share of smartphones and tablets during this
historical period of time77. Similarly to the interface for using cloud computing services,
typical anchors from the usage perspective such as web browser (0.97), business
application (0.63) and application software (0.97) also strengthen their position.
Additionally, the anchor economies of scale (0.67) emphasises the economic benefits,
which cloud computing inherently builds upon78.

In the period from June 2012 until March 2013, the sub concept perspective dominates
the social representation of cloud computing. The observed changes in this period
appear as a “desperate” attempt to list all different forms of cloud computing:

 Software as a service (0.74)  Test environment as a service (0.38)


 Platform as a service (0.53)  API as a service (0.38)
 Infrastructure as a service (0.52)  Desktop as a service (0.25)
 Data as a service (0.38)  Backend as a service (0.16)
 Storage as a service (0.38)  IT as a service (0.15)
 Security as a service (0.38)

Even business process – a term that means much more than the sum of its non-
obligatory IT components – is included into the type of service that cloud computing
can provide79.

In a nutshell, the second concretisation attempt is marked by three aspects. First, it is the
strengthening of the generalising perspective of social representation. Second, it is the
attempt to anchor cloud computing in terms of all possible forms of service delivery,
and thus a strong sub concept perspective. Finally, it is the introduction of a mobile
aspect by including mobile app anchor into the usage perspective as well as emphasis
on the additional means of accessing cloud computing and its resultant benefits.

77
http://readwrite.com/2010/03/16/mobile_app_marketplace_175_billion_by_2012
78
http://en.wikipedia.org/w/index.php?title=Cloud_computingt&oldid=488792493
79
This anchor has a strength below the threshold.
73

A final observation of the strongest anchors in the first months of 2013 is insightful. In
addition to the utility computing (1.00) anchor, which has influenced the social
representation of cloud computing from the beginning of its evolution, Fig. 41 illustrates
two dominating perspectives. The generalising perspective is the strongest perspectives
in both the present period and previous periods. The usage perspective on the other
hand, is a temporal attempt to make the representation of cloud computing more
tangible. The appearance of the anchor business model is symbolic. It demonstrates the
movement of cloud computing from its underlying technological background, towards
means of doing business in general. This is coherent with the generalising nature of the
basic characteristics such as internet and computing that are assigned to cloud
computing by anchoring it in terms of corresponding concepts.

Fig. 41 Strongest Anchors for Cloud Computing in 2013

The displacing influence of the generalising perspective is confirmed when looking at


the current definition of the cloud computing phenomenon on Wikipedia. This
definition was introduced at the beginning of the study and was captured two month
after the anchor analysis had been conducted. In this definition, the usage perspective
had once again lost its position, while the generalising nature of the social
representation is explicitly presented by labelling cloud computing as an “ambiguous”
and “jargon” expression “to describe a variety of computing concepts”.

Anchor Evolution Resume

The evolution of the cloud computing social representation on Wikipedia in terms of


anchoring is marked by high dynamics. Only a small number of anchors have ‘survived’
over the time period that has been analysed in the case study.
74

As it was outlined several times, the strongest anchors for cloud computing reflect the
generalising nature of the phenomenon. Nevertheless, throughout the anchoring process
it is apparent that the high level of generalisation in the social representation of cloud
computing, which can be simplified to “computing something in the internet”, does not
satisfy the social group of Wikipedia users. There have been numerous attempts
throughout the history of cloud computing social representation on Wikipedia to
establish more specific anchors. All of them appear to have failed in the long term due
to incompleteness and a resulting lack of coherence between the introduced anchors and
the more generalising ones.

Consequently, it is logical to suggest that the anchoring process for cloud computing
will remain intense, and that it will be marked by consecutive alternations between
either more or less generalising natured anchors, in terms of which the phenomenon of
cloud computing is anchored.

5.1.3 Objectification Analysis

The objectification process began six month after the beginning of the anchoring
process. The first relevant article that is identified is a revision of the cloud applications
article dated 05.09.2007, in which cloud computing was referenced as a platform on
which to run cloud applications80. The exact date of the last reference to the cloud
computing article is difficult to identify although it is know that there were 367 new
references to cloud computing in 2013.

Objectification analysis revealed a total number of 240581 articles with a reference to


cloud computing, that had been added during the time frame analysed in the case study.
Only 1363 articles were identified as encyclopaedia articles and have entered the
analysis. The remaining 1042 articles from talk, file and other namespaces were
disregarded (cf. section 3.4.2). Figure 42 shows the distribution of referencing articles
over time.

The steadily increasing number of references reflects a typical process of objectification


for new phenomena. It starts with (or shortly after) the beginning of the anchoring
process, and evolves in parallel with the anchoring process (Rosa 2013, p.20).

80
http://en.wikipedia.org/w/index.php?title=Cloud_Applications&oldid=155846389
81
As for 05.05.2013
75

Fig. 42 Distribution of Articles that Contain References to the Cloud


Computing Article (without indirect references)

5.2 The iPad: From a Big Smartphone to a New Market

The subject of the second case study is the social representation of iPad on Wikipedia.
The iPad is a tablet computer that was originally introduced in January 2010 by the
American IT Corporation ‘Apple Inc.’. In terms of the social representations theory, the
iPad has shifted from being an unfamiliar phenomenon to, arguably, the symbol of a
new market. Companies such as Samsung, HTC, Motorola, RIM, Sony, HP, Microsoft,
Archos and many others have entered this market with their tablet computers82. The aim
of the case study is to analyse the change in the representation of the iPad since its
introduction.

The structure of this section is similar to those of the first case study. The general
description of the iPad device is provided in the next section. Sections 5.2.2 and 5.2.3
contain the anchoring and the objectification analysis, respectively.

5.2.1 General Description

To provide a basic understanding of the phenomenon, the current83 iPad definition on


the Wikipedia platform is provided.

The iPad (/ˈaɪpæd/ EYE-pad) is a line of tablet computers designed and


marketed by Apple Inc., which runs Apple's iOS. The first iPad was released on
April 3, 2010; the most recent iPad models, the fourth-generation iPad and iPad
Mini, were released on November 2, 2012. The user interface is built around the
device's multi-touch screen, including a virtual keyboard. The iPad has built-in
Wi-Fi and, on some models, cellular connectivity.

82
http://smartmediatech.in/?page_id=61, accessed 11:15 2013.06.22
83
iPad definition as for 18 of June http://en.wikipedia.org/w/index.php?title=IPad&oldid=560439228
76

An iPad can shoot video, take photos, play music, and perform Internet functions
such as web-browsing and emailing. Other functions—games, reference, GPS
navigation, social networking, etc.—can be enabled by downloading and
installing apps; as of 2013, the App Store offered more than 800,000 apps by
Apple and third parties.[14] […]
English Wikipedia, formatting in original

According to this definition, the iPad is more than a single product. Rather, it is a line
of tablet computers, which is defined by entertainment, social networking and
navigational functions. The fact that different iPad versions are used as anchors for the
phenomenon is particularly interesting when considering it in terms of the theoretical
perspective. It shows a high objectification level of the phenomenon and therefore
supports one of the central stances of the social representations theory. The latter states
that, through the process of forming social representations, objects become part of the
social reality (Moscovici 1988, p.214). In the case study of cloud computing, the focus
was more on dynamic changes in the anchoring process, rather than on the process,
through which a representation becomes a part of reality. Consequently, the iPad case
study is valuable in understanding the transition of objects from being unfamiliar to
becoming ‘iconic’ (Moscovici 2000/1984, p.49).

Similar to the cloud computing case study, the evolution of the iPad social
representation is concurrent to the evolution of the phenomenon outside of Wikipedia.
Figures 43 and 44 sketch the same time frame, during which the Wikipedia article
underwent editing and Google search queries were initiated. The value of 100 in both
graphs represents the highest number of edits and search queries, respectively.

Fig. 43 Extrapolated Distribution of iPad Article Revisions


77

Fig. 44 Interest for the iPad Search Term According to Google Trends 84

The fact that the iPad is a physical device allows such events as its release and
announcement dates to be related to the distribution of article edits. Figure 45 illustrates
the monthly number of edits for the iPad article together with six important events that
occurred within the period.

Fig. 45 Number of Edits for iPad Article on Wikipedia Including Events

The editing pattern is different to the one that was observed in the cloud computing
article. The latter consisted of a long increasing trend towards the peak, and a
decreasing trend following that. The peak in editing activity for the iPad article, on the
contrary, is observed at the beginning. This appears to follow the increase in interest
toward the phenomenon, following Apple’s official announcement of the release date
for the iPad in January 2010 (1). The subsequent increases in editing activity
correspond to further release and announcement dates for related Apple products (2-6).

84
Source: http://www.google.com.au/trends/explore?q=iPad#q=iPad&cmpt=q,
78

The effects from these events are also observed in the number of views for the iPad
article (Fig. 46). For example, the initial interest for the iPad phenomenon leads to a
higher number of views for the article, when compared to the ‘Facebook’ article.

Fig. 46 Daily Views for iPad and Facebook Articles

The average number of views for the iPad article is about 11.4% of the respective value
of the most popular article from the year 2012. According to the assumption made in
5.1.1, the popularity of the iPad article on Wikipedia is considered to be high.

Similarly to the cloud computing case study, Fig. 47 illustrates the collaboration process
for the iPad article. This includes data about the amount of edits per month (orange),
amount of contributors per month (blue) and the combined monthly measure edits per
editor (green). The data is the foundation for the interpretation of the anchoring in terms
of the changes made in the collaboration process.

Fig. 47 Standardised Monthly Edits, Editors and Edits per Editors Statistics
for the iPad Article
79

The collaboration process for the iPad article appears to be less dynamic than the
collaboration process for the cloud computing article. The centralisation phases (marked
green in the graph), which an increasing interest, appear to correspond with the
aforementioned release and announcement events for the Apple products. With the
exception of these peaks, the collaboration intensity decreases and remains on a
relatively similar level in the time frames between the events. In the following
anchoring analysis, the correspondence between centralisation phases and changes in
the anchoring will be verified.

5.2.2 Anchoring Analysis

The anchoring time frame ranges from the first revision of the iPad article dated 26th of
December 2009 until 20th of June 2013, the date on which the anchor analysis was
conducted. The structure of the section is similar to the cloud computing study. First,
statistics for the iPad anchors are introduced. Second, the anchors identified by the tool
are coded in section 5.2.2.2. Finally, the statistics and the anchor categories are
employed in section 5.2.2.3 to reconstruct the evolution of the iPad’s social
representation on Wikipedia.

5.2.2.1 Employed Anchor Statistics

Similarly to the cloud computing study, the employed anchor statistics include new and
obsolete anchors in each period, the dissimilarity between anchoring states, the average
anchor durability in different time periods and the anchor edit war level for each
period. Since the anchor evolution interpretation in section 5.2.2.3 is focused on the first
two measures, they will be introduced in the following. The remaining anchor statistics,
can be found in Appendix G.

New and obsolete anchors

The analysis of new and obsolete anchors for the iPad social representation on
Wikipedia reveal several time periods containing significant anchor movements (Fig.
48). The first period is represented by the months immediately after the article was
created. A high anchor movement corresponds to the high editing activity during the
period. Furthermore, the months October and December in the year 2012 contain a
higher number of anchor introductions. The qualitative nature of these changes will be
explored in section 5.2.2.3. The anchor movements are also used to identify different
phases in the evolution of the social representation of the iPad.
80

Fig. 48 Amount of New and Obsolete Anchors for iPad per Month

Anchor dissimilarity

The anchor dissimilarity across subsequent time periods in Fig. 49 indicates that the
anchoring of the iPad social representation is more stable than that observed in the
cloud computing case study. Apart from the anchoring occurring in the first four
months, and the peak in November 2012, the dissimilarity measure predominantly
remains below 20 percent. This means that the differences in the anchoring in these
periods are rather small. Even without a qualitative analysis on the level of single
anchors, it is apparent that the social representation of iPad could not experience
significant change within those periods. For example, the transition from July 2011 to
August 2011correponds to an anchor dissimilarity measure of 0.01. Consequently, only
one percent of the anchoring was changed, while 99% of the anchors remained for the
same amount of time in the article. However, the periods with a high anchor
dissimilarity require a qualitative analysis in order to reveal the exact nature of the
corresponding change.

Fig. 49 Anchors Dissimilarity for iPad per Month


81

In comparison to the cloud computing case study, the anchoring of the iPad social
representation appears to be less dynamic. The actual reasoning behind this will be
conveyed in the subsequent qualitative analysis. However, it is reasonable to assume
that on the basis that the younger iPad phenomenon is a physical object, its comparison
to other objects in terms of both specifications and functionalities is easier. This
therefore means that the iPad appears more familiar to the social groups from the
beginning. This is especially true when considering that at least a part of the iPad
characteristics are attributable to its most concrete technical specifications – something
that cloud computing, as a jargon term and a business model, does not possess. The
qualitative anchor analysis will help to verify this assumption.

5.2.2.2 Anchor Coding

A total of 143 anchors for the iPad social representation on Wikipedia were identified
by the WikiGen tool. Fifty anchors were disregarded due to their anchor strength falling
below the chosen threshold of 0.10. The context of the remaining 93 anchors was
analysed according to the defined procedure in section 3.4.2. As a result, 10 anchors
were removed and 20 anchors were merged. The majority of the deleted anchors are
links to organisations such as “Pc World” magazine or “New York Times”, which
published documents sourced in the article. They therefore serve as anchors for the
corresponding documents, rather than for the iPad device. The removed anchors
comprise anchors that only differ by their spelling, such as “multitouch” and “multi-
touch”. Appendix D provides the full list of deleted and merged anchors.

The coding of the remaining 63 anchors reveal 5 categories of anchors, as presented in


Tab 3. The complete list of the strongest anchors for the iPad social representation, as
well as their categorisation, is provided in Appendix B. It must be noted that the
identified categories differ from those in the cloud computing case study, although three
of the categories are similar. The categories are the foundation for the qualitative
analysis in the subsequent section.

Products and companies that Origins of the iPad (2) Technical aspects of the iPad
compete with iPad and Apple (3)
respectively (1)
Products and technologies Use cases for the iPad (5)
that are similar to the iPad (4)

Tab. 3 Anchor Categories for the iPad Social Representation

1. The category “Products and companies that compete with iPad and Apple
respectively” (Competitors) contains five anchors, which point to Apple’s competitors
and their iPad-similar devices. The first two anchors are amazon and its eBook reader
82

‘kindle’85. The third anchor ‘barnes & noble nook’86 is another e-reader device
developed by the American book retailer Barnes & Noble. Furthermore, tablet computer
and stylus (computing) anchors87 point at a table computer definition by Microsoft,
which satisfies the majority of all potentially competing tablet computers of that time.

2. Five anchors in the category “Origins of the iPad” (Origins) refer to companies,
people and events associated with the development and marketing of the device. Three
anchors in this category entiteled: Steve Jobs88, apple inc89 and macintosh90, establish a
link to the Apple Inc. Corporation. Additionally, the iPad is anchored in terms of its
manufacturer foxconn and in terms of the conference center “yerba buena center for the
arts”91, which hosted the iPad introduction event.

3. Similarly to the cloud computing case study, the category “Technical aspects of the
iPad” (Technical aspects), which comprises 18 anchors, represents device’s
specifications. One group of the anchors, including wi-fi, 3g and cellular network92,
focuses on the iPad’s ability to access computer networks. Anchors such as dock
connector93, pixel, and gigahertz94, on the other hand, shed light on the remaining
technical aspects of the iPad.

4. The majority of the fifteen anchors that comprise the category “Products and
technologies that are similar to the iPad” (Similarities) represent similar electronic
devices. These include laptops95, tablet computers96 and smartphones97. Also included in
this category are similar Apple devices such as iPod98, iPhone99, and different versions
of the iPad itself.

5. The largest category, entitled “Use cases for the iPad” (Usage), contains 23 anchors.
These anchors represent aspects regarding the usage of the iPad device. Examples of
anchors in this category are virtual keyboard100, video camera, video game, social

85
Cf. iPad revision from 20:05, 13 February 2010
86
Cf. iPad revision from 01:42, 23 March 2010
87
Cf. iPad revision from 18:53, 17 April 2010
88
Cf. iPad revision from 20:59, 30 October 2011
89
Cf. iPad revision from 18:53, 17 April 2010
90
Cf. iPad revision from 23:52, 24 January 2012
91
Cf. iPad revision from 02:04, 27 December 2009
92
Cf. iPad revision from 18:46, 17 April 2010
93
Cf. iPad revision from 20:05, 13 February 2010
94
Cf. iPad revision from 20:05, 13 February 2010
95
01:28, 16 April 2010
96
00:07, 28 January 2010
97
13:21, 8 March 2010
98
01:35, 3 February 2010
99
22:55, 1 March 2010
100
15:49, 28 January 2010
83

network service and many others101. The context of the anchors is manifold and,
additionally, is a subject of change during the analysis time frame. The latter is a part of
the following narrative analysis that describes changes in the anchoring of the iPad
social representation on Wikipedia.

5.2.2.3 Narrative Interpretation

The overall anchoring time frame consists of distinct phases; these can be separated
from each other according to the different patterns observed in those phases. However,
the social representation of iPad is different to those of the cloud computing in that it
contains anchors that ‘survived’ almost the entire anchoring period. Therefore, these
anchors will be introduced before the reconstruction of changes, which the iPad
representation experienced in the different phases of its anchoring.

Three iPad anchors that have remained unquestioned by the Wikipedia users are tablet
computer, Apple Inc and multi-touch. The reasons for the strong position of these
anchors are apparent. While the corporation name is a historical and legal fact, tablet
computer and multi-touch anchors possess the ‘iconic quality’ for the representation
(Moscovici 2000/1984, p.49). Both refer to the physical form of the device, and the
physical means of interacting with it. The image of a human holding or touches a flat
electronic device is, likely, the most immediate and vivid image associated with the
iPad device. This is not least due to the Apple’s marketing campaigns, which often
incorporate this kind of imagery. Figure 50 illustrates the title image of the iPad TV
Ads section on the Apple website102.

Fig. 50 Title Picture of the iPad Ads Section on the Apple’s Website

101
22:16, 23 October 2012
102
http://www.apple.com/au/ipad/videos/#tv-ads-together, accessed on 23th of June 2013
84

In contrast to the above anchors, the remaining anchors are subject to change and are
therefore analysed in the context of their corresponding time frame in the anchoring
process. A summary of WikiGen statistics for each evolution phase of the iPad social
representation on Wikipedia is provided in Appendix G.

Infancy (26th December 2009 – 27th January 2010)

The infancy phase for the iPad social representation is remarkable as it is based purely
on rumours and speculations about a new Apple device. The first article revision103 is
dated 26th of December while the earliest Apple’s announcement about the actual device
was made on 27th of January. In this early stage, the iPad article had a different name –
iSlate. This name was based on the rumours within the Wikipedia community and was
changed to iPad after the official announcement. It is important to note that neither iPad
devices nor official statements about the device had been made available to the
community at this time. Yet the Wikipedia users were able to form a social
representation of a non-existent device. This emphasises the nature and the power of
social representations to form a social reality, as opposed to objectively reflecting the
world ‘out there’ (Moscovici 1990, p.164).

The strongest anchors within the infancy phase are introduced in what follows. Based
on rumours, the Wikipedia community assumed that the apple inc. is producing a new
tablet pc with the help of manufacturer foxconn. The ‘iSlate’ device was supposed to be
presented in San Francisco at the yerba buena center for the arts. Its function was
supposed to be similar to an already existing ebook reader entitled barnes & noble nook,
with the prime difference being that the content would be purchased via Apple’s app
store.

The end of the infancy phase is clearly marked by the official Apple announcement,
which has provided the community with technical specifications and typical use case
scenarios for the upcoming device.

Initial familiarisation attempts (February 2010 – August 2010)

Similarly to the cloud computing study, the phase following the infancy of the
representation is characterised by the introduction of different anchor groups. The aim
of this phase was to overcome the unfamiliarity felt towards the new iPad phenomenon
by approaching it from different angles. Within the next seven months, anchors from all
five different groups had appeared for a limited amount of time. This is reflected in the

103
http://en.wikipedia.org/w/index.php?title=IPad&oldid=334876628
85

high number of introduced and removed anchors, as well as in the high dissimilarity
measure. Figure 51 highlights the anchor dynamics in this phase.

Fig. 51 Anchor Movements for the iPad SR in Feb 10-Aug 10

The initially high unfamiliarity level of the iPad phenomenon is also observed from the
corresponding collaboration dynamics. In the cloud computing case study, it was
emphasised that a high level of unfamiliarity appears together with an increase in the
intensity of the collaboration. Furthermore in such cases, the rise in the editing activity
is not proportional in the sense that a high increase in the edits is followed by a lesser
increase in the number of editors. In other words, the unfamiliarity of the phenomenon
on Wikipedia is accompanied by the centralisation of the collaboration processes with a
simultaneous rise of interest. Similarly, the initial unfamiliarity associated with the new
iPad device is accompanied by the aforementioned changes in the collaboration process.
Figure 52 illustrates a relatively high level of the centralisation in this phase using the
standardised data for edits, editors and edits per editor.

Fig. 52 High Centralisation of the Collaboration for the iPad SR in Feb 10 -


Aug 10
86

In the first months after the official announcement (February - March), the
representation of iPad was dominated by anchors from the technical aspect category.
Technical aspects in this period include anchors such as pixel, gigabytes flash memory,
Bluetooth 2.1, apple a4 and a number of anchors for the communication technologies
that have been integrated into the device.

Another perspective on the iPad phenomenon is provided by introducing the


competitors for the device. In addition to the barnes & noble nook anchor from the
previous phase, the competitors category is extended in this period to include the
amazon and kindle anchors. Furthermore, the anchors table computer and stylus had
acquired a competitive or opposing connotation in this period. In accordance to the early
article revisions, the iPad “unlike traditional tablet computers [...] doesn't use a pen”104.
This refers to the tablet computer definition on Wikipedia at this time: “According to a
2001 Microsoft definition [...] of the term, "Tablet PCs" are pen-based, fully functional
x86 PCs with handwriting and voice recognition functionality” 105 A multitouch device
such as the iPad does not satisfy this Microsoft definition and is therefore viewed as a
subversion to what is traditionally understood to be a tablet computer.

Related to anchors from this group are those from the usage category. The first anchors
in this category are focused around the ebook use case scenario, which had already
acquired familiarity at that time. The anchors in this category between February and
July include ebook, ibooks and ibookstore. This pattern of anchoring unknown
phenomenon in terms of already known products and usage scenarios is in full
accordance with the social representations theory.

In the similar vein, anchors from the similarities category in this phase include devices
such as iphone and ipod touch, which had already existed for a longer period of time.
Two other anchors in the category, entitled smartphone and laptop, emphasise the
position of the iPad somewhere in between these two types of devices. It highlights the
dependency of the iPad social representation on already established representations.

The end of the phase is marked by a transition to a more stable anchoring following the
initial attempts at familiarisation. Figure 53 illustrates the dissimilarity measure of the
last month of the period (August 2010) to preceding and subsequent months. While the
dissimilarity to the previous months is rising, the measure for the subsequent months
indicates that about 70% of the anchoring remains stable in the following months.

104
Cf. iPad definition from 23:16, 17 April 2010
http://en.wikipedia.org/w/index.php?title=IPad&direction=next&oldid=356674575
105
Cf. Tablet PC definition from 23:01, 17 April 2010
http://en.wikipedia.org/w/index.php?title=Tablet_computer&direction=next&oldid=355863073
87

Fig. 53 Anchor Dissimilarity for iPad in 2010

In a nutshell, the time period from February 2010 until August 2010 is characterised by
initial attempts to familiarise the unknown concept of the iPad. Anchoring attempts
range from stating the technical specifications of the device, to emphasising the
competitive and similar devices, and lastly to usage scenarios. Anchors from the last
three categories focus on already established representations, such as an old definition
of a tablet computer by Microsoft, competing devices such as Amazon’s kindle, and
usage scenarios such as ebook reading.

Objectification (September 2010 – September 2012)

The previous phase resulted in a set of anchors that appear to be accepted by Wikipedia
contributors for the period between September 2010 and September 2012. In this
representation, the majority of technical anchors had disappeared. Rather, the
representation is dominated by anchors from the usage and similarities categories.
Figure 54 illustrates the strongest anchors in this phase.

Fig. 54 Strongest Anchors for iPad in Sep 2010 – Sep 2012

The effect of the almost permanent presence of these anchors in the article is shown in
Fig. 55. During the period, the dissimilarity measure falls below 25% meaning that
88

more than 75 percent of the anchoring remains stable from period to period.
Consequently, only minor anchor fluctuations are observed during this time frame.

The name of the phase however indicates that there is an intense objectification process
taking place throughout the period. This was derived from two observations. First, as
demonstrated in the subsequent section, this period contains a growing amount of
articles that reference the iPad social representation. Second, the end of the phase
indicates a major change towards a more independent representation of the iPad device.

Fig. 55 Anchor Dissimilarity for the iPad SR in Sep 10-Sep 12

Individualisation (Oct 2012 – Jun 2013)

The last period in the anchoring process of the iPad social representation on Wikipedia
can be described as a process of the individualisation. During this process, the
independence of the iPad phenomenon from its roots becomes apparent. The
corresponding changes are explained in the following.

The phase starts with a higher number of new anchors being introduced to the article
(Fig 51). The majority of the new anchors represent the different generations of iPad
devices. In this way, the iPad representation is now anchored in terms of different iPad
versions such as iPad1, iPad2, iPad3 and iPad mini. Anchors from the similarities
category take over the role that concepts such as iphone, ipod, laptop and smartphone
had in the previous phases106. More importantly, the social representation of the tablet
computer itself is changed. In its current definition on Wikipedia, the tablet computer is
no longer associated with an old “stylus-based Microsoft definition”. Instead, the
representation is anchored in terms of the iPad itself.

106
Anchors laptop and smartphone disappeared in October 2012
89

A tablet computer, or simply tablet, is a one-piece mobile computer. Devices


typically have a touchscreen […]

Among tablets available in 2012, the top-selling line of devices was Apple's iPad
with 100 million sold by mid October 2012 since it had been released on April 3,
2010 […]
English Wikipedia107, formatting in original

Fig. 56 Anchor Movements for the iPad SR in Oct 12-Jun 13

Further indication for a growing independence of the iPad representations are hidden in
a number of small but important details in the anchoring evolution. First, within the
usage category, a number of anchors such as jailbreak, itunes, file synchronisation and
usb have previously represented limitations in iPad use. Jailbreak was viewed as
necessary means to access additional, unofficial iPad features. With the growing
amount of software applications for the iPad, the jailbreak anchor lost its importance in
this period. The growing range of software products for the iPad is also reflected in the
appearance of anchors that symbolise different usage scenarios. The latter includes
camera video, portable media player, social network service, reference work and gps
navigation software. Moreover, the usage range appears to be extended by the
disappearance of anchors such as iTunes, usb and file synchronisation. These anchors
stand for requirements to operate the device that were eliminated in one of the iPad
software updates. Changes in the usage and the previously described similarities
category are illustrated in Figure 57.

107
Tablet computer definition on Wikipedia
http://en.wikipedia.org/w/index.php?title=Tablet_computer&oldid=561068568 , accessed 23.06.2013
90

Fig. 57 Usage and Similarities Categories in for the iPad SR in 2013

Further support for the individualisation process in the anchor evolution of this time
period is provided by the fact that anchors from the competitors group are no longer
observed. A comparison with other devices is rendered unnecessary for an objectified
representation of the iPad.

In summary, the social representation of the iPad does not fall in between those of a
laptop and a smartphone, nor is it a device that is considered to be similar to the iphone
or ipod. Instead, it appears to be a distinct device with its own use cases and its own
tablet computer market that is defined in terms of the iPad device itself. It appears that
the job of a marketing department of a corporation such as that in Apple Inc. is to
objectify its products as fast and as effectively as possible so that they become a part of
the social reality. In the context of the iPad’s rather short yet successful history, Apple’s
marketing department appears to be successful in achieving this goal. A further
explanation for a rather fast objectification process of the phenomenon can also be
found in the physical nature of the phenomenon. The iPad is an electronic device
designed to be touched, to be seen and to be heard. It induces sensations that
distinguishes it from abstract phenomena such as cloud computing.

5.2.3 Objectification Analysis

The objectification analysis conducted with the help of the WikiGen tool revealed a total
of 2817 articles ion Wikipedia that have references to the iPad social representation.
The 894 articles are not encyclopaedia articles and are therefore not accounted for in the
distribution of the remaining 1923 articles in the main namespace of the Wikipedia.
Figure 58 illustrates the number of references over time since the creation of the iPad
article.

In accordance with the results of the anchoring analysis, there is a significant increase in
the number of references in the year 2011 and 2012. This period was identified as an
‘objectification’ phase. The tendency is still increasing, indicating that there is an
91

ongoing objectification process through which the iPad becomes manifested in the
social reality of the Wikipedia users.

Fig. 58 Distribution of Articles that Contain References to the iPad Article


(without indirect references)
92

6 Discussion

The cloud computing and iPad case studies in the previous section demonstrated the
developed method for studying the evolution of social representations on Wikipedia.
The aims of the discussion section are to answer the research question, and to elaborate
upon the identified similarities and differences in the evolution of both analysed social
representations.

The method presented in this work answers the research question of how the evolution
of social representations can be studied on Wikipedia by developing a corresponding
method. The method is based on three essential steps. First, the mapping between
concepts of the social representation theory and Wikipedia elements, such as the one
presented in section 3.3, is required. Second, the employment of a statistical tool, such
as the WikiGen tool presented in section 4, is essential to process thousands of article
revisions that are available on the platform. Finally, a qualitative analysis on the level of
identified anchors is required to capture the semantics of the changes in the social
representation under study. The case studies in section 5 integrated all three of these
steps, and demonstrated the applicability of the method for studying the evolution of
social representations on Wikipedia through the case studies of cloud computing and the
iPad.

Figure 59 summarises the evolution of the cloud computing social representation. In the
course of the case study, six subsequent evolution phases were distinguished. Each
phase is characterised by changes in the importance of five identified anchor categories.
The evolution of the cloud computing social representation is marked by the highly
dynamic nature of its anchoring process. In this process, the generalising category
exercises the most important influence on the representation by extending its scope.
According to anchors in this category, cloud computing refers to almost anything that is
calculated on the internet. However, this view is repeatedly challenged by more specific
anchors from the categories of technical aspects of CC, sub concept of CC, CC use
cases and CC solutions and providers anchors. This tension represents two
incompatible views on cloud computing: a general, and a more specific one. Without
more specific anchors, the representation appears to evade the grasp of Wikipedia users.
An inclusion of more specific anchors, on the other hand, appears to be incoherent with
a broader scope of the phenomenon, which is set by generalising anchors. The tension
leads to consecutive alternations between either more or less specific representations of
the CC phenomenon across the different phases of its evolution.
93

Fig. 59 Evolution Phases and Anchor Perspectives for the CC Case Study

The second case study in this thesis analysed the social representation of the popular
iPad tablet device. The corresponding evolution of the iPad representation is
summarised in Fig. 60. Similarly to the cloud computing case study, four phases within
the overall evolution time frame were distinguished. Changes in the social
representation of the iPad are described in terms of the varying importance of five
identified anchor categories.

Starting with rumours about the origins of a new tablet device, the social representation
of the iPad steadily moves towards a distinct device with its own use cases and its own
tablet computer market that is defined in terms of the iPad device itself. In the course of
this shift, anchors that represent iPad competitors, as well as technical specification of
the device lose their importance to anchors from similarities and usage categories. The
latter comprise the variety of different iPad versions and the multiplicity of use cases
that the iPad is increasingly associated with.

Both case studies have indicated similarities and differences in the evolution process of
the cloud computing and the iPad social representation on Wikipedia.
94

Fig. 60 Evolution Phases and Anchor Categories for the iPad Case Study

Similarities in the case studies

In accordance with both Wikipedia research (Kaltenbrunner and Laniado 2012a) and
elaboration upon the social representations theory (Höijer 2011), the analysed articles
are experiencing continuous change. This fact emphasises the dynamic nature of
knowledge and encourages to see the meaning of an object through the prism of a
continuous social process that forms this meaning. In the case of cloud computing and
the iPad, both representations are initially derived from phenomena that were similar to
cloud computing and the iPad respectively. In the course of the evolution both
representations however shift towards becoming more independent. Simultaneously,
both social representations are increasingly anchored in terms of their usage rather than
in terms of their technical characteristics. A similar pattern can be assumed to exist for
other phenomena in the IT field.

The case studies indicate that the unfamiliarity with a phenomenon, as defined in the
social representations theory, leads to a more intense and centralised collaboration
process among the Wikipedia users. Attempts to resolve this unfamiliarity, in turn, are
reflected in the changing anchoring of the corresponding social representation. For
example, the initial high unfamiliarity with the cloud computing and the iPad
phenomena is reflected in the similar phase in the evolution of both representations,
95

entitled initial familiarisation attempts. During the phases, an intense and centralised
collaboration process among the Wikipedia users is accompanied by significant changes
in the anchoring of the representations. The latter is characterised by the introduction of
all different anchor categories that are associated with the corresponding social
representation, without any group of aspects dominating the others. At the end of this
phase, one or several anchor groups acquire a dominating role. In the cloud computing
case study, the stand out category is that of generalising anchors, while in the iPad case
study this role was taken by anchors in usage and similarities categories. This initial
constellation, however, is subject to change, as the cloud computing case study in this
thesis has demonstrated.

The objectification analysis in both case studies showed a similar increasing trend in the
cumulative amount of references among Wikipedia articles to the social representation
under study. This is a particularly valuable observation given the fact that the anchoring
process in both case studies is substantially different. While the anchoring of the iPad is
rather ‘smooth’ and ‘stable, the corresponding process for the cloud computing is
volatile. Yet the social representations of both phenomena increasingly become part of
the social reality. This is indicated by the rising number of other phenomena being
explained in terms of either the cloud computing or the iPad. Figure 61 illustrates the
objectification of cloud computing and iPad social representations on Wikipedia.

Fig. 61 Objectification of Cloud Computing (left) and of iPad (right)

Differences in the case studies

It must be noted that some differences in the case studies might be grounded in the fact
that the divergent representations might belong to different social representation types.
Cloud computing and the iPad are recent technological phenomena, with the analysed
time frames on Wikipeida being concurrent to the history of both phenomena. However,
the iPad is a physical device, which contrasts to the abstract model of cloud computing.
Furthermore, it can be assumed that Apple Inc. and its most devoted customers have a
stronger personal interest in crafting the representation of the iPad. The relationship
between cloud computing and different social groups on the other hand appears to be
96

more neutral. Regardless the correctness of these assumptions, differences in the level
of group interests towards phenomena under study can potentially be a source of
differences in the evolution of the corresponding social representations. A
categorisation of representations in further studies can therefore enable a selection of
more homogeneous social representations in order to study more specific evolution
patterns. In this manner, representations that are more likely to be deliberately
influenced by social groups can represent one possible sample.

The anchoring analysis in the thesis revealed distinct differences in the evolution of the
cloud computing and iPad representations. The cloud computing social representation is
marked by a more fluctuating anchoring process and therefore the representation of
cloud computing appears to be more ambiguous than the representation of the iPad. The
anchoring of cloud computing oscillates between a widening in scope of what cloud
computing means and a narrowing in scope. On the one hand, anchors in the
‘generalising’ category extend the scope of cloud computing by generalising it to
additional possible services or resource delivery modes. On the other hand, this view is
repeatedly challenged by more specific anchor categories. This effect can be observed
in the left part of Fig. 62 where peaks in the anchor dissimilarity of the cloud
computing correspond to fluctuations between general and specific perspectives on
cloud computing.

Fig. 62 Monthly Dissimilarity of Cloud Computing (left) and iPad (right).

In addition to the aforementioned differences in the anchoring process, the collaboration


dynamics for both articles is substantially different. The cloud computing article shows
six months of increasing editing activity until reaching its peak. In contrast, the highest
activity for the iPad article is observed at the very beginning of the article evolution.
The reason behind it is that the collaboration for the iPad article clearly follows events
such as the official announcement of the device. The latter (the official announcement
of the first iPad device) has attracted the highest attention to the iPad article. The
subsequent peaks in the collaboration process are also due to further release or
announcement statements made by Apple Inc. Therefore, matching events with the
collaboration dynamics is an additional explanatory technique available for phenomena
that are similar to iPad in this respect.
97

Another difference between the two case studies was observed in the evolution of the
iPad social representation. In the individualisation evolution phase, the iPad
representation is altered due to changes within categories rather than between
categories. In this phase, anchors from the similarities group are replaced by other
anchors from the same group that have changed the focus of the representation. It
implies that changes in the anchoring can happen even if the strength of the anchor
groups remains the same and only anchors within the category are replaced.

Last but not least, in the iPad case study it was possible to establish a connection
between the anchoring and objectification processes. During the objectification phase,
the anchors of the iPad stayed stable for a longer time period indicating a temporal
consensus. The assumption that an increasing objectification will be observed in this
period was supported by the results of the objectification analysis. For the volatile
anchoring process of the cloud computing SR, no indications for interdependence
between anchoring and objectification were observed.

Method evaluation

Due to the meaningful interpretations provided by the case studies, the method is
considered to be a suitable mean for studying the evolution of social representations on
Wikipedia. Some of the method advantages deserve are emphasised in the following.
First, the Wikipedia structure with its page concept and transparent collaboration
processes welcomes the application of the social representations theory. The
encyclopaedia articles with their unique names correspond to the definition of social
representations, and the internal links on Wikipedia allow exploration of the
relationship between social representations. Second, the tool enabled analyses in the
case studies. Without the automated anchor identification and statistics, such as the
anchor strength, tracing changes in the social representations on Wikipedia would be
intractable. Finally, the choice of the case studies has proven itself to be useful. The
cloud computing phenomenon demonstrated the dynamic nature of knowledge with the
changing character of its social representation. On the other hand, the iPad social
representation appeared to exist even before the actual device was produced, being
entirely based on rumours and speculations. This emphasises MOSCOVICI’s central
argument that social representations form reality rather than reflect it (1990, p.164).

However, the method also has its pitfalls. First, not all WikiGen statistics have proven to
be equally useful. In the case studies, the anchor durability and anchor edit war level
statistics played only a peripheral role. Second, the mapping of anchors to internal links
in the definition part of the article appeared to be problematic for a number of reasons.
For example, some potentially important concepts, although present in the relevant
98

section of the article, remain unlinked to the corresponding encyclopaedia entries. Other
potential anchors might be present in the article outside of the definition section. This is
particularly true for controversial phenomena, in which the inclusion of controversial
aspects into the definition section contradicts the encyclopaedias requirements of
neutrality and objectivity. Correspondingly, such aspects often migrate into the
specialised sections of the article (Hansen et al. 2009). Furthermore, some aspects of the
representation can be found in the talk pages of the corresponding article (Schneider et
al. 2010). As a consequence, potential anchors in those scenarios are not accounted for
in the case study. To increase the range of identified anchors and therefore to facilitate
analysis of different representations of the same phenomenon, the tool must be
extended.
99

7 Conclusions and Directions for Further Research

The research is facing the challenge of accounting for the dynamic nature of knowledge.
This challenge historically belongs to the problem space of philosophy (Marková 1996),
and is typically explored within social sciences (Berger and Luckmann 1966). However,
other fields such as the information systems increasingly recognise the need to study
knowledge in its making in order to understand group phenomena (Boland 1999). This
epistemological problem of the origins of knowledge “becomes a social problem in the
world of today, with its permanent scientific and technological revolutions” (Moscovici
1988, p.216). Divergent representations of the same objects are observed across both
time and social space. Within this multiplicity, an increasing number of representations
are circulating the digital world and especially in the realm of social media.

Instead of focusing on the evolution of single representations, this thesis adopted the
theoretical notion of social representations (Moscovici 2008 / 1966) to provide a mean
for studying the process of knowledge evolution itself. Having identified the potential of
using Wikipedia’s historical data to study the genealogy of knowledge, the thesis
addressed the question of how the evolution of social representations can be studied on
Wikipedia. The answer to the question is presented in the form of the method for
studying social representations, which integrates a quantitative analysis of Wikipedia
data by means of a specially developed statistical tool into a case study.

The cloud computing and iPad case studies in this work demonstrated the applicability
of the method. The findings of the case studies are manifold. First, the Wikipedia data
provided evidence for the existence of the continuous anchoring and objectification
processes that are defined within the social representations theory. Second, the
quantitative data provided by the tool was found to be essential in order to indicate
changes in the social representations and, additionally, to derive distinct phases from the
overall evolution period that is analysed in a case study. Phases with an increasing
intensity and centralisation of the collaboration process have indicated a correlation
with significant changes in the corresponding representation. Third, the qualitative
analysis was found to provide meaningful interpretation of the changes that are
indicated by the quantitative data. Finally, similarities and differences in the patterns of
change were identified for the social representations of the cloud computing and the
iPad. Both representations departed from their historical roots, which were represented
by anchoring both phenomena in terms of similar concepts, towards more independent
representations. Furthermore, the evolution of both phenomena is accompanied by the
strengthening of their usage aspects and the weakening role of technical specifications.
However, the actual patterns of change were found to be different for both phenomena.
100

While the evolution of the iPad representation is characterised by a ‘smooth’ change


towards a more individual and usage-centred representation, the different groups of
aspects of the cloud computing representation were found to be competitive. Despite
these differences in their evolution, cloud computing and iPad phenomena are found to
increasingly become a part of the social reality for the Wikipedia users.

The case studies also identified limitations of the method. The chosen mapping between
theoretical concepts and Wikipedia elements focuses on the dominating representation
of the phenomena, which is presented in the definition part of the article. Especially in
the case of controversial phenomena, this part might contain only ‘neutralised’ aspects
of the representation while more distinct representations can be hidden in other sections
of the article. Additionally, user discussions on the talk pages potentially contain further
aspects of the social representation or even distinct representations that are not
expressed in the actual article. These missing representation aspects limit the
applicability of the method for studying the evolution of different social representations
of the same phenomenon. Furthermore, the generalisability of results based on two
conducted case studies is limited. Although the case studies demonstrate the
applicability of the method and illustrate an effective employment of the developed
statistical tool, a larger and more specialised sample of social representations is required
to derive more specific patterns.

The aforementioned limitations constitute directions for further research. First, the
WikiGen tool can be extended to identify potentially missing anchors within the plain
text or among the internal links outside of the articles’ definition sections. Together
with an analysis of talk page discussions with a corresponding sentiment analysis
(Schneider et al. 2010), the method would be better suited for tracing different
representations of the same object. Another direction for further research suggests an
attempt to identify patterns in the evolution of social representations on Wikipedia
within a specific group of objects such as a set of communication and collaboration
software products. The results of such a study would be valuable for understanding how
people perceive new IT in enterprises and how this perception changes over time. The
use of the method is therefore promising for the IS implementation research – a
direction which is already indicated within the IS research (Gal and Berente 2008).

In the context of observed similar objectification process despite ambiguities in the


representation of the cloud computing, the thesis furthermore suggests a fast
objectification process for a product or service to be one of the main marketing aims.
The iPad commercials contain apparent objectification elements, which help to
establish a new device as a part of the customers’ social reality. Similarly, the success of
101

cloud computing is described by the Wikipedia users themselves in terms of successful


marketing campaigns of the biggest cloud computing providers. The role of the cloud
imagery in their marketing efforts is apparent. This connection between the social
representations theory and marketing appears to be a promising explanatory device for
understanding both driving factors behind the success of marketing campaigns and the
technology acceptance in general

Having demonstrated the method for studying social representations on Wikipedia and
furthermore indicating directions for further research, this thesis is a step towards
understanding the dynamic nature of knowledge. It is the hope of the author that both
the method and the tool will encourage further studies on the genealogy of knowledge.
The importance of the latter is unquestionable in that each identified pattern in the social
process of forming representations that constitute our social reality is a step towards a
better understanding of the human conduct.
102

References

Adler, B. T., De Alfaro, L., Mola-Velasco, S. M., Rosso, P., and West, A. G. 2011.
“Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and
Reputation Features.“, Lecture Notes in Computer Science (6609:PART 2), A.
Gelbukh (ed.), pp. 277–288.

Anderka, M., and Stein, B. 2012. “A breakdown of quality flaws in Wikipedia“, in


Proceedings of the 2nd Joint WICOWAIRWeb Workshop on Web Quality
WebQuality 12, ACM Press, p. 11.

Backstrom, A., Pirttila-Backman, A.-M., and Tuorila, H. 2003. “Dimensions of novelty:


a social representation approach to new foods“, Appetite (40:3), pp. 299–307.

Bangerter, A. 1995. “Rethinking the Relation Between Science and Common Sense: A
Comment on the Current State of SR Theory“, Papers on Social Representations
(4), pp. 61–78.

Bartlett, F. C. 1995. Remembering: A Study in Experimental and Social Psychology,


Cambridge University Press.

Barton, M. D. 2005. “The future of rational-critical debate in online public spheres“,


Computers and Composition (22:2), pp. 177–190.

Bauer, M., and Gaskell, G. 1999. “Towards a Paradigm for Research on Social
Representations“, Journal for the Theory of Social Behaviour (29:2).

Bauer, M. W., and Gaskell, G. 2008. “Social Representations Theory: A Progressive


Research Programme for Social Psychology“, Journal for the Theory of Social
Behaviour (38:4), pp. 335–353.

Belani, A. 2010. “Vandalism Detection in Wikipedia: a Bag-of-Words Classifier


Approach“, Changes, p. 15.

Bellomi, F., and Bonato, R. 2005. “Network analysis for Wikipedia“, in Proceedings of
Wikimania 2005, The First International Wikimedia Conference.

Berger, P. L., and Luckmann, T. 1966. The social construction of reality: a treatise in
the sociology of knowledge, Doubleday.

Billig, M. 1986. “Social representation, objectification and anchoring: A rhetorical


analysis.“, Social Behaviour (3:1), pp. 1–16.

Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., et al. 2009.
“DBpedia - A crystallization point for the Web of Data“, Web Semantics Science
Services and Agents on the World Wide Web (7:3), pp. 154–165.

Boland, R. J. 1999. “The Tyranny of Space in Organizational Analysis“, in New


Information Technologies in Organizational Processes, pp. 27–48.
103

Breakwell, G. M., and Canter, D. V 1993. Empirical approaches to social


representations, Clarendon Press.

Chin, S.-C., Street, W. N., Srinivasan, P., and Eichmann, D. 2010. “Detecting
Wikipedia vandalism with active learning and statistical language models“, North,
p. 3.

Dacin, M. T., Munir, K., and Tracey, P. 2010. “Formal Dining at Cambridge Colleges:
Linking Ritual Performance and Institutional Maintenance“, Academy of
Management Journal (53:6), pp. 1393–1418.

Doise, W., Clémence, A., and Lorenzi-Cioldi, F. 1993. The quantitative analysis of
social representations, Harvester Wheatsheaf.

Doise, W., Staerkle, C., Clémence, A., and Savory, F. 1998. “Human rights and
Genevan youth: A developmental study of social representations“, Swiss Journal of
Psychology (57:2), pp. 86–100.

Duveen, G. &, and De Rosa, A. 1992. “Social Representations and the Genesis of
Social Knowledge“, Papers on Social Representations (1), pp. 94–108.

Duveen, G. 1990. Social representations and the development of knowledge, G. Duveen


and B.B. Lloyd (eds.), Cambridge University Press.

Duveen, G. 2000. “The Power of Ideas“, in Social Representations: Explorations in


Social Psychology, S. Moscovici and G. Duveen (eds.), Polity Press Cambridge.

Duveen, G. 1996. “The development of social representations of gender“, Japanese


Journal of Experimental Social Psychology (35), pp. 256–262.

Farr, R. M. 1993. “Common sense, science and social representations“, Public


Understanding of Science (2:3), pp. 189–204.

Farr, R. M. 1996. The Roots of Modern Social Psychology: 1872-1954, Wiley.

Figari, H., and Skogen, K. 2011. “Social representations of the wolf“, Acta Sociologica
(54:4), pp. 317–332.

Gal, U., and Berente, N. 2008. “A social representations perspective on information


systems implementation: Rethinking the concept of ‘frames’“, IT {&} People
(21:2), pp. 133–154.

Gervais, M. 1997. “Social representations of nature: the case of the ‘Braer’ oil spill in
Shetland“, (Doctoral thesis).

Ghazal, M., Vazquez, C., and Amer, A. 2007. Real-time automatic detection of
vandalism behavior in video sequences.

Giles, J. 2005. “Internet encyclopaedias go head to head.“, Nature (438:7070), p. 900.


104

Gillespie, A. 2008. “Social representations, alternative representations and semantic


barriers“, Journal for the Theory of Social Behaviour (38:4), pp. 375–391.

Glăveanu, V. P. 2009. “What differences make a difference?: a discussion of hegemony,


resistance and representation“, Papers on Social Representations (18), pp. 1–22.

Glott, R., Schmidt, P., and Ghosh, R. 2010. “Analysos pf Wikipedia Survey Data.”

Hansen, S., Berente, N., and Lyytinen, K. 2009. “Wikipedia, Critical Social Theory, and
the Possibility of Rational Discourse“, The Information Society (25:1), pp. 38–59.

Hegel, G. W. F. 1998. Phenomenology Of Spirit, Motilal Banarsidass Publishers Pvt.


Limited.

Höijer, B. 2011. “Social Representations Theory“, Nordicom Review (32:2), pp. 3–16.

Holloway, T., Bozicevic, M., and Börner, K. 2007. “Analyzing and visualizing the
semantic coverage of Wikipedia and its authors: Research Articles“, Complex.
(12:3), pp. 30–40.

Howarth, C. 2006. “A social representation is not a quiet thing: Exploring the critical
potential of social representations theory“, British Journal of Social Psychology
(45:1), pp. 65–86.

Jahoda, G. 1988. “Critical notes and reflections on social representations“, European


Journal of Social Psychology (18:3), pp. 195–209.

Javanmardi, S., McDonald, D. W., and Lopes, C. V 2011. “Vandalism Detection in


Wikipedia : A High-Performing , Feature – Rich Model and its Reduction Through
Lasso“, in Proceedings of the 7th International Symposium on Wikis and Open
Collaboration, ACM, pp. 82–90.

Jodelet, D. 1991. Madness and Social Representations: Living With the Mad in One
French Community, University of California Press.

Jodelet, D. 2008. “Social Representations: The Beautiful Invention“, Journal for the
Theory of Social Behaviour (38:4), pp. 411–430.

Joffe, H. 2009. “Social representations of AIDS: towards encompassing issues of


power“, Papers on Social Representations (4:1), pp. 29–40.

Ju, B., and Gluck, M. 2011. “Calibrating information users’ views on relevance: A
social representations approach“, Journal of Information Science (37:4), pp. 429–
438.

Kaganer, E. 2010. “Responding to the (almost) unknown: Social representations and


corporate policies of social media“, Information Systems Journal, Paper 163.

Kaltenbrunner, A., and Laniado, D. 2012a. “There is No Deadline - Time Evolution of


Wikipedia Discussions“, CoRR (abs/1204.3453).
105

Korsgaard, T. R., and Jensen, C. D. 2009. “Reengineering the Wikipedia for


Reputation“, Electronic Notes in Theoretical Computer Science (244:244), pp. 81–
94.

Kubiszewski, I., Noordewier, T., and Costanza, R. 2011. “Perceived credibility of


Internet encyclopedias“, Computers & Education (56:3), pp. 659–667.

Kurzman, C. 1994. “Epistemology and the Sociology of Knowledge“, in Philosophy of


the Social Sciences (24:3), pp. 267 –290.

László, J. 1997. “Narrative organization of Social Representation“, Papers on Social


Representations, J. László and W.S. Rogers (eds.), pp. 155–172.

Lewin, K. 1948. Resolving Social Conflicts: Selected Papers on Group Dynamics, G.W.
Allport and G.W. Lewin (eds.), Harper & Row.

Lynch, M. 1997. Scientific Practice and Ordinary Action: Ethnomethodology and


Social Studies of Science, Cambridge University Press.

Magnus, P. D. 2009. “On Trusting Wikipedia“, Episteme (6:1), pp. 74–90.

Marková, I. 2000. “Amédée or How to get rid of it: Social Representations from a
Diaological Perspective“, (6:4), pp. 419–460.

Marková, I. 1996. “Towards an Epistemology of Social Representations“, Journal for


the Theory of Social Behaviour (26:2), pp. 177–196.

Martin, O. S. 2011. “A Wikipedia Literature Review“, ArXiV (abs/1110.5).

McKinlay, A., and Potter, J. 1987. “Social representations: A conceptual critique“,


Journal for the Theory of Social Behaviour (17), pp. 471–488.

Mills, J., Bonner, A., and Francis, K. 2006. “The Development of Constructivist
Grounded Theory“, International Journal (5:March), pp. 1–10.

Moloney, G., and Walker, I. 2002. “Talking about transplants: Social representations
and the dialectical, dilemmatic nature of organ donation and transplantation“,
British Journal of Social Psychology (41:2), pp. 299–320.

Morsey, M., Lehmann, J., Auer, S., Stadler, C., and Hellmann, S. 2012. “DBpedia and
the live extraction of structured data from Wikipedia“, Program Electronic Library
And Information Systems (46:2), pp. 157–181.

Moscovici, S. 1985. “Comment on Potter & Litton“, British Journal of Social


Psychology (24:2), pp. 91–92.

Moscovici, S. 1990. “Social Psychology and Developmental Psychology: Extending the


Conversation“, in Social Representations and the Development of Knowledge, G.
Duveen and B. Lloyd (eds.), Cambridge University Press, pp. 164–186.
106

Moscovici, S. 2008. Psychoanalysis: Its Image and Its Public, G. Duveen (ed.), Wiley.

Moscovici, S. 2000. Social representations: Explorations in social psychology, G.


Duveen (ed.), Polity Press Cambridge.

Moscovici, S. 1988. “Notes towards a description of Social Representations“, European


Journal of Social Psychology (18:3), pp. 211–250.

Moscovici, S. 1994. “Social representations and pragmatic communication“, Social


Science Information (33:2), pp. 163–177.

Moscovici, S. 1984. “The phenomenon of social representations“, in Social


representations, Moscovici, Serge Farr, R.M., p. 69.

Munslow, A. 1997. “Deconstructing History“, Routledge.

Oetzel, M. C. 2011. “The Online Privacy Paradox: A Social Representations


Perspective“, Public Policy, pp. 2107–2112.

Olleros, F. X. 2008. “Learning to Trust the Crowd: Some Lessons from Wikipedia“,
2008 International MCETECH Conference on eTechnologies mcetech 2008, pp.
212–216.

Pawlowski, S. D., Kaganer, E. A., and Cater, J. J. 2007. “Focusing the research agenda
on burnout in IT: social representations of burnout in the profession“, European
Journal of Information Systems (16:5), pp. 612–627.

Potter, J., and Edwards, D. 1999. “Social representations and discursive psychology:
From cognition to action“, Culture & Psychology (5), pp. 447–458.

Potthast, M., Stein, B., and Gerling, R. 2008. “Automatic Vandalism Detection in
Wikipedia“, Advances in Information Retrieval (4956), C. Macdonald, I. Ounis, V.
Plachouras, I. Ruthven and R.W. White (eds.), pp. 663–668.

Psaltis, C. 2012. “Social Representations of Gender in Peer Interaction and Cognitive


Development“, Social and Personality Psychology Compass (6:11), pp. 840–851.

Ratkiewicz, J., Fortunato, S., Flammini, A., Menczer, F., and Vespignani, A. 2010.
“Characterizing and modeling the dynamics of online popularity.“, Physical
Review Letters (105:15), p. 158701.

Rosa, A. S. de 2013. “Taking stock: a theory with more than half a century of history“,
in Social Representations in the “Social Arena”, A.S. de Rosa (ed.), Routledge.

Rose, D., Efraim, D., Gervais, M.-C., Joffe, H., Jovchelovitch, S., and Morant, N. 1995.
“Questioning consensus in social representations theory“, Papers on social
representations (4:2), pp. 150–176.

Rosenzweig, R. 2006. “Can History Be Open Source? Wikipedia and the Future of the
Past“, Journal of American History (93:1), pp. 117–146.
107

Sá, C. P. de 1996. “Determining the central nucleus of social representations“, LSE


Methodology Institute – Papers in Social Research Methods/Qualitative Series (2).

Schneider, J., Passant, A., and Breslin, J. 2010. “A Qualitative and Quantitative
Analysis of How Wikipedia Talk Pages Are Used.”, Web Science Conference

Siorpaes, K., and Bachlechner, D. 2006. “Harvesting Wiki Consensus - Using


Wikipedia Entries as Ontology Elements“, in IEEE Internet Computing, pp. 54–65.

Smets, K., Goethals, B., and Verdonk, B. 2008. “Automatic Vandalism Detection in
Wikipedia : Towards a Machine Learning Approach“, in AAAI Workshop on
Wikipedia and Artificial Intelligence An Evolving Synergy, AAAI Press, pp. 43–48.

Stross, R. 2006. “Anonymous Source Is Not the Same as Open Source“, in New York
Times.

Suchecki, K., Salah, A. A. A., Gao, C., and Scharnhorst, A. 2012. “Evolution of
Wikipedia’s Category Structure“, Advances in Complex Systems (15:supp01), p.
19.

Vaast, E. 2007. “Danger is in the eye of the beholders: Social representations of


Information Systems security in healthcare“, The Journal of Strategic Information
Systems (16:2), pp. 130–152.

Valsiner, J. 2003. “Beyond social representations: A theory of enablement“, Papers on


Social Representations (12), pp. 7.1–7.16.

Wagner, W., Duveen, G., Farr, R., Jovchelovitch, S., Cioldi, F. L., Marková, I., et al.
1999. “Theory and Method of Social Representations“, Asian Journal Of Social
Psychology (2:1), pp. 95–125.

Wagner, W., Elejabarrieta, F., and Lahnsteiner, I. 1995. “How the sperm dominates the
ovum - objectification by metaphor in the social representation of conception“,
European Journal of Social Psychology (25:6), pp. 671–688.

Wagner, W., Valencia, J., and Elejabarrieta, F. 1996. “Relevance, discourse and the
‘hot’ stable core social representations -A structural analysis of word associations“,
British Journal of Social Psychology (35:3), pp. 331–351.

Weber, M., Roth, G., and Wittich, C. 1978. Economy and Society: An Outline of
Interpretative Sociology, University California Press.

Wilkinson, D. M., and Huberman, B. A. 2007. “Assessing the Value of Coooperation in


Wikipedia“, Arxiv preprint cs0702140 (12:4), pp. 1–14.

Yasseri, T., Sumi, R., Rung, A., Kornai, A., and Kertész, J. 2012. “Dynamics of
conflicts in Wikipedia“, PLoS ONE (7:6), A. Szolnoki (ed.), p. e38869.
108

Appendix

A Page and Article Distributions According to (Anderka and Stein 2012)

Tab. 4 Distribution of Pages among Namespaces as for January 15, 2011


(Anderka and Stein 2012)
109

Tab. 5 Distribution of Articles among Topics as for January 15, 2011


(Anderka and Stein 2012)
110

B Anchor Coding for Cloud Computing

Concepts from which cloud Figurative aspects of cloud Technical aspects of cloud
computing has departed (1) computing (2) computing (3)
Sub concepts of cloud Origins of cloud computing Cloud computing solutions and
computing (4) (5) providers (6)
Benefits of cloud Broader concepts related to Means to interact with cloud
computing(7) cloud computing (8) computing (9)

Rank in Rank in Rank in Rank in Rank in


Anchor 2008 2009 2010 2011 2012
cloud 0,27 0,01 0,00 0,00 0,41
computer network diagram 0,25 0,99 0,99 0,22 0,00
electricity 0,24 0,00 0,00 0,00 0,00
metaphor 0,00 0,69 0,99 0,22 0,00
abstraction 0,00 0,66 0,99 0,23 0,00
electrical grid 0,00 0,00 0,52 0,50 0,88
software as a service 0,70 0,90 0,11 0,09 0,74
grid computing 0,56 0,00 0,00 0,00 0,00
everything as a service 0,52 0,99 0,29 0,00 0,00
utility computing 0,47 0,04 0,26 0,54 1,00
software 0,45 0,99 0,99 0,22 0,33
distributed computing 0,27 0,00 0,00 0,00 0,00
autonomic computing 0,21 0,00 0,00 0,10 0,00
public utility 0,21 0,00 0,00 0,00 0,00
computational resource 0,21 0,00 0,00 0,00 0,00
platform as a service 0,00 0,73 0,11 0,00 0,53
infrastructure as a service 0,00 0,60 0,00 0,00 0,52
test environment as a service 0,00 0,00 0,00 0,00 0,38
storage as a service 0,00 0,00 0,00 0,00 0,38
security as a service 0,00 0,00 0,00 0,00 0,38
data as a service 0,00 0,00 0,00 0,00 0,38
api as a service 0,00 0,00 0,00 0,00 0,38
business model 0,00 0,00 0,00 0,00 0,33
desktop as a service 0,00 0,00 0,00 0,00 0,25
backend as a service 0,00 0,00 0,00 0,00 0,16
it as a service 0,00 0,00 0,00 0,00 0,15
google apps 0,45 0,33 0,00 0,00 0,00
salesforce 0,23 0,18 0,52 0,07 0,00
google 0,23 0,00 0,60 0,08 0,00
ibm 0,23 0,00 0,45 0,07 0,00
microsoft 0,23 0,00 0,46 0,00 0,00
hewlett packard 0,23 0,00 0,26 0,07 0,00
yahoo! 0,23 0,00 0,00 0,15 0,00
111

valeo 0,22 0,00 0,00 0,00 0,00


l'oreal 0,22 0,00 0,00 0,00 0,00
general electric 0,22 0,00 0,00 0,00 0,00
volunteer computing 0,20 0,00 0,00 0,00 0,00
skype protocol#protocol 0,20 0,00 0,00 0,00 0,00
peer to peer 0,20 0,00 0,00 0,00 0,00
bittorrent (protocol) 0,20 0,00 0,00 0,00 0,00
amazon web services 0,19 0,00 0,39 0,07 0,00
procter & gamble 0,19 0,00 0,00 0,00 0,00
intel 0,19 0,00 0,00 0,00 0,00
vmware 0,00 0,00 0,24 0,07 0,00
fujitsu 0,00 0,00 0,20 0,07 0,00
dell 0,00 0,00 0,17 0,07 0,00
skytap 0,00 0,00 0,15 0,07 0,00
amazon 0,00 0,00 0,19 0,00 0,00
web-based email 0,00 0,00 0,00 0,18 0,00
microsoft outlook 0,00 0,00 0,00 0,18 0,00
microsoft entourage 0,00 0,00 0,00 0,18 0,00
hotmail 0,00 0,00 0,00 0,18 0,00
gmail 0,00 0,00 0,00 0,18 0,00
hp 0,00 0,00 0,17 0,00 0,00
azure services platform 0,00 0,00 0,09 0,07 0,00
red hat 0,00 0,00 0,08 0,07 0,00
mozilla thunderbird 0,00 0,00 0,00 0,15 0,00
evolution mail 0,00 0,00 0,00 0,15 0,00
netapp 0,00 0,00 0,13 0,07 0,00
service level agreement 0,24 0,00 0,84 0,22 0,00
quality of service 0,24 0,00 0,84 0,22 0,00
capital expenditure 0,24 0,00 0,00 0,00 0,00
economies of scale 0,00 0,00 0,00 0,00 0,67
subscription 0,24 0,00 0,00 0,00 0,00
web application 0,48 0,00 0,00 0,04 0,00
web browser 0,45 0,99 0,99 0,39 0,97
business application 0,00 0,54 0,99 0,22 0,63
application software 0,00 0,00 0,00 0,04 0,97
mobile app 0,00 0,00 0,00 0,00 0,92
data center 0,66 0,00 0,34 0,09 0,07
virtualization 0,47 0,77 0,93 0,00 0,05
data 0,45 0,99 0,99 0,22 0,00
parallel computing 0,29 0,00 0,00 0,00 0,00
multi-core 0,29 0,00 0,00 0,00 0,00
computer cluster 0,29 0,00 0,00 0,00 0,00
multitenancy 0,24 0,00 0,00 0,00 0,00
vector processor 0,23 0,00 0,00 0,00 0,00
multi-tenant 0,23 0,00 0,00 0,00 0,00
112

loose coupling 0,20 0,00 0,00 0,00 0,00


cluster (computing) 0,20 0,00 0,00 0,00 0,00
self-management (computer science) 0,18 0,00 0,00 0,00 0,00
scalability 0,00 0,94 0,99 0,22 0,05
server (computing) 0,00 0,54 0,99 0,27 0,58
hardware virtualization 0,00 0,00 0,06 0,22 0,00
remote server 0,00 0,00 0,00 0,18 0,00
operating system 0,00 0,00 0,00 0,17 0,00
internet 0,72 0,81 0,99 0,35 1,00
computing 0,53 0,29 0,98 0,54 0,99
web 2.0 0,47 0,34 0,00 0,00 0,00
utility 0,24 0,00 0,00 0,00 0,00
open standards 0,21 0,00 0,00 0,00 0,00
open source software 0,21 0,00 0,00 0,00 0,00
computer 0,06 0,00 0,45 0,13 0,00
information technology 0,00 0,01 0,26 0,22 0,05
computer network 0,00 0,00 0,00 0,58 1,00
shared services 0,00 0,00 0,00 0,00 0,97
converged infrastructure 0,00 0,00 0,00 0,00 0,97
service (economics) 0,00 0,00 0,00 0,31 0,44
service-oriented architecture 0,00 0,00 0,15 0,22 0,00
paradigm shift 0,00 0,17 0,82 0,00 0,00
client–server 0,00 0,00 0,67 0,00 0,00
product (business) 0,00 0,00 0,00 0,31 0,33
mainframe computer 0,00 0,00 0,60 0,00 0,00
computer data storage 0,00 0,00 0,04 0,00 0,24
goizueta business school 0,00 0,26 0,00 0,00 0,00
emory university 0,00 0,26 0,00 0,00 0,00
ramnath chellappa 0,00 0,23 0,00 0,00 0,00
Tab. 6 Strongest Anchors for Cloud Computing Including Categories
113

C Anchor Coding for iPad

Products and companies that Origins of the iPad (2) Technical aspects of the iPad
compete with iPad and Apple (3)
respectively (1)
Products and technologies Use cases for the iPad (5)
that are similar to the iPad (4)

Rank in Rank in
Anchor Rank in 2012
2010 2011

gigabytes 0,15 0,00 0,00


flash memory 0,15 0,00 0,00
bluetooth 2.1 0,15 0,00 0,00
led backlit 0,56 0,00 0,00
dock connector 0,15 0,00 0,00
apple a4 0,12 0,00 0,00
wi-fi 0,77 1,00 1,00
3g 0,72 1,00 0,97
high speed packet access 0,00 0,99 0,57
evolution-data optimized 0,00 0,62 0,57
cellular network 0,15 0,00 0,59
4g 0,00 0,00 0,61
802.11n 0,15 0,00 0,00
hsdpa 0,15 0,00 0,00
assisted gps 0,15 0,00 0,00
2g 0,00 0,00 0,13
pixel 0,12 0,00 0,00
gigahertz 0,12 0,00 0,00
multitouch 0,86 1,00 0,99
itunes 0,68 1,00 0,83
app store (ios) 0,49 1,00 1,00
usb 0,52 1,00 0,83
file synchronization 0,52 1,00 0,83
jailbreak (ios) 0,31 1,00 0,83
web browser 0,00 0,48 0,99
user interface 0,08 0,00 0,16
virtual keyboard 0,08 0,00 0,16
internet 0,06 0,00 0,32
local area network 0,00 0,48 0,83
e-book 0,21 0,00 0,00
application software 0,01 0,00 0,16
portable media player 0,00 0,00 0,16
video camera 0,00 0,00 0,16
114

camera phone 0,00 0,00 0,16


video game 0,00 0,00 0,16
reference work 0,00 0,00 0,16
gps navigation software 0,00 0,00 0,16
social network service 0,00 0,00 0,16
ibookstore 0,16 0,00 0,00
ibooks 0,16 0,00 0,00
ebook 0,16 0,00 0,00
apple inc. 0,99 1,00 0,99
steve jobs 0,01 0,34 0,15
macintosh 0,01 0,00 0,18
foxconn 0,18 0,00 0,00
yerba buena center for the
arts 0,11 0,00 0,00
tablet computer 1,00 1,00 1,00
ipod touch 0,77 1,00 0,83
iphone 0,71 1,00 0,83
smartphone 0,64 1,00 0,83
ios (apple) and ios 0,54 0,89 0,99
ipad 2 0,00 0,69 0,83
laptop 0,64 0,54 0,00
operating system 0,16 0,00 0,59
ipad (4th generation) 0,00 0,00 0,16
ipad mini 0,00 0,00 0,16
ipad (3rd generation) 0,00 0,00 0,16
list of ios devices#ipad 0,00 0,00 0,16
ipad (1st generation) 0,00 0,00 0,14
kindle 0,21 0,00 0,00
stylus (computing) 0,57 1,00 0,83
barnes & noble nook 0,17 0,00 0,00
amazon.com 0,16 0,00 0,00
115

D Disregarded Anchors in the iPad Case Study

Merged Deleted

ios with ios(apple) fast company (magazine)

ios with iphone os cnet networks (magazine)

app store with app store (ios) endgadget (magazine)

jailbreak with ios jailbreaking financial post (newspaper)

jailbreak with jailbreak (iphone os) pc world (magazine)

multitouch with multi-touch wired (magazine)

wi-fi with wifi the new your times (newspaper)

wi-fi with wireless lan the baltimore sun (newspaper)

tablet computer with tablet pc bbc news (news chanel)

3gpp long term evolution with 4g fiscal quarter (removes as a simple point
to a definition what is a quarter)

liquid crystal display with tft lcd#in-plane


switching (ips)
apple inc with apple. inc.

apple inc with apple inc.

ipod with ipod touch

kindle with amazon kindle

led with led backlit

led with tft lcd

led with backlight#led backlights

high speed packet access with high speed


packet access

laptop with laptop computer


116

E Interpretation Scheme for Trends in the Collaboration Process

There are 13 possible trends in the collaboration process, as combinations from edits,
editors and edits per editor statistics.

Fig. 63 Interpretation Scheme for Collaboration Evolution on Wikipedia

The interpretation scheme is based on the assumption that the intensity of the editing
activity depends on the interest of individuals. The higher the interest in the social
group, the more editing activity is observed. However, in accordance with the theory,
there are two possible explanations for the interest increase. It can be caused by either a
high level of unfamiliarity or by a higher ‘involvement’. The involvement describes the
role a phenomenon plays in individuals’ lives while ‘unfamiliarity’ describes the extent
to which the phenomenon remains unfamiliar to the social group. If the unfamiliarity of
a phenomenon is high, the social group naturally responds with an attempt to familiarise
it (Moscovici 2000/1984, p.37). In the similar vein, phenomena which are of a greater
importance for the social group are subject to more intensive editing activity.The
interpretation scheme helps in giving meaning to the changes in the evolution of
collaboration on Wikipedia. Such meaning is achieved by analysing data regarding the
amount of edits and editors in two different periods. The scheme is capable of revealing
centralisation or decentralisation of the collaboration, increasing or decreasing interest,
and even indicating the probable cause of those interest changes. For example, more
centralised collaboration indicates an attempt to resolve conflicting representations.
Consequently, some collaboration patterns are more likely correspond to a higher
degree of unfamiliarity associated with the underlying phenomenon than others.
117

F Statistics for Cloud Computing Evolution Phases

All values in the table are average values for the corresponding monthly statistic in the corresponding evolution phase.
118

G Statistics for iPad Evolution Phases

All values in the table are average values for the corresponding monthly statistic in the corresponding evolution phase.
119

H Cloud Computing Case Study Statistics

Editing Statistics
120

Editors Statistics
121

Anchors Compare Table

Total Time
Rank in Rank in Rank in Rank in Rank in Total
Anchor Adjusted
2008 2009 2010 2011 2012 Rank
Rank
internet 0.72 0.81 0.99 0.35 1.00 3.87 1.952
web browser 0.45 0.99 0.99 0.39 0.97 3.79 1.968
computing 0.53 0.29 0.98 0.54 0.99 3.33 1.860
software 0.45 0.99 0.99 0.22 0.33 2.98 1.322
data 0.45 0.99 0.99 0.22 0 2.65 1.047
software as a service 0.70 0.90 0.11 0.09 0.74 2.54 1.148
computer network diagram 0.25 0.99 0.99 0.22 0 2.45 1.013
server (computing) 0 0.54 0.99 0.27 0.58 2.38 1.338
utility computing 0.47 0.04 0.26 0.54 1.00 2.31 1.415
scalability 0 0.94 0.99 0.22 0.05 2.20 0.997
infoworld 0 0.89 0.99 0.22 0.05 2.15 0.980
virtualization 0.47 0.56 0.93 0 0.05 2.01 0.772
metaphor 0 0.69 0.99 0.22 0 1.90 0.872
abstraction 0 0.66 0.99 0.23 0 1.88 0.868
everything as a service 0.52 0.99 0.29 0 0 1.80 0.562
business application 0 0.54 0.99 0.22 0 1.75 0.822
computer network 0 0 0 0.58 1.00 1.58 1.220
122

Total Time
Rank in Rank in Rank in Rank in Rank in Total
Anchor Adjusted
2008 2009 2010 2011 2012 Rank
Rank
platform as a service 0 0.73 0.11 0 0.53 1.37 0.740
quality of service 0.24 0 0.84 0.22 0 1.30 0.607
infrastructure as a service 0 0.60 0 0 0.52 1.12 0.633
electrical grid 0 0 0 0.28 0.81 1.09 0.862
application software 0 0 0 0.04 0.97 1.01 0.835
data center 0.66 0 0.34 0 0.01 1.01 0.288
paradigm shift 0 0.17 0.82 0 0 0.99 0.467
converged infrastructure 0 0 0 0 0.97 0.97 0.808
shared services 0 0 0 0 0.97 0.97 0.808
pdf 0.24 0.21 0.48 0 0 0.93 0.350
mobile app 0 0 0 0 0.92 0.92 0.767
google 0.23 0 0.60 0.08 0 0.91 0.392
salesforce 0.23 0 0.52 0.07 0 0.82 0.345
electricity grid 0 0 0.52 0.22 0.07 0.81 0.465
web 2.0 0.47 0.34 0 0 0 0.81 0.192
service level agreement 0.24 0 0.54 0 0 0.78 0.310
google apps 0.45 0.33 0 0 0 0.78 0.185
service (economics) 0 0 0 0.31 0.44 0.75 0.573
ibm 0.23 0 0.45 0.07 0 0.75 0.310
microsoft 0.23 0 0.46 0 0 0.69 0.268
economies of scale 0 0 0 0 0.67 0.67 0.558
amazon web services 0.19 0 0.39 0.07 0 0.65 0.273
product (business) 0 0 0 0.31 0.33 0.64 0.482
nist 0 0 0.42 0.22 0 0.64 0.357
business software 0 0 0 0 0.63 0.63 0.525
mainframe computer 0 0 0.60 0 0 0.60 0.300
123

Total Time
Rank in Rank in Rank in Rank in Rank in Total
Anchor Adjusted
2008 2009 2010 2011 2012 Rank
Rank
hewlett packard 0.23 0 0.26 0.07 0 0.56 0.215
computer 0 0 0.41 0.13 0 0.54 0.292
service level agreements 0 0 0.30 0.22 0 0.52 0.297
web application 0.48 0 0 0.04 0 0.52 0.107
client–server 0 0 0.51 0 0 0.51 0.255
information technology 0 0.01 0.14 0.22 0.05 0.42 0.262
cloud 0 0 0 0 0.41 0.41 0.342
gartner 0 0.39 0 0 0 0.39 0.130
data as a service 0.00 0 0 0 0.38 0.38 0.317
storage as a service 0 0 0 0 0.38 0.38 0.317
security as a service 0 0 0 0 0.38 0.38 0.317
test environment as a service 0 0 0 0 0.38 0.38 0.317
api as a service 0 0 0 0 0.38 0.38 0.317
service-oriented architecture 0 0 0.15 0.22 0 0.37 0.222
grid computing 0.36 0 0 0 0 0.36 0.060
business model 0 0 0 0 0.33 0.33 0.275
vmware 0 0 0.24 0.07 0 0.31 0.167
autonomic computing 0.21 0 0 0.10 0 0.31 0.102
computer cluster 0.29 0 0 0 0 0.29 0.048
multi-core 0.29 0 0 0 0 0.29 0.048
parallel computing 0.29 0 0 0 0 0.29 0.048
computer data storage 0 0.00 0.04 0 0.24 0.28 0.220
hardware virtualization 0 0 0.06 0.22 0 0.28 0.177
the cloud 0.27 0.01 0 0 0 0.28 0.048
fujitsu 0 0 0.20 0.07 0 0.27 0.147
distributed computing 0.27 0 0 0 0 0.27 0.045
124

Total Time
Rank in Rank in Rank in Rank in Rank in Total
Anchor Adjusted
2008 2009 2010 2011 2012 Rank
Rank
emory university 0 0.26 0 0 0 0.26 0.087
goizueta business school 0 0.26 0 0 0 0.26 0.087
desktop as a service 0 0 0 0 0.25 0.25 0.208
dell 0 0 0.17 0.07 0 0.24 0.132
capital expenditure 0.24 0 0 0 0 0.24 0.040
utility 0.24 0 0 0 0 0.24 0.040
electricity 0.24 0 0 0 0 0.24 0.040
subscription 0.24 0 0 0 0 0.24 0.040
multitenancy 0.24 0 0 0 0 0.24 0.040
ramnath chellappa 0 0.23 0 0 0 0.23 0.077
multi-tenant 0.23 0 0 0 0 0.23 0.038
vector processor 0.23 0 0 0 0 0.23 0.038
yahoo! 0.23 0 0 0 0 0.23 0.038
skytap 0 0 0.15 0.07 0 0.22 0.122
general electric 0.22 0 0 0 0 0.22 0.037
l'oréal 0.22 0 0 0 0 0.22 0.037
valeo 0.22 0 0 0 0 0.22 0.037
virtualisation 0 0.21 0 0 0 0.21 0.070
open standards 0.21 0 0 0 0 0.21 0.035
open source software 0.21 0 0 0 0 0.21 0.035
computational resource 0.21 0 0 0 0 0.21 0.035
public utility 0.21 0 0 0 0 0.21 0.035
autonomic computing#autonomic_systems 0.21 0 0 0 0 0.21 0.035
netapp 0 0 0.13 0.07 0 0.20 0.112
cluster (computing) 0.20 0 0 0 0 0.20 0.033
loose coupling 0.20 0 0 0 0 0.20 0.033
125

Total Time
Rank in Rank in Rank in Rank in Rank in Total
Anchor Adjusted
2008 2009 2010 2011 2012 Rank
Rank
grid computing#grids versus conventional
0.20 0 0 0 0 0.20 0.033
supercomputers
peer to peer 0.20 0 0 0 0 0.20 0.033
bittorrent (protocol) 0.20 0 0 0 0 0.20 0.033
skype protocol#protocol 0.20 0 0 0 0 0.20 0.033
volunteer computing 0.20 0 0 0 0 0.20 0.033

amazon 0 0 0.19 0 0 0.19 0.095


intel 0.19 0 0 0 0 0.19 0.032
procter & gamble 0.19 0 0 0 0 0.19 0.032
remote server 0 0 0 0.18 0 0.18 0.120
web-based email 0 0 0 0.18 0 0.18 0.120
gmail 0 0 0 0.18 0 0.18 0.120
hotmail 0 0 0 0.18 0 0.18 0.120
microsoft outlook 0 0 0 0.18 0 0.18 0.120
microsoft entourage 0 0 0 0.18 0 0.18 0.120
salesforce.com 0 0.18 0 0 0 0.18 0.060
2002 0.18 0 0 0 0 0.18 0.030
july 20 0.18 0 0 0 0 0.18 0.030
self-management (computer science) 0.18 0 0 0 0 0.18 0.030
operating system 0 0 0 0.17 0 0.17 0.113
hp 0 0 0.17 0 0 0.17 0.085
backend as a service 0 0 0 0 0.16 0.16 0.133
azure services platform 0 0 0.09 0.07 0 0.16 0.092
client-server 0 0 0.16 0 0 0.16 0.080
it as a service 0 0 0 0 0.15 0.15 0.125
126

yahoo 0 0 0 0.15 0 0.15 0.100


evolution mail 0 0 0 0.15 0 0.15 0.100
mozilla thunderbird 0 0 0 0.15 0 0.15 0.100
red hat 0 0 0.08 0.07 0 0.15 0.087
desktop virtualization 0 0 0 0 0.13 0.13 0.108
rackspace cloud 0 0 0.06 0.07 0 0.13 0.077
ieee computer society 0.13 0 0 0 0 0.13 0.022
information technology 0 0 0.12 0 0 0.12 0.060
seti@home 0.12 0 0 0 0 0.12 0.020
application programming interface 0 0 0 0 0.11 0.11 0.092
lan 0 0 0 0.11 0 0.11 0.073
platform virtualization 0 0.11 0 0 0 0.11 0.037
personal computer 0.06 0 0.04 0 0 0.10 0.030
database as a service 0 0 0 0 0.09 0.09 0.075
word processing 0 0 0 0.09 0 0.09 0.060
datacenter 0 0 0 0.09 0 0.09 0.060
cloud gaming 0 0 0 0.09 0 0.09 0.060
cloud provider 0 0 0 0.09 0 0.09 0.060
illustration 0.09 0 0 0 0 0.09 0.015
managed
0 0 0.08 0 0 0.08 0.040
services#managed_services_provider
seti 0.08 0 0 0 0 0.08 0.013
service-level agreement 0 0 0.00 0 0.07 0.07 0.058
ajax 0 0 0 0 0.06 0.06 0.050
data centre 0 0 0 0 0.06 0.06 0.050
web browsers 0 0 0 0.01 0.05 0.06 0.048
wide area_network 0 0 0 0.06 0 0.06 0.040
website 0 0 0.01 0.05 0 0.06 0.038
real-time 0 0.06 0 0 0 0.06 0.020
127

application server 0.06 0 0 0 0 0.06 0.010


business process as a service 0 0 0 0 0.05 0.05 0.042
web server 0 0 0 0.05 0 0.05 0.033
web 0 0 0.05 0 0 0.05 0.025
processing 0 0 0.05 0 0 0.05 0.025
smartphones 0 0 0.05 0 0 0.05 0.025
it services & outsourcing 0 0 0 0 0.04 0.04 0.033
cisco 0 0 0 0.04 0 0.04 0.027
hitachi 0 0 0 0.04 0 0.04 0.027
servers 0 0 0 0.04 0 0.04 0.027
storage 0 0 0 0.04 0 0.04 0.027
computer software 0 0 0.04 0 0 0.04 0.020
obfuscation 0 0.04 0 0 0 0.04 0.013
user interaction 0 0.04 0 0 0 0.04 0.013
comet (programming)#horizontal
0.04 0 0 0 0 0.04 0.007
scalability
sap ag 0.04 0 0 0 0 0.04 0.007
network as a service 0 0 0 0 0.03 0.03 0.025
huawei 0 0 0 0.03 0 0.03 0.020
richard stallman 0 0 0 0.03 0 0.03 0.020
the guardian 0 0 0 0.03 0 0.03 0.020
web service 0 0 0 0.03 0 0.03 0.020
#comparisons 0 0 0.03 0 0 0.03 0.015
#criticism of the term 0 0 0.03 0 0 0.03 0.015
computer technology 0 0.03 0 0 0 0.03 0.010
scalability#scale horizontally 0.03 0 0 0 0 0.03 0.005
self-management (computer_science) 0.03 0 0 0 0 0.03 0.005
d&b 0.03 0 0 0 0 0.03 0.005
cloud (disambiguation) 0 0 0 0.02 0 0.02 0.013
128

mainframe 0 0 0.02 0 0 0.02 0.010


common center 0 0 0.02 0 0 0.02 0.010
information management 0 0.02 0 0 0 0.02 0.007
sap business bydesign 0 0.02 0 0 0 0.02 0.007
ramnath chellappa 0 0.02 0 0 0 0.02 0.007
redhat 0.02 0 0 0 0 0.02 0.003
local area network 0 0 0 0.01 0 0.01 0.007
service provider 0 0 0 0.01 0 0.01 0.007
computer science 0 0 0.01 0 0 0.01 0.005
hard-wired 0 0 0.01 0 0 0.01 0.005
network cable 0 0 0.01 0 0 0.01 0.005
fat client 0 0 0.01 0 0 0.01 0.005
thin clients 0 0 0.01 0 0 0.01 0.005
handheld device 0 0 0.01 0 0 0.01 0.005
computer application 0 0 0.01 0 0 0.01 0.005
google maps 0 0 0.01 0 0 0.01 0.005
intranet 0 0 0.01 0 0 0.01 0.005
network switch 0 0 0.01 0 0 0.01 0.005
hardware 0 0 0.01 0 0 0.01 0.005
bluelock 0 0 0.01 0 0 0.01 0.005
web desktop 0 0.01 0 0 0 0.01 0.003
netsuite.com 0 0.01 0 0 0 0.01 0.003
midlandhr 0 0.01 0 0 0 0.01 0.003
windows azure 0 0.01 0 0 0 0.01 0.003
netsuite 0.01 0 0 0 0 0.01 0.002
129

Anchor Snapshots
130
131

New and Obsolete Anchors


132

Anchor Dissimilarity
133

Average Anchor Durability


134

Anchor Edit-War Level


135

I iPad Case Study Statistics

Editing Statistics
136

Editors Statistics
137

Anchors Compare Table

Total Time Adjusted


Anchor Rank in 2010 Rank in 2011 Rank in 2012 Total Rank
Rank
apple inc. 0.97 1.00 0.99 2.96 1.485
tablet computer 0.73 1.00 1.00 2.73 1.433
wi-fi 0.69 1.00 1.00 2.69 1.422
3g 0.73 1.00 0.97 2.70 1.410
ipod touch 0.76 1.00 0.83 2.59 1.313
iphone 0.71 1.00 0.83 2.54 1.300
itunes 0.68 1.00 0.83 2.51 1.292
smartphone 0.64 1.00 0.83 2.47 1.282
stylus (computing) 0.57 1.00 0.83 2.40 1.265
usb 0.52 1.00 0.83 2.35 1.252
file synchronization 0.52 1.00 0.83 2.35 1.252
multitouch 0.43 1.00 0.83 2.26 1.230
web browser 0 0.48 0.99 1.47 0.982
app store (ios) 0 0.44 1.00 1.44 0.970
ipad 2 0 0.69 0.83 1.52 0.967
high speed packet access 0 0.99 0.57 1.56 0.922
local area network 0 0.48 0.83 1.31 0.862
jailbreak (ios) 0.25 1.00 0.24 1.49 0.742
evolution-data optimized 0 0.62 0.57 1.19 0.738
ios (apple) 0.12 0.89 0.24 1.25 0.655
ios 0 0 0.75 0.75 0.563
operating system 0.16 0 0.59 0.75 0.482
cellular network 0.15 0 0.59 0.74 0.480
138

Total Time Adjusted


Anchor Rank in 2010 Rank in 2011 Rank in 2012 Total Rank
Rank
ios jailbreaking 0 0 0.59 0.59 0.443
fiscal quarter#united states 0 0 0.56 0.56 0.420
laptop 0.55 0.54 0 1.09 0.408
app store 0.49 0.56 0 1.05 0.403
4g 0 0 0.44 0.44 0.330
steve jobs 0.01 0.34 0.15 0.50 0.285
internet 0.06 0 0.32 0.38 0.255
multi-touch 0.43 0 0.16 0.59 0.227
virtual keyboard 0.08 0 0.16 0.24 0.140
user interface 0.08 0 0.16 0.24 0.140
macintosh 0.01 0 0.18 0.19 0.138
wireless lan 0 0.27 0 0.27 0.135
3gpp long term evolution 0 0 0.17 0.17 0.128
application software 0.01 0 0.16 0.17 0.122
portable media player 0 0 0.16 0.16 0.120
ipad (4th generation) 0 0 0.16 0.16 0.120
ipad mini 0 0 0.16 0.16 0.120
ipad (3rd generation) 0 0 0.16 0.16 0.120
video camera 0 0 0.16 0.16 0.120
camera phone 0 0 0.16 0.16 0.120
video game 0 0 0.16 0.16 0.120
reference work 0 0 0.16 0.16 0.120
gps navigation software 0 0 0.16 0.16 0.120
social network service 0 0 0.16 0.16 0.120
list of ios devices#ipad 0 0 0.16 0.16 0.120
139

Total Time Adjusted


Anchor Rank in 2010 Rank in 2011 Rank in 2012 Total Rank
Rank
iphone os 0.42 0 0 0.42 0.105
ipad (1st generation) 0 0 0.14 0.14 0.105
cnet networks 0.41 0 0 0.41 0.102
engadget 0.41 0 0 0.41 0.102
fast company (magazine) 0.41 0 0 0.41 0.102
2g 0 0 0.13 0.13 0.098
financial post 0.37 0 0 0.37 0.092
tablet pc 0.23 0 0 0.23 0.058
e-book 0.21 0 0 0.21 0.052
pc world 0.19 0 0 0.19 0.048
wired 0.19 0 0 0.19 0.048
foxconn 0.18 0 0 0.18 0.045
backlight#led backlights 0.17 0 0 0.17 0.043
ibookstore 0.16 0 0 0.16 0.040
ibooks 0.16 0 0 0.16 0.040
ebook 0.16 0 0 0.16 0.040
amazon.com 0.16 0 0 0.16 0.040
kindle 0.16 0 0 0.16 0.040
gigabytes 0.15 0 0 0.15 0.037
flash memory 0.15 0 0 0.15 0.037
802.11n 0.15 0 0 0.15 0.037
hsdpa 0.15 0 0 0.15 0.037
assisted gps 0.15 0 0 0.15 0.037
bluetooth 2.1 0.15 0 0 0.15 0.037
dock connector 0.15 0 0 0.15 0.037
140

Total Time Adjusted


Anchor Rank in 2010 Rank in 2011 Rank in 2012 Total Rank
Rank
tft lcd#in-plane switching (ips) 0.14 0 0 0.14 0.035
barnes & noble nook 0.12 0 0 0.12 0.030
apple a4 0.12 0 0 0.12 0.030
pixel 0.12 0 0 0.12 0.030
gigahertz 0.12 0 0 0.12 0.030
yerba buena center for the arts 0.11 0 0 0.11 0.028
led backlit 0.11 0 0 0.11 0.028
jailbreak (iphone os) 0.11 0 0 0.11 0.028
apple inc 0 0.05 0 0.05 0.025
laptop computer 0.09 0 0 0.09 0.022
liquid crystal display 0.08 0 0 0.08 0.020
wifi 0.08 0 0 0.08 0.020
the new york times 0.06 0 0 0.06 0.015
print 0.06 0 0 0.06 0.015
video 0.06 0 0 0.06 0.015
photo 0.06 0 0 0.06 0.015
audio 0.06 0 0 0.06 0.015
msnbc.com 0.06 0 0 0.06 0.015
cnbc 0.06 0 0 0.06 0.015
gsm 0.06 0 0 0.06 0.015
new york times 0.06 0 0 0.06 0.015
san francisco 0.05 0 0 0.05 0.013
the baltimore sun 0.05 0 0 0.05 0.013
amazon kindle 0.05 0 0 0.05 0.013
barnes & noble 0.05 0 0 0.05 0.013
141

Total Time Adjusted


Anchor Rank in 2010 Rank in 2011 Rank in 2012 Total Rank
Rank
us dollar 0.04 0 0 0.04 0.010
led 0.04 0 0 0.04 0.010

jonathan ive 0 0.02 0 0.02 0.010


multimedia 0.03 0 0 0.03 0.007
federal communications
0.03 0 0 0.03 0.007
commission
u.s. dollar 0.03 0 0 0.03 0.007
wireless wan 0.03 0 0 0.03 0.007
hspa 0.01 0.01 0 0.02 0.007
yair reiner 0.02 0 0 0.02 0.005
tft lcd#in-
0.02 0 0 0.02 0.005
plane_switching_.28ips.29
bbc news 0.02 0 0 0.02 0.005
apple, inc. 0.02 0 0 0.02 0.005
apple newton 0.02 0 0 0.02 0.005
daring fireball 0.02 0 0 0.02 0.005
john gruber 0.02 0 0 0.02 0.005
2010 0.01 0 0 0.01 0.003
march 0.01 0 0 0.01 0.003
ceo 0.01 0 0 0.01 0.003
ipod 0.01 0 0 0.01 0.003
mac osx 0.01 0 0 0.01 0.003
event (computing) 0.01 0 0 0.01 0.003
islate 0.00 0 0 0.00 0.000
142

Anchor Snapshots
143

New and Obsolete Anchors


144

Anchor Dissimilarity
145

Average Anchor Durability


146

Anchor Edit-War Level


147
148

Declaration of Authorship

I hereby declare that, to the best of my knowledge and belief, this Master Thesis titled
“The Genealogy of Knowledge in Wikipedia – Method Development and Application” is
my own work. I confirm that each significant contribution to and quotation in this thesis
that originates from the work or works of others is indicated by proper use of citation
and references.
Münster, 29 July 2013

You might also like