Professional Documents
Culture Documents
What Makes Individual Is A Collective We Coordina
What Makes Individual Is A Collective We Coordina
Hyejin Youn1,2,3,*
1
Kellogg School of Management, Northwestern University, Evanston, IL
2
Northwestern Institute on Complex Systems, Evanston, IL
3
Santa Fe Institute, Santa Fe, NM
4
MIT Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA
*
The order of authors has not yet been determined.
June 6, 2023
Abstract
For a collective to become greater than the sum of its parts, individuals’ efforts
and activities must be coordinated or regulated. Not readily observable and measur-
able, this particular aspect often goes unnoticed and understudied in complex systems.
Diving into the Wikipedia ecosystem, where people are free to join and voluntarily
edit individual pages with no firm rules, we identified and quantified three fundamen-
tal coordination mechanisms and found they scale with an influx of contributors in
a remarkably systemic way over three order of magnitudes. Firstly, we have found a
super-linear growth in mutual adjustments (scaling exponent: 1.3), manifested through
extensive discussions and activity reversals. Secondly, the increase in direct supervision
(scaling exponent: 0.9), as represented by the administrators’ activities, is dispropor-
tionately limited. Finally, the rate of rule enforcement exhibits the slowest escalation
(scaling exponent 0.7), reflected by automated bots. The observed scaling exponents
are notably robust across topical categories with minor variations attributed to the
topic complication. Our findings suggest that as more people contribute to a project, a
self-regulating ecosystem incurs faster mutual adjustments than direct supervision and
rule enforcement. These findings have practical implications for online collaborative
communities aiming to enhance their coordination efficiency. These results also have
implications for how we understand human organizations in general.
Introduction
Coordination mechanisms lie at the heart of efficient and effective operations within complex
systems, including human organizations and biological entities. These mechanisms provide
1
the key to understanding how numerous individual components coalesce to form a cohesive,
operational whole across a wide range of scales [1, 2, 3]. To attain and maintain a function-
ing collective necessitates the inevitable and indispensable coordination cost of connecting
isolated individual I ’s to a meaningful We [4, 5, 6]. Nevertheless, coordination mechanisms
and associated costs often remain concealed in theoretical and mathematical studies, re-
vealing themselves only upon closer examination for implementation. Take, for instance,
a bustling bee colony, an impressive superorganism that, at first glance, seems to operate
without any associated coordination costs. Yet, despite their genetic predeterminations,
these societies necessitate constant coordination of activities to foster a thriving colony [7],
including frequent role adjustments, communication through pheromones [8], dances [9], and
vibrations [10] in addition to resolving occasional conflicts [11]. In our daily life, these under-
lying principles manifest themselves when we find our emails and meetings far exceeding our
original plan, leading to moments of unexpected frustration. According to a recent survey,
faculty in academia spent almost 45% of their work week in meetings, emails, scheduling,
planning, and administrative tasks that are not traditionally thought of as part of the life of
an academic [12].
Socioeconomic systems, in their wide-ranging forms, are typically coordinated and reg-
ulated through rules and feedback mechanisms [13, 14]. Rules within social systems can
mechanisms are a vital part of the system, much like the additional adaptations bees make,
offering flexibility and adaptability in the face of changing circumstances. While rules gov-
ern the many-body system by their mere existence or through procedural tools to enforce
them, feedback mechanisms often occur at the level of interpersonal interactions. Author-
ities enforce, interpret, and carry out rules or agendas through directional feedback. This
2
top-down approach, however, often runs the risk of bureaucratic slowdowns due to limited
administrative capacity [15]. Alongside this direct oversight, continuous communication for
mutual feedback among participants is always required to reduce ambiguity and accommo-
date flexibility. Therefore, selecting the appropriate coordination mechanism is paramount
What are the determining factors behind the coordination costs within a collective? Sev-
eral elements come into play, including the size of the organisation [16], its structure [17, 18],
and the intricacy of the problems it grapples with [19]. These variables are intertwined
and often hard to isolate from each other. For a collective to successfully navigate a com-
plex objective, the size, structure, and diversity of the organization need to correspond to
the required complexity of the problem at hand. For instance, sociopolitical evolution is
predominantly marked by the expansion of the polity scale [20]. This often leads to a co-
while complex tasks involving considerable interdependence are often better handled through
personal modes via horizontal communication channels within large groups. Economies of
scale are another consideration when it comes to coordination costs. As an organization
grows, the relative proportion of administrators may decrease, thereby potentially reducing
costs [21, 22]. Again, the economies of scale in coordination costs have been contested in
both theoretical accounts and empirical evidence [23, 24]. Therefore, it remains unclear
whether there exist endogenous mechanisms generating the various types of coordination as
more individual I ’s join the collective We.
To unpack the underlying mechanism and empirically assess the associated costs, we have
chosen Wikipedia as our focal point. This is an ideal system for studying these dynamics
because of its vast collection of individual activity records and myriad small communities
3
associated with individual pages. Wikipedia embodies the essence of collective intelligence,
continually evolving through contributions from a diverse array of intellectual minds [25, 26].
Notably, what sets Wikipedia apart is that it is entirely authored and maintained by a de-
centralized community of volunteers. Individual pages are defined by groups of varying sizes,
giving us a perfect window into various scales. Yet, each page shares approximately the same
task of constructing a coherent knowledge structure of a subject. Within this community,
individuals contribute their knowledge not in isolation but in a concerted effort to construct
a cohesive knowledge structure for each project [27]. To achieve the construction of a com-
prehensive knowledge structure on a given topic, individuals must allocate extra time, effort,
and resources away from their substantive contributions to coordination among themselves.
This coordination effort entails communication in talk pages [28, 29, 30], decisions about
other’s edits [25, 31, 32, 33], the action of authorities [34, 35, 36], and even execution of
norms by automated bots [37, 38, 39].
In this paper, we investigate how coordination mechanisms evolve when the number of
contributors expands within each project by analyzing how various metrics scale with the
number of contributors. In almost all cases, we find good evidence of simple power-law
scaling (see Eq. 1 below) whose exponents quantitatively reveal that the nature of coordina-
tion mechanisms systematically shifts as the number of contributors on a project increases.
Furthermore, based on these scaling exponents, we are able to quantify comparisons be-
tween our results and other systems, including biological [40, 41], ecological [42], and urban
systems [43, 44]. Our study reveals that mutual adjustment exhibits a superlinear scaling
relationship with the number of contributors (i.e., mutual adjustment per contributor in-
creases as the size of the project increases), while direct supervision increases sublinearly as
contributor numbers rise (i.e., the amount of direct supervision per contributor decreases as
the size of the project increases, reflecting an economy of scale). Moreover, we found that co-
ordination by rule, measured by the automated bots, increases sublinearly, but more slowly
than coordination by feedback. This range in the pattern of coordination cost escalation
4
indicates nontrivial tradeoffs between mutual adjustment, supervision, and rule enforcement
Result
Coordination Mechanisms in Wikipedia
contributors edit segments such that the entire page is written in a coherent and consistent
structure. This often incurs disagreement with other contributors and thus ends up with
revising or reverting prior contributors’ work. Indeed, contributors do not edit in isolation—
they need to coordinate themselves with others on the platform to resolve disagreements and
conflicts. We categorized coordination activities on Wikipedia into four types: discussions,
disagreements, interventions, and rule enforcement as illustrated in Fig 1. By measuring the
volume of these coordination activities, we can better understand how Wikipedia functions
as a collaborative platform and quantify the cost of making I ’s into We in a decentralized
system.
Given the vast intellectual diversity of individuals, differing views on any given topic are
inevitable. The true brilliance of Wikipedia lies in its ability to weave these disparate threads
into a single, cohesive narrative. However, pursuing structural cohesion from various people
comes with a price. Discord can arise when contributors’ opinions diverge significantly. Such
disagreements become visible in the frequency of reversals — a feature that empowers all
contributors to completely annul the efforts of others, thus reverting the article to a previous
5
state. This process is often considered to reveal a “norm violation” [45], and employing
reverts to map conflicts on Wikipedia is widely used [25, 31, 32, 33].
When simple reversals do not resolve the discords, or reversals continue back and forth
indefinitely, a lengthy discussion with other contributors is needed to make decisions about
an article’s content and potential improvements. These discussions are documented in the
“talk page” for each project [28, 29]. All contributors can utilize this page to deliberate on the
validity of sources, the page’s structure and organization, what content should be included,
and the conflicting opinions. In some cases, talk pages are also used to build consensus
or norms for the project, including documenting how previous conflicts were resolved. To
measure the level of discussion for each article, we use the length of the corresponding talk
page. In our analysis, both discussions (the talk page) and disagreements (reversals) are
forms of mutual adjustments toward consensus through tools available to all contributors.
Every participant in an article can leverage these tools in their interactions with each other
6
Figure 1: Schematic diagram of coordination mechanisms in Wikipedia. In Wikipedia,
discussions involve communicating with other contributors about the content of a project
(page/article). We measure the level of discussion by looking at the length of the corresponding
talk page, an accompanied page for discussing potential improvements to the article. Conflicting
opinions or ideas among contributors can lead to disagreements and disputes, proxied by frequent
reverts. To manage conflicts among contributors, administrators take actions known as interven-
tions, which we measure by administrators’ activities. Bots are automated to take action, enforcing
a set of rules.
7
Beyond the actions that all contributors can take to affect each other, some actions are
called Bureaucrats. They have powerful authority to block and unblock users, delete and
protect pages, and rename pages, often guided by the need to deal with vandalism, enforce
community policies and guidelines, and mediate disputes. Sysops often delegate their duties
and administrative privileges to trusted Wikipedia users. We include the actions taken by
those who require adminship approval administrators’ interventions as direct supervision [46,
47] as opposed to mutual interactions since they entail human intervention and because there
anti-vandalism bots like ClueBot NG, which are specifically designed to detect and undo
instances of vandalism swiftly. According to an empirical study, when ClueBot NG experi-
enced a breakdown, the number of revert edits nearly doubled [48], highlighting the bot’s
effectiveness in suppressing conflicts and upholding the quality of Wikipedia’s pages. Bots
can also generate standardized articles, such as those related to geography, by using statis-
tical data [49]. While bots are capable of making edits at a fast pace if they are designed or
operated incorrectly, they have the potential to disrupt the smooth functioning of Wikipedia.
To address this concern, Wikipedia established a bot policy, which is primarily maintained
by Wikipedia administrators. In our analysis, we consider edits by bots as a proxy of rule
nation, we compare organizations of different sizes using a scaling perspective. Using the
number of unique contributors on each page, 𝑁, as the measure of organization size, we
8
consider power-law scaling takes the form of
𝑌 = 𝑌0 𝑁 𝛽 . (1)
where 𝑌 represents the coordination cost and 𝑌0 is a normalization constant. The exponent
𝛽 quantifies the rate of increase in 𝑌 with relative increases in 𝑁 as being more (𝛽 > 1) or
ticles with at least one edit in their talk pages, one revert in their content, and one action
from an administrator. This resulting subset comprises 26,014 pages tuned to our ques-
tions. Fig. 2 shows the scaling curves of coordination costs. We observe that both talk page
size and the number of reverts exhibit a super-linear scaling relationship (𝛽 ≃ 1.3), Fig. 2
(a-b). This implies that discussions and disagreements in Wikipedia grow faster than the
number of contributors, reflecting the societal characteristics of coordination costs driven
by interactions, which may follow similar mechanisms to those driving urban features with
superlinear scaling [43, 44]. On the other hand, administrator activities and bot activities
increase sub-linearly, 𝛽 ≃ 0.9 and 𝛽 ≃ 0.7, respectively (Fig. 2 c-d.) These results display
economies of scale, consistent with economies of scale of infrastructure in urban systems [43]
and administrative staff in companies [21]. Finally, our analysis reveals that the length of
Frequently Asked Questions (FAQ), which acts as a repository for past consensus in Talk
As discussed above, the coordination mechanisms are determined not only by the size 𝑁
but also by the complexity of functions 𝐶, which are often interrelated. For complex topics,
it is often the case that a larger group of individuals is required to produce a bring full
perspective. Similarly, the more complex the subject, the higher the potential for disagree-
ments and disputes, drawing a greater crowd into the discourse. This naturally suggests
9
a
Talk page size (Byte) slope = 1.334 [1.255, 1.413] b slope = 1.312 [1.282, 1.342]
# of revert
# of contributor # of contributor
c slope = 0.906 [0.867, 0.945] d slope = 0.692 [0.671, 0.713]
# of admin activity
# of bot activity
# of contributor # of contributor
that we should consider how an article’s content might influence our conclusions. Table 1
summarizes the scaling exponents deconstructed by article category. Overall, the findings
are remarkably consistent with the previous integrated analysis: talk page size and the num-
ber of reverts exhibit a super-linear scaling relationship, while administrator activity shows
a sub-linear relationship. That said, one mustn’t overlook the minor variations across ar-
ticle categories. For instance, in the category of Everyday Life and Mathematics—a topic
generally devoid of controversy—the talk page size increases only mildly superlinearly with
10
the rise in contributors (𝛽 ≈ 1.1), which stands in stark contrast to more contentious topics
such as History or Philosophy and religion, where the talk page size rises at a far brisker
pace (𝛽 ≈ 1.5). This does not mean, however, Mathematics is not complex enough to require
lengthy discussion. In fact, the average discussion volume for Mathematics is as substantial
as those of other topics (see SI). The results indicate that the complexity of the topic has to
be further differentiated epistemological complexity from controversy-prone topics. The lat-
ter incur frequent interpersonal conflicts among different ideologies, reflected on the fastest
growth in discussions and disagreements in contentious topics with the contributor influx as
opposed to those that require a significant amount of discussion regardless of the contributor
influx [4].
Table 1: Scaling exponents of coordination costs by article category. The scaling expo-
nent 𝛽 and corresponding confidence intervals for all categories of Vital Articles, in all four measures
of coordination costs. Talk page length and reverts correspond to mutual adjustment, administra-
tor activity corresponds to direct supervision, and Bot activity corresponds to rule enforcement.
Although there are minor differences in the scaling exponents across categories, the general trend
of super- or sub-linearity remains consistent across all categories.
As the scaling relationship (Eq. 1) only provides an average description of the behavior of
coordination costs, differences between observed and predicted values may indicate that other
factors are influencing the required amount of coordination. A residual analysis provides
insights into the factors that affect coordination costs beyond the mean behavior predicted
by the power law scaling model. To investigate this, we calculate the residual, 𝜉𝑖 , [50] for
11
R 2 = 0.02
𝑌𝑖 𝑌𝑖
𝜉𝑖 ≡ log = log , (2)
𝑌 (𝑁𝑖 ) 𝑌0 𝑁
𝛽
𝑖
where 𝑌𝑖 is the observed value of coordination costs and 𝑌 (𝑁𝑖 ) is the expected value of co-
ordination costs given the number of contributors 𝑁𝑖 . As shown in Table 2, the residuals
of talk page size, the number of reverts, and the admin activity denoted as 𝜉 talk , 𝜉 revert ,
and 𝜉 bot act respectively, exhibit almost no correlation (𝜌𝜉 talk ,𝜉 revert = 0.092,𝜌𝜉 talk ,𝜉 bot act =
0.080,𝜌𝜉 revert ,𝜉 bot act = 0.088). In contrast, residuals of admin activity, 𝜉 adm act shows rela-
tively strong correlations with other coordination costs (𝜌𝜉 adm act ,𝜉 talk = 0.386,𝜌𝜉 adm act ,𝜉 revert =
0.229,𝜌𝜉 adm act ,𝜉 bot act = 0.427), implying that governance plays a central role in the coordina-
tion process of collective intelligence [51, 52].
The residual analysis shows that there is no clear trade-off among different coordination
mechanisms. Nevertheless, the values of exponents by category suggest that the level of con-
troversy or prior knowledge surrounding the topic does require more discussion for effective
contributions. Furthermore, the results suggest that understanding the coordination needs of
12
𝜉 talk 𝜉 revert 𝜉 adm act 𝜉 bot act
𝜉 talk
𝜉 revert 0.092
𝜉 adm act 0.386 0.229
𝜉 bot act 0.080 0.088 0.427
Table 2: Correlation between residuals The Pearson correlation among the scaling residuals
of the four coordination functions. The p-values for all correlations in this table are significant
(<< 0.001).
specific categories of articles may be useful for developing targeted interventions to improve
coordination and reduce costs. For instance, for categories that require more intervention
by administrators, allocating more resources to administrative tasks may be necessary to
improve overall coordination. Overall, these findings highlight the importance of considering
the specific nature of the content when studying coordination costs and developing strategies
to improve coordination in complex systems such as Wikipedia.
Discussion
We investigated how coordination costs scale with the size of the Wikipedia communities and
how they vary across different categories of articles. By analyzing the mutual adjustment
and direct supervision costs of coordination, we found that mutual adjustment costs scale
superlinearly with the number of contributors. In contrast, direct supervision cost and rule
enforcement increase sublinearly with the number of contributors. Our residual analysis
also highlighted that article content and the interaction between contributors shape the
coordination costs of Wikipedia pages.
Our findings contribute to the literature on coordination costs and scaling in complex
systems by providing insights into the coordination challenges online communities face. The
results also have practical implications for online communities, as they demonstrate the
13
size increases.
In our increasingly complex society, the efforts and expertise of highly specialized in-
dividuals are distributed across the globe. This necessitates their productive coordination
spanning vast multi-level networks to create a diverse array of complex goods and services
the urgent need for enhanced comprehension of coordination mechanisms and their latent
structures at many different scales.
Data
Wikipedia data
We used Wikipedia XML dump of English Wikipedia on January 9, 2022, comprising edit
history of 6,412,821 distinct pages in Wikipedia. Each edit includes a unique identifier of the
editor (username for registered contributors and IP address for unregistered contributors),
timestamp, comments, and edit size. Additionally, we extracted contributor’s roles from
the user group table of the SQL dump. Our primary focus lies on the crucial articles in
the curated collection, Vital articles. These articles have been carefully selected to cover
a broad spectrum of subjects and are considered essential for obtaining a comprehensive
understanding of human knowledge. Encompassing a wide range of disciplines, including
history, science, arts, geography, and more, these articles represent significant topics, events,
individuals, and concepts. They undergo regular evaluation and updating to ensure their ac-
curacy, relevance, and comprehensiveness, thus serving as indispensable pillars of knowledge
within Wikipedia’s vast and extensive encyclopedia.
The exponents are robust to the entire English Wikipedia articles beyond vital articles
with 𝛽talk = 1.379, 𝛽revert = 1.309, 𝛽adm act = 0.910, and 𝛽bot act = 0.679 Also, among the
48,864 vital pages on Wikipedia, 22,680 pages have not undergone any actions by adminis-
14
trators. To delve deeper into this matter, we performed a robust check by re-conducting a
scaling analysis on talk page size, number of reverts, and bots’ activity including pages that
have not received any intervention from administrators. We again confirm that our results
are consistent (Fig.S4), affirming our earlier observations.
Acknowledgements
The authors would like to acknowledge the support of the National Science Foundation Grant
Award Number 2133863. J.Y and H.Y thank Seoul Lee for the literature search and helpful
Author Contributions
Additional Information
Supporting Information is available for this paper. Correspondence and requests for materials
Data availability
Code Availability
15
References
[1] Malone, T. W. Superminds: The surprising power of people and computers thinking
[2] Tuomela, R. The philosophy of sociality: The shared point of view (Oxford University
Press, 2007).
[3] Weinberg, S. Dreams of a Final Theory: The Scientist’s Search for the Ultimate Laws
of Nature (Vintage, 1994).
[4] Van de Ven, A. H., Delbecq, A. L. & Koenig Jr, R. Determinants of coordination modes
[7] Huang, Z.-Y. & Robinson, G. E. Honeybee colony integration: worker-worker interac-
tions mediate hormonally regulated plasticity in division of labor. Proceedings of the
[9] Dyer, F. C. The biology of the dance language. Annual review of entomology 47,
917–949 (2002).
[10] Schneider, S. S. & Lewis, L. A. The vibration signal, modulatory communication and
the organization of labor in honey bees, apis mellifera. Apidologie 35, 117–131 (2004).
16
[11] Galbraith, D. A. et al. Testing the kinship theory of intragenomic conflict in honey
bees (apis mellifera). Proceedings of the National Academy of Sciences 113, 1020–1025
(2016).
[12] Ziker, J. The long, lonely job of homo academicus. Blue Review Post (2014).
[14] DiMaggio, P. J. & Powell, W. W. The iron cage revisited: Institutional isomorphism
and collective rationality in organizational fields. American sociological review 147–160
(1983).
[16] Camacho, A. Adaptation costs, coordination costs and optimal firm size. Journal of
Economic Behavior & Organization 15, 137–149 (1991).
[18] Powell, W. et al. Neither market nor hierarchy. The sociology of organizations: classic,
[20] Shin, J. et al. Scale and information-processing thresholds in holocene social evolution.
Nature communications 11, 2394 (2020).
17
[22] Blau, P. M. A formal theory of differentiation in organizations. American Sociological
[23] Coates, R. & Updegraff, D. E. The relationship between organizational size and the
[25] Yasseri, T., Sumi, R., Rung, A., Kornai, A. & Kertész, J. Dynamics of conflicts in
[26] Yun, J., Lee, S. H. & Jeong, H. Early onset of structural inequality in the formation of
[27] Zhu, H., Zhang, A., He, J., Kraut, R. E. & Kittur, A. Effects of peer feedback on
contribution: a field experiment in wikipedia. In Proceedings of the SIGCHI conference
[28] Kittur, A. & Kraut, R. E. Harnessing the wisdom of crowds in wikipedia: quality
[29] Kittur, A. & Kraut, R. E. Beyond wikipedia: coordination and conflict in online
production groups. In Proceedings of the 2010 ACM conference on Computer supported
cooperative work, 215–224 (2010).
[30] Aaltonen, A. & Lanzara, G. F. Building governance capability in online social produc-
18
[32] Zhang, A. F. et al. Participation of new editors after times of shock on wikipedia. In
Proceedings of the International AAAI Conference on Web and Social Media, vol. 13,
560–571 (2019).
[33] Halfaker, A., Kittur, A. & Riedl, J. Don’t bite the newbies: how reverts affect the quan-
tity and quality of wikipedia work. In Proceedings of the 7th international symposium
[34] Arazy, O., Lifshitz-Assaf, H. & Balila, A. Neither a bazaar nor a cathedral: The interplay
between structure and agency in wikipedia’s role system. Journal of the Association for
Information Science and Technology 70, 3–15 (2019).
[35] Danescu-Niculescu-Mizil, C., Lee, L., Pang, B. & Kleinberg, J. Echoes of power: Lan-
guage effects and power differences in social interaction. In Proceedings of the 21st
international conference on World Wide Web, 699–708 (2012).
[36] Greenstein, S., Gu, G. & Zhu, F. Ideology and composition among an online crowd:
Evidence from wikipedians. Management Science 67, 3067–3086 (2021).
[37] Geiger, R. S. The social roles of bots and assisted editing programs. In Proceedings of
the 5th International Symposium on Wikis and Open Collaboration, 1–2 (2009).
[38] Steiner, T. Bots vs. wikipedians, anons vs. logged-ins (redux) a global study of edit
[39] Tsvetkova, M., Garcı́a-Gavilanes, R., Floridi, L. & Yasseri, T. Even good bots fight:
The case of wikipedia. PloS One 12, e0171774 (2017).
[40] Koonin, E. V., Wolf, Y. I. & Karev, G. P. Power Laws, Scale-free Networks and Genome
19
[41] Kempes, C. P. et al. Drivers of bacterial maintenance and minimal energy requirements.
[42] Cody, M. L., MacArthur, R. H., Diamond, J. M. et al. Ecology and Evolution of
[43] Bettencourt, L. M., Lobo, J., Helbing, D., Kühnert, C. & West, G. B. Growth, inno-
vation, scaling, and the pace of life in cities. Proceedings of the National Academy of
Sciences 104, 7301–7306 (2007).
[44] Youn, H. et al. Scaling and universality in urban economic diversification. Journal of
[45] Jan Piskorski, M. & Gorbatâi, A. Testing coleman’s social-norm enforcement mecha-
nism: Evidence from wikipedia. American Journal of Sociology 122, 1183–1222 (2017).
[46] Shachaf, P. & Hara, N. Beyond vandalism: Wikipedia trolls. Journal of Information
Science 36, 357–370 (2010).
[47] Klapper, H. & Reitzig, M. On the effects of authority on peer motivation: L earning
[48] Geiger, R. S. & Halfaker, A. When the levee breaks: without bots, what happens to
wikipedia’s quality control processes? In Proceedings of the 9th International Sympo-
sium on Open Collaboration, 1–6 (2013).
[49] Zheng, L., Albano, C. M., Vora, N. M., Mai, F. & Nickerson, J. V. The roles bots play
[50] Bettencourt, L. M., Lobo, J., Strumsky, D. & West, G. B. Urban scaling and its
deviations: Revealing the structure of wealth, innovation and crime across cities. PloS
one 5, e13541 (2010).
20
[51] O’mahony, S. & Ferraro, F. The emergence of governance in an open source community.
[52] Demil, B. & Lecocq, X. Neither market nor hierarchy nor network: The emergence of
[53] Hosseinioun, M., Neffke, F., Youn, H. et al. Deconstructing human capital to construct
hierarchical nestedness. arXiv preprint arXiv:2303.15629 (2023).
[54] van der Wouden, F. & Youn, H. The impact of geographical distance on learning through
collaboration. Research Policy 52, 104698 (2023).
[55] Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production
[57] Wu, L., Wang, D. & Evans, J. A. Large teams develop and small teams disrupt science
and technology. Nature 566, 378–382 (2019).
21
Supplementary Information: What
makes Individual I ’s a Collective We;
Coordination mechanisms & costs
S1 Text. Who is the administrator in Wikipedia?: Bureaucracy in Wikipedia
The level of authority granted to users on Wikipedia is determined by their user access
level, which defines the actions they are allowed to perform. The hierarchical structure of
adminship flags is depicted in Fig. S1. The most prominent type of administrator is known
as a ”Sysop” (system operator), and they make up only 0.001% of Wikipedia users. To be-
come a sysop, individuals must undergo the adminship process and receive official approval,
which grants them the authority to block and unblock users, protect and delete pages, and
rename pages. Sysops also have the potential to acquire higher or more specialized access
levels (Fig.S1 top). Due to the limited number of administrators, they often delegate specific
22
Category Pages 𝛽talk 𝑌0talk 𝛽revert 𝑌0revert 𝛽adm act 𝑌0adm act 𝛽bot act 𝑌0bot act
Arts 1,808 1.23 [1.08,1.38] 14.33 1.29 [1.24,1.33] 0.05 0.91 [0.86,0.96] 1.46 0.73 [0.70,0.76] 0.46
Biology/Health 2,503 1.46 [1.29,1.62] 1.35 [1.30,1.40] 0.03 0.88 [0.83,0.93] 1.98 0.75 [0.71,0.80] 0.48
Everyday life 1,553 1.12 [0.99,1.25] 31.70 1.24 [1.21,1.28] 0.07 0.92 [0.85,0.99] 1.24 0.72 [0.67,0.77] 0.47
Geography 2,701 1.40 [1.27,1.53] 4.89 1.35 [1.31,1.39] 0.03 0.94 [0.90,0.99] 1.26 0.78 [0.75,0.82] 0.40
History 1,913 1.53 [1.42,1.63] 4.68 1.28 [1.21,1.34] 0.06 0.90 [0.83,0.97] 1.96 0.76 [0.70,0.83] 0.46
Mathematics 455 1.10 [0.96,1.24] 79.99 1.42 [1.34,1.50] 0.02 0.92 [0.85,0.99] 1.33 0.80 [0.73,0.87] 0.25
People 7,884 1.39 [1.31,1.48] 5.87 1.36 [1.32,1.41] 0.03 0.89 [0.83,0.94] 1.98 0.64 [0.62,0.66] 0.96
23
Philos./Religion 884 1.49 [1.38,1.60] 6.01 1.42 [1.37,1.47] 0.02 0.95 [0.88,1.02] 1.16 0.68 [0.63,0.74] 0.68
Physical sci. 1,936 1.44 [1.28,1.61] 6.06 1.35 [1.32,1.39] 0.4 0.96 [0.91,1.02] 1.09 0.76 [0.72,0.79] 0.47
Society/Social 2,698 1.38 [1.27,1.48] 6.98 1.27 [1.22,1.31] 0.05 0.99 [0.95,1.02] 0.75 0.76 [0.72,0.79] 0.41
Technology 1,679 1.21 [1.13,1.30] 17.54 1.30 [1.27,1.33] 0.05 0.97 [0.93,1.02] 0.84 0.81 [0.78,0.85] 0.26
Vital Articles 26,014 1.33 [1.26,1.41] 11.5 1.31 [1.28,1.34] 0.05 0.91 [0.87, 0.95] 1.71 0.69 [0.67,0.71] 0.64
Interface-admin (11) 50.0% 5.3% 6.8% 3.9% 4.7% 1.0% 0.1% 0.3%
Administer
Bureaucrat (19) 50.0% 9.1% 13.6% 7.8% 3.4% 1.8% 0.2%
Oversight (44) 100.0% 50.0% 27.3% 31.6% 60.8% 12.1% 4.1% 0.1% 0.6%
Checkuser (51) 100.0% 18.2% 21.1% 70.5% 14.8% 4.8% 0.1% 0.7%
Abusefilter (149) 50.0% 63.6% 26.3% 40.9% 43.1% 12.8% 5.9% 3.1% 1.1% 1.0% 0.3% 0.7% 1.1% 0.2% 0.1% 0.3%
Sysop (1066) 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 91.3% 4.3% 0.8% 0.1% 6.2% 3.1%
Accountcreator (17) 0.7% 21.7% 3.4% 2.1% 2.0% 0.5% 0.3% 1.3% 0.1% 0.3% 0.2%
Abusefilter-helper (23) 0.1% 29.4% 5.2% 0.8% 3.1% 3.4% 0.5% 0.6% 2.0% 0.1% 0.2% 0.2% 0.6%
Massmessage-sender (58) 11.8% 13.0% 9.9% 7.3% 4.5% 2.7% 0.8% 3.5% 0.8% 0.6% 0.6% 0.6%
Eventcoordinator (121) 4.3% 20.7% 3.7% 4.8% 2.5% 2.1% 3.1% 1.2% 0.5% 0.6% 0.3%
Delegated
Templateeditor (191) 4.0% 23.5% 26.1% 24.1% 5.8% 18.5% 11.7% 2.0% 8.0% 2.2% 1.6% 1.6% 4.0%
Extendedmover (357) 2.7% 41.2% 52.2% 27.6% 14.0% 34.6% 16.6% 4.8% 26.7% 4.6% 3.9% 3.7% 0.3%
Administer Filemover (403) 2.7% 11.8% 8.7% 19.0% 8.3% 24.6% 18.8% 2.9% 10.1% 5.0% 4.8% 4.4%
Ipblock-exempt (666) 1.3% 0.5% 11.8% 17.4% 8.6% 11.6% 6.8% 9.0% 4.7% 7.7% 2.1% 1.9% 1.8% 0.6%
Patroller (711) 2.3% 2.0% 3.4% 0.1% 52.9% 60.9% 43.1% 18.2% 29.8% 53.2% 17.9% 8.3% 7.7% 7.3% 6.9% 0.6%
Autoreviewer (4504) 100.0% 54.5% 36.8% 59.1% 62.7% 34.2% 26.1% 17.6% 26.1% 65.5% 43.8% 51.3% 57.7% 56.3% 14.0% 48.9% 21.7% 33.0%
Rollbacker (6561) 6.7% 100.0% 69.6% 69.0% 26.4% 55.0% 72.3% 78.9% 18.9% 67.7% 31.6% 48.6% 0.9%
Reviewer (7681) 6.7% 94.1% 78.3% 74.1% 37.2% 64.4% 80.4% 84.6% 21.0% 74.4% 56.3% 56.9% 1.5%
Bot (325) 9.1% 0.7% 0.9% 8.7% 3.4% 0.8% 6.8% 0.3% 0.3% 0.3% 0.1%
Founder (1)
Import (2)
Interface-admin (11)
Bureaucrat (19)
Oversight (44)
Checkuser (51)
Abusefilter (149)
Sysop (1066)
Accountcreator (17)
Abusefilter-helper (23)
Massmessage-sender (58)
Eventcoordinator (121)
Templateeditor (191)
Extendedmover (357)
Filemover (403)
Ipblock-exempt (666)
Patroller (711)
Autoreviewer (4504)
Rollbacker (6561)
Reviewer (7681)
Bot (325)
Figure S1: A hierarchical structure of adminship flags in Wikipedia. The axes of the
figure represent lists of adminship flags. Since a user can possess multiple adminship flags, the
colored squares with annotated numbers indicate the conditional probability that a user holding
an adminship flag on the y-axis also holds an adminship flag on the x-axis. This representation
reveals two distinct community structures within the adminship system: requested administrators
and delegated administrators.
24
a slope = 0.753 [0.747, 0.759] b slope = 0.395 [0.367, 0.423]
# of administrator
# of bot
# of contributor # of contributor
Figure S2: Coordination costs in Wikiepdia The figures depict the scaling relationship of
the required a) number of administrators and d) number of bots. The orange line represents the
regression results, while the gray line indicates the baseline results with a scaling exponent of 𝛽 = 1.
Blue dots represent the average value of each bin, with error bars denoting the standard error.
25
a slope = 1.379 [1.324, 1.435] b slope = 1.309 [1.283, 1.335]
Talk page size (Byte)
# of revert
# of contributor # of contributor
c slope = 0.910 [0.876, 0.944] d slope = 0.679 [0.663, 0.695]
# of admin activity
# of bot activity
# of contributor # of contributor
Figure S3: Coordincation costs in Wikipedia – All pages The figures depict the scaling
relationship between the measured coordination costs and the number of contributors for three
different coordination cost metrics: a) Talkpage size, b) Number of reverts, and c) Number of
administrators. The orange line represents the regression results, while the gray line indicates the
baseline results with a scaling exponent of 𝛽 = 1. Blue dots represent the average value of each
bin, with error bars denoting the standard error. We extend our analysis from the main page to
encompass the whole Wikipedia page for robustness check.
26
a slope = 1.324 [1.250, 1.398] b slope = 1.311 [1.282, 1.341] c slope = 0.710 [0.690, 0.730]
Talk page size (Byte)
# of bot activity
# of revert
Figure S4: Robustness check of the scaling exponent Out of the 48,822 vital pages in
Wikipedia, 22,071 pages have not received any intervention from administrators. For robustness
check, we examine the scaling relationship between the following metrics: a) Talk page size, b)
number of reverts, and c) number of bot activity and the number of contributors, including pages
that do not have any edit from administrators. Our findings consistently demonstrate a super-
scaling relationship on talk page size and the number of reverts, as well as a sub-linear scaling
relationship on the number of bot activities. These results are in line with our previous observations.
27