Professional Documents
Culture Documents
the agents [21, 46–48]. From online social media, social (+1) or Pro-Vax (-1)). Each like to a post (only one like
networks can be reconstructed in different ways, where per post is allowed) represents an endorsement for that
links represent social relationships or interactions. Since content, which is assumed to be aligned with the labeling
we are interested in capturing the possible exchange of of the page. Thus, the individual leaning of a user is
opinion between users, we assume directed links to repre- defined as the average of the content leanings of the posts
sent the substrate over which information may flow. For liked by the user, according to Eq. (1).
instance, if user i follows user j on Twitter, user i can We analyze three different data sets collected on Face-
see tweets produced by user j, thus there is a flow of book regarding a specific topic of discussion: vaccines,
information from node j to node i in the network. That science versus conspiracy, and news. The interaction net-
is, when the reconstructed network is directed, we assume work is defined by considering comments. In such an
the link direction points to possible influencers (opposite interaction network two users are connected if they co-
of information flow). Actions such as mentions or retweets commented at least one post. Henceforth we focus on the
may convey similar flows. In some cases, direct relations data set about vaccines and news, others are shown in
between users are not available in the data, so one needs the SM.
to assume some proxy for social connections, e.g., a link
between two users if they comment the same post on
Reddit. Here, the individual leaning of users is quan-
Facebook. Crucially, the two elements characterizing the
tified similarly to Twitter, by considering the links to
presence of echo-chambers, polarization and homophilic
news organizations in the content produced by the users,
interactions, should be quantified independently.
submissions and comments. The interaction network is
defined by considering comments and submissions, by
Implementation on social media reconstructing the information flow. There exists a direct
link from node i to node j if user i comments on a sub-
mission or comment by user j (we assume that i reads the
This section explains how we implement the operational
comment they are replying to, which is written by j). We
definitions defined above on different social media. For
analyze three data sets collected on different subreddits:
each medium, we detail (i) how we quantify the individual
the donald, politics, and news. In the following we focus
leaning of users, and (ii) how we reconstruct the inter-
on the data set collected on the Politics and on the News
action network on top of which the information spread.
subreddit, others are shown in the SM.
Further details are provided in the Materials and Methods
Section.
Gab. The political leaning xi of user i is computed
Twitter. We consider the set of tweets posted by user by considering the set of contents posted by user i that
i that contain links to news organizations of known po- contain a link to news organizations of known political
litical leaning. To each news organization is associated a leaning, similarly to Twitter and Reddit. To obtain the
political leaning score [49] ranging from extreme left to leaning xi of user i, we averaged the scores of each link
extreme right in accordance to the classification reported posted by user i according to Eq. (1). The interaction
in Materials and Methods. We infer the individual leaning network is reconstructed by exploiting the co-commenting
of a user i, xi ∈ [−1, +1] by averaging the scores of the relationships under posts in the same way as for Facebook.
news organizations linked by user i according to Eq. (1). Given two users i and j, an undirected edge between i
We analyze three different data sets collected on Twitter and j exists if and only if they comment under the same
related to controversial topics: gun control, Obamacare, post.
and abortion. For each data set, the social interaction
network is reconstructed by using the following relation,
so that there exists a direct link from node i to node j if
user i follows user j. Henceforth we focus on the data set
about abortion, others are shown in the Supplementary
COMPARATIVE ANALYSIS
Material (SM).
Facebook. The individual leaning of users is quantified
by considering endorsements in the form of likes to posts. In the following we compare the presence or absence of
While other actions such as comments or shares could be echo-chambers across social media. We select one data set
taken into account, the written text may radically change for each social media: Abortion (Twitter), Vaccines (Face-
the inferred leaning. Additionally, while a like is usually book), Politics (Reddit), and Gab as a whole. Results
a positive feedback on a news item, comments and share for other data sets for the same medium are qualitatively
can be associated to different purposes [8]. A comment similar, as shown in the SM. We first characterize echo-
can have multiple features and meanings and can generate chambers in the topology of the networks, then look at
collective debate, while a share indicates a desire to spread their effects on information diffusion. Finally, we directly
a news item to friends. Posts are produced by pages that compare Facebook and Reddit on a common topic, news
are labeled in a certain number of categories, and to consumption, to highlight the differences in the behavior
each category we assign a numerical value (e.g., Anti-Vax of users.
4
Community Size
Community Size
100
100
10
10
1 1
1 2 3 4 5 0 5 10 15 20
Community ID Community ID
Community Size
Community Size
103
100
10
101
1
0 20 40 60 0 5 10 15 20
Community ID Community ID
(c) Facebook (d) Gab
(c) Facebook (d) Gab
FIG. 1: Joint distribution of the leaning of users x and the
average leaning of their neighborhood xN N for different data FIG. 2: Size and average leaning of communities detected in
sets. Colors represent the density of users: the lighter, the different data sets.
larger the number of users. Marginal distribution P (x) and
P N (x) are plotted on the x and y axis, respectively.
lar behavior is found for different topics from the same
social media platform, see SM. Conversely, Reddit and
Homophily in the interaction networks Gab show a different picture. The corresponding plots in
Fig. 1 display a single bright area, indicating that users
do not split into groups with opposite leaning but form a
The topology of the interaction network can reveal the single community, biased to the left (Reddit) or the right
presence of echo-chambers, where users are surrounded (Gab). Similar results are found for different data sets on
by peers with similar leaning and thus are exposed with Reddit, see SM.
higher probability to similar contents. In network terms, Homophilic interactions can be revealed by the com-
this translates into a node i with a given leaning xi more munity structure of the interaction networks. We detect
likely to be connected with nodes with a leaning close to communities by applying the Louvain algorithm for com-
xi [21]. This concept can be quantified by defining, for munity detection [50]. We remove singleton communities
each user P i, the average leaning of their neighborhood, as with only one user and look at the average leaning of
1
xNi ≡ →
ki j Aij xj , where Aij is the adjacency matrix of each community, determined as the average of individual
the interaction network, Aij = 1 if there is a link P from leanings of its members.
node i to node j, Aij = 0 otherwise, and ki→ = j Aij is Fig. 2 shows the communities emerging for each social
the out-degree of node i. medium, arranged by increasing average leaning on the
Fig. 1 shows the correlation between the leaning of a x-axis (color-coded from blue to red), while the y-axis
user i and the leaning of their neighbors, xN i , for the four reports the size of the community. We find a picture
social media under consideration. The probability dis- that confirms the pattern observed before. On Facebook
tributions P (x) (individual leaning) and P N (x) (average and Twitter, communities span the whole spectrum of
leaning of neighbors) are plotted on the x and y axis, re- possible leanings, but each community is formed by users
spectively. All plots are color-coded contour maps, which with similar leaning. Some communities are characterized
represent the number of users in the phase space (x, xN ): by very strong average leaning, especially in the case of
the brighter the area in the map, the larger the density Facebook. Conversely, communities on Reddit and Gab
of users in that area. The topics of vaccines and abor- do not cover the whole spectrum, and all show similar
tion, on Facebook and Twitter, respectively, clearly show average leaning. Furthermore, it is noticeable the almost
two distinct groups whose leanings differ quite starkly, total absence of communities with leaning very close to 0,
as indicated by the two bright areas characterized by a confirming the polarized state of the systems. In addition,
high density of users with like-minded neighbors. Simi- the number of communities identified is different among
5
the four social media. The similar number of communi- 1.0 1.0 Influence Set
Average Size
ties found in Gab and Reddit and the strong difference 4000
● Influence Set ● ●
● ● ● ● ● ● ● ● ● ●
Average Size
−0.5 160 −0.5
120
80
Effects on information spreading 40
−1.0 −1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
The presence of echo chambers can be gauged by simple Seed Leaning Seed Leaning
models of information spreading: users are expected to (a) Twitter (b) Reddit
exchange information more likely with peers sharing a
similar leaning [21, 44, 51]. Classical epidemic models such 1.0 ● 1.0
● ● ● ●
● ● ● ● ●
of the circulating information), infectious (aware and Average Size Average Size
−0.5 4000 −0.5 300
willing to spread it further), or recovered (aware but not 3000
2000
200
●
100
1000
willing to transmit it anymore). Susceptible (unaware) −1.0 −1.0
users may become infectious (aware) upon contact with −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
infected neighbors, with certain transmission probability Seed Leaning Seed Leaning
β. Infectious users can spontaneously become recovered (c) Facebook (d) Gab
with probability ν. In order to measure the effects of the
leaning of users on the diffusion of information, we run FIG. 3: Average leaning hµ(x)i of the influence sets reached
the SIR dynamics on the interaction networks, by starting by users with leaning x, for different data sets under
the epidemic process with only one node i infected, and consideration. Size and color of each point represents the
stopping it when no more infectious nodes are left. average size of the influence sets. The parameters of the SIR
The set of nodes in a recovered state at the end of dynamics are set to β = 0.10hki−1 for panel (a),
the dynamics started with user i as seed of infection, β = 0.01hki−1 for panel (b), β = 0.05hki−1 for panel (c) and
i.e., those that become aware of the information initially β = 0.05hki−1 for panel (d), while ν is fixed at 0.2 for all
propagated by user i, forms the set of influence of user i, simulations.
Ii [55]. The set of influence of a user thus represents those
individuals that can be reached by a piece of content sent
by him/her, depending on the effective infection ratio
β/ν. One can compute the average leaning of the set of
influence of user i, µi , as media platform, see SM. Conversely, Reddit and Gab
X show a different behavior: the average leaning of the set
µi ≡ |Ii |−1 xj . (2) of influence, hµ(x)i, does not depend on the leaning x.
j∈Ii
These results indicate that in some social media, namely
The quantity µi indicates how polarized are the users that Twitter and Facebook, information diffusion is biased
can be reached by a message initially propagated by user toward individuals that share similar leaning, while in
i [21]. others – Reddit and Gab in our analysis – this effect is
Fig. 3 shows the average leaning hµ(x)i of the influence absent. The quantity hµ(x)i, indeed, gauges the strength
sets reached by users with leaning x, for the different data of the echo chambers effect: the more hµ(x)i is close to
sets under consideration. The recovery rate ν is fixed at x, the stronger the echo chamber effect, while if hµ(x)i
0.2 for every dataset, while relationship between infection is independent of x, echo-chambers are not present. Our
rate β and average degree hki vary from dataset to dataset results are robust with respect to different values of the
and is reported in the caption of each figure. More details effective infection ratio β/ν, see SM.
about the network used for the SIR model are reported
in Table I in Methods and Material Section. Again, one Furthermore, Fig. 3 shows that the spreading capacity,
can observe a clear distinction between Facebook and represented by the average size of the influence sets (color
Twitter, on one side, and Reddit and Gab on the other coded in Fig. 3), depends on the leaning of the users. On
side. For the topics of vaccines and abortion, on Facebook Twitter, pro-abortion users are more likely to reach larger
and Twitter, respectively, users with a given leaning are audiences, the same is true for anti-vax users on Facebook,
much more likely to be reached by information propagated left-leaning users on Reddit, and right-leaning users on
by users with similar leaning, i.e., hµ(x)i ∼ x. Similar Gab (in this data set, left-leaning users are almost absent
behavior is found for different topics from the same social though).
6
Community Size
Community Size
Fig. 4 shows a direct comparison of news consumption
100
on Facebook and Reddit along the metrics used in the pre-
100
vious Sections to quantify the presence of echo-chambers:
i) the correlation between the leaning of a user x and the 10
dynamics (bottom row). One can see that all three mea-
sures confirm the picture obtained for other data sets: On 1.0 1.0 Influence Set
Average Size
Facebook, we observe a clear separation among users de- 900
Influence Set Leaning
●
0.0 0.0
latter social media, even users displaying a more extreme Influence Set ●
● ● ●
● ●
●
●
● ●
20 −0.5
4 panel b top row) tend to interact with the majority. 15
10
Moreover, on Facebook the leaning of the seed user has −1.0
5
−1.0
an effect on who the final recipients of the information −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
are, therefore indicating the presence of echo-chambers. Seed Leaning Seed Leaning
On Reddit this effect is absent. (a) Facebook (b) Reddit
FIG. 5: Example of two news sources, namely New York Time and Breitbart, classified on mediabiasfactcheck.org. Notice that,
although Breitbart is labeled as ”Questionnable”, a explicit leaning appears in its description.
500
400
Empirical data sets
300
200 Here we report details on data collection for different
100
social media, summarized in Table I.
0
eft Le
ft ter as
t ter ht igh
t
L en Le Ce
n Rig eR
me ft−
C
xtr
e
Le ht− em
E Rig E xtr
Twitter
FIG. 6: Distribution of the leanings assigned to each source,
ranging from Extreme Left (numerical value: -1, colored in Gun control. We consider C = 19M tweets spanning
blue) to Extreme Right (numerical value: +1, colored in red).
14 days in June 2016, produced by N = 7506 users. We
reconstruct a directed follow network formed by E =
1 053 275 directed edges. The largest weakly connected
component includes more than 99% of nodes. We identify
the individual leaning of Nc = 6994 users.
independent fact-checking organization that rates news Obamacare. We consider C = 34M tweets spanning 7
outlets on the base of the reliability and of the political days in June 2016, produced by N = 8773 users. We
bias of the contents they produce and share. The website reconstruct a directed follow network formed by E =
provides the political bias related to a wide range of media 3 797 871 directed edges. The largest weakly connected
outlets. The labeling provided by MBFC, retrieved in component includes more than 99% of nodes. We identify
June 2019, ranges from Extreme Left to Extreme Right for the individual leaning of Nc = 7899 users.
what concerns the political bias. Certain media outlets are Abortion. We consider C = 34M tweets spanning 7
instead classified as ‘questionable’ sources or ‘conspiracy- days in June 2016, produced by N = 3995 users. We
pseudoscience’ sources if they tend to publish misinforma- reconstruct a directed follow network formed by E =
tion or false contents. However, most of the news outlets 2 330 276 directed edges. The largest weakly connected
without an explicit political label reported by MBFC component includes more than 99% of nodes. We identify
actually have a political bias (e.g., breitbart) that is re- the individual leaning of Nc = 3809 users.
ported in their description, as shown in Figure 5. These
media outlets often have a political bias that is classified
as extreme (either left or right). Considering the impor- Facebook
tance of including such media outlets in our analysis, we
manually reported their classification from the description
provided by MBFC, thus adding 468 outlets to the pool Science and Conspiracy. The dataset was built by
of 1722 news outlets that already have a clear political downloading posts of selected Facebook pages divided into
label. The total number of media outlets for which we two groups, namely conspiracy news and science news.
have a political label is 2190 and the overall leaning is Conspiracy pages were selected based on their name, their
summarized in Figure 6. self description and with the aid of debunking pages. The
8
TABLE I: For each data set, we report: the starting date of collection T0 , time span T expressed in days (d) or years (y),
number of unique contents C, number of users N , coverage nc (fraction of users with classified leaning), size of the giant
component G and average node degree hki.
News. To build this dataset, a set of Facebook pages The dataset, downloaded from https://files.
of news outlets listed by the Europe Media Monitor was pushshift.io/gab, spans from the first Gab post (oc-
identified as first step. By using the Facebook Graph API, curred in 2016) to the late 2018 and it includes data
all the posts and comments related to these pages in the regarding post-reply relationships, number of upvotes of
periods between 2010-2015 were downloaded. Facebook posts, repost or replies and their timestamps. We se-
pages are labelled according to the annotation provided by lected all the contents (post, reply, quote) from 11/2017
mediabiasfactcheck.org. The dataset without annotations to 10/2018, that is C = 13 580 937 unique pieces of con-
and has previously been explored [8]. We consider 15 540 tent created by N = 165 162 unique users. We consider
posts by 180 pages categorized from Left to Right (Left all the post that have a link to an external source, for an
(12), Left-Center (80), Least-Biased (42), Right-Center amount of 3 302 621 posts (excluding youtube links). By
(33), Right (13)), 38663 active users (≥ 3 likes and 3 extracting the domain from each link we obtain 75 436
comments) that co-commented 13 525 230 times. The unique domains. In this set, 1650 unique domains for a
largest connected component of the co-interaction network total of 1 454 502 URLs (44%) were labelled in the MBFC
has G = 38 594 nodes and E = 13 525 119 links. database. We were able to compute the political leaning
9
of Nc = 31 286 users. We also reconstructed the interac- connected component includes G = 20 701 nodes, about
tion network using co-commenting as a proxy. The largest the 66% of the users with leaning, E = 8 273 412 edges.
[1] Walter Quattrociocchi. Part 2-social and political chal- information online. Proceedings of the National Academy
lenges: 2.1 western democracy in crisis? In Global Risk of Sciences, 113(3):554–559, 2016.
Report World Economic Forum, 2017. [15] Kathleen Hall Jamieson and Joseph N Cappella. Echo
[2] An Nguyen and Hong Tien Vu. Testing popular news chamber: Rush Limbaugh and the conservative media
discourse on the “echo chamber” effect: Does political establishment. Oxford University Press, 2008.
polarisation occur among those relying on social media as [16] R Kelly Garrett. Echo chambers online?: Politically
their primary politics news source? First Monday, 24(5), motivated selective exposure among Internet news users.
2019. Journal of Computer-Mediated Communication, 14(2):265–
[3] Elizabeth Dubois and Grant Blank. The echo chamber is 285, 2009.
overstated: the moderating effect of political interest and [17] Walter Quattrociocchi, Antonio Scala, and Cass R Sun-
diverse media. Information, Communication & Society, stein. Echo chambers on Facebook. Available at SSRN
21(5):729–745, 2018. 2795110, 2016.
[4] Leticia Bode. Political news in the news feed: Learning [18] Kiran Garimella, Gianmarco De Francisci Morales, Aris-
politics from social media. Mass communication and tides Gionis, and Michael Mathioudakis. Political dis-
society, 19(1):24–48, 2016. course on social media: Echo chambers, gatekeepers, and
[5] Nic Newman, Richard Fletcher, Antonis Kalogeropoulos, the price of bipartisanship. In Proceedings of the 2018
and Rasmus Nielsen. Reuters institute digital news report World Wide Web Conference, WWW ’18, pages 913–922,
2019, volume 2019. Reuters Institute for the Study of Republic and Canton of Geneva, Switzerland, 2018. In-
Journalism, 2019. ternational World Wide Web Conferences Steering Com-
[6] Andrea Baronchelli. The emergence of consensus: a mittee.
primer. Royal Society open science, 5(2):172189, 2018. [19] Alessandro Bessi, Mauro Coletto, George Alexandru
[7] Matteo Cinelli, Emanuele Brugnoli, Ana Lucia Schmidt, Davidescu, Antonio Scala, Guido Caldarelli, and Wal-
Fabiana Zollo, Walter Quattrociocchi, and Antonio Scala. ter Quattrociocchi. Science vs conspiracy: Collective
Selective exposure shapes the facebook news diet. PloS narratives in the age of misinformation. PloS one,
one, 15(3):e0229129, 2020. 10(2):e0118093, 2015.
[8] Ana Lucı́a Schmidt, Fabiana Zollo, Michela Del Vicario, [20] Kiran Garimella, Gianmarco De Francisci Morales, Aris-
Alessandro Bessi, Antonio Scala, Guido Caldarelli, H Eu- tides Gionis, and Michael Mathioudakis. The Effect of
gene Stanley, and Walter Quattrociocchi. Anatomy of Collective Attention on Controversial Debates on Social
news consumption on facebook. Proceedings of the Na- Media. In WebSci ’17: 9th International ACM Web Sci-
tional Academy of Sciences, 114(12):3035–3039, 2017. ence Conference, pages 43–52, 2017.
[9] Matteo Cinelli, Walter Quattrociocchi, Alessandro [21] Wesley Cota, Silvio C. Ferreira, Romualdo Pastor-
Galeazzi, Carlo Michele Valensise, Emanuele Brugnoli, Satorras, and Michele Starnini. Quantifying echo chamber
Ana Lucia Schmidt, Paola Zola, Fabiana Zollo, and An- effects in information spreading over political communi-
tonio Scala. The covid-19 social media infodemic. arXiv cation networks. EPJ Data Science, 8(1):35, Dec 2019.
preprint arXiv:2003.05004, 2020. [22] Duilio Balsamo, Valeria Gelardi, Chengyuan Han, Daniele
[10] Michael D Conover, Jacob Ratkiewicz, Matthew Fran- Rama, Abhishek Samantray, Claudia Zucca, and Michele
cisco, Bruno Gonçalves, Filippo Menczer, and Alessandro Starnini. Inside the echo chamber: Disentangling
Flammini. Political polarization on twitter. In Fifth in- network dynamics from polarization. arXiv preprint
ternational AAAI conference on weblogs and social media, arXiv:1906.09076, 2019.
2011. [23] Alessandro Cossard, Gianmarco De Francisci Morales,
[11] Christopher A Bail, Lisa P Argyle, Taylor W Brown, Kyriaki Kalimeri, Yelena Mejova, Daniela Paolotti, and
John P Bumpus, Haohan Chen, MB Fallin Hunzaker, Michele Starnini. Falling into the echo chamber: the
Jaemin Lee, Marcus Mann, Friedolin Merhout, and italian vaccination debate on twitter. arXiv preprint
Alexander Volfovsky. Exposure to opposing views on so- arXiv:2003.11906, 2020.
cial media can increase political polarization. Proceedings [24] Cass R Sunstein. Republic.com 2.0. Princeton University
of the National Academy of Sciences, 115(37):9216–9221, Press, 2009.
2018. [25] Walter Quattrociocchi. Inside the echo chamber. Scientific
[12] Nicola Perra and Luis EC Rocha. Modelling opinion dy- American, 316(4):60–63, 2017.
namics in the age of algorithmic personalisation. Scientific [26] Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol,
reports, 9(1):1–11, 2019. Alessandro Flammini, and Filippo Menczer. The spread of
[13] Kazutoshi Sasahara, Wen Chen, Hao Peng, Giovanni Luca fake news by social bots. arXiv preprint arXiv:1707.07592,
Ciampaglia, Alessandro Flammini, and Filippo Menczer. pages 96–104, 2017.
On the inevitability of online echo chambers. arXiv [27] David MJ Lazer, Matthew A Baum, Yochai Benkler,
preprint arXiv:1905.03919, 2019. Adam J Berinsky, Kelly M Greenhill, Filippo Menczer,
[14] Michela Del Vicario, Alessandro Bessi, Fabiana Zollo, Miriam J Metzger, Brendan Nyhan, Gordon Pennycook,
Fabio Petroni, Antonio Scala, Guido Caldarelli, H Eugene David Rothschild, et al. The science of fake news. Science,
Stanley, and Walter Quattrociocchi. The spreading of mis- 359(6380):1094–1096, 2018.
10
[28] Alexandre Bovet and Hernán A Makse. Influence of fake [46] Gueorgi Kossinets and Duncan J Watts. Origins of ho-
news in twitter during the 2016 us presidential election. mophily in an evolving social network. American journal
Nature communications, 10(1):7, 2019. of sociology, 115(2):405–450, 2009.
[29] Itai Himelboim, Stephen McCreery, and Marc Smith. [47] Luca Maria Aiello, Alain Barrat, Rossano Schifanella,
Birds of a feather tweet together: Integrating network Ciro Cattuto, Benjamin Markines, and Filippo Menczer.
and content analyses to examine cross-ideology exposure Friendship prediction and homophily in social media.
on twitter. Journal of computer-mediated communication, ACM Transactions on the Web (TWEB), 6(2):9, 2012.
18(2):154–174, 2013. [48] Alessandro Bessi, Fabio Petroni, Michela Del Vicario,
[30] Seth Flaxman, Sharad Goel, and Justin M Rao. Filter Fabiana Zollo, Aris Anagnostopoulos, Antonio Scala,
bubbles, echo chambers, and online news consumption. Guido Caldarelli, and Walter Quattrociocchi. Homophily
Public opinion quarterly, 80(S1):298–320, 2016. and polarization in the age of misinformation. The Euro-
[31] Dimitar Nikolov, Diego FM Oliveira, Alessandro Flam- pean Physical Journal Special Topics, 225(10):2047–2059,
mini, and Filippo Menczer. Measuring online social bub- 2016.
bles. PeerJ Computer Science, 1:e38, 2015. [49] Eytan Bakshy, Solomon Messing, and Lada A Adamic.
[32] Cass R Sunstein. The law of group polarization. Journal Exposure to ideologically diverse news and opinion on
of political philosophy, 10(2):175–195, 2002. facebook. Science, 348(6239):1130–1132, 2015.
[33] Fabian Baumann, Philipp Lorenz-Spreen, Igor M. Sokolov, [50] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lam-
and Michele Starnini. Modeling echo chambers and po- biotte, and Etienne Lefebvre. Fast unfolding of communi-
larization dynamics in social networks. Phys. Rev. Lett., ties in large networks. Journal of statistical mechanics:
124:048301, Jan 2020. theory and experiment, 2008(10):P10008, 2008.
[34] E.g., Obama foundation’s attempt to address the issue of [51] Kiran Garimella, Gianmarco De Francisci Morales, Aris-
echo chambers. https://www.engadget.com/2017/07/ tides Gionis, and Michael Mathioudakis. Reducing Con-
05/obama-foundation-social-media-echo-chambers troversy by Connecting Opposing Views. In WSDM ’17:
Facebook’s CEO Mark Zuckerberg’s open letter. https: 10th ACM International Conference on Web Search and
//www.facebook.com/notes/mark-zuckerberg/ Data Mining, pages 81–90, 2017.
building-global-community/10103508221158471/. [52] R. M. Anderson and R. M. May. Infectious diseases in
[35] Pablo Barberá, John T Jost, Jonathan Nagler, Joshua A humans. Oxford University Press, Oxford, 1992.
Tucker, and Richard Bonneau. Tweeting from left to [53] Laijun Zhao, Hongxin Cui, Xiaoyan Qiu, Xiaoli Wang, and
right: Is online political communication more than an Jiajia Wang. Sir rumor spreading model in the new media
echo chamber? Psychological science, 26(10):1531–1542, age. Physica A: Statistical Mechanics and its Applications,
2015. 392(4):995 – 1003, 2013.
[36] Axel Bruns. Echo chamber? what echo chamber? review- [54] Clara Granell, Sergio Gómez, and Alex Arenas. Dynami-
ing the evidence. 2017. cal interplay between awareness and epidemic spreading
[37] Axel Bruns. Are Filter Bubbles Real? John Wiley & Sons, in multiplex networks. Phys. Rev. Lett., 111:128701, Sep
2019. 2013.
[38] https://www.alexa.com/siteinfo/reddit.com. [55] Petter Holme. Network reachability of real-world contact
[39] Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, sequences. Phys. Rev. E, 71:046119, Apr 2005.
Haewoon Kwak, Michael Sirivianos, Gianluca Stringini, [56] https://mediabiasfactcheck.com.
and Jeremy Blackburn. What is gab: A bastion of free [57] Alessandro Bessi, Fabiana Zollo, Michela Del Vicario,
speech or an alt-right echo chamber. In Companion Pro- Michelangelo Puliga, Antonio Scala, Guido Caldarelli,
ceedings of the The Web Conference 2018, pages 1007– Brian Uzzi, and Walter Quattrociocchi. Users polarization
1014. International World Wide Web Conferences Steering on facebook and youtube. PloS one, 11(8):e0159641, 2016.
Committee, 2018. [58] Ana Lucı́a Schmidt, Fabiana Zollo, Antonio Scala, Cor-
[40] Joseph T Klapper. The effects of mass communication. nelia Betsch, and Walter Quattrociocchi. Polarization of
1960. the vaccination debate on facebook. Vaccine, 36(25):3606–
[41] Raymond S Nickerson. Confirmation bias: A ubiquitous 3612, 2018.
phenomenon in many guises. Review of general psychology,
2(2):175–220, 1998.
[42] Michela Del Vicario, Gianna Vivaldo, Alessandro Bessi,
Fabiana Zollo, Antonio Scala, Guido Caldarelli, and Wal-
ter Quattrociocchi. Echo chambers: Emotional contagion
and group polarization on facebook. Scientific reports,
6:37825, 2016.
[43] Leon Festinger. A theory of cognitive dissonance, volume 2.
Stanford university press, 1962.
[44] Kiran Garimella, Gianmarco De Francisci Morales, Aris-
tides Gionis, and Michael Mathioudakis. Quantifying
Controversy in Social Media. In WSDM ’16: 9th ACM
International Conference on Web Search and Data Mining,
pages 33–42, 2016.
[45] Kiran Garimella, Gianmarco De Francisci Morales, Aris-
tides Gionis, and Michael Mathioudakis. Quantifying
controversy on social media. TSC: ACM Transactions on
Social Computing, 1(1):3, 2018.
11
Supplementary Information
Echo Chambers on Social Media: A comparative analysis
Here we show additional results not shown in the main paper: additional data sets in Section I and additional results
for the SIR dynamics run with different parameters in Section II
In this section we report the results obtained for other four data sets not shown in the main paper, namely “Science
and Conspiracy” (Facebook), “Gun control” (Twitter), “Obamacare” (Twitter) and ‘The Donald” (Reddit). The
techniques and the pipeline is the same used for the datasets analyzed in the main paper.
1e+05
Conspiracy
Science
Community Size
1e+03
1e+01
0 5 10 15 20
Community ID
(a) (b)
1.0 1.0
● ● ●
Influence Set Leaning
influence set leaning
0.5 0.5
●
0.0 0.0
Influence set ● Influence Set
average size Average Size
●
−0.5 −0.5
3.5 90
3.0 60
2.5 30
2.0
−1.0 −1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
seed leaning Seed Leaning
(c) (d)
FIG. 7: Science vs Conspiracy. Panel (a): Individual leaning versus neighborhood leaning. Panel (b): Community detection.
Panel (c) and (d): average leaning hµ(x)i of the influence sets reached by users with leaning x, for infection probability
β = 0.01hki−1 and β = 0.02hki−1 , respectively, where hki is the average degree of the network.
Figure 7 displays the results obtained for the Facebook dataset called “Science and Conspiracy”, described in
Materials and Methods of the main paper. Panel (a) shows the joint distribution of the leaning of users, x, against
the average leaning of their neighborhood X N . We note that the community referred to as “Science”, to which is
associated a leaning of -1, is much smaller than the community called ”Conspiracy” and for this reason it is not clearly
12
visible in the density plot but only in the histograms at its margins. Panel (b) shows the size and average leaning of
communities detected by the Louvain algorithm.
Panels (c) and (d) show the results of the SIR dynamics: the average leaning hµ(x)i of the influence sets reached by
users with leaning x, for two different values of the infection probability, while the recovery rate is fixed ν = 0.2. Size
and color of each point is related to the average size of the influence sets.
B. Guncontrol
Pro
1000 Guncontrol
Against
Guncontro
Community Size
100
10
1
1 2 3
Number of Communities
(a) (b)
1.0 1.0
Influence Set Leaning
0.5 0.5 ● ●
● ● ● ●
● ● ●
● ●
●
●
0.0 ● 0.0
●
●
●
●
Influence Set ●
Influence Set
●
Average Size ●
●
Average Size
−0.5 80 −0.5 5.5
60 5.0
4.5
40 4.0
20 3.5
−1.0 −1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
Seed Leaning Seed Leaning
(c) (d)
FIG. 8: Gun control. Panel (a): Individual leaning versus neighborhood leaning. Panel (b): Community detection. Panel (c)
and (d): average leaning hµ(x)i of the influence sets reached by users with leaning x, for infection probability β = 0.1hki−1 and
β = 0.2hki−1 , respectively, where hki is the average degree of the network.
Figure 8 shows the results obtained for the Twitter dataset “Gun control”, described in Materials and Methods of
the main paper. Panel (a) shows the joint distribution of the leaning of users, x, against the average leaning of their
neighborhood X N , in which two different regions are clearly visible. Panel (b) shows the size and average leaning of
communities detected by the Louvain algorithm.
Panels (c) and (d) show the results of the SIR dynamics: the average leaning hµ(x)i of the influence sets reached by
users with leaning x, for two different values of the infection probability, while the recovery rate is fixed ν = 0.2. Size
and color of each point is related to the average size of the influence sets.
13
Pro Obamacare
Community Size
100
10
1
1 2 3 4
0.33 Number of Communities
(a) (b)
1.0 1.0
Influence Set Leaning
Average Size ●
Average Size
−0.5 500 −0.5 10
400 8
300
200 6
100 4
−1.0 −1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
Seed Leaning Seed Leaning
(c) (d)
FIG. 9: Obamacare. Panel (a): Individual leaning versus neighborhood leaning. Panel (b): Community detection. Panel (c)
and (d): average leaning hµ(x)i of the influence sets reached by users with leaning x, for infection probability β = 0.1hki−1 and
β = 0.2hki−1 , respectively, where hki is the average degree of the network.
C. Obamacare
Figure 9 shows the results obtained for the Twitter dataset referred to as “Obamacare”, described in Materials
and Methods of the main paper. Panel (a) shows the joint distribution of the leaning of users, x, against the average
leaning of their neighborhood X N , in which two interconnected regions are clearly visible. Panel (b) shows the size
and average leaning of communities detected by the Louvain algorithm.
Panels (c) and (d) show the results of the SIR dynamics: the average leaning hµ(x)i of the influence sets reached by
users with leaning x, for two different values of the infection probability, while the recovery rate is fixed ν = 0.2. Size
and color of each point is related to the average size of the influence sets.
D. TheDonald
Figure 10 shows the results obtained for the Reddit dataset “The Donald”, described in Materials and Methods of
the main paper. Panel (a) displays the joint distribution of the leaning of users, x, against the average leaning of their
neighborhood X N , showing a unique region spanning most of the x-axis and concentrated on the values around 0.25 on
the y-axis. Such a region is also characterized by few peaks of leaning (spanning mainly from Center to Extreme Right)
that are displayed in the histogram on the top margin. Panel (b) shows the size and average leaning of communities
detected by the Louvain algorithm.
Panels (c) and (d) show the results of the SIR dynamics: the average leaning hµ(x)i of the influence sets reached by
14
Extreme
Right
1000 Extreme
Left
Community Size
100
10
1
0 10 20 30
Community ID
(a) (b)
1.0 1.0
Influence Set Leaning
0.0 0.0
Influence Set Influence Set
Average Size Average Size
−0.5 125 −0.5 2700
100 2600
75 2500
50 2400
25 2300
−1.0 −1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
Seed Leaning Seed Leaning
(c) (d)
FIG. 10: The Donald. Panel (a): Individual leaning versus neighborhood leaning. Panel (b): Community detection. Panel (c)
and (d): average leaning hµ(x)i of the influence sets reached by users with leaning x, for infection probability β = 0.0067hki−1
and β = 0.013hki−1 , respectively, where hki is the average degree of the network.
users with leaning x, for two different values of the infection probability, while the recovery rate is fixed ν = 0.2. Size
and color of each point is related to the average size of the influence sets.
15
In this section, we provide additional results for the SIR dynamics run with different parameters on the 6 data
sets considered in the main paper, namely “Abortion” on Twitter, “Politics” and “News” on Reddit, “Vaccines” and
“News” on Facebook, and Gab.
The results, reported in fig. 11, are qualitatively identical to the ones in the main paper and are reported here for
the sake of brevity. Details about the parameters used in the simulations are provided in the caption of Fig. 11.
● 100
● ● ●
●
●
●
0.0 0.0 0.0
●
●
Influence Set ●
Influence Set
● ● ●
● ●
● ●
● ● ●
Average Size ●
●
Average Size
−0.5 6.5 −0.5 −0.5 6.0
6.0 5.5
5.5
5.0 5.0
4.5 4.5
4.0 4.0
−1.0 −1.0 −1.0 ●
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
Seed Leaning Seed Leaning Seed Leaning
● 2000
●
4
8
7 3
6
5 2
−1.0 −1.0 −1.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
Seed Leaning Seed Leaning Seed Leaning
FIG. 11: Additional results of the SIR dynamics for the six data sets considered in the main paper. Average leaning hµ(x)i of
the influence sets reached by users with leaning x, for infection probability β = 0.05hki−1 (Abortion on Twitter, panel (a)),
β = 0.005hki−1 (Politics on Reddit, panel (b)), β = 0.02hki−1 (Vaccines on Facebook, panel (c)), β = 0.025hki−1 (Gab, panel
(d)), β = 0.025hki−1 (News on Facebook, panel (e)), β = 0.01hki−1 (News on Reddit, panel (f)), while the recovery rate is fixed
ν = 0.2. Size and color of each point is related to the average size of the influence sets.