Sna Project

SUBJECT: SOCIAL NETWORK ANALYSIS
SUBJECT CODE: MCAE24
TITLE: SOCIAL NETWORK ANALYSIS ON INSTAGRAM

NETWORK
By
1MS22MC017 KOSHAL GUPTA
1MS22MC036 SACHIN KUMAR

Description of the Case Study
Problem Statement:
The study revolves around the analysis of Instagram influencers through the lens of social network
analysis. Given a dataset containing information on these influencers, the challenge is to identify the
most influential individuals within the network while considering various centrality measures.
Additionally, the goal is to visually represent the network's structure to gain a deeper understanding
of its dynamics.
Objective:
The primary objective of this case study is twofold:
Centrality Ranking: To determine the most influential Instagram influencers within the network by
computing and comparing various centrality measures, including but not limited to degree centrality,
betweenness centrality, and eigenvector centrality. This objective aims to provide insights into who
holds the most significant influence within the Instagram ecosystem.
Network Visualization: To create informative visual representations of the Instagram influencer

network using Python libraries such as pandas for data management, networkx for network analysis,
and matplotlib for visualization. This objective seeks to visualize the connections and centrality of
influencers within the network, offering a clear and comprehensive view of its structure and influence
patterns.
Methodology
1. Data Collection: Our journey began with the collection of a rich dataset brimming with insights about
Instagram influencers. This treasure trove contained details about their follower relationships, user
profiles, and engagement metrics. It was the raw material upon which our analysis hinged.
2. Data Preprocessing: Before diving into analysis, we had to roll up our sleeves and clean up the data.
We combed through it to deal with any missing values, duplicates, and ensure uniform data formats.
We also selected the most pertinent attributes to make the dataset sleek and agile for analysis.
3. Centrality Computation: The heart of our study lay in computing various centrality measures for each
influencer in the dataset. These measures included degree centrality, which gauged the number of
connections an influencer had, betweenness centrality, which looked at how well an influencer acted
as a bridge in the network, and eigenvector centrality, which assessed an influencer's impact based on
their connections to other influential figures. All these measures were computed discreetly for each
node in the network.
4. Ranking Influencers: Once the centrality measures were in hand, it was time to rank the influencers.
This step separated the wheat from the chaff, identifying the top influencers for each centrality metric.
The goal was to spotlight those who wielded the most influence according to these rankings.
5. Network Visualization: To truly appreciate the intricate web of relationships among influencers, we
employed the 'nx_draw' function from the networkx library. This nifty tool conjured up vivid
visualizations of the Instagram influencer network. In these visuals, each node represented an
influencer, and the edges between them depicted their connections. We spiced things up by using
different node sizes and colors to convey the influencers' centrality within the network.
6. Insights and Interpretation: Armed with our rankings and visualizations, it was time to play detective.
We scrutinized the results to extract meaningful insights. Who were the powerhouses of Instagram
influence? How were these influencers interconnected? Did some serve as crucial bridges between
different parts of the network? These questions guided our interpretation of the data.
snaProject - Jupyter Notebook
In [44]:
import pandas as pd #For reading dataset files

import networkx as nx #For network creation/analysis
import matplotlib.pyplot as plt #For plotting graphs
%matplotlib inline
In [45]:
nodes = pd.read_csv('fromEdge.csv')
edges = pd.read_csv('toEdge.csv')
combined = pd.read_csv('combined.csv')
In [46]:
nodes.head()
Out[46]:
fromEdge
0 0
1 0
2 0
3 0
4 0
In [47]:
edges.head()
Out[47]:
toEdge
0 1
1 2
2 3
3 4
4 5
In [48]:
edges.shape
Out[48]:
(32, 1)
In [49]:
combined = combined.groupby(['fromEdge', 'toEdge']).sum().reset_index()
localhost:8888/notebooks/snaProject.ipynb 1/18
In [50]:
combined.shape
Out[50]:
(20, 2)
In [51]:
edges.head()
Out[51]:
toEdge
0 1
1 2
2 3
3 4
4 5
In [52]:
#Create undirected graph using edgelist

G = nx.from_pandas_edgelist(combined, source='fromEdge', target='toEdge')
In [53]:
nx.draw(G)
Inference: Created a subgraph using nx.from_pandas_edgelist() and visualized it with

nx.draw.
This combined approach allows for a targeted analysis and a visual representation of the
specific network dynamics of interest.
In [54]:
#Create directed graph using edgelist
G_directed = nx.from_pandas_edgelist(combined, source='fromEdge', target='toEdge', create_u
In [55]:
nx.draw(G_directed)
Inference: Created a directed graph using nx.from_pandas_edgelist() and visualized it with

nx.draw.
This combined approach allows for a targeted analysis and a visual representation of the
specific network dynamics of interest.
In [56]:
nx.info(G)
C:\Users\shaik\AppData\Local\Temp\ipykernel_10864\1064119803.py:1: Deprecati
onWarning: info is deprecated and will be removed in version 3.0.
nx.info(G)
Out[56]:
'Graph with 21 nodes and 20 edges'
In [57]:
nx.info(G_directed)
C:\Users\shaik\AppData\Local\Temp\ipykernel_10864\366499004.py:1: Deprecatio
nWarning: info is deprecated and will be removed in version 3.0.
nx.info(G_directed)
Out[57]:
'DiGraph with 21 nodes and 20 edges'
In [58]:
#Check nodes
G.nodes()
Out[58]:
NodeView((0, 1, 2, 3, 4, 5, 48, 53, 54, 73, 88, 20, 115, 116, 9, 25, 26, 78,
152, 181, 87))
In [59]:
#Check edges
G.edges()
Out[59]:
EdgeView([(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 48), (1, 53), (1, 54),
(1, 73), (1, 88), (2, 20), (2, 115), (2, 116), (3, 9), (3, 25), (3, 26), (4,
78), (4, 152), (4, 181), (5, 87)])
In [60]:
G.edges.data()
Out[60]:
EdgeDataView([(0, 1, {}), (0, 2, {}), (0, 3, {}), (0, 4, {}), (0, 5, {}),
(1, 48, {}), (1, 53, {}), (1, 54, {}), (1, 73, {}), (1, 88, {}), (2, 20,
{}), (2, 115, {}), (2, 116, {}), (3, 9, {}), (3, 25, {}), (3, 26, {}), (4, 7
8, {}), (4, 152, {}), (4, 181, {}), (5, 87, {})])
In [61]:
print("Edges::")
print(list(G.edges))
Edges::
[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 48), (1, 53), (1, 54), (1, 73),
(1, 88), (2, 20), (2, 115), (2, 116), (3, 9), (3, 25), (3, 26), (4, 78), (4,
152), (4, 181), (5, 87)]
In [62]:
print("Nodes::")
print(list(G.nodes))
Nodes::
[0, 1, 2, 3, 4, 5, 48, 53, 54, 73, 88, 20, 115, 116, 9, 25, 26, 78, 152, 18
1, 87]
In [119]:
#Visualization
nx.draw(G,node_color='r')
Inference: Creating a subgraph where color is red using nx.from_pandas_edgelist() suggests

a focus on analyzing a selected subset of edges and nodes within a larger network, enabling
a more targeted and simplified exploration of specific relationships or properties.
In [118]:
#To add vertex labels

nx.draw(G, with_labels=True,node_color='r')
Inference: Creating a subgraph where color is red and nodes are labeled using
nx.from_pandas_edgelist() suggests a focus on analyzing a selected subset of edges and nodes
within a larger network, enabling a more targeted and simplified exploration of specific
relationships or properties.
In [117]:
#To add vertex labels and change color

nx.draw(G, with_labels=True, node_color='r')
In [120]:
nx.draw_networkx(G,node_color='r')
In [121]:
plt.figure(figsize=(8,8))
plt.show()
In [68]:
#Network Centrality Measures

#Degree centrality
In [122]:
G.nodes
Out[122]:
NodeView((0, 1, 2, 3, 4, 5, 48, 53, 54, 73, 88, 20, 115, 116, 9, 25, 26, 78,
152, 181, 87))
In [129]:
G.degree(0)
Out[129]:
In [130]:
#Degree plot for undirected and unweighted graph

degrees = [G.degree(n) for n in G.nodes()]
plt.hist(degrees)
Out[130]:
(array([15., 0., 1., 0., 0., 0., 3., 0., 1., 1.]),
array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ]),
<BarContainer object of 10 artists>)
Inference: The degree distribution plot indicates that the majority of nodes in the
undirected and unweighted graph have a degree of approximately 1 to 2, with 15 nodes
having a degree of 1. This suggests a network where most nodes have limited connections,
while a smaller subset of nodes exhibits higher connectivity.
In [131]:
degree_sequence = sorted([d for n, d in G.degree()], reverse=True)

plt.loglog(degree_sequence,marker='*')
plt.show()
Inference: The log-log degree distribution plot, represented with markers, suggests that
the graph follows a scale-free network structure. This is characterized by a few highly
connected nodes (hubs) and a majority of nodes with lower degrees, which aligns with the
principles of many real-world networks.
In [132]:
#Degree plot for undirected and weighted graph

degrees = [G.degree(n, weight='weight') for n in G.nodes()]
plt.hist(degrees)
Out[132]:
(array([15., 0., 1., 0., 0., 0., 3., 0., 1., 1.]),
array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ]),
<BarContainer object of 10 artists>)
Inference: In the degree distribution plot for an undirected and weighted graph, the data
suggests a similar pattern as in the unweighted graph. Most nodes have degrees around 1 to
2, with 15 nodes having a degree of 1. This distribution of node degrees reflects the
connectivity pattern within the graph, even when considering edge weights.
In [133]:
degree_sequence = sorted([d for n, d in G.degree(weight='weight')], reverse=True)

plt.loglog(degree_sequence,marker='*')
plt.show()
Inference: The log-log degree distribution plot, with markers denoting node degrees
considering edge weights, still indicates a scale-free network structure. This means that
there are a few highly connected nodes (hubs) and a majority of nodes with lower weighted
degrees, reflecting the presence of influential and less influential nodes in the weighted
graph.
In [134]:
#Degree centrality for unweighted graph

degree_centrality = nx.degree_centrality(G)
degree_centrality
Out[134]:
{0: 0.25,
1: 0.30000000000000004,
2: 0.2,
3: 0.2,
4: 0.2,
5: 0.1,
48: 0.05,
53: 0.05,
54: 0.05,
73: 0.05,
88: 0.05,
20: 0.05,
115: 0.05,
116: 0.05,
9: 0.05,
25: 0.05,
26: 0.05,
78: 0.05,
152: 0.05,
181: 0.05,
87: 0.05}
Inference: The degree centrality analysis of the unweighted graph reveals that a few nodes,
such as Node 1, have high centrality values, signifying their significant connectivity
within the network. Meanwhile, the majority of nodes exhibit lower centrality values,
suggesting a more peripheral role in the graph's structure.
In [136]:
G.degree(0)/(17-1)
Out[136]:
0.3125
In [137]:
#Sort for identifying most influential nodes using degree centrality

for node in sorted(degree_centrality, key=degree_centrality.get, reverse=True):
print(node, degree_centrality[node])
1 0.30000000000000004
0 0.25
2 0.2
3 0.2
4 0.2
5 0.1
48 0.05
53 0.05
54 0.05
73 0.05
88 0.05
20 0.05
115 0.05
116 0.05
9 0.05
25 0.05
26 0.05
78 0.05
152 0.05
181 0.05
87 0.05
Inference: The code provided prints nodes sorted by their degree centrality in descending
order, along with their respective degree centrality values. This allows for a clear
ranking of nodes based on their centrality within the unweighted graph. Nodes 1, 0, 2, 3,
and 4 have the highest degree centrality values, indicating their central roles in the
network, while the remaining nodes exhibit lower centrality values, suggesting less
prominent positions within the graph.
In [138]:
#Calculating degree centrality from scratch

n_nodes = len(G.nodes)
for node in G.nodes():
print(node, G.degree(node)/(n_nodes-1))
0 0.25
1 0.3
2 0.2
3 0.2
4 0.2
5 0.1
48 0.05
53 0.05
54 0.05
73 0.05
88 0.05
20 0.05
115 0.05
116 0.05
9 0.05
25 0.05
26 0.05
78 0.05
152 0.05
181 0.05
87 0.05
Inference: The code computes and displays the normalized degree centrality for nodes in
the graph. This centrality measure, scaled between 0 and 1, reveals the relative
importance of nodes in the network. Nodes 1, 0, 2, 3, and 4 maintain the highest
normalized degree centrality values, underscoring their significance within the graph.
In [139]:
#Degree centrality for weighted graph

degree = G.degree(weight='weight')
max_degree = max(dict(degree).values())
degree_centrality_weighted = [deg/max_degree for deg in dict(degree).values()]
degree_centrality_weighted
Out[139]:
[0.8333333333333334,
1.0,
0.6666666666666666,
0.6666666666666666,
0.6666666666666666,
0.3333333333333333,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666,
0.16666666666666666]
Inference: The code calculates the weighted degree centrality for nodes in the graph by
normalizing their degrees based on the maximum degree in the network. These centrality values
represent the relative influence of nodes, considering the edge weights. Node 1 has the highest
weighted degree centrality with a value of 1.0, indicating its significant influence in the
weighted graph, while other nodes have centrality values scaled accordingly based on their
weighted degrees.
In [173]:
#visualization
In [141]:
#Undirected and unweigted graph

closeness_centrality = nx.closeness_centrality(G)
In [142]:
#Sort for identifying most influential nodes using closeness_centrality

for node in sorted(closeness_centrality, key=closeness_centrality.get, reverse=True):
print(node, closeness_centrality[node])
0 0.5714285714285714
1 0.45454545454545453
2 0.4166666666666667
3 0.4166666666666667
4 0.4166666666666667
5 0.38461538461538464
48 0.31746031746031744
53 0.31746031746031744
54 0.31746031746031744
73 0.31746031746031744
88 0.31746031746031744
20 0.29850746268656714
115 0.29850746268656714
116 0.29850746268656714
9 0.29850746268656714
25 0.29850746268656714
26 0.29850746268656714
78 0.29850746268656714
152 0.29850746268656714
181 0.29850746268656714
87 0.28169014084507044
Inference: The code snippet prints nodes sorted by their closeness centrality values in
descending order. Closeness centrality measures how close a node is to all other nodes in
the network, with higher values indicating nodes that are more central and have shorter
average distances to other nodes. Node 0 has the highest closeness centrality, followed by
nodes 1, 2, 3, and 4. The remaining nodes exhibit lower closeness centrality values,
indicating their relatively more distant positions within the network.
In [172]:
In [148]:
nx.closeness_centrality(G, distance='distance')
Out[148]:
{0: 0.5714285714285714,
1: 0.45454545454545453,
2: 0.4166666666666667,
3: 0.4166666666666667,
4: 0.4166666666666667,
5: 0.38461538461538464,
48: 0.31746031746031744,
53: 0.31746031746031744,
54: 0.31746031746031744,
73: 0.31746031746031744,
88: 0.31746031746031744,
20: 0.29850746268656714,
115: 0.29850746268656714,
116: 0.29850746268656714,
9: 0.29850746268656714,
25: 0.29850746268656714,
26: 0.29850746268656714,
78: 0.29850746268656714,
152: 0.29850746268656714,
181: 0.29850746268656714,
87: 0.28169014084507044}
Inference: The code outputs nodes sorted by their betweenness centrality values in descending
order. Node 0 has the highest betweenness centrality, suggesting it acts as a critical
bridge connecting various parts of the network. Nodes 1, 2, 3, and 4 also play significant
intermediary roles. In contrast, the remaining nodes have minimal betweenness centrality,
indicating limited involvement in connecting others within the network.
In [150]:
#betweenness_centrality
betweenness_centrality = nx.betweenness_centrality(G)
In [151]:
#Sort for identifying most influential nodes using closeness_centrality

for node in sorted(betweenness_centrality, key=betweenness_centrality.get, reverse=True):
print(node, betweenness_centrality[node])
0 0.8210526315789474
1 0.4473684210526316
2 0.28421052631578947
3 0.28421052631578947
4 0.28421052631578947
5 0.09999999999999999
48 0.0
53 0.0
54 0.0
73 0.0
88 0.0
20 0.0
115 0.0
116 0.0
9 0.0
25 0.0
26 0.0
78 0.0
152 0.0
181 0.0
87 0.0
Inference: The provided code snippet outputs nodes sorted by their betweenness centrality
values. Node 0 has the highest betweenness centrality, indicating its crucial role in
connecting different parts of the network, followed by nodes 1, 2, 3, and 4. In contrast,
the remaining nodes have betweenness centrality values of 0, suggesting they do not serve
as intermediaries in the shortest paths between other nodes in the network.
In [152]:
nx.draw(G, with_labels=True,node_color='r')
In [154]:
#Sort for identifying most influential nodes using eigenvector centrality

for node in sorted(eigenvector_centrality, key=eigenvector_centrality.get, reverse=True):
print(node, eigenvector_centrality[node])
0 0.5289488503955594
1 0.458655701546643
2 0.2864984819230128
3 0.2864984819230128
4 0.2864984819230128
5 0.20830460149970237
48 0.15893238610193328
53 0.15893238610193328
54 0.15893238610193328
73 0.15893238610193328
88 0.15893238610193328
20 0.09927814565031089
115 0.09927814565031089
116 0.09927814565031089
9 0.09927814565031089
25 0.09927814565031089
26 0.09927814565031089
78 0.09927814565031089
152 0.09927814565031089
181 0.09927814565031089
87 0.07218202734269338
Inference: The provided code snippet outputs nodes sorted by their eigenvector centrality
values in descending order. Node 0 has the highest eigenvector centrality, indicating its
influence and connections to other highly influential nodes in the network. Nodes 1, 2, 3,
and 4 also exhibit significant eigenvector centrality. The remaining nodes, while less
central, still contribute to the network's overall structure, as indicated by their
eigenvector centrality values above zero.
In [155]:
eigenvector_centrality_weighted = nx.eigenvector_centrality(G, weight='weight')
In [156]:
#Sort for identifying most influential nodes using eigenvector centrality

for node in sorted(eigenvector_centrality_weighted, key=eigenvector_centrality_weighted.get
print(node, eigenvector_centrality_weighted[node])
0 0.5289488503955594
1 0.458655701546643
2 0.2864984819230128
3 0.2864984819230128
4 0.2864984819230128
5 0.20830460149970237
48 0.15893238610193328
53 0.15893238610193328
54 0.15893238610193328
73 0.15893238610193328
88 0.15893238610193328
20 0.09927814565031089
115 0.09927814565031089
116 0.09927814565031089
9 0.09927814565031089
25 0.09927814565031089
26 0.09927814565031089
78 0.09927814565031089
152 0.09927814565031089
181 0.09927814565031089
87 0.07218202734269338
Inference: The provided code snippet outputs nodes sorted by their eigenvector centrality
values in descending order. Node 0 has the highest eigenvector centrality, indicating its
influence and connections to other highly influential nodes in the network. Nodes 1, 2, 3,
and 4 also exhibit significant eigenvector centrality. The remaining nodes, while less
central, still contribute to the network's overall structure, as indicated by their
eigenvector centrality values above zero.
In [158]:
#katz_centrality
nx.katz_centrality(G)
Out[158]:
{0: 0.2979457605762414,
1: 0.2983699195906719,
2: 0.2573510713664166,
3: 0.2573510713664166,
4: 0.2573510713664166,
5: 0.21798952770840102,
48: 0.1989415820285419,
53: 0.1989415820285419,
54: 0.1989415820285419,
73: 0.1989415820285419,
88: 0.1989415820285419,
20: 0.19483971256656643,
115: 0.19483971256656643,
116: 0.19483971256656643,
9: 0.19483971256656643,
25: 0.19483971256656643,
26: 0.19483971256656643,
78: 0.19483971256656643,
152: 0.19483971256656643,
181: 0.19483971256656643,
87: 0.1909035652101518}
Inference: The output represents the Katz centrality values for nodes in the graph. Katz
centrality measures the influence of a node based on both its direct connections and its
connections through other influential nodes. Node 1 has the highest Katz centrality, followed
closely by Node 0, indicating their significant influence and connections to other
influential nodes in the network. The remaining nodes also exhibit Katz centrality values,
reflecting their contributions to the network's overall centrality structure.
In [159]:
nx.katz_centrality(G_directed)
Out[159]:
{0: 0.19790730945405377,
1: 0.21769804039945917,
2: 0.21769804039945917,
3: 0.21769804039945917,
4: 0.21769804039945917,
5: 0.21769804039945917,
48: 0.2196771134939997,
53: 0.2196771134939997,
54: 0.2196771134939997,
73: 0.2196771134939997,
88: 0.2196771134939997,
20: 0.2196771134939997,
115: 0.2196771134939997,
116: 0.2196771134939997,
9: 0.2196771134939997,
25: 0.2196771134939997,
26: 0.2196771134939997,
78: 0.2196771134939997,
152: 0.2196771134939997,
181: 0.2196771134939997,
87: 0.2196771134939997}
Inference: The output shows the Katz centrality values for nodes in a directed graph
(`G_directed`). Katz centrality measures the influence of nodes, considering both their direct
connections and indirect connections through influential nodes. In this directed graph, Node 1
has the highest Katz centrality, followed by Nodes 2, 3, 4, 5, and 0, which all have the same
centrality value. These values suggest the influence of each node in the directed network, with
Node 1 being the most influential. The remaining nodes exhibit similar Katz centrality values,
indicating their interconnectedness and contributions to the network's centrality structure.
In [169]:
#density
print(nx.density(G))
0.09523809523809523
Inference: The code outputs the density of the graph (`G`), which is approximately 0.0952.
Graph density quantifies the proportion of existing edges in relation to the total number
of possible edges in the graph. In this case, the density value suggests that the graph is
relatively sparse, with only around 9.52% of possible edges present in the network.
In [170]:
#eccentricity
print(nx.eccentricity(G))
{0: 2, 1: 3, 2: 3, 3: 3, 4: 3, 5: 3, 48: 4, 53: 4, 54: 4, 73: 4, 88: 4, 20:

4, 115: 4, 116: 4, 9: 4, 25: 4, 26: 4, 78: 4, 152: 4, 181: 4, 87: 4}
Inference: The output displays the eccentricity values for nodes in the graph. Eccentricity
represents the maximum distance from a node to any other node in the graph. In this case,
Node 0 has an eccentricity of 2, indicating it is relatively close to other nodes in the
network. Nodes 1, 2, 3, 4, and 5 have an eccentricity of 3, suggesting a slightly greater
distance to some nodes. The remaining nodes, from 48 onwards, have an eccentricity of 4,
implying they are farther from the rest of the nodes in the network.
In [171]:
#diameter
print(nx.diameter(G))
Inference: The code outputs the diameter of the graph (`G`), which is 4. The diameter is
the longest shortest path between any pair of nodes in the graph. In this case, the diameter
of 4 suggests that the longest path between any two nodes in the network consists of four
edges.
Results:
Degree Centrality:
Node 1 has the highest degree centrality, indicating it has the most connections in the
network.
Nodes 0, 2, 3, and 4 also have relatively high degree centrality, suggesting they are
well-connected influencers within the Instagram network.
The remaining nodes have lower degree centrality values, signifying a less extensive
network of connections.
Closeness Centrality:
Node 0 has the highest closeness centrality, meaning it is relatively close to other
nodes in terms of average distance.
Nodes 1, 2, 3, 4, and 5 also have relatively high closeness centrality values, indicating
their proximity to other influencers in the network.
The rest of the nodes exhibit lower closeness centrality values, implying greater average
distances to other nodes.
Betweenness Centrality:
Node 0 stands out with the highest betweenness centrality, suggesting it plays a pivotal
role in connecting different parts of the network.
Nodes 1, 2, 3, and 4 also have significant betweenness centrality, indicating their
importance as intermediaries in the network.
The remaining nodes have zero betweenness centrality, meaning they do not lie on the
shortest paths between other nodes.
Eigenvector Centrality:
Node 0 has the highest eigenvector centrality, highlighting its influence and connections
to other influential nodes.
Nodes 1, 2, 3, 4, and 5 also exhibit significant eigenvector centrality values, indicating
their prominence.
The remaining nodes, while less central, still contribute to the overall network
structure.
Katz Centrality:
Node 1 has the highest Katz centrality, indicating its substantial influence in the
network.
Nodes 2, 3, 4, 5, and 0 have similar Katz centrality values, suggesting their
interconnectedness and influence.
The remaining nodes maintain moderate Katz centrality values, reflecting their
contributions to the network.
Diameter:
The diameter of the network is 4, signifying that the longest shortest path between any
two nodes in the network consists of four edges.
Eccentricity:
Node 0 has an eccentricity of 2, indicating it is relatively close to other nodes in the

network.
Nodes 1, 2, 3, 4, and 5 have an eccentricity of 3, suggesting a slightly greater distance
to some nodes.
The remaining nodes have an eccentricity of 4, implying they are farther from the rest
of the nodes in the network.
Density:
The density of the graph is approximately 0.0952, suggesting that only around 9.52% of
possible edges are present in the network, indicating a relatively sparse network.

Sna Project

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sna Project

Uploaded by

Copyright:

Available Formats

SUBJECT: SOCIAL NETWORK ANALYSIS

SUBJECT CODE: MCAE24

TITLE: SOCIAL NETWORK ANALYSIS ON INSTAGRAM

1MS22MC036 SACHIN KUMAR

Network Visualization: To create informative visual representations of the Instagram influencer

import pandas as pd #For reading dataset files

combined = combined.groupby(['fromEdge', 'toEdge']).sum().reset_index()

#Create undirected graph using edgelist

Inference: Created a subgraph using nx.from_pandas_edgelist() and visualized it with

Inference: Created a directed graph using nx.from_pandas_edgelist() and visualized it with

'Graph with 21 nodes and 20 edges'

'DiGraph with 21 nodes and 20 edges'

Inference: Creating a subgraph where color is red using nx.from_pandas_edgelist() suggests

#To add vertex labels

#To add vertex labels and change color

#Network Centrality Measures

#Degree plot for undirected and unweighted graph

degree_sequence = sorted([d for n, d in G.degree()], reverse=True)

#Degree plot for undirected and weighted graph

degree_sequence = sorted([d for n, d in G.degree(weight='weight')], reverse=True)

#Degree centrality for unweighted graph

#Sort for identifying most influential nodes using degree centrality

#Calculating degree centrality from scratch

#Degree centrality for weighted graph

#Undirected and unweigted graph

#Sort for identifying most influential nodes using closeness_centrality

#Sort for identifying most influential nodes using closeness_centrality

#Sort for identifying most influential nodes using eigenvector centrality

eigenvector_centrality_weighted = nx.eigenvector_centrality(G, weight='weight')

#Sort for identifying most influential nodes using eigenvector centrality

{0: 2, 1: 3, 2: 3, 3: 3, 4: 3, 5: 3, 48: 4, 53: 4, 54: 4, 73: 4, 88: 4, 20:

Node 0 has an eccentricity of 2, indicating it is relatively close to other nodes in the

You might also like