You are on page 1of 14


CMS NO-20869
SAP ID-16362


Task 1
Please attach your database that contains both of your YouTube channels in Gephi
format. This means that you have to transform the two .csv files into a single one with
only three columns.
Combined CSV file for both YouTube channels is created and screenshot is attached as under

Task 2
Attach a screenshot of your "Overview" tab in Gephi, which shows your network after
you ran the "Yifan Hu" Layout algorithm.
Task 3
Take a screen shot of the Data Table of Edges

Task 4
Calculate the average Degree of your network. Display and analyze all three resulting network

A. Degree Average Degree = 1.038

B. In-Degree
C. Out-Degree
Take screenshot of your network

Answer the following questions:

1. What is the difference between them?

In this project we are identifying relation between two YouTube users by analyzing their
interaction with videos. Difference between the degree’s types i.e. degree, In-Degree and
out degree

Degree is overall total degree of the node and it shows both of the information.

In-degree shows the nodes incoming connection of the node from other nodes and in this
case the video nodes are the one that are being interacted by the user so all nodes with
incoming degree are videos. Furthermore, the categories show users, videos and common

Out-degree shows the nodes incoming connection of the node from other nodes and in our
scenario, users are the one that are interacting so all nodes without-degree are YouTube
users. So, the category shows two users and all videos
2. How many categories do you get for each?


1. Degree 5 x categories
2. In-Degree 3 x categories
3. Out-Degree 3 x categories

3. Can you make sense of the numbers it indicates the number of degree per category for
each of the three measures? Why or why not?
Yes, the numbers of degrees per category makes sense. Degree is overall total degree of the
node and it shows both of the information. In-degree shows the nodes incoming connection of
the node from other nodes. Out-degree shows the nodes incoming connection of the node from
other nodes and in our scenario, users are the one that are interacting so all nodes without-
degree are YouTube users.

Task 5

1. How many nodes (videos) are shared by both YouTube channels? Count them or calculate

Total number of count for the shared nodes (videos) between sub-networks of nodes (users) is 22. This is
Identified using videos with degree 2.

22 shared videos

2. Calculate the network Modularity and take screen shot of the network.

Network Modularity = 0.465

Task 6
Calculate the "Undirected Closeness Centrality" for your network, through "Average Path
Length" (attach network screen shot) and then answer the questions:

Average Path Length = 2.914

1. How many groups of nodes do you get?

(5 Groups) we get 5 x groups of nodes by partitioning the network with undirected closeness

2. Please interpret the different groups. Which nodes are part of which group and
2 x colors identify the users or subnetwork main nodes which are most close to the nodes in
subnetwork i.e. Dark Green and Orange.
2 x colors identify nodes / videos in the sub-network connected to main node refereed in above
statement and theses colors are Light Green and purple in above diagram.
1 x color identifies the common nodes/ videos between sub-networks/ main nodes and this color
is Blue.

3. Calculate the "directed Closeness Centrality" for your network, through "Average Path
Length"(Also attach screenshot)

Task 7
Calculate PageRank for your network, a special version of Eigenvector Centrality. Then
answer the following questions: (attach screen shot of the network)

1. How many groups of nodes do you get for PageRank?

We get total 4 x groups using page rank classification of graph.
2. What do they measure?
Main nodes/ users are categorized in one group
Common videos/ nodes are categorized on other
2 x groups show non-common videos/ nodes to their respective users.

3. Is this useful?
Yes It is useful in identification of subgraph along with their main nodes and furthermore,
give ideas about common elements between subgraphs

Attach Screen shot of each if any changes done in network

Task 8
Please attach a screenshot of your "Data Laboratory" tab, now at the end, after you have done
the preceding analysis.

Go to "Data Table > Nodes" (not Edges) and make sure that the "Id" column is completely
readable (not cut off to its right):
1. Take a screenshot of the full size "Data Laboratory" window (not just the part shown in
this excerpt screen shot).
Attached above

2. What does it mean by weight in Data Table in edges Tab?

Weight of the edges show how many times a link is repeated in the data. Hence showing the
importance of the node to the other.
Task 9
Attach a screenshot of your "Overview" tab in Gephi, which shows your network after
you ran the "Force Atlas" Layout algorithm with repulsion strength of 20000.

You might also like