4 YouTube Channel
Data Management and
Analysis System
INTRODUCTION
‘This Project aims to provide a comprehensive data management and analysis system for the top
15 YouTube channels. The project utilizes Python programming and various libraries such as
Pandas, NumPy and Matplotlib to effectively handle the dataset, perform data manipulations and
generate visualizations. The dataset, sourced from a CSV file named “project.csv’, contains crucial
information about the rank, name, views, category, country, language and number of subscribers
for each channel.
Data Description
The “project.csv” file comprises a comprehensive dataset representing the top 15 YouTube
channels worldwide. The dataset encompasses the following columns:
Rank: Ranking position of the channel
Name: Name of the YouTube channel
Views: Number of views in billions
Category: Category or genre of the channel
Country: Country where the channel is based
Language: Primary language used in the channel's content
Subscribers: Number of subscribers in millionProject.csv
Rank _|Name Views|Category _|Country [Language |Subscribers
1__[T-series 244 [Music and film |india Hindi 245
2 |MrBeast 163 |Entertainment [United States English 164
3 |Cocomelon 161 [Education [United States English 161
4 |Sony Ent. Tel. Ind.| 158 [Entertainment india Hindi 158
5 [kids Diana Show | 112 [Entertainment [Ukraine [English 112
6 |PewDiePie 111 [Entertainment [Sweden English a1
7 [Like Nastya 106 |Entertainment [Russia-United States [English 106
8 |Viadand Niki__| 98.7 [Entertainment [Russia English 98.7
9 [Zee Mu. Co. 96 [Music India Hindi 96.2
10 |WWe 95.6 [Sports United States [English 95.7
11 [Black 89.5 [Music [South Korea Korean | 89.5
12 |Goldmines 86.6 [Film India Hindi 86.6
13 [Sony SAB ‘82.3 [Entertainment |India Hindi 82.4
14 [5-Minute Crafts | 80.1 [How-to [Cyprus [English 80.1
15 |BangtanTv 75.4 [Music [South Korea korean [75.5
Functionality
The project offers a user-friendly and intuitive menu-driven interface, presenting various options
to interact with the dataset:
Display all records: This option allows users to view the complete dataset of YouTube channels,
providing a comprehensive overview.
Add a new record: Users can seamlessly add a new channel to the dataset by entering relevant
details such as rank, name, views, category, country, language and subscribers.
Delete a record: This feature empowers users to remove a specific channel from the dataset
based on its index, ensuring data accuracy and integrity.
Update a record: Users can conveniently update the details of any channel in the dataset,
ensuring that the information remains up-to-date and relevant.
Functioning on columns: This functionality enables users to perform various operations
columns, including displaying a single column, displaying multiple columns, and
calculating the sum, mean, maximum or minimum value of a column.
Search: Users can easily search for channels based on their rank or name, facilitating efficient
data retrieval and analysis.
Data Visualization: This option allows users to visually represent the data using bar charts or
pie charts, facilitating better data understanding and insights.
Data Analytics: Users can delve into data analytics by accessing descriptive statistics and
correlation analysis for numeric columns in the dataset, providing valuable insights into the
YouTube channel landscape.
Exit: This option gracefully terminates the program.Implementation
The project provides the following functionalities through a user-friendly menu-driven interface:
Retrieve Data: Users can retrieve and view the entire dataset, which displays information about
the top 15 YouTube channels.
DataFrame Statistics: Users can obtain various statistics about the dataset, including displaying
column names, indexes, data types, shapes, sizes, transpose, and dimensions. This allows users
to understand the structure and characteristics of the dataset.
Show the Records: Users can choose from options such as displaying the top 5 records, the
bottom 5 records, a specific number of records from the start or end or details of all the channels.
This enables users to explore and analyze specific subsets of the dataset.
Functioning on Records: Users can insert, delete and update records in the dataset. This
functionality allows users to modify and maintain the dataset with the latest information about
YouTube channels.
Functioning on Columns: Users can insert new columns or delete specific columns from the
dataset. This flexibility enables users to add additional information or remove irrelevant columns
based on their analysis needs.
Search Specific Row/Column: Users can search for specific rows or columns based on index
values or column names. This feature facilitates targeted data retrieval and analysis.
Data Visualization: Users can generate line charts, vertical bar charts, horizontal bar charts, and
histograms to visualize the data. These visualizations provide insights into views and subscribers
of YouTube channels and aid in data interpretation.
Data Analytics: Users can perform data analytics operations such as identifying the channel
with the highest or least views and subscribers. This information helps in understanding channel
performance and making data-driven decisions.
Exit: Users can choose to exit the program when they have completed their analysis.
Implementation Details
The project is implemented in Python, leveraging the Panda's library to read the data from the
“project.csv" file and store it in a dataframe called "df.". The code follows a modular structure and
utilizes a while loop to present the main menu to the user. The user's input is used to navigate
through the different functionalities and perform the desired operations on the dataset. The
Panda's library is used for data manipulation and analysis, NumPy for numerical computations,
and Matplotlib for data visualization.
import pandas as pd
import numpy as np
import matplotlib.pyplot
pit
df = pd.read_csv("project.csv")
Results and Outputs
‘The project provides the following outputs and results:
Display of Dataset: Users can view the complete dataset containing information about top 15
YouTube channels.DataFrame Statistics: Users can obtain statistics about the dataset, including column names,
indexes, data types, shapes, sizes, transpose and dimensions.
Record Display: Users can view specific subsets of records, such as the top or bottom records
or a specific number of records from start or end. They can also view details of all the channels.
Record Manipulation: Users can insert, delete and update records in the dataset, allowing for
data modification and maintenance.
Column Operations: Users can insert new columns or delete specific columns from the dataset,
enhancing the dataset's information based on their analysis requirements.
Data Visualization: Users can generate line charts, ver
histograms to visualize views and subsc
performance and trends.
ical bar charts, horizontal bar charts and
ers of YouTube channels, gaining insights into channel
Data Analytics: Users can extract information such as channels with the highest or least views
and subscribers, providing valuable insights for decision-making.
Coding
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df-pd. read_csv ("project .csv")
# Main Menu
while (True) :
print ("Main Menu")
Retrieve Data")
Dataframe Statistics")
print ("3. Show the Records")
print ("4. Functioning on Records")
print("5. Functioning on Columns")
print("6. Search Specific Row/Column")
print("7, Data Visualization")
print("8. Data Analytics")
print("9, Exit")
nt (input ("Enter your choice")
if ass1:
print (df)
elif a=
while (True):
print ("Dataframe Statistics")
print("1. Display All Column Names")
print("2. Display All the Indexes")
print("3. Display Datatypes of Columns")
print("4, Display the Shape")
print ("5. Display the Size")
print("6. Display the Transpose")
print("7. Display the Dimension")
print("8. Exit")