You are on page 1of 17

DELHI PUBLIC SCHOOL

BOKARO STEEL CITY - 827004

Informatics Practices Project


entitled as

Analysis of top Television Shows of


India

Submitted By :
Name - AYUSH ANAND
Class - XII / H
Board Roll No -
Certificate
This is to certify that Ayush Anand of class Xll – H
has worked successfully under the supervision of
Mrs. Rashmi Sinha during academic year 2023 – 24
on the project “Analysis of top Television Shows of
India” as per the guidelines issued by Central
Board of Secondary Education (CBSE).

Signature of Subject Signature of External


Teacher Examiner

Signature of Principal
Acknowledgment
I am thankful to my IP teacher Mrs. Rshmi Sinha
who helped and guided me while making this
project.

Without her guidance my project would have been


incomplete and imperfect.

The guidance and support received from all the


members who contributed and who contributed
and who are contributing to this project, was vital
for the success of the project.

This acknowledgment is a testament to the


collaborative spirit and support that has defined
the journey of this project. Thank you to everyone
involved for their significant contributions and
dedication.
Introduction
Television serials and family dramas hold a special
place in the hearts of Indians, embodying a rich
cultural heritage. The iconic "Dhum Ta Terenana"
score and enduring "Saas Bahu" dramatic
elements have left an indelible mark on the Indian
Entertainment Industry. From classics like "Saas
Bhi Kabhi Bahu Thi" to modern entries like "Shark
Tank," this industry continuously evolves with
unique creativity. Upon discovering a dataset on
Hindi TV Serials, I felt compelled to analyze it,
aiming to extract intriguing insights from this
culturally significant domain.
Objective
This project aimed to conduct a meticulous
analysis of various aspects within the Indian
television landscape, utilizing curated datasets. The
ensuing insights included:

 Identifying the Top 5 Indian TV Shows based on


IMDB Ratings.
 Assessing artists based on mean IMDB Ratings,
spotlighting top performers.
 Profiling artists with the most extensive industry
experience.
 Analyzing genres based on mean IMDB Ratings,
highlighting those consistently acclaimed.
 Examining genres based on content abundance.
 Investigating the release frequency of shows
over the years.
 Recognizing the longest-running shows,
acknowledging their enduring impact.
 Comparative Insights: Ratings and Years of
Running for Two Random TV Shows
In essence, this study provides a structured
exploration of key dimensions within the dynamic
realm of Indian television.
Input/Output Requirement
Hardware Requirements:
 Operating System: - Windows 10 or above
 Processor: - Pentium (Any) or AMD Athlon
` (3800+ -4200 + Dual Core)
 RAM: - 512 MB+
 Hard Disk: -SATA 40GB Or Above
 Motherboard: - 1.845 Or 915,995 For Pentium or MSI
K9MM-V VIA K8M800+8237R Plus Chipset for AMD
Athlon

Software Requirements:
 Windows: OS
 Python: Programming Language
 Google Colab: Platform
Source Code
Setting up the Environment
I start with importing the necessary modules for this project:
 pandas
 numpy
 matplotlib
Then the dataset is imported into the environment through the read.csv
method.

The IMDB ratings


The IMDB ratings are going to be very important throughout this analysis as
a way to judge the quality and popularity of a TV Show whenever applicable.
But before we dive-in into how other parameters relate and affect the IMDB
rating of a show, let us independently look at these ratings.

Top 5 shows by IMDB ratings


We use the sort_values() function to get an output of the top shows according
their IMDB ratings.
The Cast and The Artists
Analyzing the cast column can provide some interesting statistics to look at,
but there is a serious problem that limits us from using it to any useful
extent.
The problem is the format in which these values are stored in the dataset.
For example take the value for the "Cast" column in the row for Shobha
Somnath Ki :
Ashnoor Kaur ,Tarun Khanna ,Joy Rattan Singh Mathur ,Sandeep Arora
This value is troublesome as it is stored as a single type object and thus it is
not possible to calculate or discern any data for individual cast members.

Cleaning Data: Solving the Cast Problem


We need to convert each cell in the Cast Column into a value based on list
syntax ie. ["a","b","c",...].
We can implement this by writing a function the takes input in the format that
we have and then adding the square brackets and the quotation marks and
returning it in the format that we need. This is my implementation of such a
function:

Before proceeding we also need to create the function needed to convert


these 2D lists to 1D. For that we will use:
Top Rated Artist
Now that we can use the Cast data properly, lets find out which artist has the
best average IMDB ratings for the shows they worked in.

Most Experienced Artist


Now moving to a more concrete relation. We will be finding out which actor
has worked in the most TV shows. It should be noted that the values of this
dataset only list the leading cast members in the cast section and thus artist
with minor roles are not properly recognized in this analysis.
This code visualizes the distribution of the top ten TV show cast members by
creating a pie chart based on their occurrence frequencies in a large dataset.

Ronit Roy having worked in 9 shows, comes out to be the most experienced
artist in this dataset. No wonder I see him in every other serious father type
role.
Genre
Its either comedy (the family kind) or drama (also the family kind) with Indian
TV Serials. But don't take my word for it, let us see for ourselves the genre
dynamics of Indian TV.

Cleaning Data: Genre


Genres also face the same problem as we faced above with artists. There is
a small edit made to handle redundancies due to white-space
characters.

Most Acclaimed Genre


First lets look at which genre claims the best mean IMDB ratings and
garners the best critic response.
Bigger Genre
Next lets look at which genre the creators love the most and thus create the
most shows based around.
Instead of the text output, a visual representation of the output would be
more suitable here, thus we generate a bar graph using the Series.plot()
function.
Release Year
Shows like "Sarabhai vs Sarabhai" were definitely much ahead of their time.
But lets look at how time affected the rest of the Indian TV.

Cleaning Data: Years


To make use of the data in the Years column, we need to convert it into forms
that are not haphazard and unusable like it originally is.
I created two new columns based on the Years column:
 First Year: This column tracks the year in which the show started
airing.
 Years Run: This column tracks how long a show ran.
These columns were created with the following code:
Longest Running Show
Indian shows like "Sasural Simar Ka" and "Kyunki Saas Bhi Kabhi Bahu Thi"
are infamous for running long enough to be part of a late teenager's life
since birth. So its obvious to find out which show actually has the longest
runtime.

"C.I.D." is no-doubt part of every Indian's life. With iconic characters like ACP
Pradyuman, Abhijit, and Daya, and a premise revolving around crime in India,
its not a surprise that it had a runtime of 20 years.
Comparative Insights: Ratings and Years of
Running for Two Random TV Shows
This visual comparison utilizes histograms to succinctly showcase the
ratings and years of running for two randomly selected TV shows.
Future Scope
 Expanded Datasets:
Collecting a more extensive dataset with additional
details, such as episode counts, awards, and
viewer demographics, would provide a richer
source for analysis.

 Interactive Visualization:
Create interactive visualizations, perhaps using
tools like Tableau or Plotly, to engage users and
enable them to explore trends dynamically.

 Machine Learning Predictions:


Explore basic machine learning models to predict
IMDb ratings based on factors like cast, genre, and
show duration, providing an introduction to
predictive modeling.

These future enhancements can provide a more


comprehensive and interactive learning experience
for high school students interested in data analysis
and storytelling.
Conclusion
This data analysis provides a snapshot of the
multifaceted world of Hindi TV serials, uncovering
trends in ratings, artist contributions, genre,
preferences, and more. Despite some limitations in
data quality, the project offers valuable insights
into the diverse landscape of Indian television. The
journey through cleaning data, exploring trends,
and drawing meaningful conclusions enhances the
understanding of data analysis and storytelling in
the context of entertainment data.

You might also like