You are on page 1of 49

A project entitled

covid-19 pandemic analysis system


IN
INFORMATICS PRACTICES (065)
For Session 2022-2023

GUIDED BY: SUBMITITED BY:


Mrs. Mitali Bansal Shubham Shandily
ROLL NO. :

SARVODYA BAL VIDYALYA


BT- BLOCK , SHALIMAR BAGH, DELHI-110088

covid-19 pandemic analysis system


INDEX
S.NO. TOPIC PAGE NO.
1 Certificate 3
2 Acknowledgement 4
3 Aim & Problem Definition, Front-End, 5
Back-End, Operating System
4 Hardware/Software configuration required 6
5 Introduction to Project 7
6 Overview of Python 3.7.3 8-9
7 Overview of MySQL 5.1 10
8 Database and Table Design & sample 11-12
data
9 Source code and Output 13-48
10 Bibliography 49

CERTIFICATE
This is to certify that SHUBHAM SHANDILY Roll NO.
of Class :XII-F , Session: 2022-23 has prepared the project
file on the topic.

covid-19 pandemic analysis system


As per the prescribed syallabus of
INFORMATICS PRACTICES (065) CLASS XII (C.B.S.E)
Under y supervision, I am completely satisfied by the
performance.
I wish him/her all the success in life.

Principal’s signature Subject


teacher’s sign

External’s signature
ACKNOWLEDGEMENT
It gives me great pleasure to express my
gratitude towards out IP Teacher
Mrs.Mitali Bansal for his guidance, support
and encouragement throughout the
development of this ( Covid-19 Pandemic
Analysis System ) project.

This project is an original piece of work.


I would also like to thank my Principal Mr.
Virender Yadav for motivation, without her
help this project could not be possible to
bring up to this form.

By: RNo-
AIM:
To develope
covid-19 pandemic analysis system
FRONT END :
Python 3.7.3

BACK END :
MySQL Server 5.1

Operating system:
Ms-Windows 7

HARDWARE & SOFTWARE REQUIREMENTS :


Hardware Requirement
Pentium 3/4/Core 2 Duo/Dual core/i3/i5/i7 With at least 256 MB RAM 2 MB free
space on Hard Disk Color Monitor/LCD

Operating System & Compiler


MS Windows 7or above
Python with related libraries used for Data Analysis

Open Source Software Being Used :


1.Python 3.7.3
 Pandas
 Matplotlib

PANDAS:
Pandas is a software library written for the Python programming language for data
manipulation and analysis. In particular, it offers data structures and operations for
manipulating numerical tables and time series.
To import this library:
import pandas as pd

MATPLOTLIB
Matplotlib is a plotting library for the Python programming language and its
numerical mathematics extension NumPy.
To import this library:

import matplotlib.pyplot as plt

import numpy as np

2.MySQL Server 5.1

INTRODUCTION TO covid-19 pandemic analysis system

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered


coronavirus.

Most people infected with the COVID-19 virus will experience mild to moderate respiratory
illness and recover without requiring special treatment.  Older people, and those with
underlying medical problems like cardiovascular disease, diabetes, chronic respiratory
disease, and cancer are more likely to develop serious illness.

The best way to prevent and slow down transmission is be well informed about the
COVID-19 virus, the disease it causes and how it spreads. Protect yourself and others
from infection by washing your hands or using an alcohol based rub frequently and not
touching your face. 
The COVID-19 virus spreads primarily through droplets of saliva or discharge from the
nose when an infected person coughs or sneezes, so it’s important that you also practice
respiratory etiquette (for example, by coughing into a flexed elbow).

At this time, there are no specific vaccines or treatments for COVID-19. However, there
are many ongoing clinical trials evaluating potential treatments. WHO will continue to
provide updated information as soon as clinical findings become available.

OVEVIEW OF Python
Python is a high-level, interpreted, interactive and object-oriented scripting language.
Python is designed to be highly readable. It uses English keywords frequently where as
other languages use punctuation, and it has fewer syntactical constructions than other
languages.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do
not need to compile your program before executing it. This is similar to PERL and
PHP.
 Python is Interactive − You can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or technique
of programming that encapsulates code within objects.
 Python is a Beginner's Language − Python is a great language for the beginner-
level programmers and supports the development of a wide range of applications
from simple text processing to WWW browsers to games.
History of Python
Python was developed by Guido van Rossum in the late eighties and early nineties at the
National Research Institute for Mathematics and Computer Science in the Netherlands.
Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-
68, SmallTalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the GNU
General Public License (GPL).
Python is now maintained by a core development team at the institute, although Guido
van Rossum still holds a vital role in directing its progress.

Python Features
Python's features include −
 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined and visible to the eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and has the
same interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more
efficient.
 Databases − Python provides interfaces to all major commercial databases.
 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows
MFC, Macintosh, and the X Window system of Unix.
 Scalable − Python provides a better structure and support for large programs than
shell scripting.
Apart from the above-mentioned features, Python has a big list of good features, few are
listed below −
 It supports functional and structured programming methods as well as OOP.
 It can be used as a scripting language or can be compiled to byte-code for building
large applications.
 It provides very high-level dynamic data types and supports dynamic type
checking.
 It supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.

OVERVIEW OF MYSQL
A database system is basically a computer based record keeping system. The
collection of data , usually referred to as the database , contains information
about one particular enterprise. In a typical file-processing system , permanent
records are stored in various file. A number of different application program
are written to extract records from files and add records to the appropriate
files A data management system is answer to all these problem as it provides a
centralized control of the data.

Various advantages of data base system are:

Data base system reduce data redundancy (data duplication ) to a large extent.

Data base system control data inconsistency to a large extent.

Database facilitate sharing of data.

Database enforces standards.

Centralized data bases can ensure data security.

Integrity can be maintained through databases.


My SQL is a freely available source Relational Database Management System
(RDMS) that uses Structured query language (SQL). It is downloadable from
site WWW.MYSQL.ORG . In a MYSQL database , information stored in
tables. MYSQL provides you with a rich set of features that support a secure
environment for storing , maintaining , accessing data. MYSQL is a fast ,
reliable , scalable alternative to many of the commercial RDBMSs available
today.

MYSQL was created and is supported by MYSQL AB , a company based in Sweden


(ww.mysql.com) . This company is now a subsidiary of sun micro systems , which holds the copyright
to most of the code base. On APRIL 20, 2009 ORACLE CORP., which develops and sells the

proprietary ORACLE DATABASE, announced a deal to acquire sun Microsystems.

dATABASE & TABLE DESIGN


sample data:
SOURCE CODE
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sys
import mysql.connector as conn
import xlrd
global df
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII 2020-21\\
Project Covid-19\\full_data.csv")

#----------------------------------------
# Function to display the main menu
#----------------------------------------
def MenuSet():
ans='y'
while ans=='y' or ans=='Y':
opt=""
print()
print("============================================")
print(" COVID-19 PANDEMIC ANALYSIS SYSTEM")
print("********************************************")
print("1- Data Visualization\n")
print("2- Analysis\n")
print("3- Read csv/excel file\n")
print("4- Export/Import to/from MySQL\n")
print("5- Manipulation\n")
print("6- Exit")
print("============================================")
opt=input("Enter your choice: ")
if opt=='1':
visuals()
elif opt=='2':
analysis()
elif opt=='3':
read_csv_excel()
elif opt=='4':
exp_imp_sql()
elif opt=='5':
manipulation()
elif opt=='6':
my_chance=input("Do you really want to exit?(y/n)")
if my_chance=='y' or my_chance=='Y':
print("Thank you. Exiting now.......")
sys.exit()
else:
print("\nInvalid choice. Try again")
continue
else:
ans=input("Do you want to continue(y/n)")

# Main Program
MenuSet()
#------------------------------------------
# Function to Plot graphs
#------------------------------------------
def visuals():
ch='y'
while (ch=='y' or ch=='Y'):
print()
print("============================================")
print(" Data Visualization Menu-Top 10 Counties")
print("============================================")
print("1- Line Chart- Daily New Cases\n")
print("2- Pie Chart-Death\n")
print("3- Bar Chart Total Test vs Confirmed\n")
print("4- Bar Chart Total cases vs Recovered vs Active\n")
print("5- Bar Chart Total cases vs death\n")
print("6- Exit\n")
print("============================================")
opt1=input("Enter your choice: ")
if opt1=='1':
line_chart()
elif opt1=='2':
pie_chart()
elif opt1=='3':
bar1()
elif opt1=='4':
bar2()
elif opt1=='5':
bar3()
elif opt1=='6':
chance=input("Do you really want to exit and go back
to Main Menu?(y/n)")
if chance=='y' or chance=='Y':
print("Exiting.......")
break
else:
print("\nInvalid input. Try again")
continue
else:
ch=input("Do you want to continue(y/n)")

#--------------------------------------------------------
# Function to Plot a Line graph for Daily cases & daily # deaths
date wise
#--------------------------------------------------------
def line_chart():
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII 2020-
21\\Project Covid-19\\full_data.csv")
while True:
l=[]
g=df.groupby('location')['location']
for c in g:
print(c,end=' ')
cname=input("Enter country Name : ")
df1=df.loc[(df['location']==cname)]
if df1.empty!=True:
dt=df1['date']
x=np.arange(len(dt))
y1=df1['new_cases']
y2=df1['new_deaths']
plt.plot(x,y1,label="Daily Cases")
plt.plot(x,y2,label="Daily Deaths")
plt.title('COVID-19 Analysis\nTop 11 Countries as on
24 May 2020\n'+cname,color='red', fontsize=10)
plt.legend()
plt.show()
break
else:
print("Country Name is incorrect. Try again")

#-------------------------------------------------------
# Function to Plot Pie Chart of Total deaths country wise
#--------------------------------------------------------
def pie_chart():
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII 2020-
21\\Project Covid-19\\full_data.csv")
df=df.set_index(['location'])
df1=df.iloc[:,[1,2]]
g=df1.groupby('location')['new_cases','new_deaths'].sum()
final_df=g.sort_values(by='new_cases').tail(12)
final_df.reset_index(inplace=True)
final_df.columns=['location','Total Cases','Total Deaths']
final_df=final_df.drop(11,axis='index')
countries=final_df['location']
tdeath=final_df['Total Deaths']

plt.pie(tdeath,labels=countries,explode=(0.1,0,0,0,0,0,0,0,0,0,0.2
),\
shadow=True,autopct='%0.1f%%')
plt.title("Covid-19 Death Analysis\nTop 10 Countries",
color='b', fontsize=12)
plt.gcf().canvas.set_window_title('Covid-19,Deaths')
plt.show()

#-----------------------------------------------------------
# Function to Plot bar graph Total Tests vs Total Confirmed
#-----------------------------------------------------------
def bar1():
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII 2020-
21\\Project Covid-19\\covid_28may.csv")
x=np.arange(11)
countries=df['country']
ttest=df['total_test']
tconf=df['total_cases']
plt.bar(x-0.25,ttest,label='Total Tests',width=0.5, color='k')
plt.bar(x+0.25,tconf,label='Total Confirmed',width=0.5,
color='r')
plt.xticks(x,countries,rotation=45)
plt.title('COVID-19 Analysis\nTop 11 Countries as on 28 May
2020',color='magenta', fontsize=10)
plt.xlabel("Countries")
plt.ylabel("No.of Cases")
plt.grid()
plt.legend()
plt.gcf().canvas.set_window_title('Covid-19,Total Tests vs
Confirmed')
plt.show()

#--------------------------------------------------------------
# Function to Plot bar graph Total Cases vs Recovered & Active
#--------------------------------------------------------------
def bar2():
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII 2020-
21\\Project Covid-19\\covid_28may.csv")
x=np.arange(11)
countries=df['country']
trecover=df['recovered']
tactive=df['active']
tcase=df['total_cases']
ans=input("Graph Type(line/bar) : ")
if ans=='line':
plt.plot(x,trecover,label='Total
Recovered',ls='-',marker='o')
plt.plot(x,tactive,label='Total
Active',ls='-.',marker='^')
plt.plot(x,tcase,label='Total Cases',ls='--',marker='s')
else:
plt.bar(x-0.30,tcase,label='Total Cases',width=0.33,
color='m')
plt.bar(x,trecover,label='Total Recovered',width=0.33,
color='b')
plt.bar(x+0.30,tactive,label='Total Active',width=0.33,
color='c')
plt.xlabel("Countries")
plt.ylabel("No.of Cases")
plt.xticks(x,countries,rotation=30)
plt.title('COVID-19 Analysis\nTop 11 Countries as on 28 May
2020',color='magenta', fontsize=10)
plt.grid()
plt.legend()
plt.gcf().canvas.set_window_title('Covid-19,Total Case vs
Recovered & Active cases')
plt.show()

#-------------------------------------------------------
# Function to Plot bar graph Total cases vs Total deaths
#--------------------------------------------------------
def bar3():
df=pd.read_csv("C:\\Users\\pc\\desktop\\IP Classes XII 2020-
21\\Project Covid-19\\full_data.csv")
df=df.set_index(['location'])
df1=df.iloc[:,[1,2]]
g=df1.groupby('location')['new_cases','new_deaths'].sum()
final_df=g.sort_values(by='new_cases').tail(12)
final_df.reset_index(inplace=True)
final_df.columns=['location','Total Cases','Total Deaths']
final_df=final_df.drop(11,axis='index')
print(final_df)
#final_df.plot(kind='bar',x='location',legend=True,
# width=0.75,title='COVID-19 Analysis\nTop 11
Countries',
# rot=30,grid=True,figsize=(8,5))
x=np.arange(11)
countries=final_df['location']
tcases=final_df['Total Cases']
tdeath=final_df['Total Deaths']
plt.bar(x-0.25,tcases,label='Total Cases',width=0.5)
plt.bar(x+0.25,tdeath,label='Total deaths',width=0.5)
plt.xticks(x,countries,rotation=45)
plt.title('COVID-19 Analysis\nTop 11 Countries as on 24 May
2020',color='red', fontsize=10)
plt.xlabel("Countries")
plt.ylabel("No.of Cases")
plt.grid()
plt.legend()
plt.gcf().canvas.set_window_title('Covid-19,Total vs Deaths')
plt.show()

#---------------------------------------------------
# Function to analyse data from a dataframe
#---------------------------------------------------
def analysis():
while True:
print("Data Frame Analysis")
print("********************")
menu=''' 1. Top record
\n 2. Bottom Records
\n 3. To print particular column
\n 4. To print multiple columns
\n 5. To display complete statitics of the dataframe
\n 6. To display complte information about dataframe
\n 7. To display the unique values of the columns
\n 8. To apply and display the data group by with
count function
\n 9. To apply and display the data using group by
with more functions
\n 10.To appying aggregate function
\n 11.To applying pivoting
\n 12.To go back'''
print(menu)
ch_an=int(input("Enter your choice"))
if ch_an==1:
n=int(input("Enter the number of records to be
displayed"))
print("Top ", n," records from the dataframe")
print(df.head(n))
elif ch_an==2:
n=int(input("Enter the number of records to be
displayed"))
print("Bottom ", n," records from the dataframe")
print(df.tail(n))
elif ch_an==3:
print("Name of the columns\n",df.columns)
col=input("Enter the column name to be displayed")
print(df[[col]])
elif ch_an==4:
print("Name of the columns\n",df.columns)
co=eval(input("Enter the column names as list in
square bracket"))
print(df[co])
elif ch_an==5:
print("Complete Statistics")
print(df.describe())
elif ch_an==6:
print("Information about dataframe")
print(df.info())
elif ch_an==7:
print("Dispaying unique values of any columns")
print("Name of the columns\n",df.columns)
co=input("Enter the column name")
print("Distinct values of column ", co," are: ")
print(*df[co].unique(),sep='\n')
elif ch_an==8:
print("Name of the columns\n",df.columns)
co=eval(input("Enter the column names as list in
square bracket"))
print(df[co])
co1=input("Enter the column name to be displayed")
print("Grouped columm ",co1)
dfgroup=df[co].groupby(co1).count()
print(dfgroup)
elif ch_an==9:
print("Name of the columns\n",df.columns)
co=eval(input("Enter the column names as list in
square bracket"))
print(df[co])
co1=input("Enter the column name for grouping : ")
print("Grouped columm",co1,' max',' min',' count','
sum',' mean')

dfgroup=df.groupby(co1).agg(['max','min','count','sum','mean'])
print(dfgroup)
elif ch_an==10:
print("Applying aggregate functions")
print("Name of the columns\n",df.columns)
co=eval(input("Enter the column names as list in
square bracket"))
print('Print the maximum values of the ',co,'
columns')
print(df[co].max()) #Any function can be applied
elif ch_an==11:
print("--: Total deaths date wise and Country
wise :--")

dfpivot=df.pivot_table(index='date',columns='location',
values='total_deaths')
print(dfpivot)
else:
break
# End of the function
#--------------------------------------------------------

#---------------------------------------------------
# Function to read csv file/excel into Data Frame
#---------------------------------------------------
def read_csv_excel():
while True:
print('''1- Read CSV file to create and display DataFrame\
\n2- Read Excel File and Display DataFrame\
\n3- Press 3 to go back''')
choice=int(input("Enter your choice:"))
if choice==1:
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII
2020-21\\Project Covid-19\\full_data.csv")
print(df)
print("File retrieved Successfully!!!")
elif choice==2:
filename=input("Enter filename with extension
.xls/xlsx: ")
df=pd.read_excel(filename)
print(df)
print("File retrieved Successfully!!!")
elif choice==3:
break

#-----------------------------------------------------------------
# Function to Export/Import to MySQL from a Dataframe and vice-
versa
#-----------------------------------------------------------------
def exp_imp_sql():
while True:
print("\n\n"+"*"*60)
print(" Data Transfer between DataFrame to MySQL")
print("-"*60)
print('''1- Import from MySQL to create and display
DataFrame\
\n\n2- Export from DataFrame to mySQL\
\n\n3- Press 3 to go back''')
print("-"*60)
choice=int(input("Enter your choice:"))
if choice==1:
dict1={'location':{},'total_cases':0,'total_deaths':0}
df1=pd.DataFrame(dict1)
conn1=conn.connect(host="localhost",user="root",
password='tiger',database='d1')
filename=input("Enter filename : ")
sqlquery="select * from {}".format(filename)
cur=conn1.cursor()
cur.execute(sqlquery)
records=cur.fetchall()
n=cur.rowcount
for i in range(n):
df1.loc[i]=records[i]
cur.close()
conn1.close()
# call function sqlToDataFrame
print("Sql to Dataframe transfer=\n",df1)
print("Transfer successful from mysql to
Dataframe!!!!\n\n")
# end of the program

elif choice==2:
df=pd.read_csv("C:\\Users\\pc\\desktop\\IP Classes XII
2020-21\\Project Covid-19\\full_data.csv")
df=df.set_index(['location'])
df1=df.iloc[:,[1,2]]
g=df1.groupby('location')
['new_cases','new_deaths'].sum()
final_df=g.sort_values(by='new_cases').tail(12)
final_df.reset_index(inplace=True)

final_df.columns=['location','Total_Cases','Total_Deaths']
final_df=final_df.drop(11,axis='index')
print(final_df)
#Connectivity
conn1=conn.connect(host="localhost",user="root"
,password='tiger',database='d1')
tablename=input("Enter table name to check :")
stmt = "SHOW TABLES LIKE '"+tablename+"'"
cur1=conn1.cursor()
cur1.execute(stmt)
result = cur1.fetchone()
if result:
# there is a table named "tableName"
# delete all existing records
sqlquery="delete from "+tablename
# create new cursor
cur=conn1.cursor()
#execute query
cur.execute(sqlquery)
conn1.commit()
# adding to mysql
for row,rs in final_df.iterrows():
country=rs[0]
tcase=str(rs[1])
tdeath=str(rs[2])
qry="insert into "+tablename+"
values('"+country+"',"+tcase+","+tdeath+")"
cur.execute(qry)
conn1.commit()
cur.close()
print("Data transferred to MySQL database
successfully\n\n")
else:
# there are no tables named "tableName"
sqlquery1="create table "+tablename+"(location
varchar(30),\
total_cases int(15), total_deaths int(10))"
# create new cursor
cur=conn1.cursor()
#execute query
cur.execute(sqlquery1)
conn1.commit()
print("Table Created!!!")
# adding to mysql
for row,rs in final_df.iterrows():
country=rs[0]
tcase=str(rs[1])
tdeath=str(rs[2])
qry="insert into "+tablename+"
values('"+country+"',"+tcase+","+tdeath+")"
cur.execute(qry)
conn1.commit()
cur.close()
print("Data transferred to MySQL database
successfully\n\n")
elif choice==3:
break

#---------------------------------------------------
# Function to manipulate data in a dataframe
#---------------------------------------------------
def manipulation():
df=pd.read_csv("C:\\Users\\pc\\Desktop\\IP Classes XII 2020-
21\\Project Covid-19\\full_data.csv")
df1=pd.DataFrame()
while True:
print("\n\nManipulation Menu")
print("*****************")
print('''\n1. Insert a Row\n
2. Delete a rows\n
3. Delete a column\n
4. Go back to main menu''')
mch=int(input("Enter your choice"))
if mch==1:
col=df.columns
print(col)
print(df.head(1))
j=0
lst1=[]
lst1=eval(input("Enter a list of value in the sequence
of columns:"))
print(lst1)
s1=pd.Series(lst1,index=df.columns)
df1 = df.append(s1, ignore_index=True)
print("New row inserted")
print(df1)
elif mch==2:
dt=input("Enter the date for deletion:")
country=input("Enter country for deletion:")
df2=df1[((df1.location!=country) | (df1.date!=dt))]
print( df2.loc[df2['location']=='India'])
elif mch==3:
print(df.columns)
col=input("Enter column name from the above")
ch=input("Do you really want to delete a column
(y/n)?")
if ch=='y' or ch=='Y':
del df[col]
print("Column - ",col,"deleted successfully!!!")
df2=pd.DataFrame()
df2=df
print(df2)
else:
break

OUTPUT SCREEN SHOTS:


BIBLIOGRAPHY

1. www.google.com

2. www.google.com/Python project

3. www.wikepedia.com/Python and Pandas


projects

4. www.data.world

5. www.youTube.com

6. Class notes.

You might also like