Professional Documents
Culture Documents
Particulars Description
Topic COVID-19 Outbreak Analysis
Class Trial
1
Parts Duration
Warm-Up 10 minutes
Wrap-Up 10 minutes
– Click on the Save a copy in Drive option. A duplicate copy will get created. It will
open up in the new tab on your web browser.
2
– In the duplicate copy, click on the Share button on the top right corner of the notebook.
A new dialogue box will appear.
3
– Click on the tiny downward arrow next to Anyone with the link can view text. A
drop-down list will appear.
– Click on the More… option. A new page on the dialogue box will appear.
4
– Click on the circle next to On - Anyone with the link option. Now, go down and
click on the tiny downward box next to Can edit text. A drop-down list will appear.
5
– Click on the Can edit option. Then click on the Save button. You will be directed
back to the first page of the dialogue box.
6
– Make sure that under the Link sharing on section, Anyone with the link can edit
option is selected.
7
– Then click on the Done button.
3. After creating the duplicate copy of the notebook, please rename it in the YYYY-MM-
DD_StudentName_Lesson0 format. This step is strongly recommended because in
future you will conduct the same class for many students on the same day. For each
student, you will have to create a duplicate copy of the original class copy. Hence, this
step will allow you to organise each duplicate copy for each student in your Google drive.
4. Click on the settings button, a settings dialogue box will appear.
5. Select the Editor option and set the parameter values as shown in the image below and
then click on the SAVE button.
8
• Now, finish the Warm-Up section. Afterwards, proceed with the class using the duplicate
copy of the notebook. You can share the link of the duplicate copy (named as YYYY-MM-
DD_StudentName_Lesson0) with the student through the chat window.
• When a student executes the code for the first time in the duplicate copy that you shared
with them, they may get the following warning:
Warning: This notebook was not authored by Google. This notebook was
authored by xyz@gmail.com. It may request access to your data stored
with Google such as files, emails and contacts. Please review the source
code and contact the creator of this notebook at xyz@gmail.com with any
additional questions.
9
problem statement.
– Triple hat signs, i.e., ^^^ denotes give hats-off for persistence. It is the ability of a
student to not give up on writing code and writing code with minimal teacher support.
• Every time you execute your code in the Google Colab notebook, please don’t forget to save
it by pressing the Ctrl + S keys (or Command + S keys if you are using a Mac). While
saving the notebook, you or student may encounter the following error:
Save failed
The notebook has been changed outside of this session. Would you like to
overwrite existing changes?
CANCEL YES
Click on the YES button.
• Occasionally, you may encounter some error such as NameError, ReferenceError,
ImportError or ModuleNotFoundError after executing the code in k th code cell (where
k > 0). Most likely, due to poor internet speed, the Colab notebook might have lost the
information that all the previous codes have been executed already. As a remedy, run codes
in the code cells beginning from the first code cell till the (k − 1)th code cell. For e.g., if the
code in the 5th code cell fails to run because of one of the aforementioned errors, execute the
codes in all the first four code cells again.
• For every Teacher Action, the teacher is supposed to share their screen with the student.
Similarly, for every Student Action, the student is supposed to share their screen with the
teacher.
1.0.3 Warm-Up
TEACHER
Hi <student_name>! My name is <teacher_name>.
I am going to be your teacher for this class. Let's get to know each
other a little bit before we start. Why don’t you tell me what are
your hobbies or what do you like to do in your free time?"
Notes:
- Get to know the student - his/her name, what do they like to do in
their free time, what are their ambitions etc.
- Find out whether the student has prior coding experience in Python or
or some other programming language.
- Get to know the student's grade. If the student is in 11th or
12th standard, then ask their stream (Maths/Biology/Arts/Commerce).
Depending on their streams connect the relevance of data science with
the streams.
- Get to know the school board (CBSE/ICSE/HSC etc) of the student.
10
EXPECTED STUDENT RESPONSE
Hello teacher! My name is <student_name> I am doing good. I love playing
football, reading books and watching movies. I want to be a politician
someday.
TEACHER
That's great! I love reading too. And it is good to have an ambition at
such an young age. It helps in finding the right direction to achieve your
goals.
Now, let me welcome you to the trial class with WhiteHat Jr. Have you ever
heard or read about artificial intelligence (AI) & machine learning (ML)?
If'yes', what do you know about them?
TEACHER
Yes. That's on the applications of AI. In general, Artificial intelligence
refers to the simulation of human intelligence in machines that are
programmed to think like humans and mimic their actions. For instance, a
human driver is replaced in a driverless car.
TEACHER
Great! In our trial class today, we will be doing the first step, that is
data visualisation. If you are not familiar with python or other coding
languages, don’t worry. You will be ready to build/code this on your own in
48 classes.
11
EXPECTED STUDENT RESPONSE
Yes. I am very excited.
TEACHER
Great. Let's get started. Now, I am going to share a link with you. Open
the link in the new tab of your web browser.
Note:
- Share the duplicate copy of the Colab notebook that you created with
the student through the chat window.
TEACHER
So, this is a Colab notebook developed by Google. We will write Python
codes in a Google Colab notebook which is actually a Jupyter notebook. It
is just a software to write Python codes. Many data analysts/data
scientists use Jupyter notebooks to solve data-related problems.
Google Colab allows multiple users to work on the same Jupyter notebook
simultaneously. It is loaded with all the tools required to solve data
analytics activities. Hence, you don’t have to install each tool manually.
At this point, the student should share/present their screen with the teacher.
12
Activity 1: Run Source Code This is the source code for the map to be created. You will
learn to write it after signing up for the applied tech course. Right now, you just have to execute
the code.
[ ]: # Student Action: Run the code below.
# Download data
!git clone https://github.com/CSSEGISandData/COVID-19.git
# Install 'geocoder'
!pip install geocoder
# Importing modules
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime
import geocoder
import folium
from folium import plugins
conf_df = pd.read_csv(conf_csv)
grouped_conf_df = conf_df.groupby(by = ['Country/Region'], as_index = False).
,→sum()
13
us_conf_csv = '/content/COVID-19/csse_covid_19_data/csse_covid_19_time_series/
,→time_series_covid19_confirmed_US.csv'
us_conf_df = pd.read_csv(us_conf_csv)
us_conf_df = us_conf_df.dropna()
grouped_us_conf_df = us_conf_df.groupby(by = ['Combined_Key'], as_index =␣
,→False).sum()
total_cases_country.index = pd.to_datetime(total_cases_country.index)
return total_cases_country
dt_series = None
if country_name != 'global':
dt_series = get_total_confirmed_cases_for_country(country_name)
else:
dt_series = get_total_confirmed_global_cases()
plt.style.use(plot_background)
plt.figure(figsize = (fig_width, fig_height))
plt.title(f'{country_name.upper()}: Total Coronavirus Cases␣
,→Reported\nCreated by {your_name.upper()}\nPowered by WhiteHat Jr', fontsize␣
,→= 16)
plt.xticks(rotation = 45)
plt.ylabel("Total Cases")
plt.grid(linestyle='--', c='grey')
plt.show()
# Add minimap
def add_minimap(map_name):
# Plugin for mini map
minimap = plugins.MiniMap(toggle_display = True)
map_name.add_child(minimap) # Add minimap
14
plugins.ScrollZoomToggler().add_to(map_name) # Add scroll zoom toggler to␣
,→map
plugins.Fullscreen(position='topright').add_to(map_name) # Add full screen␣
,→button to map
# Function to create folium maps using for India, US and the world
def folium_map_with_circles(your_name, country, map_width, map_height,␣
,→left_margin, top_margin, map_tile, zoom, circle_color, minimap):
last_col = conf_df.columns[-1]
if country == 'India':
india_map = folium.Map(location = [22.3511148, 78.6677428],
width = map_width, height = map_height,
left = f"{left_margin}%", top = f"{top_margin}%",
tiles = map_tile, zoom_start = zoom)
if minimap == True:
add_minimap(india_map)
india_df.loc[i,␣
,→'Confirmed'],
india_df.loc[i,␣
,→'Last_Updated_Time']),
color = circle_color,
fill = True).add_to(india_map)
return india_map
15
elif country == 'US':
us_map = folium.Map(location = [39.381266, -97.922211],
width = map_width, height = map_height,
left = f"{left_margin}%", top = f"{top_margin}%",
tiles = map_tile, zoom_start = zoom)
if minimap == True:
add_minimap(us_map)
grouped_us_conf_df.
,→loc[i, last_col],
last_col),
color = circle_color,
fill = True).add_to(us_map)
return us_map
grouped_conf_df.
,→loc[i, last_col],
last_col),
color = circle_color,
fill = True).add_to(world_map)
return world_map
else:
print("\nWrong input! Enter either India, US or World.\n")
16
# Total confirmed cases in the descending order.
grouped_conf_df = conf_df.groupby(by='Country/Region', as_index=False).sum()
desc_grp_conf_df = grouped_conf_df.sort_values(by=conf_df.columns[-1],␣
,→ascending=False)
# Function to create a bar plot displaying the top 10 countries having the most␣
,→number of coronavirus confirmed cases.
fontsize = 16)
sns.barplot(desc_grp_conf_df[last_col].head(num_countries),␣
,→desc_grp_conf_df['Country/Region'].head(num_countries), orient = 'h')
total_daily_cases.index = pd.to_datetime(total_daily_cases.index)
return total_daily_cases
# Line plot for the daily (non-cumulative) confirmed cases in various countries.
def daily_cases_line_plot(your_name, num_countries, width, height):
plt.figure(figsize=(width, height))
plt.title(f'Non-Cumulative COVID-19 Confirmed Cases\nCreated by {your_name.
,→upper()}\nPowered by WhiteHat Jr', fontsize = 16)
plt.xticks(rotation=45)
plt.legend()
plt.grid('major', linestyle='--', c='grey')
17
plt.show()
Activity 2: Line Plot^ Let’s create a line plot to visualise the total number of confirmed cases
in India till yesterday. For the line plot, the dataset that we have on coronavirus is maintained at
18
Johns Hopkins University which gets according to the US time. Hence, we have data updated till
yesterday.
To view this dataset, write conf_df[conf_df['Country/Region'] == 'India'] in the code cell
below.
[ ]: # Student Action: Write conf_df[conf_df['Country/Region'] == 'India'] to view␣
,→the dataset for India that will be used to create a line plot.
conf_df[conf_df['Country/Region'] == 'India']
So, in this dataset, we have data for the total confirmed cases in India starting from January 22,
2020. The date given here is in the MM/DD/YY format where
• MM stands for month
• DD stands for day
• YY stands for year
Now, let’s create a line plot. To create a line plot, you need to use the line_plot() function which
takes the following inputs:
• Name of the person who is creating the line plot which should be a text value enclosed within
single-quotes ('') or double-quotes ("").
• The background style of the line plot which should be a text value enclosed within single-
quotes ('') or double-quotes ("").. Here is the list of most commonly used background styles:
1. 'dark_background' (most preferred)
2. 'ggplot'
3. 'seaborn'
4. 'fivethirtyeight'
and many more.
• Width of the line plot (numeric value).
• Height of the line plot (numeric value).
• Name of the country which should be a text value enclosed within single-quotes ('') or
double-quotes ("").
• Colour of the lines which should be a text value enclosed within single-quotes ('') or double-
quotes (""). Here’s the list of most commonly used colours:
1. 'red'
2. 'cyan'
19
3. 'magenta'
4. 'yellow'
5. 'green'
• The width of the line (numeric value)
• The marker style on the line plot which should be a text value enclosed within single-quotes
('') or double-quotes (""). Here is the list of the most commonly used marker styles:
1. 'o' for a circular marker
2. '*' for a starred marker
3. '^' for a upper triangular marker
[ ]: # Student Action: Create a line plot for the total confirmed cases in India␣
,→using the 'line_plot()' function.
# You can try the 'US', 'Russia', 'global', 'Brazil' values instead of 'India'.
Note: The line_plot() function is NOT a standard Python function. It is a user-defined function
created at WhiteHat Jr using Python to simplify the line plot creation process. You will learn to
create your own user-defined function in the subsequent classes in this course.
Activity 3: Map^^ Let’s create a map for India. For this, we are going to use a dataset showing
statewise data for India. To view the first five rows for the total confirmed cases in India, call the
head() function on the india_df variable which stores the data.
[ ]: # Student Action: List the first five rows of the dataset containing the total␣
,→number of confirmed cases in India.
india_df.head() # You may enter 10 to view the first 10 rows in the DataFrame.
20
[ ]: State Confirmed … Latitude Longitude
1 Maharashtra 284281 … 19.531932 76.055457
2 Tamil Nadu 160907 … 10.909433 78.366535
3 Delhi 120107 … 28.651718 77.221939
4 Gujarat 45567 … 22.385001 71.745263
5 Karnataka 51422 … 14.520390 75.722352
[5 rows x 14 columns]
Let’s now create a map for India to show the state-wise total confirmed cases of coronavirus. Using
the latitude and longitude values (which are numeric values with decimal), we can create circular
markers on a map. For this, you need to use the folium_map_with_circles() function which
takes the following inputs:
• Name of the person who is creating the map which should be a text value enclosed within
single-quotes ('') or double-quotes ("").
• Name of the country for which a map needs to be created. It should be a text value enclosed
within single-quotes ('') or double-quotes (""). For the map only three values are supported:
1. 'India'
2. 'US'
3. 'World'
• Width of the map (numeric value).
• Height of the map (numeric value).
• Left margin for the map (numeric value).
• Top margin for the map (numeric value).
• The background style of the line plot which should be a text value enclosed within single-
quotes ('') or double-quotes (""). Here is the list of most commonly used background styles:
1. 'OpenStreetMap'
2. 'Stamen Terrain'
3. 'Stamen Toner'
• Initial zoom in value (a numeric value)
• Colour of the circles on the map should be a text value enclosed within single-quotes ('') or
double-quotes (""). Here’s the list of most commonly used colours:
1. 'red'
2. 'blue'
3. 'magenta'
4. 'yellow'
5. 'green'
21
• Whether you want the map to have a minimap or not; True for yes and False for no.
[ ]: # Student Action: Create a map for India to show the state-wise total confirmed␣
,→cases of coronavirus.
[ ]: <folium.folium.Map at 0x7f046c4c02e8>
india_map.save('/content/index.html')
Activity 1: Bar Plot Let’s create a bar chart displaying the top 10 countries having the most
number of coronavirus confirmed cases. For this, you need to use the bar_plot() function. It
takes four inputs:
1. Name of the person who is creating the bar plot which should be a text value enclosed within
single-quotes ('') or double-quotes ("").
2. Number of countries to be listed in the bar plot (numeric value)
3. Width of the bar plot (numeric value)
4. Height of the bar plot (numeric value)
[ ]: # Student Action: Create a bar plot to find out the top 10 worst affected␣
,→countries due to coronavirus.
22
bar_plot('student name', 10, 15, 7)
Activity 2: Daily (Non-Cumulative) Confirmed Cases Line Plot Let’s create line plots
for daily confirmed cases for the top 4 worst affected countries. For this, you need to use the
daily_cases_line_plot() function. It takes four inputs:
1. Name of the person who is creating the line plot which should be a text value enclosed within
single-quotes ('') or double-quotes ("").
2. Number of countries (numeric value) for which the line plot needs to be created.
3. Width of the plot (numeric value)
4. Height of the plot (numeric value)
[ ]: # Student Action: Create a line plot for the non-cumulative (or daily)␣
,→confirmed cases for the top 4 worst affected countries.
23
Note:
1. daily_cases_line_plot() is not a standard Python function.
2. The figure shows total non-cumulative (or daily) confirmed cases starting from 15 March
because that is the time when most of the countries started testing for coronavirus in bulk.
Before that day, most of the countries were testing very few people.
Activity 3: Map for the USA Create a map for the United States for America.
[ ]: # Student Action: Create a map for the US.
folium_map_with_circles('student name', 'US', 1200, 500, 8, 2, 'OpenStreetMap',␣
,→4, 'magenta', True)
[ ]: <folium.folium.Map at 0x7f046be04f98>
Most number of the confirmed cases are recorded in the eastern region in the USA. Also, New York
is the most affected city in the USA.
[ ]: # Student Action: Export the US map as an HTML file.
us_map = folium_map_with_circles('student name', 'US', 1200, 500, 8, 2,␣
,→'OpenStreetMap', 4, 'magenta', True)
us_map.save('/content/usmap.html')
24
folium_map_with_circles('student name', 'World', 1200, 500, 8, 2,␣
,→'OpenStreetMap', 2, 'red', True)
[ ]: <folium.folium.Map at 0x7f04697685f8>
world_map.save('/content/worldmap.html')
1.0.6 Activities
Teacher Activities
1. COVID-19 Outbreak Analysis (Class Copy)
https://colab.research.google.com/drive/1O3WgznWzmLVKtuCiOfj-YNvF5jaMtSMh
2. COVID-19 Outbreak Analysis (Reference)
https://colab.research.google.com/drive/116fIgPAi3Is-f6kf7Vab8xnSaSurbesC
3. COVID-19 Live Dashboard by Johns Hopkins University
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
4. Transit Principle Video
https://www.youtube.com/watch?v=mM3PYiyjn1o
5. Image for Star 1
https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/star_0_brightness_curve.png
6. Image for Star 2
https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/star_5085_brightness_curve.png
Student Activities
1. COVID-19 Live Dashboard by Johns Hopkins University
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
2. Transit Principle Video
https://www.youtube.com/watch?v=mM3PYiyjn1o
3. Image for Star 1
https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/star_0_brightness_curve.png
4. Image for Star 2
https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/star_5085_brightness_curve.png
25
1.0.7 Wrap-Up Pitch
This pitch is to be presented after all the activities are covered.
Hope you enjoyed the class. Let me again give you a quick summary of what you will learn in this
course. This is the world’s first machine learning + artificial intelligence curriculum for grade 10 to
12 students having industry-grade projects that use real data sets; just like we used for COVID-19
outbreak analysis.
Students best learn the power of AI & ML through our project-based learning philosophy where
we solve cutting edge problems in life tech and space tech across each of the below modules:
• Python & Exploratory Data Analysis: We will put you into an accelerated Python
programming track as soon as you start and you will become a Certified Python Programmer
as early as in the 12th class. You will then use exploratory data analysis to visualize large
data sets and identify patterns. In one of the projects, we will be using the data collected by
the meteorology department to make weather predictions.
• Supervised Machine Learning: In this module, you will be deploying machine learning
algorithms which allows a machine to analyze more complex data sets and then make ad-
vanced predictions. In one of the Lifetech projects, we will be visualizing the data of various
biomarkers like cholesterol, blood pressure, pulse rate etc to predict the likelihood of a patient
having heart disease.
• Unsupervised Machine Learning: In this module, you will be deploying ML algorithms
which can work on unstructured datasets and make very accurate predictions.
In one of the Space Tech projects, we will be using the data captured by the Kepler telescope to
predict stars that have planets beyond our milky way galaxy.
Let me explain how? We will be using the data of the brightness of various stars beyond our
milky way galaxy recorded by the Kepler telescope. The transit principle will be used to train our
machine learning model to detect the stars having planets.
Let me first explain the Transit Principle to you.
https://www.youtube.com/watch?v=mM3PYiyjn1o
Note: Ask the student/parent to click on the link provided in the Student Activities section
under the title Transit Principle Video to watch the YouTube video on the Transit Principle.
Now let me show you the brightness over a period of time for 2 different Stars. Can you tell me
which one of them has a planet?
Image for Star 1
26
Image for Star 2
Note: Ask the student/parent to click on the links provided in the Student Activities under the
title Image for Star 1 and Image for Star 2 to view the images to answer your question.
Answer: Star 1 has a planet.
Great! We will use this principle to train the machine learning model and predict stars having
planets on a dataset of over 5000 stars.
Along the course, you will be doing various such space tech projects for predicting eclipses, near-
earth objects, etc
By 2025, it is predicted that 463 Exabytes (equivalent to 463 million Terabytes) of data will be
created each day globally.
With an app serving every need from shopping to food delivery, the next wave of digital transfor-
mation is going to be led by machine learning and artificial intelligence.
Regardless of which path or career we chose, we will need to work closely with data. Our goal is
to impart you all the skills to become a true tech entrepreneur and make you ready to lead this
transformation.
27
to have your kid as my student since your kid is exceptionally bright with true entrepreneurship
potential!
Congratulations <kid name> again on being awarded 3 hats off–you’re truly exceptional. Hope you
had fun. Thank you for your time today. Kindly stay on the panel and do not close this page when
I end the class–our entire curriculum along with the details will be displayed on the panel.
28