Data Analytics Assignment 1

Assignment – I
BFT - 6
(Deepening Specialisation 2: Apparel Production Management)
Name of the Subject : Data Analytics & R Name : Radhika Chandak

Subject Code : BFT603DS2 Roll No. : BFT/19/21
Subject Id : 15250 Date of Submission : 27.04.2022
Assignment:
Using data collection methods and applying principles of statistics carry out the following:
Identify problems faced in industrial engineering and collect appropriate data.
Use any one of the following methods in Principle of Forecasting:
Time Series
Solution:
We start by installing a package already available for covid cases , ie covid19.analytics .

To begin , we take out the time series of the confirmed cases and then death cases .
The code will be as follows :
ag<-covid19.data(case='aggregated')
tsc<-covid19.data(case = 'ts-confirmed')
#summary
report.summary(Nentries=10 , graphical.output = F)
- We will be able to see graphs and charts on the right side under plots , upon
zooming we observe :
● We see that the range of dates is from : january 2020 to april 2022 , it is for top 10
countries .
● The pie chart and bar graph show the countries with the confirmed cases and death
cases respectively.
● While Us has the highest no. of confirmed cases , Turkey has the least .
● For death cases , the US is again the highest but France is the lowest .
TIME SERIES - CONFIRMED CASES

TIME SERIES - DEATH CASES
Time Series Worldwide TOTS ****

ts-confirmed ts-deaths ts-recovered
511748975 6228621 0
1.22% 0%
**** Time Series Worldwide AVGS ****
1801933.01 21931.76 0
1.22% 0%
**** Time Series Worldwide SDS ****
6617130.29 86526.01 0
1.31% 0%
- Then we take out the total per location for our country India and the country with
most cases , ie. , Us .
#total per location

tots.per.location(tsc, geo.loc = c('us' ,'india'))
So under running model we get the linear regression model .

● On the top we can see no. of cases in the log scale and x axis represent no. of days
. Each line of the plot represents the linear regression model . The plot has the
cumulative values and we can see the concave pattern , that is the increasing trend
and then the small concave pattern showing decrease in trend .
● At the bottom we have a bar chart and the values are in the log scale for y axis .
Similarly , we also get it for Us .
LINEAR REGRESSION MODEL - India and Us

- Now to see the Growth Rate of specific countries we can type (For India here )
#growth rate
growth.rate(tsc, geo.loc = 'india')
We can see that we get 2 plots , on the top , y has 2 axis ,one in regular and other in log
scale , what we can observe from here is that during the second lockdown the cases were
increasing more rapidly than before the first lockdown .
At the bottom we have the growth rate as a part of log scale .
- Now let us extract one more time series data , for all the cases and we save it into
tsa - the name of dataframe.
tsa<-covid19.data(case = 'ts-ALL')
And then using
#TOTALS PLOT
totals.plt(tsa)
We can create interactive data for time series cases .

In the linear graph and log graph , we can see that there are around 511.79 million confirmed
cases and 505.520 million active cases ,and so on .
- To see the different Covid cases across the globe we can use the function of live.map
with the dataframe tsa .
#live map
live.map(tsa)
By clicking on the viewer and scrolling on the particular countries we can see the no. of
cases .
- One of the model that is popular among the researchers working on covid 19 data is
called as SIR model . This groups the people into 3 categories , in the first category
we have
● S-people who are healthy but susceptible to the disease .

● I- people who are infected
● R- people who are recovered
We use the function called generate sir model :
#sir model
generate.SIR.model(tsc, 'india',tot.population = 1383000000)
So on the top we have two plots ,

● On the left we have yn axis which represents no. of infected people in the regular
scale and x axis represents no. of days for the first 25 days and the plot is created .
● On the right , the y axis represents no. Of infected people in the log scale and x axis
represents no. of days for the first 25 days .
● In the bottom we have no. of subjects in the log scale . The 3 different lines are
different linear models. Blue shows people susceptible , red shows infected and
green shows recovered people .
● We can observe that from 0 to day 90 approx the no. of people getting infected
reaches to peak and no. of people recovered also reaches to peak .
This is a screenshot of the coding .

Data Analytics Assignment 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analytics Assignment 1

Uploaded by

Copyright:

Available Formats

Assignment – I

Name of the Subject : Data Analytics & R Name : Radhika Chandak

We start by installing a package already available for covid cases , ie covid19.analytics .

TIME SERIES - CONFIRMED CASES

Time Series Worldwide TOTS ****

#total per location

So under running model we get the linear regression model .

LINEAR REGRESSION MODEL - India and Us

And then using

We can create interactive data for time series cases .

● S-people who are healthy but susceptible to the disease .

We use the function called generate sir model :

So on the top we have two plots ,

You might also like