Professional Documents
Culture Documents
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
Statistically,
CPM = f (SI, HWF, HB, GDP, CDR, DP, FS, MS, X, Y, LE)
CPM = Bo + B1(SI) + B2(HWF) + B3(HB) + B4(GDP) + B5(CDR) + B6(DP) + B7(FS) + B8(MS) + B9(X) + B10(Y) +
B11(LE) + U
CPM^ = b*0 + b*1(SI) + b*2(HWF) + b*3(HB) + b*4(GDP) + b*5(CDR) + b*6(DP) + b*7(FS) + b*8(MS) + b*9(X)
+ b*10(Y) + b*11(LE)
Significance Value: 5%
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
July-2020
CPM^ = -1774.47 + 25.75(SI) - 0.013(HWF) - 0.004(HB) - 81.32(GDP) - 689.52(CDR) - 11.48(DP) + 736.45(FS)
+ 240.14(MS) + 131.86(X) - 166.36(Y) + 79.40(LE)
Regression Statistics
Multiple R 0.573470227
R Square 0.328868101
Adjusted R Square 0.267347677
Standard Error 1439.205169
Observations 132
ANOVA
df SS MS F Significance
F
Regression 11 1.22E+08 11072555 5.345674 7.46E-07
Residual 120 2.49E+08 2071312
Total 131 3.7E+08
• From ANOVA, it can be observed that F-Value for the model is 7.46E-07, hence the model is significant.
• It can be understood from the above table that, In the absence of all the other independent variables,
Cases per million (CPM) can have minimum value of -1774.47.
• In the presence of any of the independent variable for example Stringency Index (SI), it can be interpreted
as 1 unit increase in SI would cause 25.753 units increase in CPM, keeping other factors constant. However,
for GDP per capita (GDP), 1 unit increase in GDP will cause 81.320 units of decrease in the value of CPM,
keeping other factors constant. Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are significant since
if P-Value is low, Null Hypothesis must go.
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
• R2 being 0.328, suggests that Model considered can explain 32.8% of CPM. That is Remaining percentage is
unexplained due to factors out of model which have not been considered. Therefore, factors SI, HWF, HB, GDP,
CDR, DP, FS, MS, X, Y & LE explains only 32.8% dependency on CPM.
• As the p value of the intercept, coefficient of the Average Stringency Index, coefficient of cardiovascular
death rate, coefficient of diabetes prevalence, coefficient of the male and female smokers, coefficient of
aged 65 older is less than 5%, these are significant and rest factors have p value more than 5% so these are
not significant.
June-2020
CPM^ = -1788.61 + 8.69(SI) - 0.00925(HWF) - 0.0133(HB - 45.75(GDP) – 878.32(CDR) – 14.1978(DP) +
966.05(FS) + 327.5(MS) + 188.8(X) – 295.5(Y)- 21.5(LE)
Regression Statistics
Multiple R 0.563116
R Square 0.3171
Adjusted R Square 0.254501
Standard Error 1587.827
Observations 132
ANOVA
df SS MS F Significance
F
Regression 11 1.4E+08 12771266 5.06556 1.81E-06
Residual 120 3.03E+08 2521195
Total 131 4.43E+08
GDP per capita (GDP), 1 unit increase in GDP will cause 45.75 units of decrease in the value of CPM, keeping
other factors constant. Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are significant since
if P-Value is low, Null Hypothesis must go.
• R2 being 0.317, suggests that Model considered can explain 31.7% of CPM. That is Remaining percentage
is unexplained due to factors out of model which have not been considered. Therefore, factors SI, HWF,
HB, GDP, CDR, DP, FS, MS, X, Y & LE explains only 31.7% dependency on CPM.
• As the p value of the intercept, coefficient of the Hospitals beds per million, coefficient of cardiovascular
death rate, coefficient of diabetes prevalence, coefficient of the male and female smokers, coefficient of
aged 65 older and 70 older is less than 5%, these are significant and rest factors have p value more than
5% so these are not significant.
May-2020
CPM^ = 250.86 + 9.15(SI) - 0.00438(HWF) - 0.0084(HB) + 7.22(GDP) – 2.59(CDR) + 132.27(DP) + 14.44(FS) -
88.90(Y) + 109.688(LE)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.498329
R Square 0.248332
Adjusted R Square 0.192881
Standard Error 32.74248
Observations 132
ANOVA
df SS MS F Significance
F
Regression 9 43210.57 4801.174 4.478416 4.31E-05
Residual 122 130792.5 1072.07
Total 131 174003.1
• From ANOVA, it can be observed that F-Value for the model is 0.0006210041582383, hence the model is
significant.
• It can be understood from the above table that, In the absence of all the other independent variables,
Cases per million (CPM) can have minimum value of 250.86.
• In the presence of any of the independent variable for example Stringency Index (SI), it can be interpreted
as 1 unit increase in SI will cause 9.15 units increase in CPM, keeping other factors constant. However, for
Cardiovascular Death Rate (CDR), 1 unit increase in CDR will cause 2.59 units of decrease in the value of
CPM, keeping other factors constant. Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are significant since
if P-Value is low, Null Hypothesis must go.
• R2 being 0.207, suggests that Model considered can explain 20.7% of CPM. That is Remaining percentage is
unexplained due to factors out of model which have not been considered. Therefore, factors SI, HWF, HB, GDP,
CDR, DP, FS, MS, X, Y & LE explains only 20.7% dependency on CPM.
• As the p value of the coefficient of the female smokers less than 5%, this is significant and rest factors have
p value more than 5% so these are not significant.
April-2020
CPM^ = -969.6 + 2.83(SI) - 0.018(HWF) - 0.0040(HB) – 24.84(GDP) – 3.22(CDR) + 24.90(DP) + 40.82(FS)-
39.78(Y) + 75.5(LE)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.48883
R Square 0.238954
Adjusted R Square 0.196674
Standard Error 959.788
Observations 172
ANOVA
df SS MS F Significance
F
Regression 9 46856539 5206282 5.651673 8.6E-07
Residual 162 1.49E+08 921193.1
Total 171 1.96E+08
Female smokers 40.82922 12.42525 3.285986 0.001246 16.29287 65.36556 16.29287 65.36556
aged 70 older -39.7867 48.68063 -0.8173 0.414957 -135.917 56.34369 -135.917 56.34369
life expectancy 75.52295 43.32416 1.743206 0.083195 -10.0299 161.0759 -10.0299 161.0759
• From ANOVA, it can be observed that F-Value for the model is 8.6E-07, hence the model is significant.
• It can be understood from the above table that, In the absence of all the other independent variables,
Cases per million (CPM) can have minimum value of 969.61.
• In the presence of any of the independent variable for example Stringency Index (SI), it can be interpreted
as 1 unit increase in SI will cause 2.83 units increase in CPM, keeping other factors constant. However, for
Cardiovascular Death Rate (CDR), 1 unit increase in CDR will cause 3.22units of decrease in the value of
CPM, keeping other factors constant. Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are significant since
if P-Value is low, Null Hypothesis must go.
• R2 being 0.238, suggests that Model considered can explain 23.8% of CPM. That is Remaining percentage is
unexplained due to factors out of model which have not been considered. Therefore, factors SI, HWF, HB, GDP,
CDR, DP, FS, MS, X, Y & LE explains only 23.8% dependency on CPM.
• As the p value of the intercept, coefficient of the male and female smokers is less than 5%, these are
significant and rest factors have p value more than 5% so these are not significant.
March-2020
CPM^ = 654.76 – 0.426(SI) - 0.00147(HWF) - 0.0011(HB) + 80.94(GDP) – 1.44(CDR) – 1.50(DP) + 33.84(FS)) -
78.12(Y) – 7.31LE)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.536895
R Square 0.288256
Adjusted R Square 0.233506
Standard Error 572.818
Observations 127
ANOVA
df SS MS F Significance
F
Regression 9 15547952 1727550 5.264988 5.15E-06
Residual 117 38390091 328120.4
Total 126 53938043
GDP per Capita 80.94362 49.49364 1.635435 0.104645 -17.0759 178.9632 -17.0759 178.9632
Cardiovasc death rate -1.44579 0.529866 -2.72859 0.007342 -2.49516 -0.39641 -2.49516 -0.39641
Diabetes Prevalence -1.50334 13.85254 -0.10852 0.913766 -28.9376 25.93089 -28.9376 25.93089
Female smokers 33.84933 8.080418 4.189056 5.46E-05 17.84648 49.85217 17.84648 49.85217
aged 70 older -78.1284 37.55589 -2.08032 0.039679 -152.506 -3.75091 -152.506 -3.75091
life expectancy -7.31696 30.03836 -0.24359 0.807977 -66.8064 52.17244 -66.8064 52.17244
• From ANOVA, it can be observed that F-Value for the model is 5.1E-06, hence the model is significant.
• It can be understood from the above table that, In the absence of all the other independent variables,
Cases per million (CPM) can have minimum value of 654.76.
• In the presence of any of the independent variable for example Stringency Index (SI), it can be interpreted
as 1 unit increase in SI will cause 0.42 units decrease in CPM, keeping other factors constant. However, for
GDP Per Capita (GDP), 1 unit increase in GDP will cause 80.94units of decrease in the value of CPM.
Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are significant since
if P-Value is low, Null Hypothesis must go.
• R2 being 0.288, suggests that Model considered can explain 28.8% of CPM. That is Remaining percentage is
unexplained due to factors out of model which have not been considered. Therefore, factors SI, HWF, HB, GDP,
CDR, DP, FS, MS, X, Y & LE explains only 28.8% dependency on CPM.
• As the p value of the intercept, coefficient of cardiovascular death rate, coefficient of aged 65 older is less
than 5%, these are significant and rest factors have p value more than 5% so these are not significant.
February-2020
CPM^ = 20.5 + 0.482(SI) + 0.3074(HWF) - 3.1175(GDP) - 0.03967(CDR) - 1.28201 (DP) + 736.45(FS) + +
0.581109 (Y) + 1.937(LE)
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.866625
R Square 0.751038
Adjusted R Square 0.654219
Standard Error 8.843517
Observations 26
ANOVA
df SS MS F Significance
F
Regression 7 4246.699 606.6713 7.757172 0.000221
Residual 18 1407.74 78.2078
Total 25 5654.44
Handwashing facilities 0.30749 0.259602 1.184463 0.251634 -0.23791 0.852894 -0.23791 0.852894
GDP per Capita -3.11755 1.299785 -2.39851 0.027511 -5.8483 -0.3868 -5.8483 -0.3868
Cardiovasc death rate -0.03967 0.024719 -1.60498 0.1259 -0.09161 0.012259 -0.09161 0.012259
Diabetes Prevalence -1.28201 0.758361 -1.6905 0.108175 -2.87527 0.311247 -2.87527 0.311247
aged 70 older 0.581109 0.83109 0.699213 0.493356 -1.16495 2.327164 -1.16495 2.327164
life expectancy 1.93717 0.795527 2.435078 0.025517 0.26583 3.60851 0.26583 3.60851
• From ANOVA, it can be observed that F-Value for the model is 0.000221, hence the model is significant.
• It can be understood from the above table that, In the absence of all the other independent variables,
Cases per million (CPM) can have minimum value of 20.58889.
• In the presence of any of the independent variable for example Stringency Index (SI), it can be interpreted
as 1 unit increase in SI will cause 0.48 units increase in CPM, keeping other factors constant. However, for
GDP Per Capita (GDP), 1 unit increase in GDP will cause 3.11 units of decrease in the value of CPM. Similarly,
it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are significant except
for Handwashing facilities and Aged older 70.
• R2 being 0.751, suggests that Model considered can explain 75.1% of CPM. That is Remaining percentage is
unexplained due to factors out of model which have not been considered. Therefore, factors SI, HWF, HB, GDP,
CDR, DP, FS, MS, X, Y & LE explains only 75.1% dependency on CPM.
• As the p value of the intercept, coefficient of the Average Stringency Index, life expectancy is less than 5%,
these are significant and rest factors have p value more than 5% so these are not significant.
March-2020
DPM^ = -2.089+ 0.27(SI) - 0.013(HWF) + 8.59(HB) – 2.8(GDP) – 0.041(CDR) – 0.32(DP) + 1.69(FS) – 1.61(Y)
– 2.29(LE)
Regression Statistics
Multiple R 0.483585
R Square 0.233855
Adjusted R Square 0.174921
Standard Error 23.50812
Observations 127
ANOVA
df SS MS F Significance F
Regression 9 19735.89 2192.877 3.968063 0.000194
Residual 117 64657.89 552.6315
Total 126 84393.78
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
Avg. Stringency Index 0.279127 0.156239 1.786537 0.076602 -0.0303 0.58855 -0.0303 0.58855
Handwashing facilities 8.59E-05 0.000195 0.440007 0.660744 -0.0003 0.000472 -0.0003 0.000472
per million
Hospitals beds per -2.8E-05 6.98E-05 -0.40167 0.688658 -0.00017 0.00011 -0.00017 0.00011
million
GDP per Capita 3.390891 2.03119 1.669411 0.09771 -0.63177 7.413557 -0.63177 7.413557
Cardiovasc death rate -0.04124 0.021745 -1.89633 0.060382 -0.0843 0.001829 -0.0843 0.001829
Diabetes Prevalence -0.32882 0.5685 -0.5784 0.564107 -1.4547 0.797065 -1.4547 0.797065
Female smokers 0.691642 0.331616 2.085672 0.039182 0.034894 1.348389 0.034894 1.348389
aged 70 older -1.61868 1.541272 -1.05022 0.295779 -4.67109 1.433726 -4.67109 1.433726
life expectancy -2.29979 1.232757 -1.86556 0.064607 -4.7412 0.141624 -4.7412 0.141624
• It can be understood from the above table that, In the absence of all the other independent
variables, death rate can have minimum value of -2.08964.
• In the presence of any of the independent variable for example female smokers, it can be
interpreted as 1 unit increase in smoker would cause 0.691642 units increase in death rate,
keeping other factors constant. However, for GDP per Capita, 1 unit increase in GDP will
cause 3.390 units of increase in the value of death rate, keeping other factors constant.
Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are
significant since if P-Value is low, Null Hypothesis must go.
• R2 being 0.233, suggests that Model considered can explain 23.3% of death rate. That is
Remaining percentage is unexplained due to factors out of model which have not been
considered. Therefore, factors SI, HWF, HB, GDP, CDR, DP, FS, Y & LE explains only 23.3%
dependency on death rate.
April-2020
DPM^ = 51.82 + 0.21(SI) - 0.0003(HWF) - 9.8(HB) + 6.07(GDP) – 0.18(CDR) – 0.46(DP) + 4.05(FS) –
4.91(Y) – 0.72(LE)
Regression Statistics
Multiple R 0.517536
R Square 0.267843
Adjusted R Square 0.227168
Standard Error 76.70034
Observations 172
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
ANOVA
df SS MS F Significance F
Regression 9 348647 38738.56 6.584895 5.47E-08
Residual 162 953036.7 5882.942
Total 171 1301684
• It can be understood from the above table that, In the absence of all the other independent
variables, death rate can have minimum value of 51.82364
• In the presence of any of the independent variable for example female smokers, it can be
interpreted as 1 unit increase in smoker would cause 4.054457 units increase in death rate,
keeping other factors constant. However, for GDP per Capita, 1 unit increase in GDP will
cause 6.071301 units of increase in the value of death rate, keeping other factors constant.
Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are
significant since if P-Value is low, Null Hypothesis must go.
• R2 being 0.267, suggests that Model considered can explain 26.7% of death rate. That is
Remaining percentage is unexplained due to factors out of model which have not been
considered. Therefore, factors SI, HWF, HB, GDP, CDR, DP, FS, Y & LE explains only 26.7%
dependency on death rate.
May-2020
Regression Statistics
Multiple R 0.498329
R Square 0.248332
Adjusted R Square 0.192881
Standard Error 32.74248
Observations 132
ANOVA
df SS MS F Significance F
Regression 9 43210.57 4801.174 4.478416 4.31E-05
Residual 122 130792.5 1072.07
Total 131 174003.1
• It can be understood from the above table that, In the absence of all the other independent
variables, death rate can have minimum value of 19.22113
• In the presence of any of the independent variable for example female smokers, it can be
interpreted as 1 unit increase in smoker would cause 0.976192 units increase in death rate,
keeping other factors constant. However, for GDP per Capita, 1 unit increase in GDP will
cause 2.797475 units of increase in the value of death rate, keeping other factors constant.
Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are
significant since if P-Value is low, Null Hypothesis must go.
• R2 being 0.248, suggests that Model considered can explain 24.8% of death rate. That is
Remaining percentage is unexplained due to factors out of model which have not been
considered.
• Therefore, factors SI, HWF, HB, GDP, CDR, DP, FS, Y & LE explains only 24.8% dependency on
death rate.
June-2020
DPM^ = 10.73 + 0.116(SI) + 7.13(HWF) -0.00012(HB) - 0.36(GDP) – 21.7(CDR) - 0.34(DP) + 21.3(FS) –
8.76(MS) + 1.194(X) – 1.145(Y) – 1.49(LE)
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
Regression Statistics
Multiple R 0.555506
R Square 0.308587
Adjusted R Square 0.245207
Standard Error 26.2231
Observations 132
ANOVA
df SS MS F Significance F
Regression 11 36828.88 3348.08 4.868865 3.4E-06
Residual 120 82518.11 687.6509
Total 131 119347
Avg. Stringency Index 0.116113 0.16302 0.712266 0.477683 -0.20665 0.438881 -0.20665 0.438881
Handwashing facilities 7.13E-06 0.000195 0.036489 0.970953 -0.00038 0.000394 -0.00038 0.000394
per million
Hospitals beds per -0.00012 8.08E-05 -1.50653 0.13456 -0.00028 3.83E-05 -0.00028 3.83E-05
Million
GDP per Capita -0.36449 2.19629 -0.16596 0.868471 -4.71299 3.984014 -4.71299 3.984014
Cardiovasc death rate -21.7362 5.817333 -3.73646 0.000287 -33.2541 -10.2183 -33.2541 -10.2183
Diabetes Prevalence -0.3414 0.104149 -3.27798 0.001368 -0.5476 -0.13519 -0.5476 -0.13519
Female smokers 21.35626 6.93287 3.080435 0.002563 7.629655 35.08286 7.629655 35.08286
Male smokers 8.76054 1.791963 4.888795 3.18E-06 5.212578 12.3085 5.212578 12.3085
aged 65 older 1.194068 0.666279 1.792143 0.07563 -0.12512 2.513254 -0.12512 2.513254
aged 70 older -1.45451 1.915234 -0.75944 0.449078 -5.24654 2.337523 -5.24654 2.337523
life expectancy -1.49428 1.35375 -1.10381 0.271885 -4.17461 1.186051 -4.17461 1.186051
• It can be understood from the above table that, In the absence of all the other independent
variables, death rate can have minimum value of 10.73925
• In the presence of any of the independent variable for example female smokers, it can be
interpreted as 1 unit increase in smoker would cause 21.35626 units increase in death rate,
keeping other factors constant. However, for male smokers, 1 unit increase in smoker will
cause 8.76054 units of increase in the value of death rate, keeping other factors constant.
Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are
significant since if P-Value is low, Null Hypothesis must go.
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
• • R2 being 0.308, suggests that Model considered can explain 30.8% of death rate.
That is Remaining percentage is unexplained due to factors out of model which have not
been considered. Therefore, factors SI, HWF, HB, GDP, CDR, DP, FS, MS, X, Y & LE explains
only 30.8% dependency on death rate.
July-2020
DPM^ = -8.99 + 0.54(SI) - 3.300(HWF) -8.89(HB) - 2.919(GDP) – 19.899(CDR) - 0.3183(DP) +
19.288(FS) + 8.09(MS) + 0.94(X) – 0.046(Y) + 1.64(LE)
Regression Statistics
Multiple R 0.577905027
R Square 0.33397422
Adjusted R Square 0.272921857
Standard Error 31.147874
Observations 132
ANOVA
df SS MS F Significance F
Regression 11 58379.45 5307.222 5.470291 5.04E-07
Residual 120 116422.8 970.1901
Total 131 174802.3
• From ANOVA, it can be observed that F-Value for the model is 5.04E-07, hence the model is
significant.
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
• It can be understood from the above table that, In the absence of all the other independent
variables, death rate can have minimum value of -8.993
• In the presence of any of the independent variable for example female smokers, it can be
interpreted as 1 unit increase in smoker would cause 19.288 units increase in death rate,
keeping other factors constant. However, for male smokers, 1 unit increase in smoker will
cause 8.099 units of increase in the value of death rate, keeping other factors constant.
Similarly, it can be said for other independent variables as well.
• Checking P-Values individually for independent variable suggests that all the variables are
significant since if P-Value is low, Null Hypothesis must go.
• R2 being 0.333, suggests that Model considered can explain 33.3% of death rate. That is
Remaining percentage is unexplained due to factors out of model which have not been
considered. Therefore, factors SI, HWF, HB, GDP, CDR, DP, FS, MS, X, Y & LE explains only
33.3% dependency on death rate.
1. Does Government Response to the Covid-19 crisis have an impact on how the pandemic has affected
each country?
We have generated a model month wise to study how no. of cases per million and no. of deaths per million
in different countries have been reported due to the government response i.e., Stringency Index.
2. Do the development indicators tell a story on the effectiveness of a country’s response to the Covid-19
pandemic?
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
We have generated a model month wise to study how no. of cases per million and no. of deaths per million
in different countries have been reported due to the Hand Washing Facilities per million, Hospital beds per
million, GDP per Capita.
3. Experts are of the opinion that lifestyle diseases and comorbidities are major risk factors in the fight
against the Covid-19 pandemic. Certain lifestyle choices and certain demographic factors also tend to
worsen the situation. Does the data give credence to these claims?
We have generated a model month wise to study how no. of cases per million and no. of deaths per million
in different countries have been reported due to the cardiovascular diseases proportion, diabetes
prevalence proportion, male and female smoker proportion and life expectancy in the countries.
4. How has Covid-19 affected different age groups with respect to severity in terms of number of deaths
reported?
We have generated a model to study how the no. of deaths in a country has affected based on the factors
of the number of people above age 65 and 70.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.478988
R Square 0.22943
Adjusted R 0.220417
Square
Standard 13134.46
Error
Observations 174
ANOVA
df SS MS F Significance
F
Regression 2 8783290056 4391645028 25.45676 2.1E-10
Residual 171 29499880104 172513918.7
Total 173 38283170160
Dependent Variable: Total number of deaths for month January to July (D)
Independent Variable:
Total population older than 65, X
Total population older than 70, Y
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
5. How can we assess an overall performance of world against Covid-19 from this data?
Box-whisker can be plotted to represent the overall performance of the world with variables as cases
per million/deaths per million for different countries.
i. Indicating middle 50% range of the data.
ii. Plotting quartile points, the Lower & Upper Whisker points and points which are beyond these
are the outliers.
The outliers will show us that the countries which are most affected by the virus.
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)
Business Statistics & Analytics Group Assignment Report
Apoorva Chhangani (p41071), Rahul Jha (p41098), Ram Sandeep Peddada (p41100)
Saloni Sharma (p41104), Shubhayan Modak (p41115)