You are on page 1of 25

1

A.J.M.V.P.S.s.
New Arts, Commerce & Science College, Ahmednagar.
2

STATISTICAL PROJECT

SY.B.SC
Under Government Scheme

DBT STAR

2020-2021
3

“Regression analysis is the hydrogen bomb of the statistics


arsenal.”

― Charles Wheelan
4

TITLE OF PROJECT:

TO STUDY AND INTERPRET THE


INCREASE OF COVID-19 DISEASE IN THE
STATE OF MAHARASHTRA , GUJRAT AND
KERALA BY COLLECTING THE DATA IN
THE MONTH OF APRIL AND MAY.
5

A.J.M.V.P.S.s.
New Arts, Commerce & Science College, Ahmednagar.

Certificate
This is to certify that Mr. Pandhare Prasad Asaram of the
F.Y.B.Sc Successfully completed project on the topic
“INCREASE OF COVID-19 DISEASE IN THE MONTH
OF APRIL AND MAY” guided by
Prof. K.B.Mane during academic year 2020-2021 as per the
guideline issued by department of Statistics.

Prof. K.B.Mane Examiner Prof. M.S. Kasture


(TeacherIn-charge) (H.O.D.
Statistics Department)
6

Name of the Student

Sr. No. Name of the Students


1 Pandhare Prasad Asaram
2 Kalbhor Nikhil Sagar
3 Ghodake Mahesh Ashok
4 Jagtap Tanishq Jitendra
5 Shinde Abhijeet Maruti
6 Anarase Dnyaneshwari
7

INDEX

Sr. No. Topic Page No.

1 Acknowledgement 8

2 Introduction Of Data 9

3 Methodology 10

4 Tools Used In Project 11

5 Interpretation 12-17

6 Graphs 18-22

7 Limitation 23

8 Summary 24
8

ACKNOWLEDGEMENT

We take this opportunity to express our profound gratitude and deep


regards to our guide Prof. A.B. Bhasme for his guidance, monitoring and
constant support.
Throughout the course of this project. We also take this opportunity to
express a deep sense of gratitude to our H.O.D. Prof. M.S. Kasture for
his cordial support, valuable information and guidance, which help us in
completing this task through various stages.
We are also thankful to our group for their kind co-operation and
encouragement to complete this project.
9

INTRODUCTION OF DATA

Statistics has become an integral part of our daily lives.statistics can be


considered as numerical statement of facts which are highly convenient
and meaningful forms of communication. The sub of statistics is
primarily concerned with making decisions about various properties of
some population of interest such as stock, market trend unemployment
rates, inflation rates over the years and so on.
In this project we shall study the inflation of covid-19 disease in the
month of April and May which is the middle term of that disease where
the disease is increasingly rapidly. Here we taking the daywise data of
two months with three variables such as No. of active cases, No. of
recoveries and Death of three states. As we know that largest no. of
cases are to be found on Maharashtra state, which has highest death ratio
also. So we compare the three states such as Maharashtra, Gujrat and
Kerala for the detail study of spread of covid-19 disease.
We can see that the variables i.e. No. of active cases, No. of recoveries
and No. of deaths are interdependent on each other. hence the best way
to study their interdependence and the nature of their relation,
regression, correlation and time series plot are the best statistical
methods which can be implemented with the help of MS-EXCEL.
10

METHODOLOGY

The data collection was been done in the month of March 2021. We
collect that data from the website of world health organization. The data
gives us information about how the corona patients were increasing as
well ans how the patients to be recovered from that disease and how the
patients fall victim to that deadly disease.
We have tried to create suitable and aquite good atmosphere to
study the data with the help of regression model, correlation covariance
and time series model.
The data gives us information about the spread of covid-19 disease in
the month of April and May.
We entered the data daywise, which is obtained online, in MS-EXCEL.
11

Tools Used in project

 MS-EXCEL
 MS-WORD
12

Firstly, we consider the data of Maharashtra and review the covid-19 condition in
the month of April and May using summary statistics.

Maharashtra State
No. Of Positive Cases No. of Recoveries Death

4825.26229 194.0491
Mean 5 Mean 4304.5082 Mean 8
210.811682 204.73919 18.13608
Standard Error 6 Standard Error 5 Standard Error 9
Median 4987 Median 4462 Median 178
Mode #N/A Mode 2500 Mode 98
Standard 1646.49187 Standard 1599.0642 Standard 141.6473
Deviation 6 Deviation 3 Deviation 8
2710935.49 Sample 2557006.4 Sample 20063.98
Sample Variance 7 Variance 2 Variance 1
- -
1.09483722 1.1733095
Kurtosis 6 Kurtosis 2 Kurtosis -0.234885
-
0.1555969 0.886969
Skewness -0.17870797 Skewness 8 Skewness 3
Range 5855 Range 5610 Range 543
Minimum 1693 Minimum 1333 Minimum 5
Maximum 7548 Maximum 6943 Maximum 548
Sum 294341 Sum 262575 Sum 11837
Count 61 Count 61 Count 61

Regression model

No. of Positive Cases X Independent


No. Of Recoveries Y Dependent

SUMMARY OUTPUT

Regression Statistics
Multiple 0.96640
R 3
0.93393
R Square 5
Adjusted 0.93281
R Square 5
Standard 414.477
13

Error 4
Observati
ons 61

ANOVA
Significa
  df SS MS F nce F
Regressio 1.43E+ 1.43E+ 834.06 1.653E-
n 1 08 08 14 36
101357 17179
Residual 59 00 1.5
1.53E+
Total 60 08      

Standa
Coefficie rd P- Lower Lower Upper
  nts Error t Stat value 95% Upper 95% 95.0% 95.0%
- - -
165.55 1.3549 0.1805 555.583 555.5838 106.94
Intercept -224.317 08 8 91 9 106.949101 697 91
X Variable 0.93856 0.0324 28.880 1.65E- 0.87353 1.00359537 0.873535 1.0035
1 6 99 12 36 6 8 979 95

Regression equation is,

Y=0.9385 X – 224.317

There is a strong positive correlation between between the two variables x and y where
x is denoted by No. of positive cases and y is denoted by No. of Recoveries.

CORRELATION COEFFICIENT

r = 0.966403183
14

Now, we consider the data of Gujrat and review the covid-19 condition in the
month of April and May using summary statistics.

Gujrat State
No. of Positive Cases No. of Recoveries Death

170.475409 94.1311475
Mean 268.6065574 Mean 8 Mean 4
13.2443947 18.9973319
Standard Error 27.13303875 Standard Error 6 Standard Error 3
Median 191 Median 163 Median 5
Mode 190 Mode 52 Mode 0
Standard 103.442029 Standard 148.373905
Standard Deviation 211.9158071 Deviation 9 Deviation 5
Sample 10700.2535 22014.8158
Sample Variance 44908.30929 Variance 5 Sample Variance 5
-
0.82952279 1.53176245
Kurtosis 0.340866295 Kurtosis 8 Kurtosis 5
0.36758266 1.66821868
Skewness 1.074069226 Skewness 9 Skewness 8
Range 792 Range 415 Range 499
Minimum 20 Minimum 20 Minimum 0
Maximum 812 Maximum 435 Maximum 499
Sum 16385 Sum 10399 Sum 5742
Count 61 Count 61 Count 61

Regression model

No. of Positive Cases X Independent


No. Of Recoveries Y Dependent

SUMMARY
OUTPUT

Regression Statistics
15

Multiple R 0.801455
R Square 0.642331
Adjusted R
Square 0.636269
Standard Error 62.38603
Observations 61

ANOVA
Significa
  df SS MS F nce F
412386 412386 105.95
Regression 1 .2 .2 69 8.6E-15
3892.0
Residual 59 229629 17
642015
Total 60 .2      

Standa
Coefficie rd Lower Upper Lower Upper
  nts Error t Stat P-value 95% 95% 95.0% 95.0%
12.962 5.0449 4.63E- 91.330 39.455 91.330
Intercept 65.39308 19 08 06 39.45579 37 79 37
0.0380 10.293 0.4672 0.3151 0.4672
X Variable 1 0.391213 06 54 8.6E-15 0.315164 62 64 62

Regression equation is,

Y=0.3912133 X + 65.39308

There is a strong positive correlation between between the two variables x and y where
x is denoted by No. of positive cases and y is denoted by No. of Recoveries.

CORRELATION COEFFICIENT

r = 0.801455491
16

Now, we consider the data of Kerala and review the covid-19 condition in the
month of April and May using summary statistics.

Kerala State
No.of Positive Cases No. of Recoveries Death

187.590163 21.1803278
Mean 9 Mean 173.9672131 Mean 7
10.8022811
Standard Error 1 Standard Error 10.73589982 Standard Error 2.59323502
Median 214 Median 209 Median 13
Mode 77 Mode 45 Mode 10
Standard Standard 20.2538129
Deviation 84.3685125 Standard Deviation 83.85005808 Deviation 7
7118.04590 Sample 410.216939
Sample Variance 2 Sample Variance 7030.83224 Variance 9
- -
1.14044254 - 0.18311028
Kurtosis 7 Kurtosis 1.400114904 Kurtosis 6
-
0.50175308 - 1.00110850
Skewness 7 Skewness 0.284912453 Skewness 3
Range 291 Range 275 Range 71
Minimum 21 Minimum 28 Minimum 0
Maximum 312 Maximum 303 Maximum 71
Sum 11443 Sum 10612 Sum 1292
Count 61 Count 61 Count 61

Regression model

No. of Positive Cases X Independent


No. Of Recoveries Y Dependent

SUMMARY
17

OUTPUT

Regression Statistics
0.91489
Multiple R 6
0.83703
R Square 5
Adjusted R 0.83427
Square 3
34.1350
Standard Error 5
Observations 61

ANOVA
Significa
  df SS MS F nce F
35310 35310 303.04
Regression 1 3 3 02 6.44E-25
68746. 1165.2
Residual 59 91 02
42184
Total 60 9.9      

Standa
Coefficie rd P- Lower Upper Lower Upper
  nts Error t Stat value 95% 95% 95.0% 95.0%
-
3.39638 10.728 0.3165 0.7526 24.864 18.072 24.864
Intercept 2 94 63 93 -18.0722 93 2 93
0.90927 0.0522 17.408 6.44E- 1.0137 0.8047 1.0137
X Variable 1 4 33 05 25 0.804756 92 56 92

Regression equation is,

Y=0.909274 X + 3.396382

There is a strong positive correlation between between the two variables x and y where
x is denoted by No. of positive cases and y is denoted by No. of Recoveries.

CORRELATION COEFFICIENT
18

r = 0.914895994
19

GRAPHS
Percentage Bar Chart
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Maharashtra Gujrat Kerala

NUMBER OF POSITIVE CASES NUMBER OF RECOVERIES


DEATHS

CONCLUSION:

From above percentage bar


chart, it is clearly seen that death percentage as
compared to no. of positive cases obtained is higher
in Gujrat state, where in other two states the death
percentage is low.
20

GRAPHS
Chart Title
350000

300000

250000

200000

150000

100000

50000

0
Maharashtra Gujrat Kerala

NUMBER OF POSITIVE CASES NUMBER OF RECOVERIES


DEATHS

CONCLUSION:
From The above bar chart it
is clearly seen that the no. of positive cases are
greater but the recovery rate is also at maximum
comparing to gujrat. And Kerala has highest
recovery rate as we see the value of no. of
positive cases and number of recoveries are near
about same.
21

GRAPHS
Scatter Plot
600

500

400
No. of Deaths

300

200

100

0
0 10 20 30 40 50 60 70

Axis Title

Maharashtra Gujrat Kerala

CONCLUSION:
As we compare the number of
deaths in the month of april and may we have seen
that largest no. of deaths happened in the state of
Maharashtra. But as compared to No. of positive
cases in Maharashtra no. of deaths in Maharashtra
are too small than Gujrat and Kerala. At the end of
month of May Gujrat state having large no. of deaths
which is closest to the Maharashtra state..
22

GRAPHS
Scatter Plot
8000

7000

6000
No. of positive cases

5000

4000

3000

2000

1000

0
0 10 20 30 40 50 60 70

Axis Title

Maharashtra Gujrat Kerala

CONCLUSION:
As we compare the number of
positive cases in the month of april and may we
have seen that number of cases in maharashtra
state is far greater than number of cases in gujrat
and kerala. Hence, we conclude that the covid-
19 disease is widely spread in the state of
Maharashtra.
23

DEATHS

1292

5742

11837

Maharashtra Gujrat Kerala

Now we compare the no. of deaths happen in three given states


such as Maharashtra, Gujrat and Kerala.

CONCLUSION:
As we studied data, we know
the state Maharashtra has large no. of positive
cases.. Similarly, for the largest no. of deaths
happen in Maharashtra. But, as comparing to no.
of positive cases in Maharashtra the value for
no. of deaths is too small as compared to Gujrat
state.
24

Limitations

1. Due to time and financial constrains we are not able


to take a sample of large size.
2. Though the sample is taken from the online website,
the sample cannot be called a real data in the true
sense.
3. The fact that the Indian governments are biased about
revealing official information might have affected the
authenticity of data.
25

SUMMARY

There is a strong positive correlation between the two variables x and y


where x is denoted by No. of positive cases and y is denoted by No. of
recoveries.
After the implementation of regression model and correlation it is seen
that body No. of positive cases and No. of deaths are positively
correlated and the regression model is fitted successfully. The study was
done on a sample of daily two month data of the three states such as
Maharashtra, Gujrat and Kerala
It was concluded that, the no. of positive cases in Maharashtra are larger
than no. of positive cases in Gujrat and Kerala. This is the very critical
situation for Maharashtra but also the recovery rate is too much better
than the Gujrat state. No. of deaths in Maharashtra is also greater but as
compared to No. of positive cases this amount is too small than other
states. As compared to data of April and May 2020, Kerala successfully
defend against the deadliest corona virus disease, as Kerala state having
Minimum number of positive cases than other states as well as their
reacovery rate is maximum and having minimum no. of deaths.

You might also like