You are on page 1of 24

Indian Institute of Management Kozhikode

Post Graduate Program

PGP 25 | Section B

Final Project Submission: Data Analysis (Term 1)

“EFFECT OF PANDEMIC ON ONLINE SHOPPING”

Submitted to Prof. Soumya Roy

Team Members – Group 9

Harshith Krishnan (PGP/25/087)


Naman Keshan (PGP/25/088)
Naveen T (PGP/25/090)
Radha Das (PGP/25/096)
Sree Anirudh Talabattula (PGP/25/117)
Table of Contents

Table of contents ....................................................................................................................... 2


Introduction ............................................................................................................................... 3
Motivation .................................................................................................................................. 4
Sampling Method ....................................................................................................................... 5
Facts and Assumptions about the analysis ................................................................................. 5
Statistical Methods used for Data Analysis ............................................................................... 6
Descriptive Statistics ...................................................................................................... 6
Pie-Chart…………...…………………………………………………………..6
Box Plot…………………….………………………………………………….9
Histogram…………………………………………………………………….11
Two-Paired Test ........................................................................................................... 13
ANOVA and TUKEY table ......................................................................................... 14
Regression Analysis ..................................................................................................... 18
Correlation ................................................................................................................... 21
Conclusion ............................................................................................................................... 22
Limitations ............................................................................................................................... 23
Appendix .................................................................................................................................. 24

Page 2 of 24
Introduction

Ecommerce was growing fast before COVID-19 hit. But the Pandemic pushed even more
consumers online. Ecommerce thrived in 2020 because of store closures and shoppers’ fear of
contracting the coronavirus in public. And figures from Q1 2021 show that the coronavirus is
still making an impact on retail spending.

2020-21 will be etched in our history due to the COVID-19 Pandemic that has influenced our
lives in every field. COVID-19 is a novel virus that came in Dec-19 in the Wuhan province of
China. The very first case of COVID-19 in India was noticed on Jan-20. Our country went into
complete lockdown from March 23, 2020, wherein almost all services and factories were
suspended. A lethal, unrelenting second wave of COVID-19 speckled by an acute shortage of
oxygen and an overburdened healthcare system bursting at the seams came to the fore in mid-
February 2021.

We have realized that how people take this critical time of loneliness results in enormous
overnight improvements to their shopping habits. People are changing what they purchase,
where and how, from conventional buying to online shopping. It increased the shopping
through websites and mobile apps. Due to the increased risk of COVID-19, customers avoid
public places, thereby increasing customers’ attraction to online shopping. Panic buying to
fulfil the needs of autonomy, relatedness and competence have further added traction to online
shopping with a lot of uncertainty and people following crowd mentality.

The COVID-19 Pandemic has accelerated the shift towards a more digital world. The changes
we make now will have lasting effects as the world economy begins to recover. The
acceleration of online shopping globally underscores the urgency of ensuring all countries can
seize the opportunities offered by digitalization as the world moves from pandemic response
to recovery.

Page 3 of 24
Motivation

Due to the covid-19 Pandemic arising in March 2020, there has been a halt in the general day-
to-day life of people. The causes have been death, unavailability of hospital beds, and
lockdown. One of the most important among these is unemployment. Some people have lost
their jobs, or their income levels have shifted due to job change or remained unaffected. Hence
there has been a general trend shift in the income level. There has also been an enhancement
in e-commerce platforms, with a 70% internet penetration PAN India, due to lack of transport
and communication. The main aim of our project is to analyze the online spending patterns of
people aged more than 18 years old. We have looked at their income shifts and a shift in the
spending capacity before and during the Pandemic. This has helped us understand better if
online buying has been impacted in any way by lockdown and the Pandemic and if they have
been at all, then by what degree. We targeted people from all levels of society and all
demographics to keep our data set diversified. We have also gathered data from people
belonging to both the upper and lower class of the community to check the extent of the shift
in online spending habits.

Page 4 of 24
Sampling Method
Covid 19 pandemic has hit the whole world and disrupted every sector of the economy. We
wanted to know how this has impacted the online shopping pattern of people in India. For better
understanding, we relied on secondary data published by different organizations. We used the
cluster sampling method for the primary data collection, where we floated a form that contained
both open-ended and close-ended questions.

A total of 100 responses were received from the different geographic areas of the country, with
good demographic diversity. We mainly wanted to know the saving pattern, change in shopping
frequency, and amount spent on online shopping before and during the Pandemic.

Data description:
The survey from collected data online from form collected the following details:

• Gender
• Age
• Geography
• Metro or tier- 1 city
• Monthly income
• Frequency of shopping before and during the Pandemic
• Average monthly spending before and during the Pandemic
• The payment method used for online shopping
• Which online platform frequently used for online shopping
• Impact on income due to the Pandemic
• Change in the saving pattern due to the onset of the Pandemic
• Impact on expense on different categories of product due to current covid 19 pandemic.

Facts and Assumptions about the Analysis:


• The value of α = 0.05 for all cases.
• The responses of the people may not be accurate but are highly nearer to the actual
values.

Page 5 of 24
Statistical Methods used for Data Analysis
❖ Descriptive Statistics
❖ Two-Paired Test
❖ ANOVA and TUKEY table
❖ Regression Analysis
❖ Correlation

Descriptive Statistics
A. Pie-Chart and Histogram

Fig. 1: We received an almost equal number of responses from males and females. No
response was obtained from the third gender

Fig.2: Major respondents were between 18-30, with fare respondents from other age groups.
As we have income as a factor, we have avoided the <18 age group.

Page 6 of 24
Geography Are you from metro or
40 39 tier 1 cities?
35
30
31%
25
19 17
20 69%
14
15 11
10
5
0
NORTH SOUTH WEST EAST CENTRAL Yes No

Fig.3,4: We have the majority of respondents from south India and majorly from Tier 1/ Metro
cities

Income change
during
25
20
pandemic
15
25

26 25 21 6 48% 52%
13 1
7 2 1 1 4 1
1 1 1 1 1
0

How frequently did you shop online per month before the pandemic
How frequently did you shop online per month during the pandemic ? Yes No

Fig.5,6: The overall frequency of shopping has dropped during the Pandemic, it may be due
to change in income level of individual

Page 7 of 24
80 72
70 64
60
60 54
50
50 42 43
35 36 36
40
30 23 21 24 22 21 23 24
17 14 15
20
7
10
0

Increase Decrease No change

Fig.7: We can change in spending pattern in online shopping in different


categories; grocery, Healthcare and Online entertainment spending increase,
whereas we can see a decreasing trend in other types.

Fig.8,9: Shopping Platform and the Payment methods used. The majority preferred online
mode of payment than cash on delivery.

Page 8 of 24
B. Box Plot
Distribution of Amount spent before the Pandemic for three age groups

Observation:
The number of outliers for 18-30:
• 250000
• 20000
The number of outliers for 31-45:
• 50000
The number of outliers for >45:
• 40000

Page 9 of 24
Distribution of Amount spent before the Pandemic for three age groups

Observation:
The number of outliers for 18-30:
• 17425
The number of outliers for 31-45:
N/A
The number of outliers for >45:
• 40000
Conclusion: This shows that the number of outliers in the amount spent is more for the
majority of the age groups. From this, we can conclude that even though there has been a
general increase in the online spending pattern, both in frequency and amount, for the general
public, the number of extensive online spending, involving larger amounts, have decreased.

Page 10 of 24
C. Histograms
Row Labels Count of Has your income been impacted due to the pandemic ?
>45 17
No 8
Yes 9
18-30 54
No 31
Yes 23
31-45 29
No 13
Yes 16
Grand Total 100

Total

31

23
16 Total
13
8 9

NO YES NO YES NO YES


>45 18-30 31-45

Page 11 of 24
Row Labels Count of Are you from a Metro or Tier-1 city?
Central 11
No 2
Yes 9
East 17
No 9
Yes 8
North 19
No 7
Yes 12
South 39
No 9
Yes 30
West 14
No 4
Yes 10
Grand Total 100

Total

30

Total
12
9 9 9 10
8 7
2 4

NO YES NO YES NO YES NO YES NO YES


CENTRAL EAST NORTH SOUTH WEST

Page 12 of 24
Two-Paired Test
What was the impact of covid on average spending on online shopping?
Data was gathered on the amount of income spent on online shopping before and during the
pandemic time. So to determine whether the average amount of income spent on online
shopping reduced, increased or remained the same due to Pandemic, we performed t-Test:
Paired Two Sample for Means.
H0: The average spending per month on online shopping before Pandemic is less than or equal
to the average spending per month during Pandemic (i.e. mu_before <= mu_during)
H1: The average spending per month on online shopping before Pandemic is more than the
average spending per month during Pandemic (i.e. mu_before > mu_during)
To prove this:
t-Test: Paired Two Sample for Means
Before Pandemic During Pandemic
Mean 6144.444444 6429.292929
Variance 36172698.41 70218214.8
Observations 99 99
Pearson Correlation 0.849481266
Hypothesized Mean Difference 0
df 98
t Stat -0.621946502
P(T<=t) one-tail 0.267710662
t Critical one-tail 1.660551217
P(T<=t) two-tail 0.535421325
t Critical two-tail 1.984467455

Conclusion: As we can see that the P-value for one-tail is greater than 0.05. So we accept the
null hypothesis, which suggests that the average amount of income spent on online shopping
during the Pandemic is more than the average amount of income spent on online shopping
before Pandemic.
Why?
This indicates that the Pandemic had impacted people purchasing more using online shopping
platforms rather than buying at offline physical stores. It could be due to increased health
concerns during the Pandemic, restrictions on commuting due to lockdowns or increased work
from home cultures.

Page 13 of 24
ANOVA and TUKEY table
We have divided our data into three age groups:
1. 18 to 30
2. 31 to 45
3. >45
We will now analyze if the three age groups' amount was significantly different before and
during the Pandemic.

A. Analysis of online spending capacity of three age groups BEFORE the Pandemic
Ho: mu_1=mu_2=mu_2
Ha = There is indeed a significant difference in the amount spent by the three groups

18-30 31-45 >45


0 0 0
500 500 1000
1000 1000 1500
1500 1500 2000
2000 2000 5000
3000 2500 8000
3500 3000 12000
4000 4000 15000
4500 5000 20000
5000 6000 25000
6000 7000
10000 7500
12000 7800
15000 8000
20000 9000
10000
12000
14000
18000
40000

Page 14 of 24
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
18-30 15 88000 5866.66667 34195238.1
31-45 20 158800 7940 79682526.3
>45 10 89500 8950 76580555.6

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 64881666.7 2 32440833.3 0.50803595 0.60532604 3.21994229
Within Groups 2681926333 42 63855388.9

Total 2746808000 44

Analysis: Here, P-value>0.05. Hence, we will accept Ho. Thus, we can say that there was no
significant difference between the amount spent by the three age groups before the Pandemic.
To prove this:
TUKEY TABLE

Groups Difference in Meansq (1/n1+1/n2)*2


SE ME LL UL
18-30 & 31-45 -2073.333333 2.858 0.05833 1930 5515.94 -7589.27 3442.60
18-30 & >45 -3083.333333 2.858 0.08333 2306.79 6592.81 -9676.14 3509.47
31-45 & >45 -1010 2.858 0.075 2188.41 6254.49 -7264.49 5244.49

Conclusion: Since the lower limits are negative for all the groups and the upper limits are
positive, we can conclude that there is indeed 0 residing within the limit. This means that there
will be values for which the amount spent by any two of the three groups will be equal. Hence,
we can conclude that there is no significant difference in the amount paid by any of the three
age groups during the Pandemic.
Why? From the data, we see that the income of all the three groups somewhat lies in the same
range. Of course, there are a few outliers in all the three groups, as is shown in box plots in
descriptive analysis, but keeping those aside, we can say that since income is the same for all
the three groups during the Pandemic, their online spending capacity also fairly lie in the same
range.

Page 15 of 24
B. Analysis of online spending capacity of three age groups DURING the Pandemic
Ho: mu_1=mu_2=mu_2
Ha = There is indeed significant difference in the amount spent by the three groups

18-30 31-45 >45


0 0 500
500 500 1000
1000 1000 1500
1200 2000 2000
1500 2500 2500
2000 3000 3000
2500 3500 4000
3000 4000 5000
3800 5000 7000
4000 6000 10000
4500 7000 11000
5000 8000 25000
6000 9000 30000
6500 10000 40000
7000 11000
8000 12000
10000 20000
12000 50000
20000
250000

Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
18-30 20 348500 17425 3019079868
31-45 18 154500 8583.33333 131919118
>45 14 142500 10178.5714 154484890

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 836294505.5 2 418147253 0.33254454 0.71870241 3.18658235
Within Groups 61613446071 49 1257417267

Total 62449740577 51

Page 16 of 24
Analysis: Here P-value>0.05. Hence, we will accept Ho. Thus, we can say that there was no
significant difference between the amount spent by the three age groups during the Pandemic.
To prove this:
TUKEY TABLE

Groups Difference in Means q (1/n1+1/n2)*2 SE ME LL UL


>45 & 18-30 -1595.238095 2.86 0.060714286 8737.4591 24971.6581 -26567 23376.4
>45 & 31-45 8583.333333 2.86 0.063492063 8935.1003 25536.5166 -16953 34119.8
18-30 & 31-45 10178.57143 2.86 0.052777778 8146.3912 23282.386 -13104 33461

Conclusion: Since the lower limits are negative for all the groups and the upper limits are
positive, we can conclude that there is indeed 0 residing within the limit. This means that there
will be values for which the amount spent by any two of the three groups will be equal. Hence,
we can conclude that there is no significant difference in the amount spent by any of the three
age groups during the Pandemic.
Why? From the data, we see that the income of all the three groups fairly lies in the same
range. Of course, there are a few outliers in all the three groups, but keeping those aside, we
can say that since income is the same for all the three groups during the Pandemic, their online
spending capacity also fairly lie in the same range.

Page 17 of 24
Regression Analysis
The dataset contains the responses from 100 individuals specifying their monthly income levels
and average spending per month on online shopping.

With regression analysis, we explicitly assume how one variable, called the response variable
(average monthly spend), is influenced by another variable called the explanatory variable
(monthly income).

• Before Pandemic

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.229601899
R Square 0.052717032
Adjusted R Square 0.04305088
Standard Error 6008.568695
Observations 100

ANOVA
df SS MS F Significance F
Regression 1 196897119 196897119 5.453776047 0.02156505
Residual 98 3538083981 36102897.76
Total 99 3734981100

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 90.0% Upper 90.0%
Intercept 5505.606336 686.9064287 8.015074697 2.35586E-12 4142.462884 6868.749788 4364.96303 6646.249642
Monthly Income 0.006267282 0.002683679 2.335332106 0.02156505 0.000941608 0.011592956 0.001810895 0.010723669

Fitted Regression Equation

Average monthly spend = Beta_0 + Beta_1 * Monthly income + Error

Predicted average monthly spend = 5505.61 + 0.0063 * Monthly income

Beta 0 and Beta 1 are model coefficients or parameters

H0: No relationship between Monthly income and Average monthly spends


H0: Beta_1 = 0
HA: Beta_1 <> 0
Reject H0
P value < 0.05
The significant relationship between Monthly income and Average monthly spends

Similarly for Beta_0 as well, P value < 0.05


Significant relationship without income effect and average monthly spends

Page 18 of 24
Coefficients - In the absence of any income effect, on an average, monthly spend will be around
INR 5,506. An increase of INR 1 in the monthly income is associated with an increase in
average monthly spending by around INR 0.006

Standard Errors - Accuracy of the estimated coefficients can be measured from their respective
common errors

95% CI - In the absence of any income effect, monthly spend will, on average, fall somewhere
between INR 4,142 and INR 6,869. For each INR 1 increase in monthly income, there will be
an average increase in monthly spending between INR 0.0009 and INR 0.012

Sample response analysis


Spend before Spend before
Income Income
Pandemic Pandemic
85,000 4,500 45,000 7,000
Predicted Spend 6,038 Predicted Spend 5,788
Residual Error 1,538 Residual Error -1,212

• During Pandemic

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.53665982
R Square 0.288003762
Adjusted R Square 0.280738494
Standard Error 21833.66996
Observations 100

ANOVA
df SS MS F Significance F
Regression 1 18897311415 18897311415 39.64117673 8.64517E-09
Residual 98 46717496085 476709143.7
Total 99 65614807500

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 90.0% Upper 90.0%
Intercept 1249.098371 2496.05006 0.500430016 0.617894301 -3704.231738 6202.428481 -2895.720594 5393.917337
Monthly Income 0.061398755 0.009751834 6.296123945 8.64517E-09 0.042046558 0.080750953 0.045205336 0.077592175

Fitted Regression Equation

Average monthly spend = Beta_0 + Beta_1 * Monthly income + Error

Predicted average monthly spend = 1249.10 + 0.061 * Monthly income

H0: No relationship between Monthly income and Average monthly spends


H0: Beta_1 = 0
HA: Beta_1 <> 0

Page 19 of 24
Reject H0
P value < 0.05
Significant relationship between Monthly income and Average monthly spends

For Beta_0, P value > 0.05


Insignificant relationship without income effect and average monthly spends

Coefficients - In the absence of any income effect, on average, monthly spending will be around
INR 1,249. An increase of INR 1 in the monthly income is associated with an increase in
average monthly spend by around INR 0.061

Standard Errors - Accuracy of the estimated coefficients can be measured from their respective
common errors

95% CI - In the absence of any income effect, monthly spending will, on average, fall
somewhere between (INR 3,704) and INR 6,202. For each INR 1 increase in monthly income,
there will be an average increase in monthly spending between INR 0.042 and INR 0.081

Sample response analysis


Spend during Spend during
Income Income
Pandemic Pandemic
85,000 4,000 45,000 9,000
Predicted Spend 6,468 Predicted Spend 4,012
Residual Error 2,468 Residual Error -4,988

Thus, it can be concluded from the sample responses that respondents with higher monthly
income spent more on online shopping during the Pandemic.

Page 20 of 24
Correlation
We performed the correlation between different datasets to understand the impact of the
Pandemic on the income and spend.

During
Correlation between income and spend Before Pandemic
Pandemic
Overall 0.23 0.54
Male 0.43 0.62
Female 0.06 0.22
Metro & Tier 1 0.21 0.57
Tier 2 & Tier 3 0.14 0.54

Conclusion: It can be clearly noted that the correlation between monthly income and monthly
average spend has increased for all categories of respondents across age groups and locations.
This specifies that the shopping behaviour was greatly influenced by the availability of income
during the Pandemic

Page 21 of 24
Conclusion
The study's goal was to identify the online purchasing criteria of Indian customers during
COVID 19. According to the survey, online purchasing will be a bright spot in India in the next
years. In India, following COVID-19, people's attitudes on internet buying are improving,
especially in basic needs category and healthcare, which may later spread to even other
categories. It will be quite beneficial in preventing the fear of Corona from spreading from
person to person to individual. The Covid-19 epidemic has peaked and urged people to go
shopping online for more active consumers. E-shopping has evolved into a more reliable source
in this regard. Corona Virus Situation and E-Retailers provide things that shoppers frequently
purchase in the supermarket. With the improvement in the income of individuals, the spending
power increases, and online shopping may be the source of shopping in the future.

Page 22 of 24
Limitations
• The sample size of the dataset is 100. This is considered as a small dataset.
• The sample set is skewed because more than 50% of the respondants are aged between
18 to 30
• The dataset includes people with a continuous income. It doesn’t consider people who
are employeed or lost their jobs. As a result the impact of the Pandemic on their online
shopping could not be analyzed
• The dataset contains records from respondants of which more than 60% belong to a
Metro-city or a Tier-1 city

Page 23 of 24
Appendix
Google form Questions and Responses:

Google Form
Questions and Responses.xlsx

Descriptive Statistics:

Descriptive
Statistics.xlsx

Two Paired Analysis:

Two
Paired_Analysis.xlsx

ANOVA analysis:

ANOVA_Analysis.xlsx

Regression and correlation Analysis:

Regression_Correlati
on_Analysis.xlsx

Page 24 of 24

You might also like