You are on page 1of 9

SMDM EXTENDED PROJECT

REPORT
By Leon D’Mello

[16/01/2022]
PGP-DSBA ONLINE
OCTOBER’ 21

Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
2
Contents
Problem Statement ………………………………………………………………………………………………………………………. 4

Sample of Data set ……………………………………………………………………………………………………………………….…. 4

Let us check the types of variables in the data frame ……………………………………………………………………... 4

Checking for missing values in the dataset ……………………………………………………………………………………... 4

Descriptive statistics ……………………………………………………………………………………………………………………..… 5

Problem 1

1. Find mean cold storage temperature for Summer, Winter, and Rainy Season ………………………………5

2. Find the overall mean for the full year ………………………………………………………………………………………….5

3. Find Standard Deviation for the full year ……………………………………………………………………………………...5

4. Assume Normal distribution, what is the probability of temperature having fallen below 2º C? …. 5

5. Assume Normal distribution, what is the probability of temperature having gone above 4º C? ….. 5

6. What will be the penalty for the AMC Company? …………………………………………………………………………6

Problem Statement

Sample of Data set ……………………………………………………………………………………………………………………………6

Let us check the types of variables in the data frame ………………………………….……………………………………6

Checking for missing values in the dataset …………………………………………………………………………………….. 6

Descriptive statistics ………………………………………………………………………………………………………………………..7

Problem 2

1. Which Hypothesis test shall be performed to check if corrective action is needed at the cold
storage plant? Justify your answer …………………………………………………………………………………………………..7

2. State the Hypothesis and do the necessary calculations to accept or reject the corresponding null
hypothesis ………………………………………………………………………………………………………………………………………..7

3. Give your inference …………………………………………………………………………………………………………………….. 8

3
Table1 ………………………………………………………………………………………………………………………………………….4

Table2 ………………………………………………………………………………………………………………………………………….5

Table3 ………………………………………………………………………………………………………………………………………….6

Table4 ………………………………………………………………………………………………………………………………………….7

4
Problem Statement

Cold Storage started its operations in Jan 2016. They are in the business of storing Pasteurized Fresh
Whole or Skimmed Milk, Sweet Cream, Flavoured Milk Drinks. To ensure that there is no change of
texture, body appearance, separation of fats the optimal temperature to be maintained is between
2º - 4º C. In this problem statement we will explore the different

In the first year of business, they outsourced the plant maintenance work to a professional company
with stiff penalty clauses. It was agreed that if it was statistically proven that the probability of
temperature going outside the 2º - 4º C during the one-year contract was above 2.5% and less than
5% then the penalty would be 10% of AMC (annual maintenance case). In case it exceeded 5% then
the penalty would be 25% of the AMC fee.

Sample of Data set

Season Month Date Temperature

0 Winter Jan 1 2.3


1 Winter Jan 2 2.2
2 Winter Jan 3 2.4
3 Winter Jan 4 2.8
4 Winter Jan 5 2.5
5 Winter Jan 6 2.4
6 Winter Jan 7 2.8
7 Winter Jan 8 3
8 Winter Jan 9 2.4
9 Winter Jan 10 2.9
Table 1

Let us check the types of variables in the data frame.


Season object
Month object
Date int64
Temperature float64
dtype: object
The variables are in object format except for Date is int64 and Temperature in float format

There is total 365 rows and 4 columns in the dataset

Checking for missing values in the dataset


# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Season 365 non-null object
1 Month 365 non-null object
2 Date 365 non-null int64
3 Temperature 365 non-null float64
dtypes: float64(1), int64(1), object(2)
memory usage: 11.5+ KB
There are no missing values in the dataset

5
Descriptive statistics help describe and understand the features of a specific dataset by giving
short summaries about the sample and measures of the data
coun uniqu
  top freq mean std min 25% 50% 75% max
t e
Season 365 3 Winter 123 NaN NaN NaN NaN NaN NaN NaN
Month 365 12 Jan 31 NaN NaN NaN NaN NaN NaN NaN
15.7205
Date 365 NaN NaN NaN 8.80832 1 8 16 23 31
5
3.00246
Temperature 365 NaN NaN NaN 0.46583 1.7 2.7 3 3.3 4.5
6
Table 2

1. Find mean cold storage temperature for Summer, Winter, and Rainy Season. (7 marks)

Mean cold storage temperature for Summer = (Sum of all temperature for Summer/ Count of
temperature for Summer) = 377.7/120 = 3.1475

Mean cold storage temperature for Winter = (Sum of all temperature for Winter/ Count of
temperature for Winter) = 341.5/123 = 2.7764

Mean cold storage temperature for Rainy = (Sum of all temperature for Rainy/ Count of
temperature for Rainy) = 376.7/122 = 3.0877

2. Find the overall mean for the full year. (7 marks)

Overall Mean = (Sum of all temperature / Count of temperature) = 1095.9 /365 = 3.002466

3. Find Standard Deviation for the full year.

To define standard deviation, you need to define another term called variance. In simple terms,
standard deviation is the square root of variance.

Standard Deviation = 0.465832

4. Assume Normal distribution, what is the probability of temperature having fallen below 2º C?

Probability of temperature having fallen below 2º C = (Number of temperatures having fallen


below 2º C/ Total of temperature) = 3/365 = 0.008219

5. Assume Normal distribution, what is the probability of temperature having gone above 4º C?

Probability of temperature having gone above 4º C = (Number of temperatures having gone


above 4º C / Total of temperature) = 7/365 = 0.019178

6
6. What will be the penalty for the AMC Company?

As the temperature going outside the 2º - 4º C is less then 5%, Hence the penalty for AMC
company will be 10% of the Annual maintenance fee.

Problem Statement

In Mar 2018, Cold Storage started getting complaints from their clients that they have been getting
complaints from end consumers of the dairy products going sour and often smelling. On getting
these complaints, the supervisor pulls out data of the last 35 days’ temperatures. As a safety
measure, the Supervisor has been vigilant to maintain the mean temperature 3.9º C or below.
Assume 3.9º C as the upper acceptable mean temperature and at alpha = 0.1 do you feel that there
is a need for some corrective action in the Cold Storage Plant or is it that the problem is from the
procurement side from where Cold Storage is getting the Dairy Products. The data of the last 35 days
is in “Cold_Storage_Mar2018_.csv”

Sample of Data set

  Season Month Date Temperature

0 Summer Feb 11 4
1 Summer Feb 12 3.9
2 Summer Feb 13 3.9
3 Summer Feb 14 4
4 Summer Feb 15 3.8
Table 3

Let us check the types of variables in the data frame.


Season object
Month object
Date int64
Temperature float64
dtype: object
The variables are in object format except for Date is int64 and Temperature in float format

There is total 35 rows and 4 columns in the dataset

Checking for missing values in the dataset


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35 entries, 0 to 34
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Season 35 non-null object
1 Month 35 non-null object
2 Date 35 non-null int64
3 Temperature 35 non-null float64
dtypes: float64(1), int64(1), object(2)
memory usage: 1.2+ KB
There are no missing values in the dataset.

7
Descriptive statistics help describe and understand the features of a specific dataset by giving
short summaries about the sample and measures of the data

  count mean std min 25% 50% 75% max


7.38918
Date 35 14.4 1 9.5 14 19.5 28
1
3.97428 0.15967
Temperature 35 3.8 3.9 3.9 4.1 4.6
6 4
Table 4

1. Which Hypothesis test shall be performed to check if corrective action is needed at the cold
storage plant? Justify your answer.

T-Test Left Tailed be performed to check if corrective action needed at the cold storage
plant

We assume that the samples are randomly selected, independent and come from a
normally distributed population. The population standard deviation is unknown.

2. State the Hypothesis and do the necessary calculations to accept or reject the corresponding
null hypothesis.

For Cold storage the null and alternative hypothesis to test whether the mean
temperature maintained is 3.9º C or below is given

Ho: mean temperature maintained ≤ 3.9

Ha: mean temperature maintained > 3.9

We assume that the samples are randomly selected, independent and come from a
normally distributed population with unknown but equal variances.

T-Test Left Tailed

Now we will perform the t-test on the log-transformed data to prove the hypothesis.

First decide the level of significance:

The level of significance is defined as the probability of rejecting a null hypothesis by the
test when it

significance 0.1 is related to the 99% confidence level.

Level of Significance α = 0.1

tstat = (𝑥− 𝜇) / (s / 𝑛)

𝑥 = 3.974286
8
𝜇 = 3.9
S = 0.159674
n = 35

tstat = 2.752369846

p-value = 0.995284

p-value > α

Hence, we fail to reject the null hypothesis.

3. Give your inference.

From the above Hypothesis we can conclude no corrective action is required as the supervisor
has maintained the mean temperature below 3.9º C the problem is from the procurement side
from where Cold Storage is getting the Dairy Products. And hence, the corrective action at the
procurement side needed to be taken to avoid further complaints.

You might also like