Professional Documents
Culture Documents
PROJECT
Table of Contents
1|Page
S. No Heading Page No
1 Problem 1 3
2 Problem 2 6
3 Problem 3 9
4 Problem 4 12
5 Problem 5 14
6 Problem 6 16
7 Problem 7 17
2|Page
Problem 1
Solution:
1.1 What is the probability that a randomly chosen player would suffer an injury?
Sol:
= 145/235
= 0.617
There is 61.7 % chance that out of total players, a randomly chosen player would suffer an
injury.
Sol:
3|Page
Let’s find the probability of a player being forward and a winger separately.
Since a player can either be Forward or Winger and can’t be both simultaneously, we can’t
have P(F W).
Thus, there is 52.3% chance that a randomly chosen player is a Forward or a Winger
1.3 What is the probability that a randomly chosen player plays in a striker position and has
a foot injury?
Sol:
= 45/235
= 0.1914
Therefore, there is 19.14% chance that a randomly chosen player plays in a Striker
position and has a foot injury.
1.4 What is the probability that a randomly chosen injured player is a striker?
Sol:
4|Page
To find this, we will first see the total number of injured players and see the probability of a
randomly chosen injured player being a Striker.
= 45/145
= 0.310
Therefore, there is 31% chance that a randomly chosen injured player is a Striker.
1.5 What is the probability that a randomly chosen injured player is either a forward or an
attacking midfielder?
Sol:
Let’s first find out the probability that an injured player chosen is a forward and mid-fielder
separately.
P( Fi Mi ) will give the probability that a randomly chosen injured player is either a forward
or an attacking mid-fielder.
P( Fi Mi ) = P(Fi) + P(Mi)
Since a player can either be a Forward or a Mid-Fielder and can’t be both. Thus we can’t find
P( Fi Mi ).
Therefore,
P( Fi Mi ) = P(Fi) + P(Mi)
= 0.386 + 0.165
5|Page
= 0.551
Therefore, there is 55% chance that a randomly chosen injured player is either a forward
or an attacking midfielder.
Problem 2:
2.1 What are the probabilities of a fire, a mechanical failure, and a human error
respectively?
2.2 What is the probability of a radiation leak?
2.3 Suppose there has been a radiation leak in the reactor for which the definite cause is not
known. What is the probability that it has been caused by:
A Fire.
A Mechanical Failure.
A Human Error.
Solution:
2.1 What are the probabilities of a fire, a mechanical failure, and a human error
respectively?
Sol:
6|Page
P(R | M) = 0.50
P(R | H) = 0.10
and also,
P( F R ) = 0.1% = 0.001
P( M R ) = 0.15% = 0.0015
P( H R ) = 0.12% = 0.0012
Since, it is given that all possible accidents cannot occur simultaneously, we can say that
these are independent events.
P( F ) = P( F R ) / P( R|F)
= 0.001/0.20
= 0.005
Thus, probability of a fire at the plant is 0.5%
P( M ) = P( M R ) / P( R|M)
= 0.0015/0.5
= 0.003
Thus, probability of a mechanical failure at the plant is 0.3%
P( H ) = P(H R ) / P( R|H)
= 0.0012/0.10
= 0.012
Thus, probability of a accident due to Human error at the plant is 1.2 %
Sol:
Since, fire, mechanical failure and human error are only possible accidents at the plant, we
can say that there will be no radiation leak if these accidents do not happen.
Now, probability that radiation leak will happen when there is no accident is zero, i.e
P(R|N)= 0
7|Page
P(R N) = P(R|N) x P(R)
= 0 x P(R) = 0
Since probability that radiation leak will happen totally depends on if any accident will
happen. This means that no leakage will happen if there is no accident i.e it is completely
dependent on occurrence of an accident.
P( R ) = P( R F ) + P( R M ) + P( R H ) + P( R N)
= 0.001 + 0.0015 + 0.0012 + 0
= 0.0037
2.3 Suppose there has been a radiation leak in the reactor for which the definite cause is not
known. What is the probability that it has been caused by:
A Fire.
A Mechanical Failure.
A Human Error.
Sol:
o P( F | R )
o P( M | R )
o P( H | R )
P( F | R ) = P ( R F ) / P( R )
= 0.001/0.0037
= 0.277027
Thus, probability that a radiation leak has happened due to fire is 27%
P( M | R ) = P ( R M ) / P( R )
= 0.0015/0.0037
= 0.4054
Thus, probability that a radiation leak has happened due to mechanical failure is 40.5%
P( H | R ) = P ( R H ) / P( R )
= 0.0012/0.0037
= 0.3243
Thus, probability that a radiation leak has happened due to human error is 32.4%
8|Page
Problem 3:
The breaking strength of gunny bags used for packaging cement is normally distributed with
a mean of 5 kg per sq. centimeter and a standard deviation of 1.5 kg per sq. centimeter. The
quality team of the cement company wants to know the following about the packaging
material to better understand wastage or pilferage within the supply chain; Answer the
questions below based on the given information; (Provide an appropriate visual
representation of your answers, without which marks will be deducted)
3.1 What proportion of the gunny bags have a breaking strength less than 3.17 kg per sq
cm?
3.2 What proportion of the gunny bags have a breaking strength at least 3.6 kg per sq cm.?
3.3 What proportion of the gunny bags have a breaking strength between 5 and 5.5 kg per
sq cm.?
3.4 What proportion of the gunny bags have a breaking strength NOT between 3 and 7.5 kg
per sq cm.?
Solution:
3.1 What proportion of the gunny bags have a breaking strength less than 3.17 kg per sq
cm?
Let us find the Z score which will give the Standard normal variable
X− X X= Observed value
Z=
σ X = Mean value = 5 kg per sq cm
As per question, = Standard Deviation = 1.5 kg/cm2
Z= (3.17-5)/1.5
= -1.22
In Python, We will use norm function of scipy.stats to calculate our cumulative density
function, which will give the area to the left of distribution below 3.17
From, Python calculation, using the code scipy.stats.norm.cdf(-1.22), we get the output as
0.111.
Therefore, we can say that 11% of the gunny bags have a breaking strength less than 3.17
kg per square cm.
3.2 What proportion of the gunny bags have a breaking strength at least 3.6 kg per sq cm.?
X− X
Z=
σ
9|Page
= (3.6 – 5) / 1.5
= - 0.933
Since we are asked to give a proportion of bags with breaking strength at least 3.6 Kg/ cm 2
This implies that we want to find the area under curve to the right of P( X >= 3.6 )
Therefore, we can say that 82.5% of the gunny bags have a breaking strength of at least
3.6 kg per square cm.
3.3 What proportion of the gunny bags have a breaking strength between 5 and 5.5 kg per
sq cm.?
To find this, we will find Z- score for both 5 and 5.5 Kg/cm 2 and calculate areas under curves
correspondingly. Then subtracting the area for both with each other to find the area
between these two points.
Therefore,
Z1= (5-5)/1.5 = 0
Z2= (5.5-5)/5 = 0.3333
Therefore, we can say that 13% of the gunny bags have breaking strength between 5 and
5.5 kg/cm2.
3.4 What proportion of the gunny bags have a breaking strength NOT between 3 and 7.5 kg
per sq cm.?
To find this, we will first find Z- score for both 3 and 7.5 Kg/cm 2 and calculate areas under
curves correspondingly. Then subtracting the area for both with each other to find the area
between these two points.
To calculate the proportion not between 3 and 7.5 kg/cm2, we will subtract the result from 1
10 | P a g e
Let Z1 be the Z score corresponding to the strength of 3 Kg per sq cm.
And Let Z2 be the Z score corresponding to the strength of 7.5 Kg per sq cm.
Therefore,
Therefore, we can say that 13.9% of the gunny bags have breaking strength between 3
and 7.5 kg/cm2.
11 | P a g e
Problem 4:
Grades of the final examination in a training course are found to be normally distributed,
with a mean of 77 and a standard deviation of 8.5. Based on the given information answer
the questions below.
4.1 What is the probability that a randomly chosen student gets a grade below 85 on this
exam?
4.2 What is the probability that a randomly selected student scores between 65 and 87?
4.3 What should be the passing cut-off so that 75% of the students clear the exam?
Solution:
4.1 What is the probability that a randomly chosen student gets a grade below 85 on this
exam?
Let us find the Z score which will give the Standard normal variable
X− X X= Observed value
Z=
σ X = Mean value = 77
As per question, = Standard Deviation = 8.5
Z= (85-77)/8.5
= 0.9411
In Python, We will use norm function of scipy.stats to calculate our cumulative density
function, which will give the area to the left of distribution below 85
From, Python calculation, using the code scipy.stats.norm.cdf(0.9411), we get the output as
0.8266.
Therefore, we can say that there is 82.6% probability that a randomly chosen student gets
a grade below 85 on this exam.
4.2 What is the probability that a randomly selected student scores between 65 and 87?
To find this, we will find Z- score for both 65 and 85 and calculate areas under curves
correspondingly. Then subtracting the area for both with each other to find the area
between these two points.
Therefore,
12 | P a g e
This will give us the area between these two points.
Therefore, we can say that there is 80% probability that a randomly chosen student gets a
grade between 65 and 87 on this exam.
4.3 What should be the passing cut-off so that 75% of the students clear the exam?
To calculate this, we will use ppf function (Percent point function) of scipy.stats library,
which will return a discrete value that is less than or equal to the asked probability.
Here, we want the cut off score above which 75% of students clear the exam. We want area
to the right side of the discrete value above which 75% students pass.
Thus, the passing cut off score so that 75% of students clear the exam is 71.26
13 | P a g e
Problem 5:
Zingaro stone printing is a company that specializes in printing images or patterns on
polished or unpolished stones. However, for the optimum level of printing of the image the
stone surface has to have a Brinell's hardness index of at least 150. Recently, Zingaro has
received a batch of polished and unpolished stones from its clients. Use the data provided
to answer the following (assuming a 5% significance level);
5.1 Earlier experience of Zingaro with this particular client is favorable as the stone surface
was found to be of adequate hardness. However, Zingaro has reason to believe now that
the unpolished stones may not be suitable for printing. Do you think Zingaro is justified in
thinking so?
5.2 Is the mean hardness of the polished and unpolished stones the same?
Solution:
5.1 Earlier experience of Zingaro with this particular client is favorable as the stone surface
was found to be of adequate hardness. However, Zingaro has reason to believe now that
the unpolished stones may not be suitable for printing. Do you think Zingaro is justified in
thinking so?
Step 1:
Defining Hypothesis:
Null Hypothesis
Ho: Adequate hardness of stone found >= 150
Alternate Hypothesis
Ha: Unpolished stone hardness not suitable for printing < 150
Here, Level of Significance ɑ = 0.05 and Sample size n= 75 (derived from dataset), x¯ =
134.11, σ = 33.04, μ = 150
Step 2: Define the test statistic based on the information in the question. Here, we are going
to use the Zstat .
From the value of the Zstat , we understand that this is a lower tailed-test.
Step 3: Let us check the critical value with respect to α for the test statistic.
Using norm.ppf for a alpha value of 0.05, the critical value is -1.64
14 | P a g e
Here, the calculated Zstat value is less than -1.64. Thus, this value falls in the rejection
region.
With 95% confidence, we can say that we have enough evidence to say that the hardness for
unpolished stones is less than 150. Thus Zingaro is right in its thinking that unpolished stones are
not fit for printing.
5.2 Is the mean hardness of the polished and unpolished stones the same?
From the table, we can see that the mean hardness of Polished stones is 134.11 whereas mean
hardness of Treated and Polished stone is 147.78 which is not same.
15 | P a g e
Problem 6:
Aquarius health club, one of the largest and most popular cross-fit gyms in the country has
been advertising a rigorous program for body conditioning. The program is considered
successful if the candidate is able to do more than 5 push-ups, as compared to when he/she
enrolled in the program. Using the sample data provided can you conclude whether the
program is successful? (Consider the level of Significance as 5%)
Note that this is a problem of the paired-t-test. Since the claim is that the training will make
a difference of more than 5, the null and alternative hypotheses must be formed
accordingly.
Solution:
Step 1:
Defining Hypothesis:
Let µ1 be the Mean push ups done by candidate before enrolment and µ2 be the Mean push ups
done by candidate after enrolment.
Null Hypothesis:
Ho: µ1=µ2
Alternate Hypothesis:
Here, Level of Significance ɑ = 0.05 and Sample size n= 100 (derived from dataset)
- But since the population standard deviation (Sigma) is unknown, we have to use a T-stat test.
- Degree of Freedom: Since the sample is the same for both Sampling tests, we have N-1 degrees
of freedom: 99
- Since the sole purpose of the test is to check whether the Program is successful, we would prefer
a One-sided T-test.
From the test, we observe that that p-value is less than alpha. Thus we reject the null hypothesis.
Hence we can say that we have statistical evidence to state that the Program was successful
16 | P a g e
Problem 7:
Dental implant data: The hardness of metal implant in dental cavities depends on multiple factors,
such as the method of implant, the temperature at which the metal is treated, the alloy used as well
as on the dentists who may favour one method above another and may work better in his/her
favourite method. The response is the variable of interest.
1. Test whether there is any difference among the dentists on the implant hardness. State the
null and alternative hypotheses. Note that both types of alloys cannot be considered
together. You must state the null and alternative hypotheses separately for the two types of
alloys.?
2. Before the hypotheses may be tested, state the required assumptions. Are the assumptions
fulfilled? Comment separately on both alloy types.?
3. Irrespective of your conclusion in 2, we will continue with the testing procedure. What do you
conclude regarding whether implant hardness depends on dentists? Clearly state your
conclusion. If the null hypothesis is rejected, is it possible to identify which pairs of dentists
differ?
4. Now test whether there is any difference among the methods on the hardness of dental
implant, separately for the two types of alloys. What are your conclusions? If the null
hypothesis is rejected, is it possible to identify which pairs of methods differ?
5. Now test whether there is any difference among the temperature levels on the hardness of
dental implant, separately for the two types of alloys. What are your conclusions? If the null
hypothesis is rejected, is it possible to identify which levels of temperatures differ?
6. Consider the interaction effect of dentist and method and comment on the interaction plot,
separately for the two types of alloys?
7. Now consider the effect of both factors, dentist, and method, separately on each alloy. What
do you conclude? Is it possible to identify which dentists are different, which methods are
different, and which interaction levels are different?
Solution:
1. Test whether there is any difference among the dentists on the implant hardness. State the null
and alternative hypotheses. Note that both types of alloys cannot be considered together. You must
state the null and alternative hypotheses separately for the two types of alloys?
Step 1:
Defining Hypothesis:
Case 1:
Null Hypothesis:
17 | P a g e
Alternate Hypothesis:
Ha: Mean Hardness is not same for at least one pair of Dentists for Alloy 1
Case 2:
Null Hypothesis:
Alternate Hypothesis:
Ha: Mean Hardness is not same for at least one pair of Dentists for Alloy 2
Here, Level of Significance ɑ = 0.05 and Sample size n= 90 (derived from dataset)
2. Before the hypotheses may be tested, state the required assumptions. Are the assumptions fulfilled?
Comment separately on both alloy types?
These are the assumptions that are required before the test:
Let’s see the boxplot of the Response variable to see the distribution.
We can clearly see that the distribution is not normal. The data clearly does not fulfill the
assumption.
18 | P a g e
3. Irrespective of your conclusion in 2, we will continue with the testing procedure. What do you
conclude regarding whether implant hardness depends on dentists? Clearly state your conclusion. If
the null hypothesis is rejected, is it possible to identify which pairs of dentists differ?
Now, we see that the corresponding p-value is greater than alpha (0.05). Thus, we fail to reject the
Null Hypothesis.
Now let us perform One way ANOVA for Response variable for Alloy 1 and Alloy 2 separately.
Alloy 1:
Alloy 2:
Now, we see that the corresponding p-value is greater than alpha (0.05). Thus, we fail to reject the
Null Hypothesis.
Thus, for both Alloy 1 and Alloy 2, the Mean hardness of Alloy1 and Alloy 2 is same across all dentists
4. Now test whether there is any difference among the methods on the hardness of dental implant,
separately for the two types of alloys. What are your conclusions? If the null hypothesis is rejected, is it
possible to identify which pairs of methods differ?
Step 1:
Defining Hypothesis:
Case 1:
Null Hypothesis:
19 | P a g e
Alternate Hypothesis:
Ha: Mean Hardness is not same for at least one pair of Methods for Alloy 1
Case 2:
Null Hypothesis:
Alternate Hypothesis:
Ha: Mean Hardness is not same for at least one pair of methods for Alloy 2
Now let us perform One way ANOVA for Response variable for Alloy 1 and Alloy 2 with respect to the
Method separately.
Alloy 1:
Alloy 2:
The p-value is lesser than alpha (0.05) for both alloy 1 and alloy 2. We will reject the null hypothesis.
We can say that the mean hardness for both Alloy 1 and Alloy 2 is different for at least one pair of
methods of Dental Implant.
5. Now test whether there is any difference among the temperature levels on the hardness of dental
implant, separately for the two types of alloys. What are your conclusions? If the null hypothesis is
rejected, is it possible to identify which levels of temperatures differ?
Step 1:
Defining Hypothesis:
Case 1:
Null Hypothesis:
Ho: Mean Hardness is same across all temperature levels for Alloy 1
20 | P a g e
Alternate Hypothesis:
Ha: Mean Hardness is not same across different temperature levels for Alloy 1
Case 2:
Null Hypothesis:
Ho: Mean Hardness is same across all temperature levels for Alloy 2
Alternate Hypothesis:
Ha: Mean Hardness is not same across different temperature levels for Alloy 2
Alloy 1:
Alloy 2:
Now, we see that the corresponding p-value is greater than alpha (0.05). Thus, we fail to reject the
Null Hypothesis.
Thus, we can say that Mean Hardness is same across all temperature levels for both Alloy 1 and Alloy
2.
21 | P a g e