You are on page 1of 7

INDU6310

Assignment 3


Q1 a) Construct a stem-and-leaf display for these data. Calculate the median and
25 points quartiles of these data.
Stem Leaf Frequency
83 4 1
84 3 3 2
85 3 1
86 7 7 7 3
87 4 5 6 7 8 9 6
88 2 3 3 3 4 5 5 6 6 7 9 11
89 0 2 3 3 6 7 8 8 9 9 10
90 0 1 1 1 3 4 4 4 5 6 7 8 9 13
91 0 0 0 1 1 1 2 2 5 6 6 8 8 13
92 2 2 2 3 6 7 7 7 8
93 0 2 3 3 4 7 6
94 2 2 4 7 4
96 1 5 2
98 8 1
100 3 1

Median 90,4
Min 83,4
Q1 88,6
Q2 90,4
Q3 92,2
Max 100,3

b) Construct a frequency distribution and histogram for the motor fuel octane
data. Use 8 bins.
Cumulative Relative
Bin Frequency Relative Frequency %
Frequency %
1 85,51 4 4,88% 4,9%
2 87,63 6 12,20% 7,3%
3 89,74 20 36,59% 24,4%
4 91,85 30 73,17% 36,6%
5 93,96 14 90,24% 17,1%
6 96,08 4 95,12% 4,9%
7 98,19 2 97,56% 2,4%
8 More 2 100,00% 2,44%

Histogram
35 100,00%
90,00%
30
80,00%
25 70,00%
Frequency
20 60,00%
50,00%
15 Frequency
40,00%
10 30,00% Cumulative %
20,00%
5
10,00%
0 0,00%
85,51 87,63 89,74 91,85 93,96 96,08 98,19 More
Bin

c) Construct a normal probability plot of the octane rating data. Does it seem
reasonable to assume that octane rating is normally distributed?
Normal Probability Plot
J Xj Zj (j-0.5)/82 J Xj Zj (j-0.5)/82 J Xj Zj (j-0.5)/82
1 83,4 -2,51 0,006 28 89,3 -0,43 0,335 55 91,2 0,43 0,665
2 84,3 -2,09 0,018 29 89,6 -0,39 0,348 56 91,5 0,46 0,677
3 84,3 -1,87 0,030 30 89,7 -0,36 0,360 57 91,6 0,49 0,689
4 85,3 -1,72 0,043 31 89,8 -0,33 0,372 58 91,6 0,53 0,701
5 86,7 -1,60 0,055 32 89,8 -0,29 0,384 59 91,8 0,56 0,713
6 86,7 -1,50 0,067 33 89,9 -0,26 0,396 60 91,8 0,60 0,726
7 86,7 -1,41 0,079 34 89,9 -0,23 0,409 61 92,2 0,64 0,738
8 87,4 -1,33 0,091 35 90 -0,20 0,421 62 92,2 0,67 0,750
9 87,5 -1,26 0,104 36 90,1 -0,17 0,433 63 92,2 0,71 0,762
10 87,6 -1,20 0,116 37 90,1 -0,14 0,445 64 92,3 0,75 0,774
11 87,7 -1,14 0,128 38 90,1 -0,11 0,457 65 92,6 0,79 0,787
12 87,8 -1,08 0,140 39 90,3 -0,08 0,470 66 92,7 0,84 0,799
13 87,9 -1,03 0,152 40 90,4 -0,05 0,482 67 92,7 0,88 0,811
14 88,2 -0,98 0,165 41 90,4 -0,02 0,494 68 92,7 0,93 0,823
15 88,3 -0,93 0,177 42 90,4 0,02 0,506 69 93 0,98 0,835
16 88,3 -0,88 0,189 43 90,5 0,05 0,518 70 93,2 1,03 0,848
17 88,3 -0,84 0,201 44 90,6 0,08 0,530 71 93,3 1,08 0,860
18 88,4 -0,79 0,213 45 90,7 0,11 0,543 72 93,3 1,14 0,872
19 88,5 -0,75 0,226 46 90,8 0,14 0,555 73 93,4 1,20 0,884
20 88,5 -0,71 0,238 47 90,9 0,17 0,567 74 93,7 1,26 0,896
21 88,6 -0,67 0,250 48 91 0,20 0,579 75 94,2 1,33 0,909
22 88,6 -0,64 0,262 49 91 0,23 0,591 76 94,2 1,41 0,921
23 88,7 -0,60 0,274 50 91 0,26 0,604 77 94,4 1,50 0,933
24 88,9 -0,56 0,287 51 91,1 0,29 0,616 78 94,7 1,60 0,945
25 89 -0,53 0,299 52 91,1 0,33 0,628 79 96,1 1,72 0,957
26 89,2 -0,49 0,311 53 91,1 0,36 0,640 80 96,5 1,87 0,970
27 89,3 -0,46 0,323 54 91,2 0,39 0,652 81 98,8 2,09 0,982
82 100,3 2,51 0,994

Normal Probability Plot


Zj
4,00

3,00

2,00

1,00

0,00

-1,00

-2,00

-3,00
83 85 87 89 91 93 95 97 99 101

Since the paired numbers are close to forming a straight line, then a normal
distribution can adequately describe the data.

Q2 The United States has an aging infrastructure as witnessed by several resent
15 points
disasters, including the I-35 bridge failure in Minnesota. Most states inspect their
bridges regularly and report their condition (on a scale from 1-7) to the public.
Here are condition numbers from a sample of 30 bridges in New York State:

5.08 5.44 6.66 5.07 6.80 5.43 4.83 4.00 4.41 4.38
7.00 5.72 4.53 6.43 3.97 4.19 6.26 6.72 5.26 5.48
4.95 6.33 4.93 5.61 4.66 7.00 5.57 3.42 5.18 4.54

Using the data on bridge conditions,

(a) Find the quartiles and median of the data.
Median 5,22
Min 3,42
Q1 4,57
Q2 5,22
Q3 6,125
Max 7



(b) Select the correct boxplot from the provided choices below.
Option A.



(c) Find the median and the upper and lower quartiles.
4,5
Q1 7
5,2
Q2 (Median) 2
6,1
Q3 25

(d) The amount of potential outliers.
Min Value 3,42 8
Lower Limit 2,24
7
Q1 4,57
IQR 1,56 6
Q3 6,13 5
Upper Limit 8,46
Max Value 7,00 4

3
There are not potential outliers since
2
the min (3,42) and max (7,00) value
are inside the outlier limits 1
[2,24-8,46].
0
1

Q3 The following data are the numbers of cakes sold by a bakery on different days:
15 points 234, 246, 214, 267, 253, 220, 259, 264, 240, 251, 230, 247, 255.
Choose the correct normal probability plot of this data, Does it seem reasonable to
assume that the number of cakes sold is normally distributed?

Option A.
Mean 244,6
St Dev 16,3

Yes, it does seem reasonable
to assume that the number of
cakes sold is normally
distributed since the graphic
is close to forming a straight
line

Normal ProbabilityPlot
J Xj Zj (j-0.5)/13
1 214 -1,77 0,04
2 220 -1,20 0,12 2,00
3 230 -0,87 0,19 1,00
4 234 -0,62 0,27 0,00
5 240 -0,40 0,35 -1,00
6 246 -0,19 0,42 -2,00
7 247 0,00 0,50
-3,00
8 251 0,19 0,58 210 215 220 225 230 235 240 245 250 255 260 265 270
9 253 0,40 0,65
Zj Lineal (Zj)
10 255 0,62 0,73
11 259 0,87 0,81
12 264 1,20 0,88
13
267 1,77
0,96
Q4 The data shown in the table are monthly champagne sales in France (1962–1969) in
20 points
thousands of bottles.

From the first plot with annual sales, we


can notice that the number of bottles was
ANNUAL SALES
(1962–1969) incrementing from 1962 until 1967 (5
years) and after there was a fall down in
80.000 the sales for 1968 of around 8.000 bottles.
70.000 68.562 The sales recovered a year after.
67.327
64.447
60.000 60.192 60.079

50.000 52.052 From the second plot where we graphic the


46.370

40.000 41.594 data by month, we can see that December


is always the month with biggest sales, and
30.000
July and August have the lowest ones all the
20.000
years, showing clearly that the annual sales
10.000 in the champagne sales business exhibit a
0 cyclic variability by month, being low at the
1962 1963 1964 1965 1966 1967 1968 1969
beginning of the years and reaching the top

at the end.

SALES BY MONTH (1962–1969)


16.000
14.000
12.000
10.000
8.000
6.000
4.000
2.000
0
1962

1963

1964

1965

1966

1967

1968

1969
1962

1963

1964

1965

1966

1967

1968

1969
1962

1962

1963

1963

1964

1964

1965

1965

1966

1966

1967

1967

1968

1968

1969

1969
1962

1963

1964

1965

1966

1967

1968

1969
1962

1963

1964

1965

1966

1967

1968

1969
1962

1963

1964

1965

1966

1967

1968

1969
1962

1962

1963

1963

1964

1964

1965

1965

1966

1966

1967

1967

1968

1968

1969

1969
1962

1962

1963

1963

1964

1964

1965

1965

1966

1966

1967

1967

1968

1968

1969

1969
1962

1963

1964

1965

1966

1967

1968

1969
JULY

JULY

JULY

JULY

JULY

JULY

JULY

JULY
JUNE

AUG.

JUNE

AUG.

JUNE

AUG.

JUNE

AUG.

JUNE

AUG.

JUNE

AUG.

JUNE

AUG.

JUNE

AUG.
OCT.

OCT.

OCT.

OCT.

OCT.

OCT.

OCT.

OCT.
MAY

MAY

MAY

MAY

MAY

MAY

MAY

MAY
MAR.

MAR.

MAR.

MAR.

MAR.

MAR.

MAR.

MAR.
FEB.

FEB.

FEB.

FEB.

FEB.

FEB.

FEB.

FEB.
APR.

APR.

APR.

APR.

APR.

DEC.

APR.

DEC.

APR.

DEC.

APR.
SEPT.

NOV.
DEC.

SEPT.

NOV.
DEC.

SEPT.

NOV.
DEC.

SEPT.

NOV.
DEC.

SEPT.

NOV.

SEPT.

NOV.

SEPT.

NOV.

SEPT.

NOV.
DEC.
JAN.

JAN.

JAN.

JAN.

JAN.

JAN.

JAN.

JAN.


Q5 The amount of time that a customer spends waiting at an airport check-in counter
15 points is a random variable with mean 8.2 minutes and standard deviation 1.5 minutes.
Suppose that a random sample of n = 49 customers is observed. Find the
probability that the average time waiting in line for these customers is:

Mean 8,2 x̄ −𝜇
𝑍= 𝜎
Std 1,5
n 49 √𝑛


a. Less than 10 minutes
!"#$,&
Z= !,'/√*+ = 8,4, Then 𝑃(𝑍 < 8,4) = 1 = 𝟏𝟎𝟎%
b. Between 5 and 10 minutes
'#$,& !"#$,&
𝑍(') = !,'/√*+ = −14,93; 𝑍(!") = !,'/√*+ = 8,4; Then 𝑃(−14,93 < 𝑍 < 8,4) =
𝑃(𝑍 < 8,4) − 𝑃(𝑍 < −14,93) = 1 − 0 = 1 = 𝟏𝟎𝟎%

c. Less than 6 minutes
.#$,&
Z= !,'/√*+ = −10,266, Then 𝑃(𝑍 < −10,266) = 0 = 𝟎%

Q6 A synthetic fiber used in manufacturing carpet has tensile strength that is normally
12 points
distributed with mean 75.5 psi and standard deviation 3.5 psi.
Find the probability that a random sample of n = 6 fiber specimens will have
sample mean tensile strength that exceeds 75.75 psi.
Mean 75.5
Std 3.5
n 6
/̄ #1 2'.2'#2','
Z= ! = $,& = 8, Then 𝑃(𝑍 > 0,174963) = 1 − 𝑃(𝑍 < 0,174) =
√# √'
1 − 0,569 = 0,430 = 43%

Q7 Scientists at the Hopkins Memorial Forest in western Massachusetts have been


18 points
collecting meteorological and environmental data in the forest data for more than
100 years. In the past few years, sulfate content in water samples from Birch Brook
has averaged 7.48 mg/L with a standard deviation of 1.60 mg/L.
Mean 7,48
Std 1,60
a. What is the standard error of the sulfate in a collection of 10 water samples?
With n=10,
1,60
𝑆𝑡𝑎𝑛𝑑𝑎𝑟 𝐸𝑟𝑟𝑜𝑟 = = 𝟎, 𝟓𝟎𝟔
√10
b. If 10 students measure the sulfate in their samples, what is the probability that
their average sulfate will be between 6.49 and 8.47 mg/L?
𝑃(6,49 < x̄ < 8,47) 𝑤𝑖𝑡ℎ 𝑛 = 10
𝑃(6,49 < x̄ < 8,47) = 𝑃(x̄ < 8,47) − 𝑃(6,49 < x̄ )
.,*+#2,*$ $,*2#2,*$
𝑍(.,*+) = !,./√!"
= −1,956; 𝑍($,*2) = = 1,956; then 𝑃(𝑍 < 1,956) −
!,./√!"
𝑃(𝑍 < −1,956) = 0,974 − 0,025 = 0,949 = 𝟗𝟒, 𝟗%

c. What do you need to assume for the probability calculated in (b) to be accurate?
Using 2 σ is equal to 95% confidence, we can be very confident that the true
population mean is 7.48 ± 2(0,506) or between 6,468 and 8,492.

You might also like