Professional Documents
Culture Documents
3.1 3.5 3.3 3.7 4.5 4.2 2.8 3.9 3.5 3.3
Find the sample mean and sample standard deviation for these 10 measurements.
3. The nine measurements that follow are furnace temperatures recorded on successive
batches in a semiconductor manufacturing process (units are ◦ F):
4. The minimum injection pressure (psi) for injection molding specimens of high amylose
corn was determined for eight different specimens (higher pressure corresponds to greater
processing difficulty), resulting in the following observations:
(a) Determine the values of the sample mean and sample median.
(b) By how much could the sample observation 8.0 be increased without affecting the
value of the sample median?
(c) Suppose we want the values of the sample mean and median when the observa-
tions are expressed in kilograms per square inch (ksi) rather than psi (pounds per
square inch). Is it necessary to re-express each observation in ksi, or can the values
calculated in part (a) be used directly? Hint: 1kilogram = 2.2 pounds.
6. In the casino game roulette, if a player bets $1 on red (or on black or on odd or on even),
the probability of winning $1 is 18/38 and the probability of losing $1 is 20/38. Suppose
that a player begins with $5 and makes successive $1 bets. Let Y equal the player’s
maximum capital before losing the $5. One hundred observations of Y were simulated on
a computer, yielding the following data:
25 9 5 5 5 9 6 5 15 45
55 6 5 6 24 21 16 5 8 7
7 5 5 35 13 9 5 18 6 10
19 16 21 8 13 5 9 10 10 6
23 8 5 10 15 7 5 5 24 9
11 34 12 11 17 11 16 5 15 5
12 6 5 5 7 6 17 20 7 8
8 6 10 11 6 7 5 12 11 18
6 21 6 5 24 7 16 21 23 15
11 8 6 8 14 11 6 9 6 10
7. Noise is measured in decibels, denoted as dB. One decibel is about the level of the weakest
sound that can be heard in a quiet surrounding by someone with good hearing; a whisper
measures about 30 dB; a human voice in normal conversation is about 70 dB; a loud
radio is about 100 dB. Ear discomfort usually occurs at a noise level of about 120 dB.
The following data give noise levels measured at 36 different times directly outside of
Grand Central Station in Manhattan.
5.9 7.2 7.3 6.3 8.1 6.8 7.0 7.6 6.8 6.5 7.0 6.3 7.9 9.0
8.2 8.7 7.8 9.7 7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7
9. In a study of warp breakage during the weaving of fabric (Technometrics, 1982: 63), 100
specimens of yarn were tested. The number of cycles of strain to breakage was deter-
mined for each yarn specimen, resulting in the following data:
(a) Construct a relative frequency histogram based on the class intervals [0,100), [100,200),
[200,300), . . ., and comment on the features of the histogram.
(b) Construct a histogram based on the following class intervals: [0,50), [50,100), [100,150),
[150,200), [200,300), [300,400), [400,500), [500,600), [600,900).
(c) If weaving specifications require a breaking strength of at least 100 cycles, what
proportion of the yarn specimens in this sample would be considered satisfactory?
10. Ledolter and Hogg report that a manufacturer of metal alloys is concerned about customer
complaints regarding the lack of uniformity in the melting points of one of the film’s alloy
filaments. Fifty filaments are selected and their melting points determined. The following
results were obtained:
320 326 325 318 322 320 329 317 316 331
320 320 317 329 316 308 321 319 322 335
318 313 327 314 329 323 327 323 324 314
308 305 328 330 322 310 324 314 312 318
313 320 324 311 317 325 328 319 310 324
(a) Construct a frequency table, and display the histogram, of the data.
(b) Calculate the sample mean and sample standard deviation.
(c) Locate x̄, x̄ ± s on your histogram. How many observations lie within one standard
deviation of the mean? How many lie within two standard deviations of the mean?
(d) Find the five-number summary for these melting points.
(e) Construct a box-and-whisker diagram.
4 Tutorial 12
11. A small part for an automobile rearview mirror was produced on two different punch
presses. In order to describe the distribution of the weights of those parts, a random
sample was selected, and each piece was weighed in grams, resulting in the following data
set:
(a) Using about 10 (say, 8 to 12) classes, construct a frequency distribution of the data.
(b) Draw a histogram of the data.
(c) Describe the shape of the distribution represented by the histogram.
√
12. A transformation of data values by means of some mathematical function, such as x
or 1/x, can often yield a set of numbers that has “nicer” statistical properties than the
original data. In particular, it may be possible to find a function for which the histogram
of transformed values is more symmetric (or, even better, more like a bell-shaped curve)
than the original data.
For example, in an experiment designed to study the behaviour of certain individual cells
that had been exposed to beryllium, the interdivision times (IDTs) of cells were deter-
mined for a large number of cells both in exposed (treatment) and unexposed (control)
conditions. Consider the following IDT data:
Construct a histogram of this data based on classes with boundaries 10, 20, 30, . . .. Then
calculate log10 (x) for each observation, and construct a histogram of the transformed data
using class boundaries 1.1, 1.2, 1.3, . . .. What is the effect of the transformation?
ENG5001/ENG6001 – Advanced Engineering Data Analysis 5
13. A survey on knee injuries recorded the following data on type of injury (A= mensical
tear, B=MCL tear, C=ACL tear, D=patella dislocation, E=PCL tear):
A B B A C A A D B A C E B
B A A C D C A C B C C C A
B B C A A B C C A C B B D
A B A C B A A C A B B E B
B B C C A C A A B D A A C
B C C A B B A D C A B
14. The National Highway Traffic Safety Administration has studied the use of rear-seat
automobile lap and shoulder seat belts. The number of lives potentially saved with the
use of lap and shoulder seat belts is shown for various percentages of use.
15. Blood cocaine concentration (mg/L) was determined both for a sample of individuals who
had died from cocaine-induced excited delirium and for a sample of those who had died
from a cocaine overdose without excited delirium; survival time for people in both groups
was at most 6 hours. The data is as follows.
(a) Determine the medians, quartiles and IQRs for the two samples.
(b) Are there any outliers in either sample? Any extreme outliers?
(c) Construct a side-by-side box plot, and use it as a basis for comparing and contrasting
the ED and non-ED samples.
6 Tutorial 12
16. Specimens of three different types of rope wire were selected, and the fatigue limit (MPa)
was determined for each specimen, resulting in the accompanying data:
(a) Construct a side-by-side box plot, and comment on similarities and differences.
(b) Construct a stem-and-leaf plot for each of the three types. Comment on similarities
and differences.
(c) Does the side-by-side box plot in part (a) give an informative assessment of similar-
ities and differences? Explain your reasoning.
17. Wear resistance of certain nuclear reactor components made of Zircaloy-2 is partly deter-
mined by properties of the oxide layer. The following data is from an article that proposed
a new nondestructive testing method to monitor thickness of the layer ; the variables are
x = oxide-layer thickness and y = eddy current response:
(a) Construct a scatter plot of the data. How would you describe the nature of the
relationship between the two variables?
(b) Compute the sample correlation coefficient for the data. Does it confirm your im-
pression from the scatter plot?
18. Express the sample correlation coefficient r in terms of the following sums:
X X X X X
xi , yi , x2i , yi2 , xi y i
19. Toughness and fibrousness of asparagus are major determinants of quality. An article
“Postharvest glyphosate application reduces toughening, fiber content, and lignification
of stored asparagus spears” reported the following data on x = shear force (kg) and y =
percent fiber dry weight:
x: 46 48 55 57 60 72 81 85 94
y: 2.18 2.10 2.13 2.28 2.34 2.53 2.28 2.62 2.63
x: 109 121 132 137 148 149 184 185 187
y: 2.50 2.66 2.79 2.80 3.01 2.98 3.34 3.49 3.26
(a) Using the formula obtained in Problem 18, calculate the sample correlation coeffi-
cient. Based on this value, how would you describe the nature of the relationship
between the two variables?
ENG5001/ENG6001 – Advanced Engineering Data Analysis 7
(b) If a first specimen has a larger value of shear force than does a second specimen,
what tends to be true of percent dry fiber weight for the two specimens, which one
would be larger?
(c) If shear force is expressed in pounds, what happens to the value of r? Why?
20. An experiment was conducted to investigate how the behaviour of mozzarella cheese
varied with temperature. The following data was obtained, with x = temperature and y
= elongation (%) at failure of the cheese.
x: 59 63 68 72 74 78 83
y: 118 182 247 208 197 135 132
(a) Construct a scatter plot in which the axes intersect at (0, 0). Mark 0, 20, 40, 60,80,
and 100 on the horizontal axis and 0, 50, 100, 150, 200, and 250 on the vertical axis.
(b) Construct a scatter plot in which the axes intersect at (55, 100). Does this plot seem
preferable to the one in part (a)?
(c) What do the plots suggest about the nature of the relationship between the two
variables?