H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 1
Ex 1 Correlation & Regression
Q1. A statistician discovers the following part of an old research paper.
giving us mean values 21 = x and = y The line of regression of y on x maybe
be written as + = x y
7
1
and the line of regression of x on y may be written as
163 7 + = x y . Treating y as the independent variable, we calculate that when y is
known to be 128, an estimate for the value of x would be
(i) Show that 16 = y . [2]
(ii) Calculate the value of the missing term in the line of regression of y on x .
[2]
(iii) Use the appropriate line of regression to calculate the missing estimate for the value of
x . [1]
(iv) The statistician works out that the product moment correlation coefficient r satisfies
49
1
2
= r . Show how this has been deduced, and give a reason why the correct value of
r is
7
1
and not
7
1
. [2]
(v) Use the value of the product moment correlation coefficient to comment on the
reliability of the estimated value of x . [1]
Q2. The equation of the estimated least squares regression line of y on x for a set of bivariate
data is x b a y
1 1
+ = , and the corresponding equation of x on y is y b a x
2 2
+ = , where
1
b
and
2
b are both positive. Show that the linear (product moment) correlation coefficient is
( )
2 1
b b .
Systolic pressure is the blood pressure exerted by the heart when it is contracting and
diastolic pressure is the blood pressure exerted when the heart is dilating. The table shows the
systolic and the diastolic pressure, in suitable units, of a patient taken at twohourly intervals
during a day.
Systolic pressure ( ) x
92 103 103 104 114 120 122 125 135 142 142 161
Diastolic pressure ( ) y
70 68 62 60 78 64 72 74 96 81 90 124
[
= = = = = 117882 , 77061 , 939 , 182877 , 1463
2 2
xy y y x x . ]
(i) Calculate the equation of the estimated least squares regression line of y on x , giving
the values of
1
a and
1
b correct to 3 significant figures.
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 2
(ii) Given that the linear (product moment) correlation coefficient for the data is 0.846,
correct to 3 decimal places, state what this indicates about the relation between x and
y over the data range.
(iii) By using your value for
1
b obtained in part (i) and the value of the correlation
coefficient, or otherwise, obtain the equation of the estimated least squares regression
line of x on y .
(iv) Obtain an estimate of the systolic pressure when the diastolic pressure is 110 units.
N97/IV/7 (FM)
Q3. The variables x and y are believed to be linearly related and a set of values of x and y are
obtained experimentally. Both variables are subject to experimental error. State the
circumstances under which the regression line of x on y should be used, rather than the
regression line of y on x , when the value of one variable is to be estimated from a given
value of the other.
When would it make little difference which line was used?
The heart rate ( x ) and the diastolic blood pressure ( y ), both in suitable units, were measured
for each of 10 hospital patients after being given a certain drug. The results were as follows.
x 49 51 54 58 63 64 68 70 75 78
y
90 88 85 91 82 85 76 77 70 71
[ ( ) ( ) ( )( )
= = = = = = 0 . 627 , 50 . 522 , 815 , 890 , 630 , 10
2 2
y y x x y y y x x x n . ]
The scatter diagram below illustrates the data.
y
50 55 60 65 70 75 80 x
(i) State what the scatter diagram indicates about the correlation between x and y .
(ii) Calculate the equation of the estimated least squares regression line of y on x in the
form bx a y + = , giving the values of a and b correct to 3 significant figures.
90
85
80
75
70
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 3
(iii) Given that the estimated regression line of x on y has equation y x 2 . 1 8 . 160 = ,
obtain an estimate for the heart rate when the diastolic blood pressure is 80.
Give a reason why it might be unwise to use either of the regression lines to estimate
the diastolic blood pressure when the heart rate is 90.
(iv) Find the linear (product moment) correlative coefficient for the data and state, giving
a reason, whether its value confirms your statement in part (i).
N99/II/8(part) (FM)
Q4. (a) The amounts, x grams, of catalyst used in a chemical reaction and the resulting times,
y hours, taken to complete the reaction were recorded. The results are given in the
following table.
x 2.0 2.5 3.0 3.5 4.0 4.5 5.0
y
62.1 51.2 44.1 39.2 35.0 37.3 33.0
(i) Calculate the value of the linear (product moment) correlation coefficient for the data.
(ii) State what your value indicates about the relation between x and y .
(iii) Plot a scatter diagram for the data and explain how its shape is related to your answer
to part (ii).
(b) The scatter diagram shows a sample of four pairs of
values ( ) y x , . Give the coordinates of a fifth point
such that the linear (product moment) correlation
coefficient for all five points is
(i) negative,
(ii) positive,
(iii) zero.
N2000/II/6(FM)
Q5. A random sample of five students is taken from those sitting examinations in English and
History, and their marks, x and y , each out of 100, are given in the table.
English mark ( x ) 56 41 75 88 34
History mark ( y ) 32 24 70 65 47
Find, in any form, the equation of the regression line of
(i) y on x , (ii) x on y .
Use your answers to parts (i) and (ii) to deduce the product moment correlation coefficient of
the data.
y
x
0 1
1
1 2
2
2
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 4
A sixth student scored 55 in the History examination, but missed the English examination.
Use the appropriate regression line to estimate what his English mark would have been.
N2002/II/6 (FM)
Q6. A teacher gives two tests, X and Y , to her class of n pupils, and labels their marks
( )
1 1
, y x , ( )
2 2
, y x , , ( )
n n
y x , respectively. An extra pupil is added to her class and the
teacher gives her the provisional marks ( ) y x , , the means of the marks already awarded.
Show that the regression lines of y on x before and after the extra pupils arrival are
identical.
x 24 38 41 57
y
34 21 63 22
Given that 4 = n and that the marks are as in the table, find the regression line of y on x .
The extra pupil now sits test X and scores 46. Use the regression line to estimate her mark
for test Y .
Comment on the suitability of the method in this case, justifying your answer.
N2005/II/9 (FM)
Answer:
Q1. (ii) 19 c = (iii) 5 x =
(iv) As seen from the negative regression coefficients, it is a negative correlation
coefficient.
(v) Since the r is not close to 1, it may indicate that the estimated value of x is not
reliable.
Q2. (i) 13.7 0.754 y x = + (ii) Possibility of linear correlation
(iii) 47.7 0.949 x y = + (iv) 152
Q3. (i) High negative correlation (ii) 126 0.704 y x = (iii) 64.8
(iv) 0.919
Q4. (a) (i) 0.925 (ii) High negative correlation
Q5. (i) 0.657 8.98 y x = + (ii) 0.841 18.8 ; 65.0 x y = +
Q6. 0.271 45.8 y x = + ; 33 ; Since 46 x = is within the data range of x, the estimated
value is quite reliable.
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 5
Ex 2 Probability
Q1 (a). Events A and B are such that P(A) =
4
7
, P(A' B) =
1
3
, P(B'A) =
5
12
. A' and B'are the
complements of A and B respectively. Find
(i) P(A B), [2]
(ii) P(B), [3]
State, with a reason, whether or not A and B are independent events? [1]
[NJC 06 Promo Q12]
Q2. Given that P(  ) 0.75 A B = , P(  ) 0.2 A B = and P( ) 0.4 B = , find
(i) P( ) A ,
(ii) P(  ) B A ,
where A and B denotes the complement of A and B respectively. [5]
05 NYJC Q21
Q3. Whenever Sheila needs a taxi, she contacts one of three taxi companies, AverCab,
BestCab or CallCab. She calls AverCab 45% of the time, BestCab 20% of the time
and CallCab 35% of the time. A taxi from AverCab arrives late 5% of the time, a
taxi from BestCab arrives late 10% of the time and a taxi from CallCab arrives
late 7% of the time.
(i) Find the probability that, when Sheila calls for a taxi, it arrives on time. [2]
(ii) Suppose Sheilas taxi arrives late, what is the probability that she rang AverCab? [3]
05 MI
Q4 The probabilities that a husband and wife will be alive 20 years from now are given by 0.8 and
0.9 respectively.
(i) Find the probability that in 20 years both will be alive. [1]
(ii) Find the probability that in 20 years at least one will be alive. [3]
Q5(a) The probability of a football team winning any match is
2
1
and the probability of
losing any match is
4
1
. Three points are obtained for a win, one point for a draw and
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 6
no points for a defeat. If the team plays four matches, find the probability that the
team
(i) wins exactly two matches,
(ii) obtains exactly four points.
(b) The events A and B are such that P(A) =
2
1
, P( A  B' ) =
3
2
, P( A  B ) =
7
3
,
where B' is the event B does not occur .
Find (i) P(B); (ii) P(A B ).
Q6. Two boats, A and B, compete in a sailing competition that consists of a series of independent
races. Every race is won by either A or B and the first boat to win three races wins the
competition. The probability of winning is influenced by wind conditions and the probability of
having strong winds is 0.2. In strong winds, the probability that A will win is 0.9; in light
winds, the probability that A will win is 0.5. For each race, the wind condition is either strong
or light and the result for each race is independent of the result for any other races.
(i) Show that the probability of A winning a race is 0.58.
(ii) Calculate the probability of A winning not more than two races out of three.
(iii) Given that A won the first race, determine the conditional probability that A
will win the competition.
Answer:
Q1. (a) (i)
19
21
(ii)
2
3
; independent events
Q2. (i) 0.42
(ii)
29
24
Q3. 0.933
Q4. 0.72, 0.98
Q5. (a) (i)
3
8
(ii)
25
256
(b) (i)
7
10
(ii)
3
10
Q6. (i) 0.58 (ii) 0.805 (iii) 0.797
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 7
Ex 3 Binomial Distribution
Q1 Clinical evidence suggests that, on average, 35 out of 100 patients who are administered a
lumbar puncture for medical treatment will suffer from Severe Spinal Headache (SSH).
(i) In a group of 20 patients who are given the treatment, find the probability that at most
two of them will suffer from SSH. [2]
(ii) In another group of 60 patients who are given similar treatment, find,
using a suitable approximation, the probability that more than 15 of them will suffer
from SSH. [3]
05SAJC
Q2 Mark has a large collection of books consisting of 60% Comics and the remainder being
Fiction books. It is also known that 5% of his Fiction books and 10% of his Comics are in
Chinese.
(i) Find the probability that a random collection of 20 books contains no Chinese books. [2]
(ii) Find the probability that a random pile of 6 books contains not more than 4 Comics. [2]
(iii) Using a suitable approximation, find the probability that a random collection of 50 books
contains at least 16 nonChinese Fiction books. [4]
Q3 The staff of the hospital emergency unit observes that, out of 5 patients admitted, only 2
patients require major operation.
i) Find the probability that of a random sample of 15 patients admitted to the emergency
unit, at least 2 patients will require major operation.
ii) Find the smallest value of n if there is a probability of a least 0.98 that of a random
sample of n patients, at least 1 patient requires major operation.
iii) In a particular week, the emergency unit takes in 200 patients. Estimate, to three
significant figures, the probability that there will be at most 86 patients who require major
operation in that week.
Q4 At a firing exercise, each participant is required to fire five rounds. The probability that a
participant fires a successful shot at each round is 0.38, and it is assumed that the outcome of
each round is independent of each other.
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 8
Find the probability that a participant manages to obtain at least three successful shots.
[3]
A large number of participants is expected to take part in the firing exercise. Assuming a
suitable approximation, find the least number of participants required in this exercise such
that the probability of an average of at most two successful shots per participant is at least
0.99. [4]
05 NYJC
Q5 Alex undergoes 4 training sessions each week, on separate days. In each of his
training session, Alex has to run 10 laps of 400m. The probability of him meeting
his target time for each lap on a sunny day is 0.95. On a rainy day, the probability of
achieving his target time is 0.9. Each day is either sunny or rainy, and the
probability of a sunny day is 0.8.
(i) Calculate the probability of Alex achieving his target time for at least 8 of the laps on a
rainy day. [3]
(ii) Given that there was a week of rainy days, determine the conditional probability of Alex
being able to achieve his target time for 8 or more laps, in each of the sessions, on 3
separate days. [4]
(iii) There was sunny weather for 2 consecutive weeks. By using a suitable approximation,
determine the probability of Alex not meeting his target time for more than 4 laps during
the 2 weeks. [5]
05RJC
Answer:
Q1.(i) 0.0121 (ii) 0.932
Q2. (i) 0.189 (ii) 0.767 (iii) 0.846
Q3. (i) 0.995 (ii) n = 8 (iii) 0.826
Q4. 0.283; 638
Q5. (i) 0.930
(ii) 0.226
(iii) 0.371
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 9
Ex 4 Normal Distribution
Q1 The random variable X has a normal distribution with mean 10 and variance 4.
Find the largest integer value of a for which P( 12) 0.1565 a X < < > . [4]
05PJC
Q2 The marks of the first test T is normally distributed with mean 54.3 and standard deviation
o .
(i) If the probability that a randomly chosen mark for the test to be within 5 marks of the
mean is 0.542, calculate the variance. [3]
(ii) For a second test, assuming that the variance of the marks is the same as the first test,
calculate the greatest probability that the mark of a randomly chosen student from the
second test is between 50 and 55. [2]
05YJC
Q3 It is found that the time required to roast a chicken may be taken to follow a normal
distribution with mean 30 minutes and standard deviation 8 minutes. The time required to
roast a duck may be taken to have an independent normal distribution with mean 40 minutes
and standard deviation 11 minutes. At any one time, only one bird can be roasted.
(i) Three randomly chosen chickens are roasted in succession in random order. Find the
probability that one of them will take more than 30 minutes to roast.
[2]
(ii) Find the time in which a duck is 90% roasted. [3]
(iii) Find the probability that the time taken to roast three randomly chosen chickens exceeds
twice the time taken to roast one randomly chosen duck. [4]
Q4 The weights of boys in a certain age group are normally distributed, with mean 48 kg standard
deviation 4 kg. The weights of girls in the same age group are normally distributed, with mean
42 kg and standard deviation 5 kg. Boys and girls are randomly selected from this age group,
find the probabilities that
i) the weight a boy chosen is more than 52 kg ,
ii) the mean weight of 10 girls chosen is less than 41 kg ,
iii) the weight of a boy is at least 5 kg more than that of a girl ,
iv) the total weight of 11 girls is more than twice the total weight of 5 boys.
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 10
Q5 A cutting machine produces steel rods which must not be more than 1 m in length. The
length of the steel rod, X cm , follows a normal distribution with mean and standard deviation o.
(i) Given that P(X > 100.1) = 0.0228 and P(X < 99.4) = 0.0038 , show that = 99.8cm
and find the value of o, giving your answer correct to three significant figures. [4]
(ii) If any 3 rods are placed end to end, what is the probability that the total length
exceeds 299.30 cm? [3]
Q6 The mass of one bar of a certain chocolate is known to follow the normal distribution with
mean 71 grams and standard deviation 7 grams. The mass of a certain kind of lollipop is
known to follow the normal distribution with mean 57 grams and standard deviation 5 grams.
(i) What is the probability that the mass of 4 randomly chosen lollipops exceeds thrice the
mass of one such bar of chocolate by less than 10 grams? [4]
5 chocolate bars and 2 lollipops are packed into goodie bag A of negligible mass. The total
mass (in grams) of this goodie bag is denoted by X.
(ii) Show that E(X) = 469 and Var(X) = 295. [2]
3 chocolate bars and 5 lollipops are packed into goodie bag B of negligible mass. The total
mass (in grams) of this goodie bag is denoted by Y.
(iii) Find the probability that the mass of goodie bag A differs from the mass of goodie bag B
by less than 10 grams. [6]
05MI
Q7 An examination consists of a Verbal section and a Mathematics section. The marks obtained by
a large number of candidates taking the examination may be assumed to follow independent
normal distributions, with means and standard deviations as shown in the table below:
Mean Standard Deviation
Verbal 45 12
Mathematics 58 9
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 11
For a randomly chosen candidate, find the probability that
i) the difference in marks obtained in the two sections is less than 10;
ii) the average mark obtained in the two sections is more than 55.
Candidates A and B sat for the examination. Find the probability that the average mark
obtained in the two sections by candidate A is 5 more than that obtained by candidate B.
It is decided to calculate a candidates total mark for the examination by adding three times the
mark in the Verbal section to twice the mark in the Mathematics section. Find the least total
mark to be obtained in order for the candidate to be among the top 10% in the examination.
Q8 The weekly income at three franchise bakeries may be assumed to be independent and
normally distributed with mean and standard deviation (in dollars) as shown in the table.
Bakery Mean Standard Deviation
A 6000 400
B 9000 800
C 5100 180
(iv) Find the probability that the income at Bakery B in a week is more than the total
income at Bakery C in two weeks. [3]
(v) Find the probability that the mean weekly income at the three bakeries is more than
$6500. [3]
The parent company receives a weekly levy from the income of each bakery as given in the table
below:
Bakery A B C
Levy 12% 20% 8%
(vi) Find the probability that this levy exceeds $3000 in any given week. [4]
05 NYJC
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 12
Q9 A particular brand of baby powder is packed in boxes of two sizes standard and large.
The mass of each standard box of baby powder may be regarded as a normal variable with
mean 150g and standard deviation 15g. The mass of each large box of baby powder may be
regarded to be another normal variable with mean 180g and standard deviation 25g.
(i) Find the probability that a randomly chosen standard box of powder is at most 140g.
[2]
(ii) Find the probability that the total mass of two randomly chosen standard boxes of
powder exceed twice the mass of a randomly chosen large box of powder by less than
5g. [3]
(iii) The random variable T denotes the mean mass (in grams) of a large box of powder in
a random sample of n large boxes of powder. Find the least value of n such that P
90 . 0 ) 1 180 ( > < T . [5] 05SRJC
Q10 A company, intending to employ a new project manager, sets a leadership assessment for all
its job applicants. The assessment consists of a test that computes the leadership potential
index, x, of the applicant, where x 0. Extensive studies have shown that it can be assumed
that the leadership potential indexes are normally distributed with mean 15 and standard
deviation 6. An applicant is deemed suitable if he or she has a leadership potential index
above 20.
(i) Find the probability that a randomly chosen applicant is found suitable. [2]
There are 70 applicants for the job.
(ii) Write down, to the nearest whole number, the expected number of suitable applicants.
Using a suitable approximation, find the probability that the number of suitable applicants
found falls short of this number. [5]
(iii) Find the probability that the mean of the indexes of all applicants exceeds 16.5. [3]
05RJC
Q11 A randomly chosen glass jar can contain either Mocha drink or vanilla icecream or both.
Their weights, in grams (g), are normally distributed with the following means and standard
deviations:
Mean Standard deviation
Empty glass jar 180 5
Mocha drink 150 3
1 scoop of vanilla icecream 45 1.5
(i) Find the probability that the weight of a Mocha drink with 1 scoop of vanilla ice
cream in a randomly chosen glass jar is greater than 385g. [3]
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 13
(ii) Find the probability that the total weight of 2 randomly chosen glass jars of Mocha
drink, each containing 1 scoop of vanilla icecream, differs from twice the weight of
another randomly chosen glass jar of Mocha drink by at least 110g.
[5]
(iii) It is known that at least 97.5% of a randomly chosen glass jar containing n scoops of
vanilla icecream has weight exceeding (168 + 45n)g.
Find the largest possible value of n. [4]
05PJC
Answer:
Q1. 10 = a
Q2. (i) 4 . 45
2
= o (ii) 0.2894
Q3. (i) 0.375 (ii) 25.9 mins (iii) 0.650
Q4. (i) 0.159 (ii) 0.264 (iii) 0.562 (iv) 0.230
Q5. (i) 99.8 , 0.150 (ii) 0.650
Q6. (i) 0.415 (iii) 0.162
Q7. 0.358; 0.320; 0.319; 261
Q8. (i) 0.0765 (ii) 0.745 (iii) 0.334
Q9. (i) 0.252 (ii) 0.8844 (iii) 1692
Q10. (i) 0.203 (ii) 14; 0.420 (iii) 0.0182
Q11. (i) 0.0484 (ii) 0.083 (iii) 5 = n
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 14
Ex 5 Sampling/ Hypothesis Testing
Q1. A random sample of 40 observations is selected from a population with mean 20 and standard
deviation 2.9. Given that X is the sample mean, find
(a) ( ) 7 . 20 P s X (b) ( ) 5 . 18 P > X
(c) ( ) 5 . 19 14 P s < X (d) the value of a if ( ) 05 . 0 P = > a X
Q2. A random sample of 250 adult men undergoing a routine medical inspection had their heights
( x cm) measured to the nearest centimeter, and the following data was obtained
43205 =
x and
= 7469107
2
x .
Calculate an unbiased estimate of the population variance, correct to 5 decimal places.
Q3. For the following cases, comment on the sampling methods used. Write down the distribution
of each of the sample mean.
(i) A sample of 50 NJC students on the height of JC students in Singapore, given that the
height of JC students are normally distributed.
(ii) A sample of 30 people on the amount a Singaporean spent on entertainment in a
month.
(iii) 12 observations of the traffic along Hillcrest Road during peak hours 5 to 6pm for the
number of inconsiderate drivers.
(iv) A sample of 15 people on the type of burgers that Singaporeans like most.
(v) A sample of a carton of 24 Pink Dolphins bottle drinks for the sugar content in each
Pink Dolphin drink. The standard deviation for sugar content observed by the
manufacturer over the years is 0.01mg.
Q4. The random variable X has mean 25 and standard deviation 4.5. A sample of 500
observations is made. Find the value of a , correct to 3 decimal places, such that
( ) 75 . 0 P = > a X . Does your answer depend on the Central Limit Theorem? Give an
explanation for your answer.
Q5. Sally wants to open an accessory shop at a shopping mall. She decides to make a survey on
the accessory preferences of the people visiting the mall. She gets three of her friends to help
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 15
her in conducting the survey. Each of them is told to question 20 ladies between the age of 16
to 25 and 20 ladies between the age of 26 to 35.
(i) What type of sampling is Sally using?
(ii) Is this sampling method random? Why?
(iii) Comment on the number of people that are surveyed.
Q6. A lift maintenance firm has a team of workmen who are available to deal with lift breakdowns
reported by residents of a large housing estate. The time (excluding traveling time), X, taken to
deal with a breakdown is assumed to follow a normal distribution with mean 65 minutes and
standard deviation 60 minutes.
(a) A random sample of 90 breakdowns is taken. Find the probability that the mean time taken
to deal with each of these breakdowns is less than 70 minutes. [3]
(b) A statistician stated that, as the times has a mean of 65 minutes and a standard deviation of
60 minutes, the normal distribution would not provide an adequate model.
(i) Explain the reason for the statisticians statement. [2]
(ii) Give a reason why, despite the statisticians statement, your answer to part (a) is
still valid. [1]
Q7. The owner of an old car park suspects that his employees are cheating on him by underreporting
the duration the cars are left in the lots. He spent one day at his car park and kept a detailed
record of the 300 cars that were parked there that day.
From the owners record, he made the following summary:
= 900 x and
= 5000
2
x ,
where x hours is the duration each car was left in lot.
i) From his employees reports, a mean of 2.7 hours was obtained.
State suitable null and alternative hypotheses, involving the population mean
duration hours, for a test to determine whether the employees were underreporting the
duration the cars were left in the lots.
Carry out this test, using a 5% significance level.
ii) State, with a reason, whether it is necessary to assume that the duration each car was left
in the lot follows a normal distribution in using the above test.
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 16
Q8.
(a) The random variable X measures the reliability of a certain electrical component. A
random sample of 100 independent observations of X is taken and the results are
summarized by:
1163.2 X =
and
2
13912.5 X =
Calculate unbiased estimates of the mean and variance of X. Test, at the 5% level of
significance, whether the population mean of X differs from 12.
(b) Albert is always late for school. The discipline head has recorded his traveling time from home
to school every morning. Over a long period, he finds that Albert's mean travelling time is 24.5
minutes. He then recommends to Albert an alternative route to school. For the alternative
route, Albert's travelling time, t minutes, on each of the 72 randomly chosen mornings is again
noted. The results are summarised as follows:
E(t 20) = 215; E(t 20)
2
= 3234
i) Find the unbiased estimates of the population mean and population variance of the
travelling time for the alternative route.
ii) Using 5% significance level, test whether Albert's travelling time from home to school
has shortened after taking the alternative route to school.
Q9. A manufacturer of candles claimed that it produced birthday candles with a mean burning
time of 6 minutes. A random sample of 150 birthday candles was tested and the burning
times, X minutes, were summarized by
120 ) 5 ( =
x and 638 ) 5 (
2
=
x .
(i) Calculate the unbiased estimates for the mean and variance o
2
. [3]
(ii) Perform a statistical test, at 6% level of significance, to determine whether the
manufacturers claim was exaggerated. [4]
(iii) Find the largest level of significance o (o e Z ) which would result in the rejection of
the null hypothesis. [2] 05SAJC
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 17
Q10. A perfumery produces bottles of perfume whose content may be assumed to be normally
distributed. The owner of the perfumery claims that the mean content of each bottle of
perfume is 400 ml. To investigate the owners claim, a random sample of 110 bottles was
taken and it was found that
= 220 ) 400 (x and
= 7416 ) 400 (
2
x ,
where x is the content in ml of a bottle of perfume.
(i) Find an unbiased estimate of the population variance. [3]
(ii) Based on the above results, test at the 5% level of significance whether the owner has
overstated the mean content of all the bottles of perfume produced. [5]
Another random sample of size 20 is taken. The test in part (ii) is again repeated. State with
a reason whether this test is still valid given the smaller sample size. [2] 05 SRJC
Q11. A drug company tested a new slimming product called Slim Line on a random sample of
100 overweight individuals over a trial period of 3 months. The weight loss by each
individual within this trial period, x kg, is recorded and summarized as follows:
( 5) 120 x =
and
2
( 5) 250 x =
.
(i) Test, at the 2% significance level, whether the mean weight loss by an individual is
more than 6 kg. [5]
(ii) In another 2 tail test at the 1% level of significance, the null hypothesis
0
= ,
where is the population mean weight loss by an individual, is rejected. Find the set
of possible values of
0
. [2]
05PJC
Q12. It has been observed that the mean time for the candidates to complete a specific test is 50
minutes. The time (in minutes) required for 20 randomly selected candidates from a large
cohort of candidates to complete a similar test are summarized as follows:
850 =
x and 36835
2
=
x .
(i) Calculate the unbiased estimates for the population mean and variance of the time
required to complete the test. [3]
(ii) The examiner claims that on average, candidates take less than 50 minutes to complete
the test. Test, at 5% level of significance, whether the examiners claim is justifiable. [5]
H1 MATHEMATICS STATS REVISION
2011 Mr Teo  www.teachmejcmathsg.webs.com 18
Answer:
Q1. (a) 0.937
(b) 0.999
(c) 0.138
(d) 75 . 20 = a
Q2. 173 = x ; 71 . 9
2
= s
Q3. (i) The sample could be biased and not representative of all JC students in Singapore.
2
N ,
s
X
n
 

\ .
(ii) 30 people is too small a sample to be representative of the population of Singapore, because
there are many categories to consider, for example, age group, income, gender, race etc.
2
N ,
s
X
n
 

\ .
(iii) Sample is too small and it is a discrete distribution. There is no way to determine
the distribution of X .
(iv) Sample is too small. And it is not a (meaningful) discrete nor a continous
random variable.
(v) The sample is not random because it comes from the same carton.
2
N , where =0.01 X
n
o
o
 

\ .
Q4. 864 . 24 = a ; yes
Q5. (i) Quota sampling (ii) No. (iii) Sample size is small.
Q6. (a) 0.785
Q7. (i) pvalue = 0.0305 Reject
0
H . (ii) No
Q8. (a) 11.63; 3.86; pvalue = 0.0597 Do not reject
0
H
(b) (i) 22.986; 36.507 (ii) pvalue = 0.0335, Reject
0
H
Q9. (i) 5.8 x = ;
2
3.637584 s = (ii) 0.09951 p = ; Do not reject
0
H
(iii) 9.951% ~10%
Q10. (i) 64 (ii) pvalue = 0.00437 Yes, overstated ; still valid
Q11. (i) p = 0.0266 , do not reject H0 (ii) 5.73 or 6.47
o o
< >
Q12. (i) 42.5 x =
2
37.36842105 37.4 s = ~ (ii) p = 0.000 000 0205 , do not reject H0
The End