You are on page 1of 22

# S1- REVISION WORKSHEET -1

Answers: 1) (a) r = -0.7690 (b) it shows ve correlation meaning: - less glove sales in higher temperatures
2(a) 0.5 2(b)

2(c)

## + 4(a) y=2.21+0.0529x 4(b) n = -14.8+0.0529v

4(c) 33 5(a) -86 5(b) mean = 369, SD=466 5(c) mean is raised by a few very large values, most weeks a
lot less is stolen; median is more typical but would suggest that the amount stolen is much less of a
problem than it really is. 6(a)

6(b)

6(c)

7(a)

=44.5,

=73.1,

=105.7
7(b) mean = median = 72,

= . ,

## = . 7(c) median and quartiles from model all slightly lower

than in new results but reasonably close so fairly suitable model
Page 1 of 2 praveenglitters@gmail.com
Page 2 of 2
K
P
K
Kothakonda Praveen Kumar
S1- REVISION WORKSHEET -2
Answers : 1)B has greater probability of completing, equal to 0.454 2(a) p=0.3 2(b)(i) 0.37
2(b)(ii) 0.52 2(c) 2.61 and 2.38 2(d) F(x)=0.11, 0.28, 0.48, 0.61, 0.91, 1 3(a) 68 3(b)r=0.736
3(c) = . + 3(d) when x=70 , y . 3(e) Not very reliable,as value of r shows only
moderate correlation. 4(a) 35,15 (b) 40 (c) 18.91 (d) 7.26 (e)Q2 = 18, Q1= 13.75 Q3=23
4(f) 0.376(Positve skewness)
Page 1 of 2
Kothakonda Praveen Kumar
praveenglitters@gmail.com
1
= 23,
2
= 29,
3
= 34, = 11 6(b)
2

1
> 6
3

2
, +
6(c) recommend mean and SD as they take account of all values and there is little skew/ few extreme
values. 6(d) outliers are 0, 2 7(a) 43% 7(b) 0.37 8(a) 0.9441 8(b) 87.3% 8(c) 0.139
Page 2 of 4
K
P
K
Kothakonda Praveen Kumar
Kothakonda Praveen Kumar
Answers : 9(a) 0.28 9(b) 0.3 9(c)0.76 9 (d) 0.59 10(a) 43 and 4
10(b) one value is 8 below mean and one value is 8 above mean, so mean is unchanged.
Page 3 of 4
May/2014/Maldives
Page 4 of 4
K
P
K
Kothakonda Praveen Kumar - 7763399
1. A company assembles drills using components from two sources. Goodbuy supplies 85% of the
components and Amart supplies the rest. It is known that 3% of the components supplied by
Goodbuy are faulty and 6% of those supplied by Amart are faulty.
(a) Represent this information on a tree diagram.
(3)
An assembled drill is selected at random.
b)
Find the probability that it is not faulty.
(3)
2.
The number of caravans on Seaview caravan site on each night in August last year is summarised
in the following stem and leaf diagram.
Caravans 10 means 10 Totals
1 0 5 (2)
2 1 2 4 8 (4)
3 0 3 3 3 4 7 8 8 (8)
4 1 1 3 5 8 8 8 9 9 (9)
5 2 3 6 6 7 (5)
6 2 3 4 (3)
(a) Find the three quartiles of these data. (3)
During the same month, the least number of caravans on Northcliffe caravan site was 31. The
maximum number of caravans on this site on any night that month was 72. The three quartiles
for this site were 38, 45 and 52 respectively.
(b) On graph paper and using the same scale, draw box plots to represent the data for both
caravan sites. You may assume that there are no outliers. (6)
(c) Compare and contrast these two box plots.
(3)
(d) Give an interpretation to the upper quartiles of these two distributions.
(2)
Page 1of 2
3. The following table shows the height x, to the nearest cm, and the weight y, to the nearest kg, of
a random sample of 12 students.
x 148 164 156 172 147 184 162 155 182 165 175 152
y 39 59 56 77 44 77 65 49 80 72 70 52
(a) On graph paper, draw a scatter diagram to represent these data.
(b) Write down, with a reason, whether the correlation coefficient between x and y is positive or
negative.
The data in the table can be summarised as follows.
(3)
Answers : 1(B) 0.9655 2(a)Q1=33, Q2=41, Q3=52 2(c) Median of Northcliffe is greater than median of Seaview/ Upper
quartiles are the same /IQR of Northcliffe is less than IQR of Seaview/Northcliffe positive skew, Seaview negative skew/
Northcliffe symmetrical, Seaview positive skew (quartiles)/ Range of Seaview greater than range of Northcliffe
3(d)1.028
2(d)On 75% of the nights that month both had no more than 52 caravans on site.
3(b)Positive; as x increases, y increases 3(c) 1793
S1 - REVISION WORKSHEET -3
x =1962, y =740, y
2
=47 746, xy =122 783, S
xx
=1745.
Kothakonda Praveen Kumar
Kothakonda Praveen Kumar
(e) Find, to 3 significant figures, the mean y and the standard deviation s of the weights of
this sample of students.
(3)
(f ) Find the values of y 1.96s.
(g) Comment on whether or not you think that the weights of these students could be
modelled by a normal distribution.
(1)
(2)
4. The random variable X has probability function P(X =x) =kx,
x =1, 2, ..., 5.
(a) Show that k =
15
1
.
(2)
Find
(2)
(c) E(X),
(2) (d) E(3X 4). (2) (b) P(X < 4),
5. Articles made on a lathe are subject to three kinds of defect, A, B or C. A sample of 1000 articles
was inspected and the following results were obtained.
31 had a type A defect
37 had a type B defect
42 had a type C defect
11 had both type A and type B defects
13 had both type B and type C defects
10 had both type A and type C defects 6 had all three types of defect.
(a) Draw a Venn diagram to represent these data. (6)
Find the probability that a randomly selected article from this sample had
(1)
(c) no more than one of these defects.
(2)
An article selected at random from this sample had only one defect.
(2)
Two different articles were selected at random from this sample.
(2)
(b) no defects,
(d) Find the probability that it was a type B defect.
(e) Find the probability that both had type B defects.
6. A discrete random variable is such that each of its values is assumed to be equally likely.
(1)
(b) Give an example of such a distribution.
(1)
(c) Comment on the assumption that each value is equally likely.
(2)
(d) Suggest how you might refine the model in part (a).
(2)
(a) Write down the name of the distribution that could be used to model this random variable.
Answers : 3(e) 61.7, sd = 13.3 or 13.9 3(f) 34-36, 87-89 3(g) All values between their 35.7 and their
6(d) carry out an experiment to establish probabilities
Page 2 of 2
K
P
K
(2)
(d) Find, to 3 decimal places, the value of b. (2)
(c) Find S
xy
.
The equation of the regression line of y on x is y = 106.331 + bx.
87.7 so could be normal. 4(b) 0.4 4(c) 11/3 4(d) 7 5(b)0.918 5(c) 0.978 5(d) 0.316 or 0.317
5(e) 0.0013 or 0.00133 6(a) (discrete) Uniform 6(b) e.g. tossing a fair dice/coin
6(c)Useful in theory : allows problems to be modelled not necessarily true in practice
Kothakonda Praveen Kumar
1.
An athlete believes that her times for running 200 metres in races are normally distributed
with a mean of 22.8 seconds.
(a) Given that her time is over 23.3 seconds in 20% of her races, calculate the variance of
her times. (5 marks)
(b) The record over this distance for women at her club is 21.82 seconds. According to
her model, what is the chance that she will beat this record in her next race? (3 marks)
2.
The events A and B are such that P(A) =
16
5
, P(B) =
2
1
and P(A | B) =
4
1
Find
(a) P(A B), (2+3+2 marks)
(b) P(B | A),
(c) P(A B),
(d) Determine, with a reason, whether or not the events A and B are independent. (3 marks)
3. A group of 60 children were each asked to choose an integer value between 1 and 9 inclusive.
Their choices are summarised in the table below.
Value chosen 1 2 3 4 5 6 7 8 9
Number of children 3 4 5 10 12 13 7 4 2
(a) Calculate the mean and standard deviation of the values chosen. (6 marks)
It is suggested that the value chosen could be modelled by a discrete uniform distribution.
(b) Write down the mean that this model would predict. (2 marks)
Given also that the standard deviation according to this model would be 2.58,
(c) explain why this model is not suitable and suggest why this is the case. (2 marks)
4.
A six-sided die is biased such that there is an equal chance of scoring each of the numbers
from 1 to 5 but a score of 6 is three times more likely than each of the other numbers.
(a) Write down the probability distribution for the random variable, X, the score on a single
throw of the die.
(4 +3+2 M)
(b) Show that E(X) =
8
33
.
(c) Find E(4X 1 ) . (d) Find Var(X.
(4 marks)
5.
The number of patients attending a hospital trauma clinic each day was recorded over several
months, giving the data in the table below.
Number of patients 10 - 19 20 - 29 30 - 34 35 - 39 40 - 44 45 - 49 50 - 69
Frequency 2 18 24 30 27 14 5
(i) 30 - 34
the distribution.
(5 marks)
(ii) 50 - 69 (6 marks)
These data are represented by a histogram.
Given that the bar representing the 20 - 29 group is 2 cm wide and 7.2 cm high,
(a) calculate the dimensions of the bars representing the groups
(b) Use linear interpolation to estimate the median and quartiles of these data. (6 marks)
The lowest and highest numbers of patients recorded were 14 and 67 respectively.
(c) Represent these data with a boxplot drawn on graph paper and describe the skewness of
GRADE - 11 S1 - REVISION WORKSHEET -4
K
P
K
K
P
K
Kothakonda Praveen Kumar
PAGE 2 OF 2
6. Penshop have stores selling stationary in each of 6 towns. The population, P, in tens of
thousands and the monthly turnover, T, in thousands of pounds for each of the shops are as
recorded below.
Town Abberton Bember Claster Deller Edgeton Figland
P (0 000s) 3.2 7.6 5.2 9.0 8.1 4.8
T ( 000s) 11.1 12.4 13.3 19.3 17.9 11.8
(a) Represent these data on a scatter diagram with T on the verical axis. (4 marks)
(b) (i) Which towns shop might appear to be underachieving given the populations of
the towns?
(ii) Suggest two other factors that might affect each shops turnover. (3 marks)
You may assume that
P = 37.9, T = 85.8, P
2
= 264.69, T
2
= 1286, PT = 574.25.
(c) Find the equation of the regression line of T on P. (7 marks)
(d) Estimate the monthly turnover that might be expected if a shop were opened in Gratton,
a town with a population of 68 000. (2 marks)
(e) Why might the management of Penshop be reluctant to use the regression line to
estimate the monthly turnover they could expect if a shop were opened in Haggin, a
town with a population of 172 000?
(1 mark)
7. The random variable X is normally distributed with mean 79 and variance 144.
(3) + (3) Find (a) P(X <70),
(b) P(64 <X <96).
It is known that P(79 a X 79 +b) =0.6463. This information is shown in the figure below.
0.6463
79 a 79 79 +b
(3)
Given that P(X 79 +b) =2P(X 79 a),
(c) show that the area of the shaded region is 0.1179.
(d) Find the value of b.
(4)
Answers : 1(a) 0.3530 (b) 0.0495 2(a) 1/8 (b) 3/5 (c) 13/16 (d) Not independent
3(a)1.93 (b)(by symmetry) 5 (c) actual std.dev.much lower than in model tendency to pick
numbers nearer the middle 4(b)33/8 (c) 31/2 (d) 3.36 5(a) (i) w=1cm , h = 19.2cm
(ii) w=4cm ,h=1cm (b) 31.6, 37.2, 42.5 (c) Symmetrical (or slight +ve skew) 6(b)(i) Bember
6(b)(ii) e.g. how near to town centre; size of shop 6(c) T=6.24+1.28P 6(d) 14900
6(e) P=17.2 which lies outside the set of values used to obtain the equation.
7(a)0.2266 (b)0.8166 (d) 8.64
K
P
K
Kothakonda Praveen Kumar
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
1. An adult evening class has 14 students. The ages of these students have a mean of 31.2 years
and a standard deviation of 7.4 years.
A new student who is exactly 42 years old joins the class. Calculate the mean and standard
deviation of the 15 students now in the group.
2. A tennis coach believes that taller players are generally capable of hitting faster serves.
To investigate this hypothesis he collects data on the 20 adult male players he coaches.
The height, h, in metres and the speed of each players fastest serve, v, in miles per hour were
recorded and summarised as follows:
h = 36.22, v = 2275, h
2
= 65.7396, v
2
= 259853, hv = 4128.03.
3. The events A and B are such that
P(A) = 0.2 and P(A B) = 0.6
Find
(a) P(A B ) (b) P(A B)
Given also that events A and B are independent, find (c) P(B) (d) P(A B )
(a) Calculate the product moment correlation coefficient for these data.
(b) Comment on the coachs hypothesis.
4. The discrete random variable X has the following probability distribution.
x 1 2 3 4 5
P(X = x) 0.1 0.35 k 0.15 k
Calculate
(a) k,
(b)
(c) P(1.3 < X 3.8)
(d) E(X) (e) Var(3X+2) F(2)
5. For a project, a student asked 40 people to draw two straight lines with what they thought was
an angle of 75 between them, using just a ruler and a pencil. She then measured the size of
the angles that had been drawn and her data are summarised in this stem and leaf diagram.
Angle (64 means 64) Totals
44 1 (1)
4 (0)
5 0 00 2 (3)
5 55 5 8 (3)
6 11 1 1 1 1 (5)
6 55 5 5 5 5 (5)
7 0 0 0 0 0 0 0 1 00 (9)
7 55 5 5 5 5 5 6 (7)
8 00 0 0 0 1 (5)
8 6 55 (2)
(a) Find the median and quartiles of these data.
Given that any values outside of the limits Q
1
1.5(Q
3
Q
1
) and Q
3
+ 1.5(Q
3
Q
1
) are to be
regarded as outliers,
(b) determine if there are any outliers in these data,
(c) draw a box plot representing these data on graph paper,
(d) describe the skewness of the distribution and suggest a reason for it.
Grade - 11 S1 - Revision Worksheet-5
K
P
K
Page 1 OF 2
Kothakonda Praveen Kumar
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
Answers : (1) mean=31.9 years , SD=7.6 years (2)(a)r = 0.6417 (b) r is fairly strongly +ve, supporting hypothesis
(3)(a)0.4 (b)0.4 (c) P(B) = 0.5 (d)0.9 4(a) k = 0.2 (b) 0.45 (c)0.55 (d) 3 (e) E(X
2
) = 10.7, Var(X) = 1.7
Var(3X + 2) = 15.3 (5) (a) Q
1
= 63,Q
2
= 71.5,Q
3
= 77 (b) 41 is an outlier

6.
The individual letters of the word STATISTICAL are written on 11 cards which are then
shuffled.One card is selected at random. Find the probability that it is
(a) a vowel, (b) a T, given that it is a consonant.
The 11 cards are then shuffled again and the top three are turned over. Find the probability
that
(c) all three of the cards have a T on them,
(d) at least two of the cards show a vowel.
7. The volume of liquid in bottles of sparkling water from one producer is believed to be
normally distributed with a mean of 704 ml and a variance of 3.2 ml
2
.
Calculate the probability that a randomly chosen bottle from this producer contains
(a) more than 706 ml, (b) between 703 and 708 ml.
The bottles are labelled as containing 700 ml.
(c) In a delivery of 1 200 bottles, how many could be expected to contain less than the
stated 700 ml?
The bottling process can be adjusted so that the mean changes but the variance is unchanged.
(d) What should the mean be changed to in order to have only a 0.1% chance of a bottle
having less than 700 ml of sparkling water? Give your answer correct to 1 decimalplace.
0
5

0 0 1 0 2 0 3 0 4 0 5 0 6
(c)
10

40 30 30 4 30 4 30 80 90
5(d) neagative skewness e.g. people know 90 so less likely to draw much larger than 75
6. (a)
11
4
(b) 3 Ts, 7 consonants,
7
3
(c)
165
1
9
1
10
2
11
3
=
(d) 3 vowels:
165
4
9
2
10
3
11
4
=
2 vowels: 3
55
14 7
9 10
3
11

4
=
P(at least 2 vowels) =
165
46
55
14
165
4
+ =
7. (a) P(Z >
3.2
706 704
) = P(Z > 1.12) = 0.1314
(b) P(
703
3

.
7
2
04
< Z <
3.2
708 704
)
= P

## ( 0.56 < Z < 2.24)

= P Z ( < 2.24) P(Z <

0.56)
= 0.9875 0.2877 = 0.6998
(c) P(Z <
3.2
700 704
) = P(Z <

2.24) = 0.0125
expect 0.0125 1 200 = 15
700
(d) P(Z <
3.2
) = 0.01
3.2
700
= - 3.0902
= 700 + (3.0902 3.2 ) = 705.5 ml (1dp)
K
P
K
Page 1 OF 2
Praveenglitters@gmail.com
S1- REVISION WORKSHEET - 6
1. The students in a class were each asked to write down how many CDs they owned. The student with the
least number of CDs had 14 and all but one of the others owned 60 or fewer. The remaining student
owned 65. The quartiles for the class were 30, 34 and 42 respectively.
Outliers are defined to be any values outside the limits of 1.5(Q
3
Q
1
) below the lower quartile or
above the upper quartile. (7 marks)
On graph paper draw a box plot to represent these data, indicating clearly any outliers.
2. The random variable X is normally distributed with mean 177.0 and standard deviation 6.4.
(a) Find P(166 <X <185). (4 marks)
It is suggested that X might be a suitable random variable to model the height, in cm, of adult males.
(b) Give two reasons why this is a sensible suggestion. (2 marks)
(c) Explain briefly why mathematical models can help to improve our understanding of real-world
problems. (2 marks)
3. A fair six-sided die is rolled. The random variable Y represents the score on the uppermost, face.
(a) Write down the probability function of Y. (2 marks)
(b) State the name of the distribution of Y. (1 mark)
Find the value of
(c) E(6Y +2), (4 marks)
(d) Var(4Y 2). (5 marks)
4. The following grouped frequency distribution summarises the number of minutes, to the nearest
minute, that a random sample of 200 motorists were delayed by roadworks on a stretch of motorway.
Delay (mins) Number of motorists
46 15
78 28
9 49
10 53
1112 30
1315 15
1620 10
(a) Using graph paper represent these data by a histogram. (4 marks)
(b) Give a reason to justify the use of a histogram to represent these data. (1 mark)
(c) Use interpolation to estimate the median of this distribution. (2 marks)
(d) Calculate an estimate of the mean and an estimate of the standard deviation of these data.
(6 marks)
One coefficient of skewness is given by
deviation standard
median) 3(mean
.
(e) Evaluate this coefficient for the above data. (2 marks)
(f) Explain why the normal distribution may not be suitable to model the number of minutes that
motorists are delayed by these roadworks. (2 marks)
Page 1 of 2
K
P
K
For more resources search in www.scribd.com
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
5. The employees of a company are classified as management, administration or production. The following
table shows the number employed in each category and whether or not they live close to the company or
some distance away.
An employee is chosen at random.
Find the probability that this employee
(a) is an administrator, (2 marks)
(b) lives close to the company,
given that the employee is a manager.
(2 marks)
Of the managers, 90% are married, as are 60% of the administrators and 80% of the production employees.
(c) Construct a tree diagram containing all the probabilities. (3 marks)
(d) Find the probability that an employee chosen at random is married. (3 marks)
An employee is selected at random and found to be married.
(e) Find the probability that this employee is in production. (3 marks)
6. A local authority is investigating the cost of reconditioning its incinerators. Data from 10 randomly
chosen incinerators were collected. The variables monitored were the operating time x (in thousands of
hours) since last reconditioning and the reconditioning cost y (in 1000). None of the incinerators had been
used for more than 3000 hours since last reconditioning.
The data are summarised below,
x =25.0, x
2
=65.68, y =50.0, y
2
=260.48, xy =130.64.
(a) Find S
xx
, S
xy
, S
yy
. (3 marks)
(b) Calculate the product moment correlation coefficient between x and y. (3 marks)
(c) Explain why this value might support the fitting of a linear regression model of the form
y =a +bx. (1 mark)
(d) Find the values of a and b. (4 marks)
(e) Give an interpretation of a. (1 mark)
(f) Estimate
(i) the reconditioning cost for an operating time of 2400 hours,
(ii) the financial effect of an increase of 1500 hours in operating time. (4 marks)
(g) Suggest why the authority might be cautious about making a prediction of the reconditioning cost
of an incinerator which had been operating for 4500 hours since its last reconditioning.
(2 marks)
2(b) Height is a continuous random variable,
2(c) Simplifies a real world problem
/quicker and cheaper.
3(a)P(Y=y)=1/6 ; y=1,2,3,4,5,6 3(b)Discrete uniform distribution 3(c) 23 3(d) 46.6 or 46.7
4(b)The variable (minutes delayed) is continuous 4(c) 9.65 or 9.66 4(d) mean=9.96 , SD=2.78 4(e)0.329
4(f) For a normal distribution, coefficient of skewness must be zero. In this case coefficient of skewness is
0.329 , so normal distribution may not be suitable. 5(a)0.28 5(b) 0.3 5(d) 0.76 5(e)0.589
6(a) 3.18 , 5.64, 10.48 6(b) 0.977 6(c) since positive correlation 6(d) a=0.566,b=1.77 6(e) The cost of
reconditioning immediately after it has been reconditioned is 566 pounds.
6(f)(i) \$ 4814 (ii) increase of \$2655. 6(g) 4500 is well outside the range of values of x( 30)and there is
no evidence that model will apply.
Live close Live some
distance away
Management 6 14
Production 45 25
page 2 of 2
K
P
K
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
Page 1 of 2
1.
(a) Draw two separate scatter diagrams, each with eight points, to illustrate the relationship
between x and y in the cases where they have a product moment correlation coefficient
equal to
(i) exactly + 1,
(4 marks)
(b) Explain briefly how the conclusion you would draw from a product moment correlation
coefficient of + 0.3 would vary according to the number of pairs of data used in its
calculation. (2 marks)
2. A histogram was drawn to show the distribution of age in completed years of the participants
on an outward-bound course.
There were 32 people aged 30 - 34 years on the course. The height of the rectangle
representing this group was 19.2 cm and it was 1 cm in width.
Given that there were 28 people aged 35 - 39 years,
(3 marks)
Given that the height of the rectangle representing people aged 40 - 59 years was 2.7 cm,
(b) find the number of people on the course in this age group.
(3 marks)
3. The events A and B are such that P(A) =
12
7
, P(A B) =
4
1
and P(A | B) =
3
2
.
(9 marks) Find (a) P(B) , (b) P(A B) , (c) P(B | A ).
(a) find the height of the rectangle representing this group.
4. The owner of a mobile burger-bar believes that hot weather reduces his sales.
To investigate the effect on his business he collected data on his daily sales, P, and the
maximum temperature, T C, on each of 20 days. He then coded the data, using x = T 20 and
y = P 300, and calculated the summary statistics given below.
x = 57, y = 2222, x
2
= 401, y
2
= 305576, xy = 3871.
(a) Find an equation of the regression line of P on T. (9 marks)
The owner of the bar doesnt believe it is profitable for him to run the bar if he takes less than
460 in a day.
(b) According to your regression line at what maximum daily temperature, to the nearest
degree Celsius, does it become unprofitable for him to run the bar? (3 marks)
5. The discrete random variable X has the probability function shown below.
2, 3, 4, 5, 6,
otherwise.
= x
0,
,

) =
kx
P( X = x
(2 marks)
(3 marks)
(2 marks)
(2 marks)
(a) Find the value of k.
(b) Show that E(X) =
9
2
.
Find (c) P[X > E(X) ] , (d) E( 2X 5 ) , (e) Var(X) . (4 marks)

S1- REVISION WORKSHEET -7
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
Page 2 of 2
6. A geologist is analysing the size of quartz crystals in a sample of granite. She estimates that
the longest diameter of 75% of the crystals is greater than 2 mm, but only 10% of the crystals
have a longest diameter of more than 6 mm.
The geologist believes that the distribution of the longest diameters of the quartz crystals can
be modelled by a normal distribution.
(a) Find the mean and variance of this normal distribution. (9 marks)
The geologist also estimated that only 2% of the longest diameters were smaller than 1 mm.
(b) Calculate the corresponding percentage that would be predicted by a normal distribution
with the parameters you calculated in part (a).
(c) Hence, comment on the suitability of the normal distribution as a model in this
situation. (2 marks)
7. Jane and Tahira play together in a basketball team. The list below shows the number of points
that Jane scored in each of 30 games.
39 19 28 30 18 21 23 15 34 24
29 17 43 12 24 25 41 19 26 40
45 23 21 32 37 24 18 15 24 36
(a) Construct a stem and leaf diagram for these data. (3 marks)
(b) Find the median and quartiles for these data. (4 marks)
(c) Represent these data with a boxplot. (3 marks)
Tahira played in the same 30 games and her lowest and highest points total in a game were 19
and 41 respectively. The quartiles for Tahira were 27, 31 and 35 respectively.
(d) Using the same scale draw a boxplot for Tahiras points totals. (2 marks)
(e) Compare and contrast the number of points scored per game by Jane and Tahira.
(3 marks)
(3 marks)
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
Page 1 of 2
1.
(a) Explain briefly what you understand by a statistical model.
(b) Name a distribution that you think might be suitable for modelling such data.
(c) Describe two features that you would expect to find in the distribution of the weights
of adult female otters and that led to your choice in part (b).
(d) Why might your choice in part (b) not be suitable for modelling the weights of all
2. For a geography project a student studied weather records kept by her school since 1993.
To see if there was any evidence of global warming she worked out the mean temperature in
degrees Celsius at noon for the month of June in each year.
Her results are shown in the table below.
Year 1993 1994 1995 1996 1997 1998 1999 2000
Mean temperature
(C)
21.9 24.1 20.7 23.0 24.2 22.1 22.6 23.9
(a) Plot a scatter diagram showing these data.
The student wanted to investigate further whether or not her data provided evidence of an
increase in temperature in June each year. Using Y for the number of years since 1993 and T
for the mean temperature, she calculated the following summary statistics.
Y = 28, T = 182.5, Y
2
= 140, T
2
= 4173.93, YT = 644.7.
(b) Calculate the product moment correlation coefficient for these data.
(c) Comment on your result in relation to the students enquiry.
A zoologist is analysing data on the weights of adult female otters.
3.
In a study of 120 pet-owners it was found that 57 owned at least one dog and of these 16 also
owned at least one cat. There were 35 people in the group who didnt own any cats or dogs.
(a) owns a dog but does not own a cat,
As an incentive to take part in the study, one participant is chosen at random to win a
years free supply of pet food.
Find the probability that the winner of this prize
(b) owns a cat, (c)does not own a cat given that they do not own a dog.
4. The time taken in minutes, T, for a mechanic to service a bicycle follows a normal distribution
with a mean of 25 minutes and a variance of 16 minutes
2
.
Find (a) P(T < 28), (b) P(T 25< 5).
(c) Find the probability that he will take less than 23 minutes on each of the three bicycles.
One afternoon the mechanic has 3 bicycles to service.
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
5. An internet service provider runs a series of television adverts at weekly intervals.
To investigate the effectiveness of the adverts the company recorded the viewing figures in
millions, v, for the programme in which the advert was shown, and the number of new
customers, c, who signed up for their service the next day.
The results are summarised as follows.
v = 4.92, c = 104.4, S
vc
= 594.05, S
vv
= 85.44.
(a) Calculate the equation of the regression line of c on v in the form c = a + bv. (4 marks)
(b) Give an interpretation of the constants a and b in this context.
(2 marks)
(c) Estimate the number of customers that will sign up with the company the day
after an
advert is shown during a programme watched by 3.7 million viewers.
(2 marks)
(d) State two other factors besides viewing figures that will affect the success of an advert
in gaining new customers for the company.
(2 marks)
6. The number of people visiting a
new art gallery each day is recorded
over a three-month period and the
results are summarised in the table
below.
(a) Draw a histogram on graph paper
to illustrate these data.
In order to calculate summary statistics
for the data it is coded using
y =
the mid-point of each class.
Number of visitors Number of days
400 - 459 3
460 - 479 8
480 - 499 13
500 - 519 12
520 - 539 18
540 - 559 11
560 - 599 9
600 - 699 5

x

509.

5
, where x is
(b) Find fy.
You may assume that fy
2
= 2041.
(c) Using these values for fy and fy
2
, calculate estimates of the mean and standard
deviation of the number of visitors per day.
10
7.
A bag contains 4 red and 2 blue balls, all of the same size. A ball is selected at random and
removed from the bag. This is repeated until a blue ball is pulled out of the bag.
(a) Show that P(B = 2) =
15
4
.
(b) Find the probability distribution of B. (c) Find E(B).
The bag and the same 6 balls are used in a game at a funfair. One ball is removed from the
bag at a time and a contestant wins 50 pence if one of the first two balls picked out is blue.
(d) What are the expected winnings from playing this game once?
For 1, a contestant gets to play the game three times.
(e) What is the expected profit or loss from the three games?
END
The random variable B is the number of balls that have been removed from the bag.
Page 2 of 2
Solomon - C
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
1.
Joel buys a box of second-hand Jazz and Blues CDs at a car boot sale.
In the box there are 30 CDs, 8 of which were recorded live. 16 of the CDs are predominantly
Jazz and 13 of these were recorded in the studio. This information is shown in the following
table.
Studio Live Total
Jazz 13 16
Blues
Total 8 30
2. The discrete random variable Q has the following probability distribution.
q 1 2 3 4 5
P(Q = q)
5
1
5
1
5
1
5
1
5
1
(a) Write down the name of this distribution.
r 14 24 34 44 54
P(R = r)
5
1
5
1
5
1
5
1
5
1
(a) Copy and complete the table above .
Joel picks a CD at random to play first. Find the probability that it is
(b) a Blues CD that was recorded live (c) Jazz CD,given that it was recorded in the studio.
The discrete random variable R has the following probability distribution.
(b) State the relationship between R and Q in the form R=aQ+b.
(c) Given that E(Q)=3 and Var(Q)=2, find E(R) and Var(R).
3.
The ages of 300 houses in a village are
recorded giving the following table of results.
Age (years) Number of houses
0 - 36
20 - 92
40 - 74
60 - 39
100 - 14
200 - 27
300 - 500 18
Use linear interpolation to estimate for these data
(a) the median,
(b) the limits between which the
middle 80% of the ages lie.
best represents the data.
An estimate of the mean of these data is
calculated to be 86.6 years.
(c) Explain why the mean and median are
so different and hence say which you consider
GRADE-11 S1 REVISION WORKSHEET - 9
Page 1 of 2
Answers : 1(b) 1/6 (c) 13/22 2(a) Disccrete uniform (b) R=10Q+4 (c) 34 and 200 3(a) 0.7611 (b) 0.1645 (c) 49
4(a)Cf's :36,128,202,241,255,282,300 Q2 = 45.9/46.14 (b) limits are 17 and 256 years
(c) Data v.skewed,some extreme high values/doesn't affect median but increases mean significantly/median is
better,most values below the mean.
5(b)0.3 (b) 2.7 (c) 9.4 (d) 1.21 6(a) 0.27 (b) 0.892 (c) 0.617 (d) 214
7(b)n=195.9-4.93h (c)no.of clinches decreases by 4.93 per hour awake (d) ability likely to be roughly constant
during normal working hours/ only decreases when awake for longer than usual (e) 18.6 hours
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
5. The discrete random variable Y has the following cumulative distribution function.
y 0 1 2 3 4
F(Y ) 0.05 0.15 0.35 0.75 1
6. A software company sets exams for programmers who wish to qualify to use their packages.
Past records show that 55% of candidates taking the exam for the first time will pass, 60% of
those taking it for the second time will pass, but only 40% of those taking the exam for the
third time will pass. Candidates are not allowed to sit the exam more than three times.
A programmer decides to keep taking the exam until he passes or is allowed no further
attempts. Find the probability that he will
(a) pass the exam on his second attempt,
(b) pass the exam.
Another programmer already has the qualification.
(c) Find, correct to 3 significant figures, the probability that she passed first time.
At a particular sitting of the exam there are 400 candidates.
The ratio of those sitting the exam for the first time to those sitting it for the second time to
those sitting it for the third time is 5 : 3 : 2
(d) How many of the 400 candidates would be expected to pass?
(a) Write down the probability distribution of Y.

(c) Show that E(Y) = 2.7 (d)Find E(2Y+4) (e) Find Var(Y).
4.
The random variable X is normally distributed with a mean of 42 and a variance of 18.
(a) P(X 45),
(b) P(32 X 3 8 ) ,
Find
(c) the value of x such that P(X x) = 0.95
7. A doctor wished to investigate the effects of staying awake for long periods on a persons
ability to complete simple tasks. She recorded the number of times, n, that a subject could
clinch his or her fist in 30 seconds after being awake for h hours.
The results for one subject were as follows.
h (hours) 24 20 21 22 23 16 17 18 19
n 116 114 109 101 94 94 86 81 80
(a) Plot a scatter diagram of n against h f or these results.
h = 180, n = 875, h
2
= 3660, hn = 17204.
(b) Obtain the equation of the regression line of n on h in the form n = a + bh.
(c) Give a practical interpretation of the constant b.
(d) Explain why this regression line would be unlikely to be appropriate for values of h
(e) After how many hours of being awake together would you expect these two subjects to
be able to clench their fists the same number of times in 30 seconds?
You may use
between 0 and 16.
Another subject underwent the same tests giving rise to a regression line of n = 213.4 5.87h.
Page 2 of 2
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
Page 1 of 2
1. The weight in kilograms, w, of the 15 players in a rugby team was recorded and the results
summarised as follows.
w = 1145.3, w
2
= 88042.14.
(a) Calculate the mean and variance of the weight of the players.
Due to injury, one of the players who weighed 79.2 kg was replaced with another player who
weighed 63.5 kg.
(b) Without further calculation state the effect of this change on the mean and variance of
2. The discrete random variable X has the following probability distribution.
x 1 2 3 4 5
P(X = x) a b
4
1
2a
8
1
(b) Find an expression for E(X) in terms of a. (a) Find an expression for b in terms of a.
3.
The time it takes girls aged 15 to complete an obstacle course is found to be normally
distributed with a mean of 21.5 minutes and a standard deviation of 2.2 minutes.
(a) Find the probability that a randomly chosen 15 year-old girl completes the course in less
than 25 minutes.
A 13 year-old girl completes the course in exactly 19 minutes.
(b) What percentage of 15 year-old girls would she beat over the course?
Anyone completing the course in less than 20 minutes is presented with a certificate of
achievement. Three friends all complete the course one afternoon.
(c) What is the probability that exactly two of them get certificates?
4. Each child in class 3A was given a packet of seeds to plant. The stem and leaf diagram below
shows how many seedlings were visible in each childs tray one week after planting.
Number of seedlings (21 means 21) Totals
0 00 2 (2)
0 (0)
1 1 (1)
1 55 7 (2)
2 00 0 0 0 1 (5)
2 5 5 5 5 5 7 55 (7)
3 00 0 0 0 0 0 0 (7)
3 5 5 6 55 (4)
4 11 1 3 (3)
S1 REVISION WORKSHEET -10
Solomon -F
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399
5. The events A and B are such that P(A) = 0.5, P(B) = 0.42 and P(A B) = 0.76
Find (a) P(A B) , (b) P(A B) , (c) P(B | A ) .
(d) Show that events A and B are not independent.
(a) Find the median and interquartile range for these data.
The mean and standard deviation for these data were 27.2 and 10.3 respectively.
Outliers are defined to be values outside of the limits Q
1
2s and Q
3
+ 2s where s is the
standard deviation given above.
(d) Represent these data with a boxplot identifying clearly any outliers.
(b) Use the quartiles to describe the skewness of the data. Show your method clearly.
(c) Explaining your answer, state whether you would recommend using these values or
6. A school introduced a new programme of support lessons in 1994 with a view to improving
grades in GCSE English. The table below shows the number of years since 1994, n, and the
corresponding percentage of students achieving A to C grades in GCSE English, p, for each
year.
n 1 2 3 4 5 6
p (%) 35.2 35.2 35.2 35.2 35.2 35.2 37.1
(a) Represent these data on a scatter diagram.
You may use the following values.
n = 21, p = 240.1, n
2
= 91, p
2
= 9675.41, np = 873.
(b) Find an equation of the regression line of p on n and draw it on your graph.
(c) Calculate the product moment correlation coefficient for these data and comment on
the
suitability of a linear model for the relationship between n and p during this period.
Answers : 1(a) 76.4 , 39.6 (b) mean lower as replacement weighs less/variance higher as repalcement's
weigh further from mean 2(a) b= (5/8)-3a (b) 3a+21/8 (c) a=1/16 , b=7/16 3(a) 0.9441 (b) 87.3%
4 (a) n=31 Q2=29 Q1=23 Q3=34 IQR = 11
(b)Q2 Q1 = 6; Q3 Q2 = 5 Q2 Q1 > Q3 Q2 slight +ve skew (c) recommend mean and std. dev.
as they take account of all values and there is little skew / few extreme values (d)(d) Q1 2s = 2.4; Q3 + 2s =
54.6 o utliers are 0, 2
5(a)0.16 (b) 0.66 (c) 0.52 (d)P(A) P(B) = 0.5 0.42 = 0.21
6(b) p = 33.5 + 1.87n
(c) (P(2 of 3 < 20) = 3 0.2483
2
0.7517 = 0.139
0
0 0 1 20 30 0 4 50
P(A B ) not independent
(c) S
pp
= 9675.41
240
6
.1
2
= 67.4083
r =
17.5 67.4083
32.65
= 0.9506 r strongly +ve supporting linear model
Page 2 of 2
40.6
39.0 43.4 44.8
Kothakonda Praveen Kumar - praveenglitters@gmail.com - 009607763399