You are on page 1of 17

GEN1008 / MED1018 / GED1008 Introduction to Statistics – Mid-term Test

Name: Student ID:

INSTRUCTIONS
1. Time allowed: 1.5 hours.
2. Answer all questions on this question-answer paper.
3. The use of HKEA approved calculator(s) is allowed.
4. The use of dictionary and other electronic devices are prohibited.

DO NOT TURN OVER UNTIL YOU ARE TOLD TO DO SO

Question Score
1
2
3
4
5
Total / 50

GEN1008 / MED1018 / GED1008 Mid-term Test Page 1 of 17


Question 1 (10 marks)

A health care professional wishes to analyse the Body Mass Index (BMI) of teenagers
aged from 13 to 19. He conducted a survey and collected the BMI readings (in kg/m 2)
of 24 teenagers. The data are shown in the table below. It is given that the mean of data
below is 21.25 kg/m2.

25 20 22 31 20 19 23 25
18 16 20 18 25 17 22 26
18 22 17 19 15 21 28 23

a) Construct a frequency distribution table for the above sample data using 6 classes.
Use 15 as the lower limit of the first class. Write a title and include the class limits,
the class boundaries and the frequency as columns in your table. [4]
b) What is the shape of the distribution of data in the sample? Compare the mean and
the median in the sample data. [2]
c) Given ∑𝑥 2 = 11200, find the standard deviation of the sample data. [2]
d) How will the standard deviation in (c) be affected if the two largest values in the
sample are changed to numbers higher than 32 kg/m2? Explain your answer briefly.
[2]

[Solution]

GEN1008 / MED1018 / GED1008 Mid-term Test Page 2 of 17


GEN1008 / MED1018 / GED1008 Mid-term Test Page 3 of 17
Question 2 (10 marks)

Consider a population of university graduates and their scores in an aptitude test are
analysed. There are two groups of graduates: Group A graduates have degrees in
humanity and Group B graduates have degrees in science. The mean and standard
deviation of scores of both groups and all graduates are shown in the table below. It is
found that the distribution of scores of all graduates follows a bimodal distribution.

Mean Standard Deviation


Group A 63.9 9.1
Group B 78.8 10.6
Overall 71.4 12.3

a) Is the variable in the study qualitative or quantitative? [1]


b) What are the coefficients of variation of the scores of Group A and Group B
graduates respectively? Compare the dispersion of scores of these two groups of
graduates. [3]
c) By using the Chebyshev’s Theorem, the overall mean and standard deviation, find
the range of scores so that at least 80% of all graduates will have. [3]
d) If the passing mark of the aptitude test is 40, is it reasonable to say that at least
80% of all graduates get a pass in the test? [1]
e) Explain whether there is any difference in your answer in (d) if the distribution of
scores of all graduates is not bimodal, but is uniform. [2]

[Solution]

GEN1008 / MED1018 / GED1008 Mid-term Test Page 4 of 17


GEN1008 / MED1018 / GED1008 Mid-term Test Page 5 of 17
Question 3 (10 marks)

There is a study on the reading time (hours per week) of secondary school students in
Hong Kong. A recent report reveals that the population mean time is 3.2 hours per week
and the population standard deviation is 0.8 hours per week. It is assumed that the
reading time follows a normal distribution.

a) Find the proportion of secondary school students in Hong Kong whose reading
time is less than 2.4 hours per week. [2]
b) Suppose a researcher wishes to select students whose reading time is in the middle
50% of the reading time of the population. Find the range of the reading time of
students who may be selected for the research. [4]
c) Another researcher selected a sample of 45 students from the whole population.
What is the probability that the mean reading time of the sample is less than 2.8
hours per week? [2]
d) Explain whether there are any changes in your answer in (c) if the reading time
follows a left-skewed distribution, instead of a normal distribution. [2]

[Solution]

GEN1008 / MED1018 / GED1008 Mid-term Test Page 6 of 17


GEN1008 / MED1018 / GED1008 Mid-term Test Page 7 of 17
Question 4 (10 marks)

Suppose you work in a research team to investigate whether cancer patients are satisfied
with their lives after receiving a radiation therapy treatment. In a pilot study (i.e., a
smaller scale study), a sample proportion 0.64 is used to determine a 90% confidence
interval, which is [0.5692, 0.7108]. The sample proportion is the proportion of patients
in the sample who are satisfied with their lives.

a) Based on the results from the pilot study, is it reasonable to conclude that more
than half of cancer patients are satisfied with their lives after receiving a radiation
therapy treatment? Explain your answer. [2]
b) What is the margin of error of the confidence interval in the pilot study? [1]
c) Suppose you want to conduct a larger scale study to find the confidence interval
of proportion. The desired level of confidence is 95% and the margin of error is
half of that in the pilot study result. By using the sample proportion in the pilot
study, find the minimum sample size required. [3]
d) You finally used a sample of 750 patients in your study and 492 of them are
satisfied with their lives after receiving the treatment. Find the 95% confidence
interval of the proportion using this sample data. Interpret your answer in the
context of the subject matter. [4]

[Solution]

GEN1008 / MED1018 / GED1008 Mid-term Test Page 8 of 17


GEN1008 / MED1018 / GED1008 Mid-term Test Page 9 of 17
Question 5 (10 marks)

There is a treatment on patients with diabetes. To study the effectiveness of the


treatment, 56 patients with diabetes were chosen and they have taken the treatment for
one week. The fasting glucose level (in mmol/L) of these patients are then measured.
The sample mean is 5.3 mmol/L and the sample standard deviation is 0.7 mmol/L.

a) What is the level of measurement of the variable in the study? [1]


b) Find the 95% confidence interval for the variable in the study. [2]
c) Interpret your answer in (b) in the context of the subject matter. [2]
d) What are the effects on the confidence interval width in (b) if (i) the sample mean
is higher than 5.3 mmol/L, and (ii) the sample standard deviation is higher than
0.7 mmol/L, respectively? [2]
e) The sample is also used to test a hypothesis that the mean fasting glucose level of
patients after treatment is less than 6 mmol/L. Write down the null and the
alternative hypothesis for the test and identify the claim. [2]
f) Is the hypothesis test in (e) a left-tailed, right-tailed or two-tailed test? [1]

[Solution]

GEN1008 / MED1018 / GED1008 Mid-term Test Page 10 of 17


THE END

GEN1008 / MED1018 / GED1008 Mid-term Test Page 11 of 17


(This is a blank page)

GEN1008 / MED1018 / GED1008 Mid-term Test Page 12 of 17


Formulae

Sample mean for individual data:


∑𝑋
𝑋̅ =
𝑛

Sample mean for grouped data:


∑𝑓 ⋅ 𝑋𝑚
𝑋̅ =
𝑛

Population variance:

2
∑(𝑋 − 𝜇)2 ∑𝑋 2
2
∑𝑋 2
𝜎 = 𝑜𝑟 𝜎 = −( )
𝑁 𝑁 𝑁

Sample variance:
∑(𝑋 − 𝑋̅)2 𝑛(∑𝑋 2 ) − (∑𝑋)2
𝑠2 = 𝑜𝑟 𝑠2 =
𝑛−1 𝑛(𝑛 − 1)

Coefficient of Variation for population data:


𝜎
𝐶𝑉𝑎𝑟 = × 100%
𝜇

Coefficient of Variation for sample data:


𝑠
𝐶𝑉𝑎𝑟 = × 100%
𝑥̅

Standard score:
𝑋−𝜇
𝑧=
𝜎

Standard error of the mean:


𝜎
𝜎𝑋̅ =
√𝑛

Central Limit Theorem formula:


𝑋̅ − 𝜇
𝑧=
𝜎/√𝑛

GEN1008 / MED1018 / GED1008 Mid-term Test Page 13 of 17


z confidence interval for means:
𝜎
𝑋̅ ± 𝑧𝛼/2 ( )
√𝑛

t confidence interval for means:


𝑠
𝑋̅ ± 𝑡𝛼/2 ( )
√𝑛

Sample size for means:


𝑧𝛼/2 𝜎 2
𝑛=( )
𝐸

Sample proportion:
𝑥
𝑝̂ =
𝑛
Confidence interval for a proportion:

𝑝̂ (1 − 𝑝̂ )
𝑝̂ ± 𝑧𝛼/2 √
𝑛

Sample size for a proportion:


𝑧𝛼/2 2
𝑛 = 𝑝̂ (1 − 𝑝̂ ) ( )
𝐸

GEN1008 / MED1018 / GED1008 Mid-term Test Page 14 of 17


Statistical tables

GEN1008 / MED1018 / GED1008 Mid-term Test Page 15 of 17


GEN1008 / MED1018 / GED1008 Mid-term Test Page 16 of 17
- The End -

GEN1008 / MED1018 / GED1008 Mid-term Test Page 17 of 17

You might also like