You are on page 1of 9

<NBHS1602>

TAKE HOME EXAMINATION

MAY 2020 SEMESTER

NBHS1602

BIOSTATISTICS

MATRICULATION NO : 930108035557001
IDENTITY CARD NO. : 930108-03-5557

1
<NBHS1602>
PART A
QUESTION 1A

(i) To obtain a suitable number of classes and the class width for the given data, we have
to determine the highest and lowest value in the data. Calculate the range and select
the number of classes desired. In order to find the class width, use a specific formula
(range/number of classes and rounding up). Finally find the upper class limit and the
boundaries. The number of classes can be determining by using the 2k rule.

(ii) Frequency Distribution Table

Class limits Class boundaries Frequency


46-51 45.5-51.5 6
52-57 51.5-57.5 8
58-63 57.5-63.5 6
64-68 63.5-68.5 5
69-74 68.5-74.5 5
Total 30

(iii) Cumulative distribution Table


Class Frequency Cumulative Relative frequency %F
limits Frequency
46-51 6 6 6/30=0.20 20
52-57 8 14 14/30=0.46 46
58-63 6 20 20/30=0.67 67
64-68 5 25 25/30=0.83 83
69-74 5 30 30/30=1.00 100
Total 30 Greater than Relative Greater than percentage
greater than

QUESTION 1B

(i) Summary Table

Medicine Frequency Relative Frequency Angle


A 4 4/50=0.08 4
×360=28.8
50
B 10 10/50=0.2 10
×360=¿72
50
C 8 8/50=0.16 8
×360=¿57.6
50
D 7 7/50=0.14 7
×360=¿50.4
50

2
<NBHS1602>
E 11 11/50=0.22 11
×360=¿79.2
50
F 10 10/50=0.2 10
×360=¿72
50

(ii) Pie chart

28.8
72

72
A
B
C
D
E
79.2 F

57.6

50.4

(iii) Conclusion
Based on the pie chart, we can say that the pie chart gives a clear picture of the large
percentage of preferences of six types of medicine (A-F). The most preferred
medicine is E followed by Band F. Meanwhile the least preferred medicine is A.

QUESTION 2A
Let’s start with the range because it is the most straightforward measure of variability to
calculate and the simplest to understand. The range of a dataset is the difference between the
largest and smallest values in that dataset. For example, in the two datasets given, dataset 1
has a range of 90 – 25 = 65 while dataset 2 has a range of 92 – 28 = 64. Dataset 1 has a
slightly broader range and, hence, more variability than dataset 2.

The interquartile range is the middle half of the data. To visualize it, think about the
median value that splits the dataset in half. Similarly, we can divide the data into quarters.
Statisticians refer to these quarters as quartiles and denote them from low to high as Q1, Q2,
and Q3. The lowest quartile (Q1) contains the quarter of the dataset with the smallest values.
The upper quartile (Q4) contains the quarter of the dataset with the highest values. The
interquartile range is the middle half of the data that is in between the upper and lower
quartiles. In other words, the interquartile range includes the 50% of data points that fall

3
<NBHS1602>
between Q1 and Q3. I’ve divided the dataset below into quartiles. The interquartile range
(IQR) extends from the low end of Q2 to the upper limit of Q3. For the dataset A, the Q1 is

35+48 55+55 59+60


=41.5 , Q2(median) = =55 and Q3 is =59.5. Then the IQR is 59.5-
2 2 2

40+ 4 0 56+65
41.5=18. For the dataset B, the Q1 is =40, Q2(median)= =60.5 and Q3 is
2 2

70+75
=72.5 . Then the IQR is 72.5-40=32.5.
2

35+90+55+25+ 59+ 60+55+48


The mean value for dataset A is = 53.38 meanwhile
8

40+56+ 40+65+75+ 92+28+70


for dataset B the mean value is =58.25. The mode for dataset
8
A unimodal is 55 similar to dataset B 40. To determine the coefficient of variation for both
datasets, we need to calculate the standard deviation.

∑ x 2 −x́ 2
σ A=
√ n

22385
¿
√ 8
−¿¿

=17.99

∑ x 2 − x́ 2
σ B=
√ n

30334
¿
√ 8
−¿ ¿

=19.97

The coefficient of variation for dataset A is CV= σ / x́ = 17.99/53.38 = 0.3370, meanwhile for
dataset B is CV= = σ / x́ = 19.97/58.25 = 0.3428

QUESTION 2B

The Pearson’s Coefficient of skewness for the dataset A:

X́ −M 0
Sk A=
s

4
<NBHS1602>
53.38−55
¿
17.99

=-0.90

The Pearson’s Coefficient of skewness for the dataset B:

X́−M 0
Sk B =
s

58.25−40
¿
19.97

= 0.91

The coefficient compares the sample distribution with a normal distribution. The larger the
value, the larger the distribution differs from a normal distribution. A value of zero means no
skewness at all. A large negative value means the distribution is negatively skewed. For
dataset A the coefficient is -0.90 indicates the negative skewness meanwhile for dataset B the
coefficient value is 0.90 which is indicates the positive skewness.

PART B
QUESTION 1A
In order to conduct a hypothesis test in this case, we should look at the process that uses
sample statistics to test a claim about the value of a population. In this case, a verbal
statement or claim about population parameter is called a statistical hypothesis which isμ, the
population mean of age of those taking Vitamin C during the flu season. Then, there are two
decisions will be made whether we reject the null hypothesis ( H 0 ¿ or do not reject the null
hypothesis ( H 0 ¿ . we also consider the types of hypothesis tests such as left-tailed, right-tailed
or two-tailed test and the critical value in order to test whether there is a significant difference
so that the null hypothesis should be rejected.

Then come up with the hypothesis statements according to the corresponding sample.
The data will be assumed normally distributed. The test statistics value in this case will be
using a suitable formula which is Z formula because the standard deviation value is known
and n<30. Finally we come up with the decision and Conclusion whether the statistic value
falls in critical region so that we can write the hypothesis statement that has been chosen
significant at 5%.

5
<NBHS1602>

QUESTION 1B
Population mean, μ=45
Sample mean, X́ =45
Sample size, n=45
Standard deviation, σ =4
Test mean difference at α =0.05
Step 1: Name the tested parameter
The tested parameter is μ, the population mean of age of those taking Vitamin C during the
flu season
Step 2: Data is assume normally distributed
Step 3: Hypothesis statements
H 0: μ=45 (claim)
(There is no mean difference between the mean age of those taking vitamin C during
the flu season)
H 1: μ> 45
(There is mean difference between the mean age of those taking vitamin C during the
flu season)
(Right-tailed test)
Step 4: Critical values for a two tailed test with v = 24 atα =0.05 are 1.711.

Critical region

1.711

Step 5: Compute the test statistics. Since the standard deviation is known and the n is small
(n<30), then we use Z-test:

6
<NBHS1602>
X́−μ
Test statistics, Z=
σ /√n
47−45
=
4/ √ 25
= 2.5
Step 6: Since the test statistic value is 2.5 and it falls in critical region, we reject the H 0

2.5

1.711

Step 7: Conclusion
Thererefore, there is enough evidence to support that There is mean difference
between the mean age of those taking vitamin C during the flu season significant at
0.05

QUESTION 2
a) Population mean, μ=10.2
Sample mean, X́ =11.8
Sample size, n=60
Standard deviation, σ =3.2
X́−μ Z= 11.8−10.2
Z= , = 0.5
σ 3.2
To calculate a 95% confidence interval for μ , we use the formula

X́ ± Z × ( √σn )=11.8± 0.6915 × √3.260


=11.8 ± 0.2857
=(11.51 , 12.08)

7
<NBHS1602>
b) Test mean difference at α =0.0 1
Population mean, μ=11.8
Sample mean, X́ =1 0.2
Sample size, n=60
Standard deviation, σ =3.2

Step 1: Name the tested parameter


The tested parameter isμ, the population mean of patient takes on months to fully
recover from a knee surgery
Step 2: Data is assume normally distributed
Step 3: Hypothesis statements
H 0: μ=11.8
(There is no mean difference between the mean of patient takes on months to
fully recover from a knee surgery)
H 1: μ<11.8 (claim)
(The mean of patient takes on months to fully recover from a knee surgery has
decreased below sample mean)
(Left-tailed test)
Step 4: Critical values for a left tailed test with less than 1% is -2.33

Critical region

-2.33

Step 5: Compute the test statistic. Since the standard deviation is known and the n is small
(n>30), then we use Z-test:
X́−μ
Test statistics, Z=
σ /√n

8
<NBHS1602>
10.2−11.8
=
3.2/ √60
= -3.88
Step 6: Since the test statistic value is -3.88 and it falls in critical region, we reject the H 0

Step 7: Conclusion
Thererefore, there is enough evidence to support the claim that the mean of patient
takes on months to fully recover from a knee surgery has decreased below 11.8

You might also like