You are on page 1of 7

Prof. Dr.

Rainer Schwabe
M. Sc. Kerstin Reckrühm

July 18, 2016

Examination
“Biological Statistics”

1 2 3 4 5 6 Σ

Last name: First name:

Student ID: Study programme:

Please note the following:


• The exam consists of 6 problems. The points for each problem are indicated in
brackets next to the number of the problem. You do not need to solve the indivi-
dual problems completely, partial solutions will also be marked. It is not sufficient,
however, to state the result only. You should clearly show your approach and way
of solution. If you draw any conclusions from the data provided, you should state
those conclusions clearly.
• In the multiple choice questions (Problem 6) exactly one answer is correct for each
question. Cross-mark your answer clearly on the problem sheet. Multiple answers
to one question will be considered as being false.
• You can reach a maximum of 40 points. A total of 16 points is required for passing
the exam.
• You are allowed to use a pocket calculator, one hand-written sheet with your own
notes as well as the formula sheet and all statistical tables used in the tutorial (folder
“Tables”).
• Write your name, student ID number and study program on the top of the front
page (this page) of your problem sheets. Return the problem sheets together with
your solutions, when you finish the exam.
• Be cautious: All “data” provided on these problem sheets are modified. Hence,
conclusions drawn are not valid for real life.

1
Data set 1
A survey was conducted to investigate the health behavior of people, which is defined
as any activity undertaken for improving health and well being. A random sample of 15
students in Magdeburg produced the following data set.
No. Age Gender Body Self-rated Physical Healthy
weight health exercise diet
1 19 m 80 2 1 no
2 21 m 85 3 5 yes
3 30 m 102 2 3 yes
4 23 f 54 2 2 yes
5 22 f 68 1 4 no
6 23 m 72 3 3 yes
7 26 f 73 2 2 yes
8 21 f 55 4 1 yes
9 24 f 67 2 0 yes
10 24 f 63 1 4 yes
11 25 m 92 3 0 no
12 27 m 75 1 2 yes
13 25 f 61 2 1 yes
14 28 m 87 3 2 yes
15 24 m 80 3 3 yes
The second and third column give the age (in years) and the gender (“f” for female and
“m” for male) of the students. The body weight given in the fourth column is measured
in kg. The column “Self-rated health” shows the answer to the question:
“Would you say that your health is (1) excellent, (2) very good, (3) good, (4) fair or (5)
poor?”
Furthermore, the students were asked how often per week they engage in some form of
physical exercise. The answer can be found in the sixth column. The last column shows
whether the students pay attention to healthy diet or not (“yes” or “no”).

Problem 1 (6 points)
Consider data set 1 and answer the following questions.
(a) What are the observational units?

(b) Specify the data type for each of the variables.

(c) Which type of plot is meaningful to represent the data for the variable “Self-rated
health”?

(d) Compute the sample mean, the variance and the sample standard deviation for the
variable “Body weight” in the subgroup of female participants.

2
Problem 2 (7 points)
To investigate the relationship between physical exercise undertaken on a regular basis and
the occurrence of high blood pressure (hypertension) a study was conducted. It appears
that 70% of the participants do sport regularly. Furthermore, it is known that only 10% of
the physically active people suffer from high blood pressure, while 40% of the participants
not engaging in physical activities regularly (physically non-active) suffer from the disease.
(a) What is the probability that a randomly selected participant suffers from high blood
pressure?
(b) What is the probability that a participant with high blood pressure does sport
regularly?
(c) What is the probability that at most 2 out of 9 physically non-active participants
suffer from high blood pressure?
(d) What is approximately the probability that more than 50 of 120 physically non-
active participants have high blood pressure?

Problem 3 (6 points)
Physicians also claim that endurance training can help to decrease the resting heart rate
of people. The resting heart rate of 12 students was measured before and after a 6-week
course of endurance training. Here are the data (in beats per minute):
Student 1 2 3 4 5 6 7 8 9 10 11 12 mean
Before 67 69 71 65 74 67 78 64 68 59 74 84 70
After 63 65 62 65 64 67 70 66 61 62 62 73 65
Assume that the differences of rates are normally distributed.
(a) Is there a significant difference in the mean resting heart rate before and after
training (at level α = 0.05)?
Remember to state the model and the hypotheses. Also state your conclusion.
Hint: For your convenience, the standard deviation for the difference of the resting
heart rate was computed: sd = 5.26.
(b) Determine a 95%-confidence interval for the difference in the mean resting heart
rate.
(c) In addition, the students were asked whether they are interested in sports. The
result with respect to gender was summarized in the following contingency table:
no yes
f 2 6 8
m 0 4 4
2 10
Why is it not reasonable to use the χ2 -test for independence to check, if the interest
in sports depends on gender?
Fisher’s exact test can be used instead. The p-value for this test is 0.5152. State the
hypotheses and the test decision using a significance level of 0.05.

3
Problem 4 (9 points)
The data from 9 different climate stations located at different altitudes were collected to
analyze the linear dependence of air pressure (in hPa) on height (in meter). Here are the
data:
height xi 1030 1270 1500 1580 1750 1940 2300 2800 3100
air pressure yi 887 853 848 834 810 781 760 724 680

Hint:
mean height 1918.89
standard deviation height 693.46
mean air pressure 797.44
standard deviation air pressure 67.10
correlation between height and air pressure -0.9923

(a) Estimate the intercept and the slope of the regression line that describes the linear
dependence of air pressure on height.

(b) Give an interpretation of the estimated slope.

(c) Compute and interpret the coefficient of determination R2 .

(d) Predict the air pressure for a height of 1800 meter.

(e) Calculate a 95%-confidence interval for the slope of the regression line.
Test, if the slope significantly differs from zero (at significance level α = 0.05).
Hint: The standard error for the slope equals 0.0045, which you may use without
proof.

4
Problem 5 (7 points)
A clinical trial was conducted to compare four weight loss programs: (I) low calorie diet,
(II) low fat diet, (III) low carbohydrate diet and (IV) regular physical training. 20 parti-
cipants were randomly assigned to one of the four programs. They followed the assigned
program for 10 weeks. The weight loss (in kg), i.e. the difference in weight measured at
the start of the study and weight measured at the end of the study (10 weeks), is given
in the following table.
program I II III IV
7 3 2 7
9 4 3 5
5 3 1 6
4 5 0 4
7 2 5 8
mean 6.4 3.4 2.2 6.0
df sum of squares mean squares F
program ? 61.8 ? ?
error ? 45.2 ?
total ? 107.0
(a) Test (at level α = 0.05), whether there is a significant difference in the mean weight
loss between the different groups.
Remember to state the model and the hypotheses. Also state your conclusion.
(b) If there is a difference between the groups, which differ from each other? (α = 0.05)
Hint: If pairwise t-tests are performed the following (unadjusted) p-values are ob-
tained. (using pooled variance estimates)
H0 p-value
µI = µII 0.0123
µI = µIII 0.0011
µI = µIV 0.7116
µII = µIII 0.2756
µII = µIV 0.0264
µIII = µIV 0.0025
Note that under the Bonferroni adjustment an individual significance level of
αind = 0.0083 has to be used to keep a multiple level of α = 0.05.
(c) Find a 95 percent confidence interval for the mean weight loss of group I (low calorie
diet) based on the pooled variance estimate.
Test, whether the mean loss of this group significantly differs from 5. Remember to
state the test problem and your conclusion.
Hint: The pooled (error) variance estimate s2e can be found in the completed ANOVA
table.
√ µ̂ −µ
Test statistic: t = nj jse 0 ∼ tf (under H0 ), where f is the number of degrees of
freedom related to the pooled variance estimate.

5
Problem 6 (5 points)
Multiple choice questions: exactly one answer is correct in each question. Multiple answers
to one question will be considered as being false. Please indicate your answers clearly.

(a) It is known that a laboratory method for measuring the concentration of a certain
substance in the serum has a variance of 12. To reduce the variance of the mea-
surement error the same serum is analyzed repeatedly and the mean is calculated.
To reduce the variance resulting from the method of measuring to a value of 3 the
measurement has to be repeated

A  twice
B  nine times
C  sixteen times
D  four times
E  The variance of the mean does not depend on the number of repetitions.

(b) In animal testing you have measured concentrations of corticosterone [ng/ml] for
two different treatment groups. What is a suitable presentation of the data?

A  scatter plot
B  distribution function
C  data points with box-plots
D  pie chart
E  histogram

(c) For a test on the equality of two expected values a non-significant test result means:

A  Actually there is no difference between the expected values.


B  There is no objection to the assumption of equal expected values.
C  The equality of the two expected values is significant.
D  The equality of the two expected values is not significant.
E  Actually there is a difference between the expected values.

6
(d) Dr Wisenheimer always proceeds as follows: He starts his studies by investigating a
batch of n laboratory animals and performs a test at level α. If the null hypothesis
cannot be rejected, he investigates another batch of n animals and performs a test
based on the common result at the same level α. How big is the significance level
(type 1 error) for any result published by Dr Wisenheimer?

A  equal to α2
B  equal to α/2
C  equal to α/n
D  equal to 1 − α
E  larger or equal than α

(e) You want to conduct an animal experiment to investigate the influence of breeding
conditions on the development of the brain. What should be taken into account?

1 equal observational conditions


2 randomized assignment of the breeding conditions
3 equal breeding conditions for all animals
4 “blinding” of the investigator (if possible)
5 equal results of the observations

A  none
B  1, 2, and 4
C  1, 3, and 5
D  only 3
E  all

You might also like