1.
a)
Life expectancy vs tvs per person – positively correlated; as the
number of tvs per person increases, average life expectancy increases.
Life expectancy vs doctors per person - positively correlated as better
access to healthcare services likely improves life expectancy.
Tvs per person vs doctors per person - positively correlated. Higher
availability of both tvs and doctors is a sign of better socioeconomic
development.
b)
coefficient is 0.76276 which indicates strong correlation between life
expectancy and tvs per person.
c)
Coefficient is 0.87562 which indicates very strong positive monotonic
correlation.
d) yes, they are in agreement as both idicates a correlation. Looking at
the scatterplot Life expectancy vs tvs per person, it seems that
relationship is not linear but monotonic, which means that Spearman
should be used.
2.
a) Y = β0 + β1x + ϵ
Where,
Y – SBP
X – age
β0 – intercept (SBP at age 0)
β1 - slope (change in SBP per year of age)
ϵ - random error term
Assumptions for coefficients: relationship between age and SBP is linear,
observations of SBP from different boys are independent, the variance of the errors is
constant across all ages, and errors are normally distributed with mean 0.
Assumptions for interval: all the assumptions for coefficients + correct model
specification. Normality of errors might be violated with large enough sample as
Central Limit Theorem ensures the normality of coefficients in this case.
b)
So equation is SBP = 97.79 + 1.92 × Age
β0 – 97.79, statistically significant (p<.0001)
β1 – 1.92, statistically significant, (p<.0001)
c) Intercept(β0) is SBP at age 0.
Slope(β1) shows that every year SBP level increases by 1.92
d)
Answer: [1.79, 2.05]
e) Predicted SBP = 97.79 + 1.92 × 13 = 122.75
3. A) H0: ρ = 0
HA: ρ =/= 0
The test is a t-test for correlation coefficients.
t=r*sqrt(n-2)/sqrt(1-r^2)
b)
t = -3.84, using df=241, from a t-distribution table:
p<0.0002 which is less than 0.05 so we reject H0 meaning the
correlation between waist circumference and HDL-C is not zero (there
is a correlation).
c)
d)
HDL-c = 29.89 – 0.13 * Waist Circumference
e) Intercept (β0) represents the predicted HDL-C value when the waist circumference is
0 cm which is unreal for a human body so it does not have a practical or meaningful
interpretation.
Slope (β1) represents the change in HDL-C for every 1 cm increase in waist
circumference: HDL-C decreases by 0.126 mg/dL on average. The slope is meaningful
and indicates an inverse relationship between waist circumference and HDL-C.
f) HDL-C = 29.87 − 11.34 = 18.53 mg/dL
g) Is the same as in part f
h) The 95% CI for the predicted HDL-C for an individual is wider than the interval for the
mean HDL-C because individual’s CI accounts for both the variability in the
regression line and the variability of individual observations around the mean.
Code:
1. A)
b)
c)
2. B)
3. B)
c)
d)