Professional Documents
Culture Documents
Departme
nt of methods and applied statistics
Research
(VetM 713) assignment
by
1. Redwan Anwar (DVM) = R0008/14
2. Iya Halake (BVSc) = R0007/14
3. Demeke Hailu (DVM) = R0003/14
January, 2022
Hawassa, Ethiopia
1) A given researcher is interested to find out if there is any relationship b/n height of the son and his
father. He took random sample of 8 fathers and their sons and collected data on their height in cm as
given in the table below.
Height of the father(X) 159 163 166 175 179 182 185 191
Height of his son(Y) 163 162 165 180 174 168 181 187
a. Fit simple linear regression model representing dependency son’s height on his father.
Y=β0+β1x+£ , Where
Y = height of son (dependent variable)
β0 = the intercept (constant i.e. the value of y when x = 0)
β1 = the slope of the line (proportional change in blood sugar level for a unit change in exercise)
x = height of fathers (independent variable)
£ = random error
b. Estimate the regression coefficients and interpret your results.
Ŷ = a + bx , to estimate the regression coefficients (a and b), we first find the two means, ȳ, and x bar.
x bar = 1400/8 = 175, ȳ = 1380/8 = 172.5
b = ¿)¿ ¿) = ¿)¿ ¿) = 0.706
a = ȳ - bx bar = 172.5-0.706*175 = 48.914
Ŷ = 48.914 + 0.706x
Interpretation: it indicates that, for no change in the height of fathers, the height of sons increase by
48.914 and for one cm increase in the height of fathers, there is 49.62 cm increase in height of sons.
Again on average, a 0.706 cm increase in the height of fathers result in 49.62 cm increase in the height of
sons. In general, the relationship is statistically significant.
c. Calculate correlation coefficient and interpret the result.
Ʃxy−ƩxƩy /n 242137−1400∗1380/8
r= = = 0.853
√( Ʃx ¿ ¿ 2−( Ʃx ) / n)¿ ¿ ¿ ¿ ¿
2
√(245902−1400² /8) ¿ ¿ ¿ ¿
The type of relationship between the height of fathers and their sons is direct/positive and strong
(r=0.853).
d. Obtain coefficient of determination and interpret the result.
R Square = 0.728, obtained by squaring simple linear correlation coefficient (0.853). This implies that
72.8 % of change in the height of sons is explained/influenced by change in the height of their
fathers.
2) Suppose one of the objectives for a given study is to identify the dependency of the level of blood sugar
on exercise done (distance run) and create a model that represent it . The following table contains data on
the aerobic exercise levels (running distance in km) and blood sugar levels for 12 different days.
Distance (in km) 2.1 2.3 2.5 2.5 3.2 3.5 3.5 3.8 4.1 4.5
Blood sugar 136 146 131 125 120 116 116 104 95 85
(mg/dL)
10 10 10
Summary results ∑ ( xi ) = 32, ∑ ( yi ) = 1174, ∑ ( xiyi ) = 3624.6
n =1 n =1 n =1
10 10
∑ ( xi ) ²=108.44, ∑ ( yi ) ²=140976
n =1 n =1
ANOVAa
Model Sum of df Mean Square F Sig.
Squares
1 Regression 60.601 2 30.30 33.290 .000b
Residual 4.550 5 .910
Total 65.156 7
a. Dependent Variable: consumption level of households
b. Predictors: (Constant), monthly income, family size, and schooling cost of households
Based on the output displayed in the table above, answer the following questions based on the output
displayed.
a. Before interpreting the output regression coefficients, test the adequacy of the model using output on
Model summary (R-square Table 1) and ANOVA Table 2.
In this case to be ANOVA model is an adequate the coefficient of determination (R-square)
percent is greater than 50% and p-value is less than 0.05, then ANOVA is adequate. 0.883 *
100% = 88.3% which is greater than 50% and p-value is 0.000 which is less than 0.05. Therefore
ANOVA is adequate.
b. Write the model of multiple linear regression representing the problem and explain each component.
Y = β0+β1x1+ β2x2 + β 3X3 + £ , Where
Y = consumption level of households (dependent variable)
X = family size, monthly income, and school cost which are independent variables (expenses).
β0 = is the consumption level of household which is not affected by monthly income, family size,
and school cost.
β1 = the amount of income which affect the family level of consumption.
β2 = the amount of family size which affect family level of consumption.
β3 = school cost = cost/expense which affect the family’s monthly expense.
£ = random error
c. Write the fitted model of MLR using (substituting) the regression coefficients displayed.
Ŷ = -453.604 + 0.707x1 + 89.091x2 -0.329x3
d. Interpret the regression coefficients (considering type, magnitude and significance) displayed in the
output table.
β0 = - 453.604 indicates that the household is expected to expend even have no income, no other
individual added to family, and no school expense.
β1 = 0.707 which is directly related with family consumption cost, which means monthly income
increase at the same time consumption expenditure increases. They have intermediate relationship.
The monthly income is significant variable which affect family’s monthly expense.
β2 = 89.091 which is the family size of the household. It is directly related to the monthly
expenditure of the household i.e increase or decrease in the number of family size greatly affects
the family’s monthly expenditure. It is also significantly associated with monthly expenditure of the
family.
β3 = -0.329 (school cost) = the type of relationship between school cost and monthly expenditure of
household is indirectly related. The relationship between school cost and monthly expenditure is
intermediate because of change in the school cost slightly affect the family’s monthly expense. The
variable is significant even though it moderately related with the dependent variable.
e. Test the significances of regression parameters
Solution: the significance of the regression is interpreted by the observation of p-value. If it is less
than 0.05 the variable included is significant, otherwise not significant.
β1 = significant because 0.001 is < 0.05.
β2 = significant because 0.022 is < 0.05.
β3 = significant because 0.041 is < 0.05.
f. Suppose a given household with monthly income and schooling cost of 8000 and 1200 respectively has a
family size of 5. What would be the expected monthly expenditure level of the household?
Solution: substitute in the fitted model equation, Ŷ = -453.604 + 0.707x1 + 89.091x2 -0.329x3.
Given x1 = 8000, x2 = 5, x3 = 1200,
Then Ŷ = -453.604 + 0.707*8000 + 89.091*5 -0.329*1200 = 5253.051.
So the household is expected to expend 5253.051.