You are on page 1of 6

PROJECT 1 REPORT

MBA-652A

By:
Pritish Kohli 16125030
M Manjunath 16125023
Gughapriyan M 16125017
PROJECT REPORT
A) Research Problem: A Study of the Relations of the Brain Weight to the Size of the Head
B) Data Set Source: Source: R.J. Gladstone (1905). "A Study of the Relations of the Brain
Weight to the Size of the Head", Biometrika, Vol. 4, pp105-123. Website:
http://www.stat.ufl.edu/~winner/datasets.html
C) Variables Involved
Dependent Variable Brain Weight Independent Variables Head Size,
Gender and Age

Gender- 1=Male, 2=Female


Age Range- 1=20-46 years, 2=46+ years
Head size- in (cm^3)
Brain weight- in (grams)

D) Descriptive Statistics: The Mean of Brain Weight, Head Size and Variance-Covariance
Matrix is as below:

Scatter plot of Dependent Variable (Brain Weight) with various Regressors are below.
E) Regression Results: The regression is done in brain.R. It takes four inputs (1,2,3,4)
and various cases are evaluated.
Case-1: Bain Weight = Head Size
Summary of regression code is as below.

Interpretation: 1) The coefficient of regressor Head_Size has high statistical significance


2) The regressor has a positive relation to the dependent variable
3) As in the scatter plot above Head Size seems to be the most important regressor
to predict brain
weight

Case-2: Bain Weight = Head Size + Gender


Summary of regression code is as below.

Interpretation: 1) From the available data checking for Gender as an omitted variable
2) Gender can be factor that affects brain weight and could be correlated to head
size because
difference in body weight and height between both genders
3) As per regression results coefficient of Gender has high probability of being
zero and hence
Is not considered in regression

Case-3: Bain Weight = Head Size + Age


Summary of regression code is as below.

Interpretation: 1) From the available data checking for Age as an omitted variable
2) Age can be factor that affects brain weight and could be correlated to head size
because with age
the brain size and weight increases
3) The coefficient of Age is statistically significant at 5% level
4) Comparing the R2 Values of Case 1 and Case 3 there is not much difference
5) The value of F-statistic is reduced from 416 to 214

Case-4: Bain Weight = Head Size + Gender + Age


Summary of regression code is as below.

Interpretation: 1) After including both Age and Gender the coefficients of both these
regressors are statistically
significant at 5% level
2) There is not much difference between R2 values of Case 1 and Case 4
3) The value of F-statistic has reduced significantly from 414 to 146

F) Conclusion: As from the four cases observed the best available case is the first one. Case
-2 uses Gender as an omitted variable but the coefficient does not prove statistically
significant. In Case 3 and Case -4 Age has been included but considering R 2 as
comparison, there is not much difference in its value. Hence both Age and Gender are not
included in the regression. The final equation of regression is as below:
Equation: Brain Weight = 0.263*Head Size + 325.57
The Y-Intercept only provides some minimum value to brain weight but has no meaning as
brain weight cannot be some value when head size is zero.
Omitted Variables:
1) The other factors that could be included origin of a person. The growth factors of a
person depend on its place of origin such as European, African, Asian etc.
2) The Age is given as a category, instead if Age is given as a continuous variable then
change could have been observed more.
3) Whether the person is malnourished in his life to the point of observation.