You are on page 1of 2

Nanyang Business School

AB1202 – STATISTICS AND ANALYSIS

Tutorial : 11
Topics : Regression Analysis II

1.
(1) The estimated regression model is log ( (y) = 3.64 + 0.5 log(𝑥) where y represents
expenditure on food and x represents income. Interpret the slope coefficient, 𝛽! = 0.5.
(2) Suppose now the estimated regression is log ( (y) = 7.6 + 0.05𝑥 . Interpret the slope
coefficient 𝛽! = 0.05.

2. Discuss;
(1) the reasons why the logarithmic scale should be considered in place of the linear scale in a
regression.
(2) how log transformation reduces the skewness of a right-skewed distribution.
(3) why R-square can be used as a goodness-of-fit measure.

3. Use the data in “kielmc” from the package ”wooldridge” for this exercise. To study the
effects of the incinerator location on housing price, we consider the following regression
model
price = 𝛽" + 𝛽! log(dist) + 𝜖
where price is housing price ($) and dist is the distance of the house from the incinerator (in
feet).

(1) Check the skewness of price. Considering the coefficient alone, should we consider log-
transform it based on its skewness?
(2) What sign do you expect for 𝛽! if the presence of the incinerator depresses housing
prices?
(3) Following (1). Estimate this equation and interpret the result of coefficient 𝛽! .
(4) Following (1). Add the following IV’s to the regression: log(area),log(land),rooms, where
area is square footage of the house, land is the lot size (square feet), rooms is the total
number of rooms. Now, what do you conclude about the effects of the incinerator?
(5) Based on the adjusted 𝑅# in the above two models, which model provides a better fit?

4. Use the data in “apple” from the package “wooldridge”.


(1) We plan to estimate the following model
𝑒𝑐𝑜𝑙𝑏𝑠 = 𝛽" + 𝛽! ecoprc + 𝛽# regprc + 𝜖
Ecolbs is the pounds of “ecologically friendly” (ecolabeled) apples that a family would
demand. Ecoprc is the price of eco-labeled apples, whereas regprc is the price of regular
apples. First check the skewness of 𝑒𝑐𝑜𝑙𝑏𝑠. Is log-transformation recommend? Can we
estimate the model based on the log-transformed DV?
(2) Are the two price variables (ecoprc and regprc) statistically significant? Report the p-
values for the individual t tests.

5. One important decision in regression analysis is to decide what “additional” independent


variables should be included. For example, in the analysis of how Education is related to wage,
1
Nanyang Business School

we included several independent variables other than “Education”:


𝐸(𝑊𝑎𝑔𝑒) = 𝛽" + 𝛽! 𝐸𝑑𝑢 + 𝛽# 𝐸𝑥𝑝 + 𝛽$ 𝑇𝑒𝑛𝑢𝑟𝑒 + 𝛽% 𝑓𝑒𝑚𝑎𝑙𝑒

Why did we do that, since the primary interest is to check the effect of Education on wages
(𝛽! )? The reason is, when those “necessary” variables were not included, the coefficients of
the included variables would very likely be off (e.g., you might get an average of -2.5 from R
while the true (but unknown) coefficient value is 𝛽! =1.8). The difference is what is called
“bias” in statistics. A biased estimate could lead you to a biased conclusion.1

The purpose of this question is to introduce what variables are considered necessary.

The most important class of “necessary variables” refers to those that are correlated with both
the DV and your key IV’s. These are the variables that will generate biases when not included
in the regression. For example, if we have a reason to believe that parents’ education level is
strongly correlated with both the Edu and the wage of an individual2, then “parents’ education
level” is a variable that we should try to include in the analysis.

Please discuss with the class or in a breakout group: (1) Find a DV and a key IV of interest to
your group. (2) Populate a list of other necessary IV’s and explain why they are considered
necessary.

1
Recall in Week 5, it is said “correlation does not mean causality.” This difference between the observed
correlation and the actual relationship is one example of such problems. If we ignored “IQ” in the
analysis, we would expect to see inflated relationship between GPA and salary, which is higher then it
actually is.
2
For example, parents with a higher education level may tend to pay more attention and giving more
support to their children’s education and academic performance. Those parents may be better
connected with many VIP’s, which may enahce their children’s prospect of securing higher-paying jobs.
2

You might also like