You are on page 1of 10

Class 10

Individual Assignment
(Solutions)

General instructions
• This assignment is covered under the Honor Code of the Fuqua School of Business at Duke University.
• This assignment is due by the deadline listed on the course website. Except for the submission deadline,
there is no time limit to complete the assignment.

• This is an individual assignment. The work for this assignment is to be your work alone, without
consultation or assistance from any other person.
• This assignment is open books and open notes. You can only access the material that is intended
for this course and that is made available through the course website and the course packet, as well
as your notes. You are not permitted to obtain any course materials (including handouts, readings,
assignments, etc.) or any solutions from other Fuqua students or any other source.
• You may use a laptop in performing calculations. However, when requested by the question, you are
expected to inlcude intermediate steps as well as references to the formulas used.
• When your are done working on your assignment, make sure to submit your responses by clicking on
the “Submit” button. Answers without submission are not recorded.
• You can submit your answers as many time as you wish before the deadline. Only your last submission
will be graded. Please, keep in mind that when you decide to take the assignment more than one
time, your previous answers will be deleted and no answer will be recorded until you submit your new
responses.

• Late or missing submissions will receive no credit.


• The questions that are worth 0 points will not be graded for correctness. However, you are strongly
encouraged to answer them as they might be relevant for later questions and/or for the in-class dis-
cussion.

• Multiple choice questions: read them carefully. Only one answer is correct. You are not to provide
any reasoning on why you selected a particular choice.
• Multiple answer questions: you will get points for each correct selection and negative points for each
incorrect selection.

• Numeric answer questions: make sure to enter only numeric values with a point on the line as the
decimal separator (e.g. you must enter 0.2 and not 0,2 or other variations).
• File upload questions: the number and type of files for upload might be limited as specified by each
question. No other upload is allowed. Please, keep in mind that the updloaded file(s) must detail your
work and the process you followed to reach your answer.
Pizza sales
The Waialua Pizza Company is a medium-sized chain of pizzerias located at beaches all over the South
Pacific. The company thinks that the income levels of the nearby community and the presence or absence
of competition might be major factors in determining sales.
The data in the file PizzaSales Data includes daily sales performance data (DS, in U.S. dollars) for some
of its stores, along with the weekly per capita income (PCI, in U.S. dollars) in the neighborhood where each
store is located, as well as of whether each store has one ore more competing pizzerias that are located within
half mile (COMPETITION).

Solution. See also the auxiliary solution file PizzaSales Solution for more details on the answers to
questions below.

Measuring the additive effect of competition by constructing a dummy variable. In this part of
the assignment, we construct and use a dummy variable to estimate impact of local competition on daily
sales.

• Construct a new dummy variable COMP, which will indicate whether a store faces local competition
or not: set COMP = 1 for each store with COMPETITION = Yes, and set COMP = 0 for each store with
COMPETITION = No.
• Run the simple linear regression DS ∼ COMP, with daily sales (DS) as the dependent variable, and the
dummy variable for competition (COMP) as the only independent variable.

1. (1 point) Based on the regression model DS ∼ COMP, provide an estimate for the expected daily sales for
a store which faces local competition.

Solution. The regression line is

cS = 1, 105.434 − 455.349COMP.
D

When there is local competition, the dummy variable COMP = 1, so we obtain that the estimate for
expected daily sales is 1, 105.434 − 455.349 × (1) = 650.086.
The same estimate can be obtained using the forecasting tool and setting the value of the independent
variable COMP equal to 1.

2. (1 point) Based on the DS ∼ COMP regression model, how confident can one be that the expected daily
sales of a store that faces no competition are higher than the expected daily sales of a store that faces
local competition?
Less than 0.15%
At least 0.15% but less than 2.50%
At least 2.50% but less than 16.00%
At least 16.00% but less than 50.00%
At least 50.00% but less than 84.00%
At least 84.00% but less than 97.50%
At least 97.50% but less than 99.85%

At least 99.85%

Page 2
Solution. The population model is DS = β0 + β1 COMP + , where  is the normally distributed
error term with mean zero. When a store faces no competition, the dummy variable COMP = 0, so
that store’s expected daily sales are E [DS] = E [β0 + β1 × 0 + ] = β0 . When a store faces local
competition, then dummy variable COMP = 1, and that store’s expected daily sales are E [DS] =
E [β0 + β1 (1) + ] = β0 + β1 .
The difference between expected sales with and without local competition is then given by

β0 − β0 + β1 = −β1 ,
|{z} | {z }
expected sales without competition expected sales with competition

which is the slope coefficient for the dummy variable COMP.


Based on the regression model DS ∼ COMP, the slope coefficient β1 has a point estimate βb1 =
−455.349, and a standard error of SEβb = 88.919. Thus, the confidence that −β1 > 0 is the same
1
as the confidence that β1 < 0. Since −455.349 is −455.349/88.919 = −5.121 standard errors away
from 0 (also reported under t-stat), one can conclude using the 1-2-3 rule for one-sided confidence
intervals that the expected daily sales of a store facing no competition are higher than the expected
daily sales of a store facing local competition with at least 99.9% confidence. (See also the value of
the p-value for the COMP coefficient.)

Page 3
Measuring the effect of per capita income and competition on daily sales. Run the multiple
linear regression model DS ∼ COMP + PCI with daily sales (DS) as the dependent variable, and the competition
dummy (COMP) and the per capita income (PCI) as the two independent variables.
3. (1 point) Based on the regression model DS ∼ COMP + PCI, provide an estimate for the expected daily
sales for a store in a neighborhood with weekly per capita income of $300 (PCI = 300) and in the presence
of local competition.

cS = 331.998 − 383.408COMP + 2.490PCI.


Solution. The estimated equation for the regression line is D
In a neighborhood with local competition, we have the competition dummy COMP = 1. Furthermore,
since the per-capita income in that neighborhood is PCI = 300, then we obtain from the regression
equation that the estimate for expected daily sales is

cS = 331.998 − 383.408COMP + 2.490PCI


D
= 331.998 − 383.408 × 1 + 2.490 × 300
= 695.590.

The same estimate can be obtained with the forecasting tool by setting the value of the independent
variables to COMP = 1 and PCI = 300.

4. (1 point) Based on the regression model DS ∼ COMP + PCI, how confident can one be that the expected
daily sales of a store that faces local competition are at least $350 less than the expected daily sales of a
store that faces no competition? (Hint. What is the confidence level for the COMP coefficient being less
than −350?)

Solution. The population model is DS = β0 + β1 COMP + β2 PCI + , so, for any given value of per
capita income and since E [] = 0, the impact of local competition is given by β1 , the coefficient for
the competition dummy COMP.
Here, we are interested in the confidence level associated with a competition effect of $350 or more
(i.e., we are interested in the confidence that daily sales are—on average—more than $350 less in
neighborhoods with local competition). In formulas, we are interested in the confidence level for
β1 ≤ −350.
Based on the regression model DS ∼ COMP + PCI, the coefficient β1 for the competition dummy COMP
has a point estimate βb1 = −383.408, and a standard error SEβb = 33.938. Here, the cut-off −350
1

is (−350 − (−383.408))/33.938 = 0.984 standard errors above the point estimate βb1 = −383.408.
As such, the 1-2-3 rule for one-sided confidence intervals tells us that one can be slightly less than
(approximately) 84% confident that β1 ≤ −350.
A more precise confidence level can be obtained by computing the appropriate normal probability.
For instance, with Excel one has NORM.DIST(−350, −383.408, 33.938, TRUE) = 0.838.

Page 4
Using dummy variables to estimate interaction effects. In this part of the assignment, we introduce
a slope dummy variable which is constructed by multiplying values of the per capita income variables (PCI)
with the value of the competition dummy (COMP) for each data point. This new variable—which we will refer
to as PCI COMP—allows to estimate the interaction effect of local competition and per capita income.

• Construct the slope dummy variable PCI COMP, by setting PCI COMP = PCI × COMP for each data point.
Note that PCI COMP = 0 when COMP = 0, and PCI COMP = PCI when COMP = 1.
• Run the multiple linear regression DS ∼ PCI + PCI COMP, in which daily sales (DS) is the dependent
variable, and per capita income (PCI) and the newly constructed slope dummy (PCI COMP) are the two
independent variables.

5. (1 point) Based on the regression model DS ∼ PCI + PCI COMP, provide an estimate for the expected
daily sales of a store in a neighborhood with weekly per capita income of $300 (PCI = 300) and with no
competition.

Solution. The regression model DS ∼ PCI + PCI COMP has the estimated regression line

cS = 83.383 + 3.288PCI − 1.274PCI COMP.


D

For the store in the question, we have a neighborhood weekly per capita income of $300 (PCI = 300),
no competition (COMP = 0), and the slope dummy PCI COMP = PCI × COMP = 300 × 0 = 0. When we
plug these values of the independent variables into the regression equation, we then obtain that

cS = 83.383 + 3.288PCI − 1.274PCI COMP


D
= 83.383 + 3.288(300) − 1.274(0)
= 1, 069.752.

The same estimate can be obtained with the forecasting tool in the regression output. One just
needs to set the values of the independent variables PCI = 300 and PCI COMP = 0.

6. (1 point) Based on the regression model DS ∼ PCI + PCI COMP, provide an estimate for the expected
daily sales of a store in a neighborhood with weekly per capita income of $300 (PCI = 300) and in the
presence of local competition.

Solution. The regression model DS ∼ PCI + PCI COMP has the estimated regression line

cS = 83.383 + 3.288PCI − 1.274PCI COMP.


D

For the store in the question, we have a neighborhood weekly per capita income of $300 (PCI = 300).
There is also local competition (COMP = 1), so the slope dummy PCI COMP = PCI × COMP = 300 × 1 =
300. When we plug these values of the independent variables into the regression equation, we then
obtain that

cS = 83.383 + 3.288PCI − 1.274PCI COMP


D
= 83.383 + 3.288(300) − 1.274(300)
= 687.444.

The same estimate can be obtained with the forecasting tool in the regression output. One just
needs to set the values of the independent variables PCI = 300 and PCI COMP = 300.

Page 5
7. (1 point) Based on the regression model DS ∼ PCI + PCI COMP, what is the estimate for the expected
change in daily sales due to a $100 increase of weekly per capita income in a neighborhood in which
there is no competition (COMP = 0)?

Solution. Here, the population model is DS = β0 + β1 PCI + β2 PCI COMP + . The store in the
question faces no competition (so COMP = 0 and PCI COMP = PCI × COMP = 0), so when the per
capita income increases by $100, we have a new value for daily sales:

DS0 = β0 + β1 (PCI + 100) + .

To estimate the expected change in daily sales when the per capita income increases by $100, we
just need to compute the expected difference between DS0 and DS:

E [DS0 − DS] = β1 (100).

The regression output tells us that the estimate for the slope coefficient of PCI is βb1 = 3.288. If we
now multiply this estimate by $100 (the increase in per capita income), we obtain that the estimate
for the increase in expected daily sales is 3.288 × 100 = 328.8.

8. (0 points) Based on the regression model DS ∼ PCI + PCI COMP, what is the estimate for the expected
change in daily sales due to a $100 increase of weekly per capita income in a neighborhood in which
there is local competition (COMP = 1)?

Solution. Here, the population model is DS = β0 + β1 PCI + β2 PCI COMP + . The store in the
question faces local competition (so COMP = 1 and PCI COMP = PCI × COMP = PCI), so when the per
capita income increases by $100, we have a new value for daily sales:

DS0 = β0 + β1 (PCI + 100) + β2 (PCI + 100) + .

To estimate the expected change in daily sales when the per capita income increases by $100, we
just need to compute the expected difference between DS0 and DS:

E [DS0 − DS] = (β1 + β2 )(100).

The regression output tells us that the estimate for the slope coefficient of PCI is βb1 = 3.288. It also
tells us that the estimate for the slope coefficient of PCI COMP is βb2 = −1.274. If we now multiply
the sum of these estimates βb1 + βb2 = 3.288 − 1.274 = 2.014 by $100 (the increase in per capita
income), we obtain that the estimate for the increase in expected daily sales is 2.014 × 100 = 201.4.

9. (1 point) Compare the regression model DS ∼ PCI + PCI COMP (dependent variable: DS; independent
variables: PCI and PCI COMP) with the regression model DS ∼ COMP + PCI (dependent variable: DS;
independent variables: PCI and COMP). Select all of the correct statements from below. (Note. Keep in
mind that incorrect selections will be graded with negative points.)
The standard error of regression in the model DS ∼ PCI + PCI COMP is larger than the standard
error of regression in the model DS ∼ COMP + PCI

The standard error of regression in the model DS ∼ PCI + PCI COMP is smaller than
the standard error of regression in the model DS ∼ COMP + PCI
The slope coefficient for PCI COMP is not significant at a 95% confidence level

The slope coefficient for PCI COMP is significant at 95% confidence level
The slope coefficient COMP is not significant at 95% confidence level

Page 6

The slope coefficient COMP is significant at 95% confidence level
Something is wrong with the model DS ∼ PCI + PCI COMP since the p-value of the intercept is
too high
Something is wrong since both models DS ∼ PCI + PCI COMP and DS ∼ COMP + PCI have one
independent variable with negative t-stat value

Solution. Based on the regression outputs available in the auxiliary file PizzaSales Solution, we
find that:

• The standard error of regression in the model DS ∼ PCI + PCI COMP is 94.275.
• The standard error of regression in the model DS ∼ COMP + PCI is 117.515.
• The slope coefficient for PCI COMP in the model DS ∼ PCI + PCI COMP is significant (different
from zero) with at least 99.9% confidence (p-value 0.000).

• The slope coefficient for COMP in the model DS ∼ COMP + PCI is significant (different from zero)
with at least 99.9% confidence (p-value 0.000).
• The p-value of the intercept in the model DS ∼ PCI + PCI COMP is 0.029. This tells us that
we are 97.1% confident that the intercept in the model is different from zero. However, even
if the intercept were not to be significant at such confidence level, there would be no problem
since the intercept coefficient does not describe a relationship between the dependent and an
independent variable.
• In the regression outputs, t-stat values can be negative. This usually occurs when the point
estimate is negative, and the t-stat represents distance from zero from the left (from the
negative numbers).

10. (0 points) Run a regression model with daily sales (DS) as the dependent variable, and with the com-
petition dummy (COMP), the weekly per capita income (PCI), and the slope dummy (PCI COMP) as the
three independent variables. Based on this regression model, which of the following are estimates of the
cS = a + mPCI for stores facing no competition (COMP = 0) and for stores facing local competition
line D
(COMP = 1).
The regression line estimate for stores facing no competition is D cS = 79.601 + 2.025PCI; the
regression line estimate for stores facing local competition is D
cS = 90.701 + 3.267PCI
The regression line estimate for stores facing no competition is D cS = 79.601 + 3.267PCI; the
regression line estimate for stores facing local competition is D
cS = 90.701 + 2.025PCI
The regression line estimate for stores facing no competition is D cS = 90.701 + 2.025PCI; the
regression line estimate for stores facing local competition is D
cS = 79.601 + 3.267PCI

The regression line estimate for stores facing no competition is D cS = 90.701 +
3.267PCI; the regression line estimate for stores facing local competition is D cS =
79.601 + 2.025PCI

Solution. For the model with daily sales (DS) as the dependent variable, and with the competition
dummy (COMP), the weekly per capita income (PCI), and the slope dummy (PCI COMP) as the three
independent variables, the regression line estimate is given by

cS = 90.701 − 11.100COMP + 3.267PCI − 1.242PCI COMP.


D

Page 7
In a neighborhood with no competition, the dummy variable COMP = 0 and the slope dummy
PCI COMP = 0, so the regression equation is

cS = 90.701 + 3.267PCI.
D

In a neighborhood in which the store faces local competition, the dummy variable COMP = 1 and the
slope dummy PCI COMP = PCI, so the regression equation is

S = 90.701 − 11.100COMP + 3.267PCI − 1.242PCI COMP


Dc
= 90.701 − 11.100(1) + 3.267PCI − 1.242PCI(1)
= 79.601 + 2.025PCI.

11. (1 point) Run a regression model with daily sales (DS) as the dependent variable, and with the competi-
tion dummy (COMP), the weekly per capita income (PCI), and the slope dummy (PCI COMP) as the three
independent variables. How confident can one be that the coefficient of the dummy variable COMP is not
zero?

Solution. The p-value for COMP is 0.888, so one can only be 1 − 0.888 = 0.112 ≈ 11.2% confident
that the COMP coefficient is different from zero.
In other words, COMP is not statistically significant. The data does not support the inclusion of the
dummy variable COMP in the regression model DS = β0 + β1 COMP + β2 PCI + β3 PCI COMP + .

Page 8
The stratified approach. We now stratify the data and build separate regression models for daily sales
of stores (0) facing no competition, and (1) facing local competition. To do so, you need to split the data
and separately treat observations that correspond to neighborhoods with no competition and observations
from neighborhoods with competition. There are several ways to do so; one is described below.

• Step 1: create the new variable PCI 0. In the Data spreadsheet, create a new variable with name
PCI 0 for the per capita income in neighborhoods where the pizza stores face no competition. Type
the variable name in cell F1, for instance.

• Step 2: populate the variable PCI 0. To select the values of per capita income in neighborhoods
where the stores face no competition, type the formula =IF(C2=‘‘No’’,A2,‘‘’’) in cell F2, and copy
the formula down to the end of the dataset. As the formulas are calculated, you will see the per capita
income values appear in the cells in which the COMPETITION variable is set to No. (Or the competition
dummy COMP = 0.) The cells that refer to neighborhoods with local competition will be blank.

• Step 3: repeat this construction three more times. Now that the variable PCI 0 has been
created, you need to follow the same steps to create three more variables.
– DS 0: values of daily sales at stores that face no competition. Create the variable in a new column
(for instance, in column G), and type the formula =IF(C2=‘‘No’’,B2,‘‘’’) in the respective cell
of row 2 (for instance, in cell G2). Copy the formula down to the end of the dataset and see the
values of daily sales appear in correspondence of the entries of PCI 0 that contain a number.
– PCI 1: per capita income in neighborhoods where the store faces local competition. Same construc-
tion as PCI 0, but make sure to select the observations that correspond to “Yes” values of
COMPETITION.
– DS 1: daily sales for stores that face local competition. Same construction as DS 0, but make sure
to select the observations that correspond to “Yes” values of COMPETITION.

Note that this construction will create the variables PCI 0 and DS 0 with 29 empty cells, and the variables
PCI 1 and DS 1 with 21 empty cells. It is important for the analysis that these cells are indeed empty. The
regression add-in will simply skip these rows and treat them as “missing” observations. (It would be a
conceptual mistake to set these cells to zero: this would incorrectly describe non-existing stores with zero
sales in neighborhoods with zero income.)
Next, run two separate regressions, one for each of the two segments:
(0) Simple linear regression model DS 0 ∼ PCI 0 with DS 0 as the dependent variable and PCI 0 as the
independent variable;
(1) Simple linear regression model DS 1 ∼ PCI 1 with DS 1 as the dependent variable and PCI 1 as the
independent variable.
12. (1 point) Consider the regression model DS 0 ∼ PCI 0 for the stores facing no competition, and the
regression model DS 1 ∼ PCI 1 for the stores facing local competition. Which of the following are the
estimates of the respective regression lines?
The regression line estimate for the stores facing no competition is DS d0 = 79.601+2.025PCI 0;
the regression line estimate for the stores facing local competition is DS
d1 = 90.701+3.267PCI 1
The regression line estimate for the stores facing no competition is DS d0 = 79.601+3.267PCI 0;
the regression line estimate for the stores facing local competition is DS
d1 = 90.701+2.025PCI 1
The regression line estimate for the stores facing no competition is DS d0 = 90.701+2.025PCI 0,
the regression line estimate for the stores facing local competition is DS
d1 = 79.601+3.267PCI 1

The regression line estimate for the stores facing no competition is DS d0 = 90.701 +
3.267PCI 0; the regression line estimate for the stores facing local competition is
d1 = 79.601 + 2.025PCI 1
DS

Page 9
Solution. The regression outputs available in the auxiliary solution file PizzaSales Solution tell
us that the regression line estimate for the stores facing no local competition is

d0 = 90.701 + 3.267PCI 0,
DS

and that the regression line estimate for the stores facing local competition is

d1 = 79.601 + 2.025PCI 1.
DS

Note that the regression lines above are the same as the ones you would obtain by choosing the
appropriate values of the independent variables in the regression model DS ∼ COMP + PCI + PCI COMP.
The estimated regression line for this model is

cS = 90.701 − 11.100COMP + 3.267PCI − 1.242PCI COMP.


D

If you set COMP = 0 and PCI COMP = 0 in the equation above, you then recover the regression line
for stores facing no competition. Similarly, if you set If you set COMP = 1 and PCI COMP = PCI
in the equation above and rearrange, you then recover the regression line for stores facing local
competition.

Page 10

You might also like