You are on page 1of 10

Exercise 1

Table 1.1 contains some descriptive statistics of the prices of ten food products sold at four
different chains of supermarkets.
Table 1.1: Descriptive Statistics of 10 prices at four supermarkets (the last row
shows statistics for the difference between 10 prices at supermarkets 1 and 2)
Supermarket N Minimum Maximum Mean Std. Deviation
1 10 1.69 7.99 2.920 2.020
2 10 0.79 4.99 2.211 1.350
3 10 1.29 5.99 2.240 1.513
4 10 0.99 5.99 2.282 1.484
1–2 10 -0.99 3.00 0.709 1.029

a. Test whether the mean price is significantly larger for supermarket 1 than for supermarket 2,
where you should use the fact that the two standard deviations for these two supermarkets do
not differ significantly. Use the following steps: (i) State H0 and Ha. (ii) Compute the relevant
test statistic and its degrees of freedom. (iii) Determine the relevant critical value. (iv) Draw
your conclusion, also in common speech.
(i)

(ii)

(iii)

(iv)

b. The prices are in fact paired, as they refer to the same ten products for each supermarket.
Perform a paired samples test to investigate whether the mean price is significantly larger for
supermarket 1 than for supermarket 2. Use the following steps: (i) State H0 and Ha. (ii)
Compute the relevant test statistic and its degrees of freedom. (iii) Determine an approximate
value for the relevant P-value. (iv) Draw your conclusion, also in common speech.
(i)

1
(ii)

(iii)

(iv)

Table 1.2: Ranks and Test statistics for 10 prices of Supermarkets 1 and 2
Supermarket N Mean Rank Sum of Ranks Wilcoxon W Z
1 10 12.70 127.00 127.00 1.67
2 10 8.30 83.00

Table 1.3: Sign of difference between supermarkets 1 and 2 for 10 prices


Supermarket N Positive Negative Equal Exact Sig. (2-tailed)
1 minus 2 10 8 2 0 0.109

c. Tables 1.2 and 1.3 show results based on the rank scores for the prices of supermarkets 1 and
2. We wish to use these results to test whether the prices are larger for supermarket 1 than for
supermarket 2. (i) State the relevant H0 and Ha for Table 1.2. (ii) Compute the corresponding
P-value and draw your conclusion, with a brief motivation. (iii) State the relevant H0 and Ha
for Table 1.3. (iv) What is the relevant test distribution for Table 1.3? (v) Draw your
conclusion for Table 1.3, with a brief motivation.
(i)

(ii)

(iii)

(iv)

(v)

2
Table 1.4: Test outcomes of ANOVA and Kruskal-Wallis tests
ANOVA Kruskal-Wallis
Supermarket N Mean F Mean Rank Chi-Square
1 10 2.920 0.442 26.95 4.253
2 10 2.211 18.30
3 10 2.240 17.25
4 10 2.282 19.50

d. Finally, we wish to use the results in Table 1.4 to test whether the price distributions differ
between the four supermarkets. (i) State the relevant H0 and Ha for the ANOVA test in Table
1.4. (ii) Compute the degrees of freedom of the corresponding F-test. (iii) Draw your
conclusion, with a brief motivation. (iv) State the relevant H0 and Ha for the Kruskal-Wallis
test in Table 1.4. (v) Compute the degrees of freedom of the corresponding Chi-Square test.
(vi) Draw your conclusion, with a brief motivation.
(i)

(ii)

(iii)

(iv)

(v)

(vi)

3
Exercise 2

A company is interested in the effect of radio and newspaper advertising on its sales. A sample of
22 cities with approximately equal populations is selected for study during a test period of one
month. Each city is allocated a specific expenditure level both for local radio and for local
newspaper advertising, and the sales during the test month are recorded. All variables are
expressed in thousands of dollars, and the combined advertisement expenditures for radio and
newspaper range from 40-100 thousand dollars per city. Table 2.1 shows results obtained for four
regression models (s2 denotes the estimated value of the variance 2 of the error terms).

Table 2.1: Regression models, with Sales as dependent variable


Model 1 Model 2 Model 3 Model 4
Variable Coeff P-value Coeff P-value Coeff P-value Coeff P-value
(Constant) 731.250 .000 148.201 .316 418.677 .021 -433.545 .394
Radio 11.683 .002 13.733 .000 -2.892 .676 26.782 .024
Newspaper 17.335 .000 14.737 .000 33.379 .023
Radio*Radio .240 .020
Radio*Newspaper -.366 .237
Resid. Var. (s2) 77366 31447 24414 30647
R Square .383 .762 .825 .780

a. Answer the following questions: (i) Write down the estimated regression equation of Model
2. (ii) Give an interpretation of the sign, size, and significance of the coefficients of Model 2.
(iii) Explain the large differences between the coefficients of Model 2 and Model 4, even
though the interaction term in Model 4 seems to be small and not significant. (iv) Use Model
2 to compute an approximate 95% prediction interval for the sales if the advertisement levels
for radio and newspaper are each equal to 50 thousand dollars. (v) Answer the same question
as in (iv), but now using Model 3 instead of Model 2. (vi) Provide a clear motivation of why
the prediction interval in (v) is to be preferred over that in (iv), and indicate how the
prediction could still be further improved by adjusting the regression model.
(i)

(ii)

(iii)

4
(iv)

(v)

(vi)

Table 2.2: Scatter diagrams, sales vs radio (left) and residuals vs predicted of Model 2 (right)

b. Table 2.2 shows two scatter diagrams, on the left of sales against radio advertisement, and on
the right of the residuals against the predicted values for Model 2. Answer the following
questions: (i) Provide a thorough motivation for the specification of Model 3. (ii) Formulate
the relevant H0 and Ha to test Model 3 against Model 1. (iii) Compute the degrees of freedom
of the F-statistic for this test, and find the related relevant critical value. (iv) Compute the
value of this F-statistic. (v) Draw your conclusion, also in common speech.
(i)

(ii)

(iii)

5
(iv)

(v)

Table 2.3: Sales explained from two factors (radio and newspaper, with interaction)
Source Sum of Squares df Mean Square F Sig.
Radio xxx df1 xxx xxx .000
Newspaper xxx df2 xxx xxx .000
Radio * Newspaper 247312 df3 xxx ??? ???
Error 54396 df4 xxx
Corrected Total xxx 21

c. Table 2.3 shows ANOVA results for sales, which are based on the fact that the number of
different values for the advertisement expenditures during the test period is limited to just
four, both for radio and for newspaper. Answer the following questions: (i) Compute each of
the four degrees of freedom denoted by df1-df4 in the table, and give a brief explanation of
your computations. (ii) Compute the degrees of freedom of the F-statistic to test for the
presence of an interaction effect. (iii) Compute the value of this F-statistic. (iv) Use the F-
table to establish that the corresponding P-value lies between 0.05 and 0.10. (v) The
regression Model 4 in Table 2.1 and the ANOVA test in Table 2.3 contain similar explanatory
factors for the mean level of Sales. Discuss the major difference between these two methods
for investigating the effects of Radio, Newspaper, and their interaction on (expected) Sales.
(i)

(ii)

(iii)

(iv)

(v)

6
Exercise 3

The left diagram in Table 3.1 shows quarterly revenues (in billions of euro’s) of a large company,
with n = 32 observations for the years 2000-2007. The middle diagram shows the residuals of the
regression model where these revenues are explained in terms of a linear trend, and the right
diagram shows the boxplot of these residuals. Further, Table 3.2 shows the outcome of a
nonparametric test for the residuals that are divided into two groups, those for the fourth quarter
and those for the other three quarters.

Table 3.1: Time plot of revenues (left), and time plot (middle) and boxplot (right) of residuals

Table 3.2: Test outcome for residuals


Quarter N Mean Rank Sum of Ranks Z Asymp. Sig. (2-tailed)
Q1-Q3 24 12.50 300 -4.178 .000
Q4 8 28.50 228

a. (i) List the four assumptions of the regression model. (ii) Which of these assumptions are
clearly violated for the linear trend model for the revenues? Provide a thorough motivation for
your answer. (iii) Provide suggestions of how to adjust the model to satisfy the required
assumptions, and motivate your suggestions.
(i)

(ii)

(iii)

b. (i) Which nonparametric test has been applied in Table 3.2? (ii) Provide a motivation for this
test for these residuals. (iii) Check the reported value of the Z-statistic by means of an explicit
computation. (iv) State your conclusion, with a clear motivation.

7
(i)

(ii)

(iii)

(iv)

Table 3.3 summarizes the results of three regression models for the revenues. Model 1 is the
linear trend model, and Models 2 and 3 contain also quarterly dummies (denoted by Q1-Q4),
where the reference quarter is quarter 4 in Model 2 and quarter 3 in Model 3.

Table 3.3: Regression models, with Revenues as dependent variable


Model 1 Model 2 Model 3
Variable Coeff t-value P-value Coeff t-value P-value Coeff t-value P-value
(Constant) 36.647 20.073 .000 45.407 62.892 .000 34.433 49.011 .000
Trend (1-32) 1.601 16.582 .000 1.552 55.215 .000 1.552 55.215 .000
Q1 -11.932 -16.271 .000 -.959 -1.313 .200
Q2 -8.872 -12.142 .000 2.102 2.883 .008
Q3 -10.973 -15.051 .000
Q4 10.973 15.051 .000
Resid. Var. (s2) 25.44 2.123 2.123
R Square .902 .993 .993

c. Provide clear and thorough explanations of the following facts. (i) The residual variance in
Models 2 and 3 is much smaller than that of Model 1. (ii) Models 2 and 3 have the same
residual variance and the same R Square. (iii) The coefficient of Q4 in Model 3 is the
opposite of the coefficient of Q3 in Model 2. (iv) For forecasting purposes, the choice of
Quarter 3 as reference quarter in Model 3 has an advantage over the choice of Model 2.
(i)

(ii)

(iii)

(iv)

8
Exercise 4

The quantity theory of money states that there is a direct relationship between the quantity of
money in the economy and the aggregate price level. Table 4.1 shows the outcomes of three
models for inflation, based on quarterly data for the USA from 1961 till 2005. Here DP denotes
the quarterly inflation and DM denotes the quarterly change in the quantity of money. Further,
DP(-1) denotes inflation in the previous quarter, with similar definitions for DP(-2), DM(-1), and
DM(-2). The variable P(-1)-M(-1) indicates deviations from equilibrium in the previous quarter.
Finally, DUM8605 is a dummy variable, with value 0 for the period 1961Q4-1985Q4 (97
observations) and value 1 for the period 1986Q1-2005Q4 (80 observations).

Table 4.1: Regression models, with Inflation (DP) as dependent variable


Model 1 Model 2 Model 3
Variable Coeff t-value P-value Coeff t-value P-value Coeff t-value P-value
(Constant) -0.000662 -1.70 0.091 -0.001495 -2.50 0.013 -0.000818 -2.05 0.041
P(-1)-M(-1) -0.018109 -2.39 0.018 -0.034967 -3.16 0.002 -0.021943 -2.80 0.006
DP(-1) 0.438860 5.90 0.000 0.489716 5.91 0.000 0.439737 5.95 0.000
DP(-2) 0.149119 1.99 0.049 0.160258 1.88 0.062 0.145513 1.94 0.054
DM(-1) 0.013800 0.55 0.584
DM(-2) -0.051965 -2.02 0.045
DUM8605 0.001354 1.72 0.087
DUM8605  (P(-1)-M(-1)) 0.029599 1.93 0.056
DUM8605  DP(-1) -0.308493 -1.63 0.106
DUM8605  DP(-2) -0.012568 -0.07 0.947
No Observations (n) 177 177 177
SE of Regression (s) 0.003317 0.003286 0.003297
R Square (R2) 0.297 0.326 0.314

a. Answer the following questions: (i) Which hypothesis can be tested by means of Models 1
and 2 in Table 4.1? (ii) State the distribution of the relevant test statistic, with its degrees of
freedom. (iii) Perform this test, show all relevant computations, and draw a clear conclusion.
(i)

(ii)

(iii)

9
b. Answer the following questions: (i) Which hypothesis can be tested by means of Models 1
and 3 in Table 4.1? (ii) State the distribution of the relevant test statistic, with its degrees of
freedom. (iii) Perform this test, show all relevant computations, and draw a clear conclusion.
(i)

(ii)

(iii)

The values of the standardized residuals z = e/s of Model 1 (where z has sample mean 0 and
sample standard deviation 1) are split in five classes. These classes are shown in Table 4.2,
together with the observed counts.

Table 4.2: Five classes for standardized residuals of Model 1


Interval z < -0.84 -0.84 < z < -0.25 -0.25 < z < 0.25 0.25 < z < 0.84 z > 0.84 Total
Count 32 36 36 42 31 177

c. Perform a goodness of fit test for the null hypothesis that the error terms are normally
distributed, by answering the following questions: (i) Compute the five expected counts under
the null hypothesis. (ii) Compute the value of the relevant test statistic. (iii) Use the Chi-
square distribution with 4 degrees of freedom to perform the test, and draw a clear conclusion.
(iv) Comment on the actual number of degrees of freedom of this test.
(i)

(ii)

(iii)

(iv)

10

You might also like