You are on page 1of 9

Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.

1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Final Assignment - WS 19-20 - Applied Econometrics & Introduction to a Statistical Software (6.1)

Start date: 22nd January, 2020


Due: 26th January, 2020; 23.59 o’clock

In this assignment your analytical skills with regards to estimating and interpreting associations between
variables is tested. The data that you shall be using is collected as part of the Better Life Index (2017), the
original data can be found here: https://stats.oecd.org/Index.aspx?DataSetCode=BLI. Each variable name
field (row 4 in the Excel File) contains a hyperlink to the OECD website with more information about the
characteristics of the data.

Each student is assigned her/his own dataset! Please check in the excel sheet Name_Dataset which dataset
is assigned to you (indicated by the suffix _NN where NN corresponds to your initials) and use only this
dataset for your calculations. Note that I amended each dataset slightly, so please exclusively use the Excel
dataset provided by me for your calculations and work on your own solutions. While you may exchange ideas
with your fellow students, exact copying of solutions will be counted as attempt of deception and lead to
failing this class.

To carry out the different tasks of this assignment (specified on the next pages) use all country
entries, i.e. the data from Australia up to United states to run single and multiple regression analysis. The
data analysis can be carried out with the Excel-Data Analysis tool (Data -> Data Analysis > Regression).
Make sure you interpret your results in the space provided on this sheet (type your text into the answer
boxes). For example, in linear regressions, an explicit interpretation of the model coefficients usually takes the
form of: “A one unit/percent increase in … is associated with a ... unit/percent change in …”. For some
variables however, it might make sense to report the effect of a 10,000 USD change instead of a unit change.

In total there are four tasks to this assignment, each weighted equally with 30 points. You will need 50 points
to pass the assignment, i.e. whereas you can in principle achieve 120 points, I will grade your work as if only
100 points were possible. You need to submit your Excel file in addition to this document in order to show
your work. Answers without workings will score no marks. In case your calculation is incorrect, but your
interpretation right or vice versa, you will be awarded partial points. To submit your assignment please
upload this document & the excel file, to the LEA exam folder for module 6.1 until January 26th,
23:59pm at the latest.

Good Luck!

FILL IN THE FOLLOWING:

Name:
Matriculation number:

When saving your answers and excel sheet, please also include your last name in the file names!

1
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Task 1: Compare the average life expectancy between high-earning and low-earning countries.
(Chapter 3)

i. First, you need to divide the countries in two groups, one with high personal earning and one
with low personal earnings. For this, generate a dummy variable indicating whether personal
earnings in the country are above 40,000 USD. That is, generate a variable (in the following
named PerEarDum) which takes value 1 if personal earnings are above 40,000 USD in that
country, and value 0 if personal earning are below 40,000 USD in that country. How many
countries fall in each group?
Hint: Use the IF function.

ii. Then calculate the mean life expectancy for the two groups of countries, i.e.
E(LifeExpectancy | PerEarDum=0) and E(LifeExpectancy | PerEarDum=1).
What did you calculate?

iii. Calculate the standard deviation for life expectancy in the two groups. In general, what does the
standard deviation tell us about the distribution of a variable?
Hint: Use the STDEV.S command combined with the IF function in Excel. Remember to enter this formula with
CTRL+SHIFT+ENTER (instead of only pressing ENTER).

iv. Calculate the difference in these means and the associated standard error. Use the calculated
values to test whether the difference between the two means is significant. Interpret your result.
Hint: Read Chapter 3.4 in Stock&Watson and in particular the box on page 132 “The Gender Gap of Earnings of
College Graduates in the United States”.

v. What would be an alternative method to test for significant difference between the two means?

2
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Answers Question 1:

3
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Task 2: Use simple linear regression in order to estimate the relationship between two variables.
(Chapter 4 and 5)

i. What are the three key assumptions for the Ordinary-Least-Squares estimation?

ii. Run the following two linear regressions and interpret your finding: Is there evidence of an
association between years of education and personal earning? What about between personal
earnings and life expectancy? Be as explicit as you can.

Y-Variable X-Variable
1 Personal Earnings Years in Education
2 Life Expectancy Personal Earnings
Hint: Also comment on the statistical significance of your results.

iii. Are there any outliers in any of the variables used in the above regressions? If yes, what does that
imply for your results? Re-run the respective linear regression excluding the outlying observation
and contrast your findings to the one in (ii).
Hint: Check graphically for outliers.

4
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Answers Question 2:

5
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Task 3: Test different functional forms of the relationship between two variables (Chapter 8)

i. Using lin-log and log-log transformations, test for non-linear relationship between the following
variables:

Y-Variable X-Variable
1 Life Expectancy Log of Personal Earnings
2 Log of Life Expectancy Log of Personal Earnings

Hint: To log-transform a variable type “=ln(variable)”.

ii. How do these models compare to the linear specification estimated in Task 1 in terms of model
fit? As a researcher, which model would you choose?

iii. Interpret the coefficients estimated above. Be as explicit as you can. Which model do you

6
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Answers Question 3:

7
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Task 4: Use multivariate regression to control for another variable. (Chapter 6 and 7)

For this task, choose whether to continue with the log-log or the log-transformation (Regressions 1 or 2 from
Task 3).

i. Run the same regression adding a control variable for the country’s’ safety (Homicide Rate).
Interpret your coefficient of homicide rate, be as explicit as you can.
Hint: To include multiple regressors in your calculations, copy-paste the needed column of the additional variable next to the
already included x-variable and enlarge the “input x-range”.

ii. Why does the log personal earning coefficient change its value in comparison to the model
estimated in Task 3? Would you consider the Homicide Rate an omitted variable or a control
variable?
Hint: Read Chapter 7.5 for a distinction between omitted variables and control variables.

iii. Without estimating it, how would the R2 of the regression in Task 3 compare to the one of the
regression in Task 4?

8
Final Assignment Applied Econometrics & Introduction to a Statistical Software (6.1), WS 2019/2020
Lecturer: Simona Helmsmüller Due date: 01/26/2020

Answers Question 4:

You might also like