You are on page 1of 2

ECON6067 Computation and Analysis of Economic Data

Stata (I) (II) Problem Set


Karen Xiaoting Mai

1. Use the Penn World Table 10.0 for the following exercises. Use the series “rgdpo” for real
GDP. Focus on the years 1980-2019. Drop the countries for which rgdpo is missing in 1980.

(a) Plot the human capital index over time for the United States.
(b) Generate real per capita GDP, rgdpo_pop. Label it properly.
(c) Find the mean, min, max of rgdpo_pop for the year 1980 and 2019, respectively.
(d) Based on the rgdpo_pop in 1980, divide countries into two groups, those above median
and those below median. Generate a new variable highincome with values equal to 1
for the “higher income” group and 0 for the “lower income” group.
(e) Calculate for each country the average annual growth rate of real per capita GDP during
1980-2019. Generate scatter plots separately for the “higher income” and “lower income”
groups of this average growth rate versus real per capita GDP 1980.

Data:
https://www.rug.nl/ggdc/productivity/pwt/
Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation
of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for
download at http://www.ggdc.net/pwt.

2. Use the dataset “bpwide” (fictional blood pressure) installed with Stata for the following
exercises.

(a) What is the difference in the means of bp_before for those aged 30-59 and those aged
60+? Is the difference statistically significant at the 5% significance level?
(b) What is the difference in the means of bp_before and bp_after for female patients aged
46-59? Is the difference statistically significant?

3. Use the dataset “census” installed with Stata for the following exercises.

1
(a) Regress number of deaths on median age.
(b) Regress number of deaths on median age, controling for the region effects.
(c) Using pop as weight to re-run the regression. Test whether the coefficient on medage
equals to 12000.

You might also like