Computer Tools Course 2006

STATA Exercise

In the dataset master.dta you can find a set of 240 students that have applied to a Master
Course. Information about the following variables is given:
id personal identification number
male dummy variable taking the value 1 if the student is a boy
age age of the student
language numerical variable: 1 Asian, 2 Baltic, 3 English, 4 Hispanic, 5 Roman
Language, 6 Scandinavian/German/Dutch, 7 Other
grade grade of the undergraduate studies
toefl score in the TOEFL exam
prof dummy =1 if the professor of the reference letter is known
master dummy = 1 if the student has been accepted to the master

Please write a do-file for the following tasks and save your results in a log-file.
1. Read the dataset into Stata
2. Label the variables
3. Give a summary statistics for all variables: Use the apt commands for numeric and
categorical variables
4. Generate dummies for the language and create a graph showing the distribution of
the different languages among the students.
5. Generate dummies that indicate if a student is above the average age, average
TOEFL score and the average grade in the undergraduate studies.(use a loop)
6. Estimate how much the different characteristics of a student’s application explain
if she has been accepted to the Master.
a. Include the variables male, age, grade, toefl and the dummies for the language
b. Include furthermore the information about the professor of the reference letter
c. Use now the deviations from the mean for the variables age, grade and toefl.
7. Test if the professor of the reference letter has a significant impact on the
acceptance to the master
8. Compare the prediction for the acceptance to the master of the three different
estimations and the true variable.