You are on page 1of 16

An Introduction to Event History Analysis

Dawn L. Teele

April 2008


This paper is meant to be a guide to using event history modeling. Topics include the formulation of the hazard rate and the survival func- tion, parameterization of the model, and common problems such as right and left censoring and ties. The appendices contain commands for setting up event history models in Stata.

  • 1 Event History Modeling

Event history models, also known as survival-time analysis, duration models, or hazard models take their origin in the natural sciences. These models estimate what is known as a hazard rate, the probability of an event occurring given that it has not already occurred. In other words, the conditional probability of an event. Embedded in the hazard is the notion of a failure rate and a survival func- tion. In the natural sciences the hazard was used to understand how medical treatments a§ected the probability of death of patients with terminal illnesses (the failure rate) given a distribution of longevity (the survival function). In the social sciences we might also be concerned with the conditional prob- ability of an event occurring, and with some data ó especially those that have variation in observable characteristics over time ó an event history model might be appropriate.

  • 1.1 Why event history?

When the estimate of interest is the conditional probability of an event occur- ring, a linear regression speciÖcation will not su¢ ce. If left in a linear model, the explanatory variables will yield predictions for the coe¢ cients that are di¢ cult to compute and interpret (Allison 1984), and little information will be shed on the conditions that lead to an ìeventîoccurring. There are two statistical techniques that can be used to understand the e§ects of the covariates on the response variable: The logit or log-odds of the


event variable can be used in an OLS regression, 1 or survival time regressions can be performed. Logit regression has been a traditional method for analyzing events within social science. However, if the length of time leading up to an event is also of interest, then duration models hold a unique advantage over traditional logit models. For example, some subjects will have time-varying observable charac- teristics that need to be controlled for, such as GDP per capita, trade and even an indicator of how democratic a country is. Event history models perform much better than traditional OLS regression with these types of independent variable (Allison 1984). 2 Issues with event history data such as truncation, late entry, and censoring are also better dealt with in survival analysis packages. In many situations, researchers wonít have data for subjects prior to when their sample begins (es- sentially an incomplete life history). This is called ìforward censoringîand can be handled quite well by survival time packages. Another type of censoring, ìright censoringî occurs when a subject has not experienced the event by the time the sample ends. Mathematically, right censoring does not pose problems for estimating a hazard because subjects for which an event has not happened contribute to the survival function S(t) but not to the unconditional failure function f(t) (Box-Ste§ensmeier and Jones, 2004). To add to the laundry list of concerns that event history packages handle well, some subjects have what is known as ìdelayed riskîmeaning that they are not part of the relevant sample at the beginning of the history. Concretely, if we want to know the probability of a relapse into civil war ó and the duration of time between those events ó countries that have not yet had one civil war at the beginning, but who have at least one by the end of the history, have delayed risk. Finally, ìfailure tiesîoccur when more than one subject experiences an event in the same time period. Ties are a statistical problem because it is impossible to determine which subject failed Örst hence we cannot specify the conditions that were instrumental at the time of failure. Ties are less of an issue with discrete than continuous data, and Ölters can be applied to deal with these issues (Box-Ste§ensmeier and Jones, 2004).

  • 1.2 An example to work with

In the following discussion I will make reference to work that I conducted for my senior thesis to provide a practical example of the way in which event history analysis can be used (Teele 2006). The basic question I asked is, ìwhat causes countries to ratify the child labor convention with the International Labor Or- ganization?î taking ratiÖcation as the ìeventî to be studied. Bear in mind

1 Formally, the logit or "log-odds" transformation is constructed by: logit_Depvar =




Depvar ) : The transformation is necessary to have standard errors that are normally

distributed. 2 If all of the covariates used in the regression are constant over a countryís event history, cross-sectional analysis of the dependent variable in the year of ratiÖcation would su¢ ce.


Figure 1: RatiÖcation of the ILO Child Labor Convention by Region, 1976-2004 that I use this

Figure 1: RatiÖcation of the ILO Child Labor Convention by Region, 1976-2004

that I use this example here to demonstrate the steps that one should take to implement an event history model, and am not purporting to have uncovered un-falsiÖable truths in this demonstration. The sample contains 150 countries from 1976-2004, the periods are in years (discrete time) and the independent variables will be both time-varying and time-invariant. To give you a sense of the data, Figure 1 shows the ratiÖcation patterns by region. It is clear that there was a surge in ratiÖcation during the later part of the sample (1997-2003), and so there must be a story to tell about the conditions leading up to this time period. For this reason the event history methodology seems, conceptually, to Öt the data.

  • 1.3 Duration variable (the dependent variable)

Event history analysis uses the duration of time ó either discrete or continuous time ó before an event occurred as the dependent variable (though not the dependent variable in the conventional sense). In our example, Convention 138 was introduced to the ILO assembly in 1976, so the event history begins in this year and ends in 2004, the last year that data for this sample are available. However, there are 47 countries that were not members of the ILO in 1976, meaning that they have ìdelayed riskî in the sample. Practically speaking, a subject cannot experience an event if it is not part of the relevant sample (we cannot understand the e§ect that a blood pressure drug has on increasing the time between heart attacks if a subject has not in fact had a heart attack), and so we must be careful when constructing the dataset to deal with these cases


(read: get the Stata commands correct). 3 In this study the duration variable will not begin for each country until they are members. Thus for all countries the duration variable begins either in 1976 or in the year that a country joined the ILO, and ends in the year that they ratify C138. The late-entries are classiÖed as ìleft-censoredî. On the opposite end of the sample period we have many countries that have not ratiÖed the ILO convention. In event history models these cases are classiÖed as ìright censoredî.

  • 1.4 Estimating the hazard rate

Event history models estimate a ìhazard rateî, or the conditional probability that an event will occur given that it hasnít already. The basic formulation of the hazard rate H(t), is similar to a cumulative distribution function of the probability of an event. The hazard rate is constructed by estimating the maximum likelihood that a subject will experience an event. Mathematically, the hazard rate has two components: the probability of failure, f(t) and the survival function, S(t). As I mentioned earlier, the hazard rate originally hails from techniques used in the natural sciences to predict the probability that a subject will die, hence ìsurvivalîand ìfailureîfunctions (Allison 1984). Following Box-Ste§ensmeier and Jones (2004), let T be a discrete random variable indicating the timing of an event. In our example the probability that a country will ratify C138 at time t j , is found in the failure function where j represents the year in which a country ratiÖes C138:

f(t) = Pr(T = t j )


Because ratiÖcation (or ìfailureî) can only occur once for each country, this is a ìsingle spellî analysis. However, there may be more than one failure at a time if countries ratify in the same year; the Breslow method of estimation for ìfailure tiesîis accounted for in Stata. The second component of the hazard, the survival function is deÖned as:

S(t) = Pr(T t j ) = X f(t j )



The basic hazard rate for discrete-time is simply the ratio of the probability of failure to the probability of survival, or the conditional probability of survival given that failure has not already occurred. This relationship can be expressed in two ways:

h(t j ) =




3 I have run across some literature (speciÖcally, Bockmann 2001) in which authors disregard the observations that are not in the entire sample. This seems to be a waste, both because it lowers the n but also because it devalues any general conclusions that may be inferred from the results.


h(t j ) = Pr(T = t j jT t j )


Equation 4 gives the rate at which countries ratify C138 conditional on their survival until j. The hazard can be augmented to include the e§ect of a vector of time-varying explanatory variables x ij :

h(t j jx ij ) =

f(t j jx it )

S(t j jx it



In this speciÖcation it is assumed that the only thing that a§ects the haz- ard over time are elements of x ij , the vector of time-varying covariates, but qualitative variables can be included in the model just as easily. Implicit in the hazard estimate is what is known as the ìbaseline hazardî, h 0 (t), which is a parameter representing the distribution of the data with respect to time. The idea is that the hazard is a function of h 0 (t), an intercept, and a vector of explanatory variables, x ij . The e§ect of the baseline h 0 (t) on the hazard ratio depends on the restric- tions that are placed on this parameter. Using a Cox ìproportional hazardsî speciÖcation is a way to allow the hazard to be estimated without placing any parametric assumptions on the baseline. Meaning that the regression remains agnostic as to how time ináuences the probability of the event. Proportional hazards, especially the Cox model, has become increasingly more popular in he social science literature due to its áexibility. 4 However, if the distribution is known (or suspected), a functional form for h 0 (t) can, and should, be speciÖed. So long as the hazard is a multiplicative of the covariates, the model is classi- Öed as proportional hazards, and most parametric models can be written in this notation. There are substantive di§erences in the estimates when other distrib- utions are assumed, and alternatives to the Cox model include the exponential, Gompertz and Weibull distributions.

  • 1.5 Survivor and Hazard Functions

Two preliminary steps in event history analysis entail looking at the components of the hazard function for di§erent sub-groups, and testing to see whether the di§erences are statistically signiÖcant. Given that the division between countries in the OECD and those who are not members generally corresponds to the level of development, I will compare these two sub-groups, but others (such as region of the world) could just as easily be chosen. The cumulative hazard estimates for OECD and non-OECD countries can be seen in Figure 2. No parametric assumptions were placed on the estimates of

4 Estimating the hazard rate was Örst done within a conditional logit model, and is at- tributed to Cox (1972). Over time, the ìCox Proportional Hazardî model has become more prevalent in empirical social science. For a discussion of the beneÖts of the propor- tional hazards assumption as opposed to other functional forms of the hazard model see Box-Ste§ensmeier and Jones (2004) and Yamaguchi (1991).


Figure 2: Cumulative Hazard Estimates for OECD and non-OECD Countries the cumulative hazard functions, and at

Figure 2: Cumulative Hazard Estimates for OECD and non-OECD Countries

the cumulative hazard functions, and at Örst glance the OECD countries appear

to have a higher hazard ratio based on the higher intercept and steeper function

toward the end of the sample.

Another common way to look at sub-groups within the event history panel

is to calculate a Kaplan-Meier estimate of the survivor function, and then graph

these results for di§erent qualitative sub-groups. The Kaplan-Meier estimator

also has no parametric assumptions and is given by:

S(t) ^ = Y

jjt j t

n j r j

n j


where n j represents the number of countries who are at risk of ratiÖcation

at time t j , and r j represents the number of countries that ratify C138 in period

t j . In other words, the Kaplan-Meier estimate is the product of the percent of

countries in the sample that survive past each period. This estimator is non-

parametric in that it doesnít place any restriction on the shape of the survivor

function takes. The Kaplan-Meier survival estimates, often referred to as the

ìCumulative Survivalîestimate for the ratiÖcation of C138 can be seen in Figure


The probability that countries within each group, here OECD and non-

OECD countries, will make it to the next year without ratifying C138 is shown

on the y-axis, whereas the x-axis shows the time that has passed since C138 was

introduced in 1976. Figure 3 displays survival functions that look almost iden-

tical in shape (as opposed to the cumulative hazard graph) and it is clear that

for both OECD and non-OECD countries, the survival function is monotoni-


Figure 3: Kaplan-Meier Survival Estimates for OECD and non-OECD Countries cally decreasing over the sample. Though

Figure 3: Kaplan-Meier Survival Estimates for OECD and non-OECD Countries

cally decreasing over the sample. Though the intercept for OECD countries is

visibly lower than the intercept for non-OECD countries, a statistical test ó

the ìlog rankîtest for the equality of survivor functions ó demonstrates that

the di§erence is not statistically signiÖcant.

Taken together, the Cumulative Hazard and Kaplan-Meier graphs illustrate

that though the hazard functions display some variation across groups, the

survivor functions move very closely to one another. Solely looking at the cu-

mulative hazard functions (or not looking at the log-rank tests) could lead to a

misspeciÖcation of the model. Appendix A shows the output for the log-rank

test in the scenario above, and lists a few other statistical tests that can be

used to test the null hypothesis that the survivor functions for two (or more)

sub-groups are the same.

In the literature on ILO convention ratiÖcation, the Cox proportional haz-

ards model is the predominant speciÖcation, although little rationale other that

its áexibility is given (see Boockmann 2001, Chau and Kanbur 2001, and Abu

Sharkh 2002). The Cox model may have advantages when the shape of the

baseline hazard is unknown, however for this data the survivor functions are

decreasing monotonically over time, which leads to a suspicion that the risk of

ratiÖcation as represented by the hazard is increasing over time.

  • 1.6 The Weibull Regression Model

A parametric regression model that provides e¢ cient coe¢ cient estimates for

monotonic functions is the Weibull speciÖcation (Allison 1984, Box-Ste§ensmeier

and Jones 2004). The baseline hazard function that the Weibull estimates is


given by:

h 0 (t) = pt p 1 exp(a)


where p is a shape parameter that remains constant, and a is a scale para-

meter that is estimated by the covariates. When the hazard is monotonically

increasing with respect to time, p > 1. Given a set of covariates x ij the Weibull

equation is:

h(t j jx ij ) = h 0 (t)exp(x it x )

= pt p 1 exp( 0 + x it x )


As can be seen from Equation 8, the covariates that ináuence the hazard do

so as a multiple of the baseline hazard. For this reason the Weibull still falls

into the category of a proportional hazards model, but is di§erentiated from

the Cox model based on the ancillary parameters a and p. The Ötted Weibull

model estimates a, p and 0 x (Cleves, Gould & Gutierrez 2002).

  • 2 Estimation Results

Table 1 lists the coe¢ cients and standard errors for Weibull estimates of the

ratiÖcation of Convention 138. The signs of the coe¢ cients indicate the direction

of the e§ect that the variable has on the probability that a country will ratify

C138 given that it has not already. The coe¢ cients themselves do not have

meaningful interpretations, but because the Weibull model still Öts into the

family of proportional hazard models, the e§ect that each covariate has on the

hazard ratio can be found by taking the exponential the coe¢ cients. 5

The Öve regressions in Table 1 are not nested models, but have been included

to demonstrate two things: Örst, as can be seen in regressions (1) and (3), per-

capita GDP and its quadratic term do not have a statistically signiÖcant e§ect

on the hazard ratio. This result is consistent with the results found by Chau

and Kanbur (2001), and is intuitive given that many of the richest countries

ratiÖed C138 very late into the sample, if at all.

The e§ect of international trade on the propensity to ratify C138 is posi-

tive and statistically signiÖcant in all tested variations of the model (for other

combinations of covariates see Teele (2006)). The negative coe¢ cient on tradeís

quadratic term indicates that trade has a positive e§ect on ratiÖcation but at

a decreasing rate. This result predicts that as trade increases, the probability

of ratiÖcation increases as well. This result could be motivated by a desire to

5 For hazard ratios the null hypothesis tested is that coe¢ cient has no e§ect on the hazard such that exp(B x ) = 1. This is equivalent to testing whether the non-exponentiated coe¢ - cients are equal to zero, as with standard linear regression. For this reason, traditional ways of assessing signiÖcance apply in this analysis.


Table 1: Weibull Regressions on Child Labor Convention







ln GDP per cap





GDP squared






-0.00287 -0.0262




(0.0536) (0.0795)



Trade, % GDP









Trade squared









Aid per capita

-0.00494 -0.00485

(0.00296) (0.00292)

Child Labor %






-7.427 -7.009


-7.908 -8.793














(0.0978) (0.0978)









Standard errors in parentheses p < 0:05, p < 0:01, p < 0:001


ìsignalî to trade partners that the country has a similar set of values and re-

strictions placed on its labor standards ó a move that is meant to facilitate

trade negotiations.

Per-capita international aid is statistically signiÖcant at the 10 percent level

and is found to have a negative e§ect on the hazard. This means that countries

that receive more international aid per person are less likely to ratify the child

labor convention. This result is contrary to sociological theories that ìpressureî

from international aid organizations and foreign governments a§ects ratiÖcation

patterns for aid recipients. 6 However, it should be noted that countries that have

higher per-capita aid are probably some of the poorest, and this variable may

picking up on other correlates of poverty.

The other result of interest in Table 1, found in Regression (5) is that child

labor has a positive e§ect on the hazard ratio, and is statistically signiÖcant. 7

This result is particularly intriguing given that C138, which is technically the

ìminimum age to workî convention speciÖes that a minimum age of 15 be re-

quired for economic participation. The regression shows that for higher levels

of child labor ó speciÖcally deÖned as the percent of children 10ñ14 who are

economically active ó countries are more likely to ratify the child labor con-


Regression (5) includes all of the covariates that have a statistically signiÖ-

cant e§ect on the hazard. Hazard ratios have been calculated for these covariates

and can be found in Table 2. For estimates of the hazard ratio that are larger

than one, the covariate has a positive e§ect on the probability of ratiÖcation.

Covariates whose hazard estimates are less than one have a negative e§ect on

the probability of ratiÖcation. The hazards reported in Table 2 mirror the previ-

ous discussion. Both child labor and trade have very large e§ects on the hazard


In the next section I will construct a graph holding child labor constant at the

saple mean and varying the level of trade in order to give a visual interpretation

of the results above.

  • 2.1 Fitted Values of Survival Analysis

Because there are so many steps that go into the data collection even before the

regression command can be hit, it can be tempting to let the regressions speak

for themselves. However, regression coe¢ cients or, in our case above, hazard

rates, are not always easy to understand. The saying goes that pictures are

worth a thousand words, so for the next page or so I will show you a picture

that can help us interpret the results from above.

First, it is important to look at the data to see what a probable range for the

6 It could be argued that it is not aid per-capita but the overall reliance on international aid that allows countries to be pressured into ratiÖcation. For this reason, international aid as a percent of government expenditures was also tested for its e§ect on the hazard but was not statistically signiÖcant. 7 Without the inclusion of regional dummies, child labor is signiÖcant at the 10 percent level.


Table 2: Hazard Estimates for Model (5)


Trade, % GDP



Trade squared



Aid per capita



Child Labor %








Exponentiated coe¢ cients; t statistics in parentheses p < 0:05, p < 0:01, p < 0:001

variables of interest are. Table ?? below shows means and standard deviations

for trade and child labor, the two variables that were statistically signiÖcant

in the section above. Figure 4 presents the Ötted values for Regression (5)

evaluated at the average level of child labor, 17 percent, for di§erent levels of

trade. The groups with lower trade (one standard deviation below the mean)

as a percent of GDP are less likely to ratify the convention than those with

above-average trade (one standard deviation above the mean).

The slopes of the Ötted value curves show that for higher levels of trade, the

hazard rate is much higher. The direct interpretation of this is: holding child

labor constant, countries with higher levels of trade as a percent of GDP are

much more likely to ratify the ILO convention banning child labor.

This result supports the hypothesis that countries may sign conventions to

enhance their ìreputational capitalî within the global market place. If ques-

tions of labor standards arise in trade negotiations, countries can point to the

convention as a law that they uphold, while knowing full well that the ILO has

very little coercive power to punish their actions should they be discovered.

3 Conclusion

Finally we come to the end of this foray into event history modeling. The

Appendices that follow are meant to help you get started with setting up your

event history panel in Stata, and I have also included the commands that were


Table 3: Summary Statistics


Trade, % GDP



Child Labor %





mean coe¢ cients; sd in parentheses

Table 3: Summary Statistics (1) Trade, % GDP 0.729 (0.406) Child Labor % 0.136 (0.158) Observations

Figure 4: Fitted Values of Regression (5) Evaluated at Mean Level of Child

Labor with Varying Levels of Trade

used to make each graph in this paper.


A ST commands used on this data

As a brief introduction to some of the survival time commands that are necessary

to set up an event history panel:

snapspan panelid year dcnv138, gen(date0) replace

/* Convert snapshot data to time-span data */

rename year date1

stset date1, id(panelid) failure(dcnv138) origin(ismember==1) exit(iloflake==1)

stdes /* describes st set */


note: obtain K-M survival estimate

sts generate kmS=s

label var kmS "K-M"

note: obtained N-A cumulative hazard estimate

sts generate naH=na\qquad

label var naH "N-A"

note: calculate N-A survivor estimate

g naS=exp(-naH)

label var naS "N-A"

note: calcualte K_M cumulative hazard estimate

g kmH=-log(kmS)

label var kmH "K-M"

Cox regression -- Breslow method for ties

stcox lpcGDP lpcGDPsq, basesurv(s) basehc(h)

sts test OECD, logrank


failure _d: dcnv138

analysis time _t: (date1-origin)

origin: ismember==1

exit on or before: iloflake==1

id: panelid

Log-rank test for equality of survivor functions






OECD | observed












Total |



chi2(1) =


Pr>chi2 =


*or can use any of the following to test for the equality of survivor functions:

sts test OECD, wilcoxon /* Wilcoxon-Breslow Test */

sts test OECD, tware\qquad /* Tarone-Ware Test */

sts test OECD, peto /* Peto-Peto Test */

sts test OECD, logrank strata(legalsys) detail /* stratified long-rank tests */

B Graph commands used in this paper

The following are the stata commands used to construct the graphs and tables


For the bar chart in Figure 1:

graph bar (asis) Rat138 if Year!=1975,

over(Year, label(angle(forty_five)

labsize(small))) stack

ytitle(, size(small)) ylabel(, angle(horizontal) labsize(small))

title(Ratification of ILO Convention 138)

subtitle(by region and year)

note(Source: International Labour Organization)

For the hazard rate estimates in Figure 2:

sts graph, hazard by(OECD)


ytitle(Conditional Probability of Ratification)

xtitle(Years since C138 introduced, margin(medium))

legend(order(1 "non-OECD" 2 "OECD") size(small))

xlabel(0 "1976" 10 "1986" 20 "1996" 30 "2006", valuelabel)

title(Cumulative Hazard Estimates)

subtitle(OECD and non-OECD countries)

note(Source: International Labour Organization)


For the Kaplan-Meier survival estimates in Figure 3:

sts graph, by(OECD)


ytitle(Conditional Probability of Ratification)

xtitle(Years since C138 introduced, margin(medium))

legend(order(1 "non-OECD" 2 "OECD") size(small))

xlabel(0 "1976" 10 "1986" 20 "1996" 30 "2006", valuelabel)

title(Kaplan-Meier Survival Estimates)

subtitle(OECD and non-OECD countries)

note(Source: International Labour Organization)

Finally, for the hazard graphs of Ötted values evaluated at the mean level of

child labor for di§erent levels of trade in Figure 4:

stcurve, cumhaz

at1( trade=.701 clabWDI=.15 )

at2( trade=1.07 clabWDI=.15 )

at3( trade=.295 clabWDI=.15 )



xlabel(0 "1976" 10 "1986" 20 "1996" 30 "2006", valuelabel)



legend(order (

  • 1 "Average Trade (70%)"

  • 2 "One standard deviation above"

  • 3 "One standard deviation below"))

title(Fitted values of Cumulative Hazard Function)

subtitle(evaluated at average level of child labor)



[1] Allison, Paul D. 1984. Event History Analysis: regression for longitudinal

event data. Newbury Park, California: Sage Publications.

[2] Artecona,





A Database of Labor Market Indicators. World Bank.

[3] Beck, Thorsten; George Clarke; Alberto Gro§; Phillip Keefer; Patrick

Walsh. 2001. The Database of Political Institutions. World Bank.

[4] Boockmann, Bernhard. 2003. Mixed Motives: An empirical analysis of ILO

roll-call voting. Constitutional Political Economy 14(4), December 2003,


[5] ó ó . 2001. The RatiÖcation of ILO Conventions: A hazard rate analysis.

Economics and Politics 13(3): 281ñ309.

[6] Box-Ste§ensmeier, Janet; Bradsford Jones. 2004. Event History Modeling:

a guide for social scientists. Cambridge, United Kingdom: Cambridge Uni-

versity Press.

[7] Chau, Nancy; Ravi Kanbur. 2002. The Adoption of International Labor

Standards Conventions: Who, when and why? Revised version published

in Brookings Trade Forum, 2002.

[8] Cleves, Mario; William Gould; Roberto Gutierrez. 2002. An Introduction

to Survival Analysis in Stata. College Station, Texas: Stata Press.

[9] Jaggers, Keith; Monty Marshall. 2002. Polity IV Project. University of


[10] Kopka, Helmut; Patrick Daly. 1999. A Guide to L A T E X. Dorchester, Eng-

land: Dorset Press.

[11] StataCorp. 2005. Statistical Software: Release 9.0. College Station, Texas:

Stata Corporation.

[12] Teele, Dawn. 2006. Child Labor and the Minimum Age to Work Convention.

Reed College Thesis.