methodologies to evaluate early childhood development programs




How representative are the data for the population of interest? Can inferences be
made for some population of interest beyond the sample, perhaps through weighting the
observations appropriately? Some potentially very interesting data, such as individual and
family histories (e.g. Watkin’s [2004] use of journals kept by four individuals on
HIV/AIDS in Malawi), pre-school- or clinic-based data and much (though not all)
qualitative data may raise interesting questions and conjectures for more systematic study
but be difficult to interpret with regard to their implications for broader populations.

Power, Sample Size, and Sample Design

Power refers to whether the sample is large enough to identify the effect of
interest at a given significance level. Power calculations indicate how large the sample
size needs to be to identify such an effect with a specified degree of confidence (e.g. at
the 5% level); standard software packages such as Stata can facilitate power calculations
(e.g. Behrman and Todd 1999b). For example, suppose that the question of interest is
whether spending the third year of life in a particular comprehensive ECD program
increases adult children’s access to resources by at least 3% at the 5% significance level.
The sample size in terms of households necessary to have any particular level of
statistical power, of course, varies depending on what question is being asked. For
instance, a larger number of households is required the more fine-tuned the question is
with respect to demographic groups – so many more households will be needed to
investigate the possibility of a given impact with given significance between ECD
programs and cognitive skills among three-year old girls than to investigate the
possibility of the same percentage impact with the same significance between ECD
programs and schooling attendance for all 6-12 year-old children (even with correction of
the standard errors for clustering at the family level). If the sample design involves
clustering, the number of clusters and the intracluster correlations are important in
addition to the number of households (see discussion on standard errors in Section III). It
is sensible for researchers to ask questions about power when they initiate analysis rather
than bemoan that the sample size is too small after they have invested a lot of resources
in the research project. Data that in other respects might appear very promising for the
analysis of ECD interventions may not warrant analysis if the power is too low.


