You are on page 1of 14

Economics of Education Review 72 (2019) 30–43

Contents lists available at ScienceDirect

Economics of Education Review


journal homepage: www.elsevier.com/locate/econedurev

Center-based care for infants and toddlers: The aeioTU randomized T


trial✰,✰✰
Milagros Noresa, , Raquel Bernalb, W. Steven Barnetta

a
National Institute for Early Education Research, Graduate School of Education, Rutgers, The State University of New Jersey, 73 Easton Av, New Brunswick, NJ 08901,
United States
b
Economics Department and Centro de Estudios sobre Desarrollo Económico, Universidad de Los Andes, Bogotá, Colombia

ARTICLE INFO ABSTRACT

JEL codes: Little is known about the effectiveness of center-based high quality educational programs for infants and tod-
J13 dlers, especially in low- and middle-income countries. This paper reports effects from a randomized trial of a
I10 high-quality center-based early intervention on infants and toddlers in two communities in northern Colombia.
I20 Just eight months into the program results indicate large positive effects on language, cognitive development
H43
and overall development, with girls benefitting the most. No effects were found on nutritional outcomes, socio-
emotional development or the home environment.
Keywords:
Early childhood development
Early education
Poverty
Impact evaluation

1. Introduction effects of comprehensive (integrating education, care, nutrition and


health) interventions providing high-quality early education and care to
Global interest in public investments to improve the development of infants and toddlers in the developing world (Behrman & Urzúa, 2013;
disadvantaged young children has risen exponentially in recent decades Black et al., 2017; Britto et al., 2017). Infants and toddlers (less than
(Berlinski & Schady, 2016; Black et al., 2016; Nores & Barnett, 2010). three years old) have been previously underrepresented in studies of
Poverty compromises the development of hundreds of millions of large-scale child care in the developing world (recent exceptions are
children in the developing world, at great cost to individuals and their Bernal, Attanasio, Peña, & Vera-Hernández, 2019 and Noboa-Hidalgo &
countries (Black et al., 2016). Some studies find early intervention can Urzúa, 2012).
alter such developmental trajectories (Berlinski & Schady, 2016; Cunha, Most research on the impact of center-based care at scale for chil-
Heckman, Lochner, & Masterov, 2006; Engle et al., 2007, 2011). dren under three years old has been conducted in the United States and
However, questions remain about whether and how best this might be other high-income countries. Effects appear to vary depending on the
done with large-scale public programs (Nores, Figueras-Daniel, López, quality of care and family background. The U.S. literature includes
& Bernal, 2018). many large scale correlational studies, but only two large-scale rando-
In developing and low-income countries, there have been few mized trials of high-quality infant-toddler programs, Early Head Start
comprehensive educational interventions and more focus on less costly and the Infant Health and Development Program ( Cunha, et al., 2006;
nutrition, parenting and stimulation, or cash transfer interventions Camilli, Vargas, Ryan, & Barnett, 2010; McCormick et al., 2006; Vogel,
(Engle et al., 2007; Nores & Barnett, 2010), especially for children Xue, Moiduddin, Carlson, & Kisker, 2010). Although studies reported
under the age of three (Black et al., 2016). In addition, public policies positive effects on cognitive and non-cognitive outcomes, persistent
have tended to favor increasing access over quality, resulting in rela- gains beyond early childhood appear to be difficult to produce at scale
tively weak educational interventions (Araujo, López Bóo, & Puyana, (Yoshikawa et al., 2015). European studies found lasting positive effects
2013). As a result, little empirical information exists regarding the on language and cognitive abilities from day care 0–2, with gains


Trail Registry # AEARCTR-0001903 https://www.socialscienceregistry.org/trials/1903
✰✰
Supplemental materials for this paper found in the Online Appendix https://sites.google.com/view/raquelbernal/research/working-papers

Corresponding author.
E-mail address: mnores@nieer.org (M. Nores).

https://doi.org/10.1016/j.econedurev.2019.05.004
Received 27 September 2018; Received in revised form 4 April 2019; Accepted 13 May 2019
Available online 16 May 2019
0272-7757/ © 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/BY-NC-ND/4.0/).
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

concentrated among the educationally and economically disadvantaged is the only study in a developing country that would allow the com-
(Dragne & Havnes, 2015; Felfe & Lalive, 2014). parison of outcomes of infants and toddlers randomly assigned to re-
Some studies have found negative effects of 0–2 day care. In an ceive high-quality center-based care to a randomly defined control
affluent Italian region, Fort, et al. (2019) found negative effects for group. The relatively high structural quality of aeioTU model and its
girls, but not boys, with effects concentrated among the most affluent. provision to over 13,300 children in 25 cities throughout the country
Baker et al. (2015) found that “very low-cost child care” (p.2) for make it highly policy relevant. aeioTU is similar in design to the
children aged 0–4 in Quebec had immediate and persistent negative Abecedarian program—a very small, researcher-administered and
effects on non-cognitive development (anxiety and aggression), and no controlled program—which was found to produce persistent cognitive
consistent evidence of effects on cognitive development, and negative gains for disadvantaged children in the U.S. in a randomized trial.
effects on health. Additional evidence from studies of reductions in Abecederian was an intensive center-based early care intervention from
parental care time 0–2 also suggests that lower quality care can have eight weeks old to age 5. It provided 40–45 h of service per week during
negative effects, particularly for children in more educationally and 50 weeks per year. Long-term impacts included improvements in high
economically advantaged families (Bernal & Keane, 2011; Carneiro, school graduation, higher education enrollment, skilled job employ-
Løken, & Salvanes, 2015; Herbst, 2013). ment, and adult health (Barnett & Masse, 2007; García, Heckman, &
In Latin America and the Caribbean, public investment in early Ziff, 2018; Heckman; Muennig et al., 2011). However, the sample in our
childhood services has been increasing recently. However, there is vast aeioTU study is roughly eight times as large that in the Abecedarian
heterogeneity in coverage, content, funding, quality, and staff qualifi- study and aeioTU operates at scale, in multiple cities, with multiple
cations for programs serving children under age three. Although there communities in a developing country.
has been growing research in the region on the impacts of home-based We measure anthropometry, language, cognitive, motor, and socio-
child care including infants and toddlers (Bernal & Fernandez, 2013; emotional development, and parenting for the first year (eight months)
Behrman, Cheng, & Todd, 2004; Attanasio, Di Maro, & Vera‐Hernández, of program participation for children under age 3. In developing
2013), as well as some research on the effects of center-based early care countries, participation in early education and care interventions are
for children older than age three (Berlinski & Galiani, 2007; Berlinski, often for eight months or less (Black et al., 2016; Nores & Barnett 2010).
Galiani, & Gertler, 2009), the evidence on the impacts of center-based As both the treatment and the counterfactual differ between children
child care for infants and toddlers is scarce and outcomes are mixed, under and over age 3, and different measures are required at these
possibly because of variations in quality. different ages, clear interpretation of results requires that they be pre-
Bernal et al. (2019) assessed the effects of the transition of children sented separately for younger and older children. The one-year follow-
from home-based child care to center-based child care in Colombia for up for children under the age of three facilitates a clear presentation of
children from zero to five years of age. The study found positive effects policy-relevant findings for the age group at highest risk of inadequate
on children's health but negative impacts on children's cognitive out- early care and education (Black et al., 2016).1
comes possibly associated with the low process quality during the We find positive effects on language, cognitive development, motor
transition. The authors did not study the subsample of toddlers and development and overall development (0.09–0.20 standard deviations)
infants separately. In Chile, an evaluation of JUNJI (a child care pro- eight months into the program. Effects are observed only for girls (up to
gram serving children under the age of two in centers) found positive 0.33) and not for boys. No intervention effects were observed for nu-
effects on most outcomes including emotional regulation, motor skills, tritional outcomes, socio-emotional development or the home en-
expressive communication and adaptive behavior particularly for chil- vironment.
dren over six months of age. However, they also found statistically The rest of this paper is organized as follows. Section 2 briefly de-
significant negative effects on reasoning suggesting the quality may scribes the early childhood policy context in Colombia and how the
have been adequate to improve some outcomes while worsening others aeioTU programs fits in this framework. Section 3 describes the study
(Noboa-Hidalgo & Urzúa, 2012). In Ecuador, an evaluation of a gov- design, the sample, the data and the empirical strategy. Results are
ernment subsidized center-based day-care program for low-income shown in Section 4 and Section 5 discusses the results and offers con-
children found null effects for a sample of children between zero and six clusions.
years of age. The authors proposed poor quality of services as the ex-
planation for the results (Rosero & Oosterbeek, 2011). Finally, Araujo,
Dormal and Schady (2018) studied the effects of caregiver-child inter- 2. Background
actions on children younger than two years of age in center-based care
in Peru. They found that children with caregivers who exhibited higher- Of the 4.3 million children in Colombia younger than six years of
quality interactions had better developmental outcomes relative to age (2.8 million younger than three), about 65% are socioeconomically
children in the same centers whose caregivers exhibited low-quality disadvantaged and 40% of those are served by government programs
interactions. Like Noboa-Hidalgo and Urzua (2012) and Rosero and (Bernal & Camacho, 2014). Children from low-income households show
Oosterbeek (2011), they highlighted the importance of quality. developmental language and cognition gaps as early as 12 months of
In sum, estimated effects of infant-toddler programs vary from po- age and these are about one standard deviation by age 5, relative to
sitive, to null, to negative in both higher- and lower-income countries high-income children (Bernal, Martínez, & Quintero, 2015; Rubio-
across a relatively small number of studies. These variations can be Codina, Attanasio, Meghir, Varela, & Grantham-McGregor, 2015). En-
explained post-hoc as due to variations in how they affected the quality rollment rates in public child care programs is close to 30% for children
of early care and education experiences relative to the home environ- aged three or less. Close to 800,000 vulnerable children younger than
ment. However, outside the U.S., large-scale studies have been non- six are served through Hogares Comunitarios de Bienestar (HCB), a public
experimental raising the possibility that results vary due to bias from home-based child care program that provides home-based care and
self-selection and other hidden conditionals. As randomized trials are supplemental nutrition to low-income children. Recent evaluations of
less susceptible to this problem, new studies using this research design the HCB program (Attanasio, Di Maro, & Vera‐Hernández, 2013; Bernal
would be particularly valuable additions to the literature. & Fernández, 2013) have found positive impacts on children's
In this paper, we investigate the effects of aeioTU, an intensive and
comprehensive center-based early education intervention, on the de- 1
In a complimentary paper, we use a factor model to synthesize all instru-
velopment of disadvantaged infants and toddlers in Colombia using a ments within a developmental domain in a single factor by taking advantage of
randomized control trial with a sample of 848 children under the age of instrument overlap in at least one occasion to “link” measures across time.
three in two communities in northern Colombia. To our knowledge, this (Bernal, Giannola and Nores, in progress).

31
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

developmental outcomes of about 0.15 standard deviations (SD), de- aeioTU program provides full-day (9 h per day) educational care during
spite the low quality of the service provided (Bernal & 11 months of the year, with relatively low child-to-teacher ratios (8:2
Fernández, 2013).2 In 2011 the government launched a national early for infants, 12:2 for toddlers at the time of this study), high teacher
childhood strategy, “De Cero a Siempre” (From Zero to Forever), aimed qualification requirements (32% had a BA and the rest had a vocational
at improving the quality of and access to comprehensive services3 for degree in early childhood education when we started this evaluation),
1.2 million children (Bernal & Ramírez, 2018; Comisión Inter-Sectorial and concerted pre- and in-service training (120 h pre-service, over
para la Primera Infancia, 2013). As a result, center-based care enroll- 130 h in-service). Bernal et al. (2019) reported that teacher training and
ments grew from 125,000 children in 2011 to about 380,000 children coaching strategies were not common and varied significantly across
in 2016 (with most of the growth between 2010 and 2013). The aeioTU service providers in most comparable public center-based programs;
program is part of this strategy. aeioTU is an NGO operating 28 centers they also reported lower educational levels of teachers and a child-to-
by 2016 providing comprehensive early childhood education to about teacher ratio of 25:1 for toddlers around 2010–2011. In contrast,
13,300 low-income children aged 0–5 throughout urban Colombia. The Nores et al. (2018) reported that continuous quality improvement
program, a Reggio Emilia “inspired” educational program, features strategies were effectively implemented by aeioTU. The aeioTU pro-
characteristics relevant for quality in early care and education.4,5,6 gram also provided 70% of children's daily nutritional requirements
aeioTU's curriculum conceives the learning process through strategies (breakfast, two snacks, and lunch), which is mandatory in all public
such as play, exploration, and projects, building upon child‐initiated child care public programs in Colombia,11 and also provided regular
activities and balancing these with teacher‐directed activities. A child nutritional monitoring.12 aeioTU staffed centers with a team of experts
in an aeioTU centers experiences fully outfitted classrooms and ex- including an atelier (in-site artist) and a pedagogical coordinator (di-
perientially‐based classrooms.7 The model is explained in more detail in rector), who played a critical role in the planning of pedagogical
Nores et al. (2018). The evaluation of the Reggio model in its original guidelines in centers.13 In line with the Reggio Emilia approach, aeioTU
context in Italy found that it supported child development, but no more emphasized family participation. In particular, aeioTU held regular
so than other high quality models available in the region (Biroli et al., workshops for parents, informed them about child activities and pro-
2018). In Colombia, other public center-based programs for children gress through weekly reports and photos, and had an open-door policy
aged zero to three did not use a structured curriculum or have clear that included use of the centers’ recreational areas by families during
pedagogical guidelines for teachers at the time of this study (Bernal, weekends.
2019).8 aeioTU was funded by a government stipend equivalent to USD In sum, aeioTU's features such as pre- and in-service training, the
1500 per child per year,9 which aeioTU supplemented with an addi- use of a structured developmentally-oriented curriculum and of tools to
tional 20–30% from their own resources at the time of the study.10 The assess progress in children, and high qualification requirements for staff
were uncommon among service providers at the time of this study.
These features are considered to be important influencers on early care
2
This seemingly contradictory result might be due to the fact that in low- and and education quality and linked to better child developmental out-
middle-income countries even very basic public pre-school especially for chil- comes (Barnett & Boocock, 1998; Bernal, 2015; Bowman, Donovan, &
dren from deprived backgrounds could have positive impacts, given the very Burns, 2001; Yoshikawa, Weiland, & Brooks-Gunn, 2016).14
low quality or unavailability of alternative options.
3
We define comprehensive child care services here refers to programs that 3. Methods
embed pedagogical contents aimed at stimulating cognitive and socio-emo-
tional development and do not simply provide a safe environment to care for 3.1. Study design
the child while the mother works. Specifically, comprehensive services would
offer concurrently nutrition, health, care, and early education.
4 We conducted a randomized controlled trial with families of young
Since it started operations in 2009, aeioTU has grown through public-pri-
children assigned to treatment or control in two early care centers in
vate partnerships with the national government and has recently been inter-
nationally highlighted as a successful innovative approach in a 2017 white one city in northern-coastal Colombia. In Fig. 1 we show the study's
paper from the World Economic Forum and its CEO was awarded the 2018 flow chart for sample selection. Children were first assessed in late
Klaus J. Jacobs Awards for Social Innovation and Engagement. 2010, prior to random assignment and the beginning of the interven-
5
The Reggio Emilia Approach is an education philosophy for pre-school and tion, and assessed again about eight months post-treatment. Site 1 re-
primary education. It is based on the notion that children are capable of con- ceived baseline testing in July-September 2010, started intervention in
structing their own learning process through their innate curiosity to under- November 2010, and received post-testing between June and
stand the world. The basic principle is that children learn about themselves and
their context through interactions with others and their environment. Thus,
adults are mentors and guides of this process rather than mere caregivers or (footnote continued)
providers of knowledge, in the sense of providing opportunities for children to government's stipend, and to provide nutritional supplementation over the
explore their own interests. The approach recognizes many ways to understand holidays.
11
the world and express thoughts, and aims at promoting these communication The nutritional component is mandatory in all public child care programs.
channels within the educational experience, including art, music, dance, However, according to aeioTU, this component was underfunded by the gov-
movement, pretend play, and exploration. ernment's stipend by about 20-25%, which was covered with aeioTU's own
6
The Reggio approach has been exported from Italy to the Nordic countries, resources, and the program also provided nutritional supplementation (micro-
North America, Argentina, Paraguay, Mexico, Peru, New Zealand, Australia, nutrients) during holidays. We did not collect any data to monitor im-
Korea, Germany, The Netherlands, the United Kingdom, Spain and Thailand. plementation of the nutritional component.
12
See www.reggioalliance.org. An in-site nutritionist periodically monitored children's nutritional status.
7
In line with the Reggio Emilia philosophy, aeioTU emphasizes on the col- Children found to be at risk are referred to public health services and the center
legial work of center personnel, the presence of the atelier (artist), and a ped- would adjust the nutritional supplement as recommended by the nutritionist.
agogical coordinating team. This was not mandatory for all public child care services at the time of this
8
It is important to note that there is no specific curricular guideline for early study.
13
education in Colombia. The “De Cero a Siempre” strategy has emphasized the At the time of this study, the hiring of these experts was not a national
principle of curricular freedom, and national standards are intentionally broad. requirement.
9 14
Using the average COP/USD exchange rate in 2010, at the time this study We do not have data to confirm that aeioTU had better quality than
began. comparable center-based child care programs targeting the same population.
10
The additional funding provided by aeioTU was used for teacher training We can only specify that the inputs often linked to center-based quality were
and the nutritional component of the program, which was underfunded by the higher, on average.

32
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Fig. 1. Study's flow chart for sample selection.

September 2011. Site 2 received baseline testing in October–December being built through a door to door census and through community
2010, started intervention in March 2011, and received post-testing leaders, for a total of 1288 children (see Fig. 1). All families were in-
between October and December 2011. come-eligible for aeioTU by SISBEN scores,16 and all families expressed
interest in enrolling in aeioTU centers if offered a slot.
The evaluation study is a randomized controlled trial based on an
3.2. Sampling and randomization
oversubscription model. Seventy children were excluded from the
sample for reasons detailed in Fig. 1. In particular, n = 66 children
The sites were selected from the centers being built and opened by
were directly offered a center slot for various reasons including sons
aeioTU around the time the study was planned and funded
and daughters of teachers, n = 2 children who moved out of the city
(2009–2010). Sites had to fulfill two criteria for inclusion: size (no
prior to the opening of centers, and n = 2 children outgrew the program
small centers, so that we could power the study for children at different
between the census and center opening. This rendered a final sample of
ages) and oversubscription (so that a lottery could be drawn). Two
N = 1218 children, out of which 819 were toddlers. All 1218 families
aeioTU early childhood centers were opening in 2010 in northern
agreed to participate in the study through written active informed
Colombia, in two different communities deemed suitable due to their
consent.17
socio-economic vulnerability.15 We identified all children under the age
of five living in these two communities at the time the centers were
16
The SISBEN instrument is based on a sociodemographic household survey
15
Other considerations for aeioTU's location choices included the political used to generate SISBEN scores used for social policy targeting.
17
will of local mayors who often provided the infrastructure as well as approval of Ethics Committees at participating institutions approved the study's pro-
ICBF which prioritized underserved areas. tocol in 2009.

33
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Each aeioTU center in the study had capacity for about 320 children education of families. The centers provided workshops in early child-
aged 0–5, with just over half of that for children up to age 3. Families hood development and parenting on topics such as positive discipline.
were randomized to treatment and stratified by age (five groups), This component, continuous exposure to teachers, and the school's
gender and neighborhood within site (three groups) right after baseline communication efforts may have had effects on parental knowledge and
data was collected. Slots were randomized for 819 infants and toddlers, feasibly modified parenting behaviors. In addition, the marginal return
with 337 allocated to enroll in the center (the treatment group) and 482 of parental investments might increase with improvements in the
allocated to the control group.18 Power analyses indicated a power of quality of early care and education to the extent that complementarities
0.85 for sample sizes of 700 with α = 0.05 and an expected effect size exist between these sets of inputs in the early human capital production
of 0.25 SD following Nores and Barnett (2010), allowing for an attrition function. Therefore, we assessed whether change occurred in the home
rate of 17%. learning environment.
Lottery winners were offered a slot to enroll in aeioTU centers. It is reasonable to assume though that parents might also switch
aeioTU followed-up with lottery winners’ parents for effective regis- resources and/or attention to other children (or themselves) as a re-
tration of the children. However, some lottery winners decided not to sponse to the intervention, to equalize the allocation of resources across
enroll their children in centers even after several calls and visits. Given children (or household members). This may mitigate the program's
that our sample includes the universe of children residing in these impact.
communities, centers had to resort to children in the control group to
complete enrollment.19 More details on compliance and cross-over are
discussed in section IV. 3.4. Measures
Child assessors and parent interviewers were blind to treatment
status of participants. Realistically, parents could have communicated The assessment instruments chosen have been used extensively in
their status at post-testing so there is the possibility that assessors learnt evaluations of early care and education including studies in developing
this information as they were being assessed. countries (Fernald et al., 2017). They have adequate psychometrics and
have detected program effects in other studies. Child developmental
tests were collected by five graduates in psychology and three students
3.3. Theory of change
in their senior year in psychology,22 who were trained to reliability
standards (100% agreement with the trainer) by experienced research
We hypothesize that exposure to the aeioTU early education pro-
staff in a two-week training which included live reliability with infants
gram, particularly in contrast to the existing supply of early childhood
and toddlers. Data collection for all children was conducted in spaces
services and home learning environments in deprived communities,
rented and adapted for that purpose, under identical conditions, with
would have an impact on children's health and education outcomes.20
parental informed consent. Parental interviews were carried out
In particular, components of quality in the program such as compre-
alongside the child's assessment, in a separate room. Families and
hensive pre- and in-service teacher training, the use of a structured
children were provided small incentives for participation and a snack.
developmentally-oriented curriculum, high qualification requirements
Nutrition: As is standard practice in early intervention studies in
for staff, low child-to-teacher ratios, the presence of a team of experts
developing countries (Fernald, Gertler, & Neufeld, 2008), we measured
including the pedagogical coordinator and the atelier, and strong in-
height, weight, and arm circumference following World Health Orga-
frastructure and supports should provide enhanced learning opportu-
nization (WHO) standards (WHO, 2006, 2007) at baseline and follow-
nities for children, thus improving children's language, motor, cognitive
up.
and socio-emotional development.
Cognitive Development: We used Cognitive, Motor, and Language
The program offered the same amount of daily calorie intake as
scales from the Bayley Scales of Infant Development III (BSID), the most
other child care services,21 provided nutritional supplementation
commonly used assessment of infant development (Bayley, 2005), for
during holidays, had on-site nutritional monitoring to detect nutritional
all children younger than 36 months of age following guidelines for
risks, and adjust nutritional supplements, if necessary. As a result, we
conducting this assessment. In particular, we used a translation pro-
would also expect improvements in children's nutritional status.
vided under a license by the publisher (Pearson), that had been done for
Finally, the learning environment in the homes might have im-
another study on a similar population in Colombia which reports a test-
proved as a result of aeioTU's emphasis on active participation and
retest reliability of this translation of 0.95–0.98 (Attanasio et al., 2014).
The BSID is predictive of later measures of cognitive ability (Blaga
18
We used computer generated random lists to assign children to treatment et al., 2009; Feinstein, 2003). We measured the BSID at baseline and
and control. We did this in a public event, within the community, with all of the follow-up (if still applicable). Infants and toddlers who outgrew the
families present. BSID at post-test were administered a commonly used measure of re-
19
For this reason, children assigned to the control group were further ran- ceptive vocabulary, the Peabody Picture Vocabulary Test in Spanish
domized into ordered waiting lists (by cohorts) so that they were offered pro- (Test de Vocabulario en Imágenes Peabody, TVIP) (Padilla, Lugo, &
gram participation if necessary in such order if children assigned to the treat- Dunn, 1986). An overall development score is drawn from the sum of
ment group declined an aeioTU slot. This makes cross-overs from control to
the cognitive, motor, and language scales of the BSID. Raw scores are
treatment that follow the randomized list, random. However, although centers
used in estimations of program impacts as there are no norms for the
reported following this list, we did not effectively monitor this process. In
principle, this procedure would imply that children high up in the waiting list Spanish translation and the English version is normed with a sample of
were also more likely to be assigned to treatment. In fact, the correlation be- children in the United States.
tween list order and enrollment in centers was −0.42. Socio-emotional Development: The Ages and Stages Questionnaire
20
At baseline, only 12.5% of infants and toddlers had used child care services for the Socio-Emotional domain (ASQ:SE) is a parent-completed as-
during the previous year (14% in the treatment group and 11% in the control sessment system for children from six to 60 months old. The ASQ:SE
group). Of these, 90% attended a public child care program such as Hogares measures self-regulation, compliance, communication, adaptive
Comunitarios, 7% attended a private or NGO-sponsored child care program, and
3% were cared for by caregivers in their own home. The rest of the children
22
(87.5%) were being cared at home by parents or unpaid relatives. There was continuity in five assessors between baseline and post-test. All
21
However, aeioTU added 20-25% of own resources to the nutritional com- assessors were retrained at post-test regardless. Assessors were paired as teams
ponent in order to be able to comply with the national nutritional guidelines. that assessed children, surveyed parents and visited homes to assess the home
This implies that, presumably, other providers could not in practice fully environment. The training was provided by a seasoned trainer from the
comply with the calorie intake requirements given the government's stipend. National Institute of Early Education Research.

34
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

functioning, autonomy, affect, and interactions with others standard deviations below the mean of the relevant standardized dis-
(Squires, Bricker, & Twombly, 2009a). The ASQ:SE has high levels of tribution at baseline, since we consider this to be an indication of po-
reliability and validity (Squires et al., 2009b). It was collected at tential disability.27
baseline and follow-up through parent interviews. Higher scores re- As ITT was randomly assigned, we can be confident that the ex-
present higher levels of socio-emotional risk or negative behaviors. We clusion restriction in Eq. (1) holds. Thus, ^ estimated by OLS captures
2
used raw scores in the statistical analyses.23 the causal impact of random assignment to treatment on the outcome.
Home Environment: The Home Observation for Measurement of the Variations of this model inquire into heterogeneous effects by gender,
Environment (HOME) measures the quality and extent of supports for initial developmental levels and maternal education.28
child development in the home Caldwell & Bradley, 1984). The infant Given crossover between treatment and control groups as the pro-
and toddler inventory includes six subscales: (1) responsiveness to gram rolled out, we also estimated the impact of effective enrollment in
parent, (2) avoidance of restriction and punishment, (3) organization of centers (treatment-on-treated or TOT). In particular, 91 children (27%)
the environment, (4) appropriate play materials, (5) parental involve- assigned to the treatment group did not enroll in aeioTU centers and 80
ment, and (6) variety in daily stimulation. We used raw scores in all children (16.5%) in the control group enrolled in the centers (see
analyses. The HOME was administered at follow-up only, by six psy- Fig. 1). TOT estimates adjust the ITT (coefficients) by take-up rate. We
chologists trained to reliability standards by experienced research staff estimated TOT effects using an instrumental variable approach. That is,
in a 1.5 weeks training which included live reliability. The inter-rater we used random assignment (ITT) as an instrument for enrollment in
reliability was above 0.9 on the full scale. The instrument was collected the program. Enrollment is defined as a binary variable that equals 1 if
during visits of 1–2 h to the child's household.24 child i was effectively enrolled in the aeioTU center (i.e., registered in
Socio-economic characteristics: In addition to the outcome mea- center rosters), and 0 otherwise. Enrollment is directly obtained from
sures described, we surveyed primarily the mother, or the head of the aeioTU's administrative records.29 This procedure provides an unbiased
household in each home. We collected socio-economic information on estimate of the program's effects among those who actually enrolled in
families on schooling attainment, maternal age at birth of the child, the centers. The ITT indicator is a valid instrumental variable as it was
race, income and expenditures, employment, assets, health insurance, randomly assigned and it significantly explains actual enrollment.30
number of children in the household, and childcare experiences.

3.5. Statistical strategy 4. Results

We hypothesize that the intervention would impact the six sets of 4.1. Baseline characteristics
outcomes assessed: nutritional, cognitive, language and motor devel-
opment, socio-emotional development, and home environments. We Table 1 provides summary statistics at baseline by random assign-
estimate intention to treat (ITT) effects on the outcome measures using ment (ITT) for the subsample of children re-interviewed at follow-up
the following ordinary-least-squares (OLS) specification: (93%). We include various socio-demographic characteristics of chil-
dren and their families, as well as baseline developmental outcomes. At
Ait = 1 + 2 ITTi + 3 Ai
baseline
+ 4 Xi + i (1) baseline children were, on average, 20 months of age. Households had
an average of 2.6 children under the age of five, and 26% were headed
where ITTi equals 1 if child i was randomly assigned to treatment and 0
by single parents. Mothers averaged 8.5 years of education, and only
otherwise. Ait is an outcome variable for child i in period t (in this case,
36% had high school degrees. Families in the sample are quite vul-
follow-up), Aibaseline is the same outcome for child i or an available
nerable, with an average low number of child-related books (less than
measure of the same developmental domain at baseline, and Xi is a
two) and have low access to childcare at baseline.
vector of baseline controls including the child's race, age and age
Children were highly nutritionally vulnerable with height-for-age
squared, maternal education and marital status, and the household
one standard deviation below average according to WHO standards
wealth index, as well as those sociodemographic characteristics that
(WHO 2006; WHO 2007). This translates into 21.6% of children stunted
were unbalanced at baseline such as the number of children younger
at baseline. In 2010, 15.2% children in rural areas were stunted (with
than five years old in the household and whether the child had attended
comparable socioeconomic conditions as children in our sample) and
child care before randomization. It also includes neighborhood, cohort
12% in urban areas, according to data from the Colombian Longitudinal
and gender (randomization strata) fixed effects, as well as tester or
Household Survey (ELCA, 2010).
interviewer fixed effects.25 We adjust two-tailed tests P-values for
To assess the degree of vulnerability of children in the sample, we
multiple hypotheses testing using the Romano and Wolf (2005) step-
down procedure, where appropriate.26 We report both adjusted and
unadjusted P-values. 27
The exclusion of outliers is shown in the “Analysis” panel of the flow chart
We excluded from the analysis children with developmental out- in Figure 1. The number of final observations varies by instrument and is be-
comes with internally age-standardized values lower than three tween 459 for BSID III to a maximum of 748 for all other outcome variables.
This standardization of scores was done only to assess outliers. All estimations,
as noted earlier, are run using raw scores.
23 28
We do not use the risk of socio-emotional development calculated as the We did not find statistically significant differences by child's age when
fraction of children above a threshold of behavioral problems defined by the comparing impacts on children 0-1 years of age vs. children 1-3 years of age
test developers given that the ASQ:SE has not been locally validated and it (results available upon request). Another interesting analysis would be to look
would be inaccurate to use these thresholds (see Frongillo et al., 2014). into differential program impacts by the child's environment using the HOME
24
Assessors previously agreed upon appointments with families by phone, assessment. However, we were only able to collect HOME data at follow-up.
29
and were trained to be unobtrusive, to the extent possible, in the home while Centers did not keep digital daily attendance records, so it was not possible
observing and interviewing primary caregivers. to construct an indicator of average daily attendance. aeioTU indicates that by
25
We also present the main results excluding the tester fixed effects in Table 2011 daily attendance rates (for all children and not exclusively infants and
A9 in Appendix. toddlers) varied between 60% and 78% in centers across the country.
26
Romano and Wolf (2005) step-down procedures for multiple testing were Attendance rates for other center-based care providers in Colombia or the re-
run within developmental domains (receptive and expressive language, fine and gion was not found publicly available by the authors, to compare rates ac-
gross motor, all the socio-emotional outcomes and all the subscales of the cordingly.
30
HOME) extracting t-statistics of effect sizes from Stata and using the Matlab A regression of enrollment in centers on random assignment yields a sta-
algorithm written by D Wunderli (University of Zurich). tistically significant positive coefficient with an F-test of 50.5.

35
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Table 1
Descriptive statistics at baseline by randomization status.
Socio-demographics and outcomes at baseline N Control Treatment P-value* Stepdown
Mean (SD) Mean (SD) P-value†

Child's age in months 763 19.79 (8.83) 20.73 (9.69) 0.164 0.868
Child's gender (male) 763 0.53 (0.50) 0.53 (0.50) 0.927 0.997
Child's race (black) 763 0.59 (0.49) 0.63 (0.48) 0.290 0.931
Maternal marital status (single) 763 0.29 (0.45) 0.28 (0.45) 0.723 0.997
Health insurance for child 763 0.78 (0.42) 0.79 (0.41) 0.629 0.996
Mother secondary incomplete 763 0.63 (0.48) 0.64 (0.48) 0.787 0.997
Mother secondary complete and above 763 0.37 (0.48) 0.36 (0.48) 0.787 0.997
Wealth Index‡ 763 0.28 (4.90) −0.10 (3.96) 0.256 0.923
Children books at home 763 1.33 (2.24) 1.40 (3.15) 0.739 0.997
Mother education years 763 8.38 (3.33) 8.37 (3.12) 0.937 0.997
No. of children <=5 yr 763 2.60 (0.76) 2.78 (0.86) 0.004 0.051
Childcare by baseline 763 0.11 (0.31) 0.14 (0.34) 0.218 0.914
Neighborhood (La Paz) 763 0.60 (0.49) 0.49 (0.50) 0.003 0.041
Neighborhood (Alpes B) 763 0.04 (0.21) 0.07 (0.26) 0.133 0.818
Neighborhood (Timayui 1) 763 0.18 (0.38) 0.22 (0.41) 0.173 0.868
Neighborhood (Timayui 2) 763 0.18 (0.39) 0.23 (0.42) 0.123 0.813
Cohort 2008 763 0.39 (0.49) 0.37 (0.48) 0.680 0.997
Cohort 2009 763 0.42 (0.49) 0.33 (0.47) 0.008 0.099
Cohort 2010 763 0.12 (0.33) 0.17 (0.37) 0.098 0.749

Nutrition (Z scores)
Length/height-for-age 743 −1.06 (1.18) −1.10 (1.04) 0.625 0.861
BMI-for-age 731 0.57 (0.98) 0.48 (1.00) 0.223 0.511
Weight-for-age 743 −0.22 (1.07) −0.34 (1.01) 0.129 0.347
Weight-for-length 735 0.44 (0.97) 0.35 (1.00) 0.244 0.537
Arm circumference 734 0.26 (0.85) 0.24 (0.82) 0.733 0.861

Infant development: BSID III raw scores


Receptive vocabulary 737 18.79 (7.62) 20.10 (8.03) 0.025 0.057
Expressive vocabulary 739 19.24 (8.86) 20.23 (9.90) 0.157 0.266
Total language 729 38.02 (16.12) 40.50 (17.53) 0.049 0.106
Cognitive 743 48.02 (14.41) 49.36 (15.18) 0.221 0.290
Fine motor 739 32.18 (9.22) 33.17 (9.78) 0.162 0.266
Gross motor 742 46.80 (12.71) 47.49 (13.70) 0.483 0.488
Total motor 735 78.95 (21.54) 80.66 (23.09) 0.303 0.341
BSID III Total 715 164.95 (50.55) 171.27 (53.77) 0.109 0.195

Socio-emotional development (ASQ: SE)


Self-regulation 754 16.43 (14.82) 18.32 (16.16) 0.096 0.420
Compliance 564 2.07 (4.06) 2.77 (4.65) 0.058 0.323
Communication 754 1.54 (3.98) 1.90 (4.01) 0.217 0.626
Adaptive functioning 753 6.85 (8.65) 6.92 (8.47) 0.913 0.999
Autonomy 564 4.06 (5.27) 5.15 (5.78) 0.020 0.131
Affect 754 3.82 (4.76) 3.88 (4.92) 0.865 0.999
Interaction 754 5.46 (7.07) 5.42 (6.86) 0.945 0.999
ASQ:SE Total 753 39.79 (26.42) 43.50 (28.23) 0.065 0.323

Note: Comparison of baseline variables before randomization in 2010, by intent-to-treat for the sample of 789 children followed up in 2011. BSID denotes the raw
score from the Bayley Scales of Infant Development 3rd edition (Bayley, 2005), ASQ:SE denotes the raw score from the Ages and Stages Socio-emotional Ques-
tionnaire (Squires, Bricker & Twombly, 2009b). P-values for differences in means ≤ 0.10 between treatment and control children are in bold type. *Standard P-
values. †Stepdown P-values are for Romano and Wolf (2005) stepdown procedures applied by blocks of baseline variables. 2000 repetitions. ‡Wealth index calculated
through principal component of a set of variables including type and characteristics of dwelling (floors, walls, bathrooms, etc.), availability of public utilities and
durable goods.

standardized BSID III scores31 at baseline. These were 90.4 (SD = 13.3) within 0.1 below or above the norming sample in the three areas
for cognition, 88.9 (SD = 13.2) for language and 93.6 (SD = 13.6) for (Rubio-Codina et al., 2015). These results indicate that the children in
motor development. This implies that children in the sample were our sample are significantly vulnerable.32
about 0.7 standard deviations below average relative to published Average socio-emotional (ASQ:SE) scores were slightly elevated
norms (Feinstein, 2003), and slightly below scores reported on a recent compared to the validation sample and quite comparable to children
study including a similar sample of low-income children aged 12–24 from low socioeconomic status urban households in the Colombian
months in rural areas in Colombia (Attanasio et al., 2018). The latter Longitudinal Household Survey in 2013 (ELCA, 2013).
reported standardized scores of 92.0 for cognition, and 91.6 for lan- Overall, there are few statistically and non-systematic significant
guage. In contrast, children between 18 and 36 months of age in Bogota
(Colombia's capital) from average income households scored only
32
Distributions of outcomes are depicted in Figures A1a-A1c by ITT. Outcome
distributions are as expected. As anticipated, cognitive, language and motor
developmental outcomes are correlated with maternal education (Table A1h).
31
Bayley III composites computed based on published norms provided by test Socio-emotional outcomes on the other hand, are not. This may be due to the
developers. Standardized scores have mean 100 and a standard deviation of 15. screening nature of the measure. Socioemotional development is however
We only use standardized scores to understand vulnerability. Estimations use correlated with the amount of books at home which proxies for maternal be-
raw scores as stated earlier. haviors not necessarily correlated with education.

36
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

differences between ITT groups at baseline, with treatment families 4.3. Estimations of program impact
having on average 0.18 more children under the age of five and being
more likely to have attended childcare during the year prior to baseline 4.3.1. Intent-to-treat and treatment-on-treated on final outcomes
(14% vs. 11%). We also report differences in the cohort of birth. In Table 2 reports treatment effects, standard errors, and step-down P-
particular, there were fewer babies than toddlers in the treatment by values outcomes. We first present ITT estimates controlling for baseline
design, as class sizes for younger children were lower. Finally, we re- outcomes (Eq. (1)) along with its standard error. Next, we present TOT
port differences in developmental outcomes at baseline. Children in the estimates by two-stage least squares using random assignment as an
treatment group scored higher in BSID receptive vocabulary and total instrumental variable for enrollment in the centers and its standard
language than children in the control group. At the same time, parents error. In each case, we report step-down P-values or unadjusted P-va-
of children in the treatment group reported higher prevalence of pro- lues when the adjustment is not required. Program effects are reported
blematic behaviors in the areas of compliance, autonomy, and total as fractions of a standard deviation relative to the control group. To put
ASQ:SE scores than parents of children in the control group.33 Some, estimated effects in perspective, 0.75 SD is equivalent to the ability gap
but not all, of these differences remain significant after step-down ad- on the vocabulary between low- and high-income 3-year-olds in Co-
justment of P-values. All analyses provided hereinafter control for these lombia (Colombian Longitudinal Household Survey ELCA, 2013). We
baseline measures to increase precision.34 find statistically significant effects for language and cognitive devel-
opment, but not for physical or socio-emotional development.
Most of the results on cognition are for the subsample of children
4.2. Attrition and compliance who did not outgrow the BSID III between baseline and follow-up. That
is, the sample of approximately 480 children (out of 763 infants and
A total of 93%, that is, 763 of the 819 children in the sample, were toddlers surveyed at both baseline and follow-up) who were under 36
assessed at post-test (see Fig. 1). In particular, 22 (6.5%) children in the months of age at post-test.40 The ITT effect on receptive language by
treatment group and 34 (7%) in the control group were not re-inter- BSID III was of 0.111 SD (P-value = 0.035),41 on expressive vocabulary
viewed due to migration to a municipality located too far from our was of 0.114 SD (P-value = 0.035), with the combined language effect
sites.35 Attrition rates were not statistically significantly different be- being 0.112 SD (P-value = 0.006). The TOT effect on receptive lan-
tween treatment and control groups, overall, and for subpopulation guage was of 0.204 SD (P-value = 0.033), on expressive vocabulary
groups.36 was of 0.208 SD (P-value = 0.033), with the combined language effect
Compliance was high for the control and treatment groups: 73% of being of 0.205 SD (P-value = 0.005).
lottery winners enrolled in the centers and 83% of lottery losers did not To address the fact that BSID III is only observed at follow-up for a
enroll in the centers (see Fig. 1).37 Compliance was strongly correlated subsample of children, we internally age-standardized both BSID III
with ITT, the child's age, and varied significantly by site. In terms of receptive vocabulary and TVIP scores in the complete sample and then
enrollment in aeioTU centers at follow-up, 73% of children in the pooled both scores into a single estimation, controlling for the change
treatment group surveyed at follow-up were enrolled, as well as 17% of of instrument in the regression. This allows us to estimate impacts on
children in the control group. Enrollment was higher for older children, 710 children (last row in the second panel in Table 2). We find no
for boys, and for children of more educated working mothers.38 statistically significant impact on this combined measure of receptive
At post-testing, 82.2% of children in the treatment group were en- vocabulary for the full sample. In addition, we estimate effects on re-
rolled in early education services, of these, 91.5% were enrolled in ceptive language for the subsample of children that outgrew the BSID III
aeioTU, 6.9% in other publicly provided child care service, and 1.6% in between baseline and follow-up using TVIP scores. We do not observe
NGO-provided services or private center-based child care. In the significant effects on this subsample of 230 children older than 36
treatment group, 16.8% of children did not attend any program, and an months of age at follow-up either.
additional 1% did not respond. Of those children not attending any The effect on BSID III cognition was 0.074 SD (P-value = 0.035) by
program, 92% were being cared by the mother, 2% by the father, and ITT and 0.138 SD (P-value = 0.033) by TOT. The effect on fine motor
the remainder were cared for by other relatives or non-relatives at the development was 0.063 (P-value = 0.048) by ITT and 0.119 (P-
child's home. As for the control group, 37% children were enrolled in value = 0.048) by TOT, and, the effect on gross motor development
early education programs at follow-up, out of which 48% were enrolled was 0.047 (P-value = 0.076) by ITT and 0.085 (P-value = 0.074) by
in aeioTU centers, 41.2% in other publicly provided alternatives, and TOT. The effect on total motor development (aggregate of fine and
the rest in NGO-provided services or private childcare. About 60% of gross motor) was 0.049 (P-value = 0.035) by ITT and 0.092 (P-
children in the control group were not attending any child care program value = 0.033) by TOT. The program effects on overall development,
at follow-up; of these, 86% were being cared for by the mother, 2% by as measured by the aggregate of language, cognition and motor de-
the father, and the rest by other relatives or non-relatives.39 velopment, were 0.064 SD (P-value = 0.016) by ITT and 0.117 SD (P-
value = 0.014) by TOT.42,43

33 40
When we re-create the same table for the complete sample interviewed at For this reason, we show in corresponding tables on the online appendix,
baseline, these imbalances remain but no others emerge. all analyses including attrition, compliance, enrollment, and baseline equiva-
34
Tables A1a-i in the online Appendix display demographics and outcomes at lence for this specific subsample of children as well (Tables A1i, A2, A4 and
baseline by gender and for various subgroups of children relevant for the het- A5b).
41
erogeneous impacts results presented later. We report here step-down adjusted P-values and unadjusted P-values in
35
Table A2 shows attrition rates for the sample and for subgroups of interest. cases when multiple hypotheses testing does not apply (as in Table 2).
36 42
Table A3 in the online appendix assesses the determinants of attrition. For Given that program effects for the BSID were estimated for the subsample
the overall sample, no observables were found to statistically significantly re- of children who did not outgrow the measure between baseline and follow-up,
late to attrition with the exception of neighborhood, which we control for in our we also estimated program impacts on nutrition, socio-emotional development
estimations. and the home environment on this subsample, i.e., children that were still
37
Table A4 in the online appendix reports compliance rates by subpopulation younger than 36 months of age at follow-up. The results are virtually un-
groups. changed (see Table A10 in the online Appendix).
38 43
Tables A5a-c in the online appendix report the determinants of enrollment In Table A9 in the online appendix, we show results that exclude tester
in the program in the complete sample and subgroups of interest. fixed effects from equation (1). The main results remain unchanged. Tester
39
See Table A6 in the online appendix. fixed effects are not correlated with ITT (available upon request).

37
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Table 2
aeioTu intervention. ITT and TOT estimations of program effects by outcome.I
Outcomes N ITT effect size (S.E.) P-value TOT Effect size (S.E.) P-value
A. Nutrition (Z scores)

Height-for-age 738 0.047 (0.039) 0.593 0.083 (0.067) 0.581


BMI-for-age 724 −0.038 (0.050) 0.818 −0.067 (0.088) 0.810
Weight-for-age 738 0.025 (0.041) 0.873 0.043 (0.071) 0.874
Weight-for-length 728 0.002 (0.047) 0.964 0.003 (0.083 0.963
Arm circumference 726 −0.031 (0.055) 0.873 −0.053 (0.093) 0.874

B. Language, cognitive and motor development: bayley scales of infant development III (BSID III)
BSID receptive vocabulary 480 0.111** (0.044) 0.034 0.204** (0.080) 0.032
BSID expressive vocabulary 482 0.114** (0.047) 0.034 0.208** (0.084) 0.032
BSID language total 473 0.112*** (0.041) 0.006 0.205*** (0.073) 0.005
BSID cognitive 487 0.074** (0.030) 0.034 0.138** (0.055) 0.032
BSID fine motor 483 0.063** (0.028) 0.046 0.119** (0.053) 0.048
BSID gross motor 482 0.047* (0.027) 0.084 0.085* (0.048) 0.078
BSID motor total 478 0.049** (0.023) 0.035 0.092** (0.043) 0.033
BSID III Total 456 0.064** (0.026) 0.016 0.117** (0.047) 0.014
TVIP 710 0.100 (0.075) 0.184 0.173 (0.128) 0.177
Receptive vocabulary (Std) ‡ 230 0.027 (0.123) 0.826 0.042 (0.185) 0.819

C. Socio-emotional development: ages and stages socio-emotional (ASQ:SE)


Self-regulation 748 0.073 (0.074) 0.88 0.131 (0.130) 0.870
Compliance 559 0.103 (0.094) 0.863 0.171 (0.151) 0.838
Communication 748 0.01 (0.095) 0.985 0.017 (0.168) 0.985
Adaptive functioning 747 0.03 (0.070) 0.976 0.054 (0.122) 0.973
Autonomy 559 0.009 (0.074) 0.985 0.016 (0.119) 0.985
Affect 748 0.04 (0.078) 0.976 0.071 (0.136) 0.973
Interaction 747 −0.077 (0.092) 0.904 −0.138 (0.161) 0.894
ASQ:SE Total 746 0.078 (0.087) 0.373 0.140 (0.154) 0.364

Note: Sample sizes vary by instrument due to the collection method (with less households assessed with HOME than children nutritionally) and as children at post
testing grow out of the BSID III. We show BSID III results for children still eligible for BSID at follow-up (456 < N < 487), as well as TVIP language results for
children who outgrew the BSID III at follow-up (N = 230). We use nutritional Z-scores by OMS standards, and raw scores of all other outcomes except for ‡. The
regression controls for the corresponding pre-test (except for HOME for which there was no pretest), age, age squared, male, black, mother marital status (single),
maternal years of education, household wealth index, household kids, childcare experience before randomization, neighborhood, cohort of birth and tester fixed
effects. Effects sizes are interpreted as fraction of SD in control group at baseline. *p < 0.10; **p < 0.05; ***p < 0.01; Significance according to standard or
stepdown P-values (where applicable). Stepdown P-values are for Romano and Wolf (2005) stepdown procedures applied to blocks of outcomes per type of de-
velopmental dimension measured. For Bayley, motor scales are one block and cognitive and language are another block. Combined receptive vocabulary excluded
from the block as it combines two measures of receptive vocabulary and is based on a different sample than other outcomes within that group. RW p-values are not
calculated for the total aggregate scores as these are aggregate measures across various dimensions or for one developmental domain. In these cases, we show
unadjusted p-values in italics in the corresponding columns. ‡ We internally age-standardized both, BSID III receptive vocabulary and TVIP scores in the complete
sample and then pooled both in a single regression controlling for measure type.

Table 3
aeioTU intervention. ITT and TOT estimations of program effects for intermediate outcomes: home observation and measurement of the environment (HOME).
Outcomes N ITT effect size (S.E.) P-value TOT effect size (S.E.) P-value

Responsivity 720 0.057 (0.076) 0.830 0.099 (0.130) 0.820


Acceptance 720 0.079 (0.077) 0.830 0.137 (0.133) 0.820
Organization 702 0.073 (0.073) 0.830 0.126 (0.124) 0.820
Learning materials 702 −0.001 (0.068) 0.991 −0.001 (0.116) 0.991
Involvement 702 −0.04 (0.076) 0.830 −0.070 (0.130) 0.820
Variety 720 −0.093 (0.073) 0.726 −0.162 (0.125) 0.709
HOME total 720 0.002 (0.068) 0.977 0.003 (0.117) 0.976

Note: All notes related to P-values and controls for Table 2 apply here. However, effects sizes are interpreted as fraction of SD in control group at post-test as there was
no baseline testing with HOME.

4.4. Indirect effects 4.5. Heterogeneous effects

We conducted the same series of analyses to investigate program We also conducted analyses separately for girls and boys, children of
effects on intermediate outcomes. In particular, we focus on the HOME, mothers with and without secondary education completion, and
a measure of the environment and care of children in their own homes. stunted and non-stunted children at baseline (as in Blaga et al., 2009,
Table 3 reports these results for the overall HOME score and its sub- and Hoddinott, Alderman, Behrman, Haddad, & Horton, 2013). While
scales. We find no evidence of statistically significant effects on the the latter two groupings relate to degree of vulnerability, it is important
home environment for the full sample, suggesting that all effects are to keep in mind that the whole sample is quite vulnerable. We again
due to center-based experiences provided by the program. find no significant effects on overall physical development (nutrition
effects) or on social-emotional development. Results for the remaining

38
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Table 4
aeioTU intent-to-treat effects by outcome, for selected groups. Reports effect size (p-value†).
Outcome variables Females Males Less than high school High school or higher Stunted Non-stunted
A. Nutrition (Z scores)

Height-for-age 0.011 0.079 0.055 0.024 −0.172 0.094


(0.999) (0.770) (0.855) (0.976) (0.926) (0.409)
BMI-for-age −0.035 −0.036 −0.095 0.087 −0.109 −0.016
(0.993) (0.993) (0.597) (0.861) (0.930) (0.965)
Weight-for-age 0.007 0.038 0.002 0.074 −0.082 0.057
(0.999) (0.989) (0.994) (0.801) (0.949) (0.781)
Weight-for-length −0.002 0.011 −0.041 0.107 0.057 −0.008
(0.999) (0.999) (0.946) (0.744) (0.965) (0.979)
Arm circumference −0.047 −0.010 −0.035 −0.009 −0.171 0.006
(0.993) (0.999) (0.976) (0.994) (0.827) (0.979)

B. Cognitive and language development: bayley scales of infant development III (BSID III)
BSID receptive vocabulary 0.149 0.065 0.036 0.239 0.093 0.095
(0.130) (0.440) (0.524) (0.010) (0.698) (0.185)
BSID expressive vocabulary 0.179 0.025 0.069 0.176 −0.064 0.116
(0.078) (0.680) (0.418) (0.110) (0.698) (0.127)
BSID language total 0.171 0.032 0.056 0.205 −0.016 0.110
(0.013) (0.545) (0.272) (0.010) (0.896) (0.027)
BSID cognitive 0.075 0.072 0.049 0.134 0.129 0.046
(0.292) (0.225) (0.418) (0.056) (0.416) (0.416)
BSID fine motor 0.125 0.016 0.045 0.087 −0.023 0.055
(0.012) (0.718) (0.469) (0.190) (0.796) (0.200)
BSID gross motor 0.080 0.023 0.043 0.050 0.077 0.028
(0.213) (0.718) (0.469) (0.469) (0.675) (0.675)
BSID motor total 0.100 0.013 0.039 0.067 −0.020 0.040
(0.014) (0.663) (0.199) (0.154) (0.786) (0.183)
BSID III Total 0.101 0.029 0.032 0.129 0.003 0.055
(0.025) (0.388) (0.325) (0.008) (0.977) (0.082)
TVIP 0.445 −0.288 −0.004 0.042 0.140 −0.020
(0.060) (0.091) (0.977) (0.977) (0.852) (0.876)
Receptive vocabulary‡ 0.295 −0.083 −0.005 0.279 0.203 0.054
(0.026) (0.418) (0.959) (0.064) (0.386) (0.527)

C. Socio-emotional development: ages and stages socio-emotional (ASQ:SE)


Self-regulation 0.196 −0.016 0.106 −0.014 0.028 0.109
(0.731) (0.999) (0.971) (0.999) (0.999) (0.914)
Compliance 0.105 0.031 0.123 0.111 0.192 −0.004
(0.997) (0.999) (0.983) (0.999) (0.986) (0.999)
Communication −0.071 0.033 0.060 −0.183 0.203 −0.024
(0.999) (0.999) (0.999) (0.988) (0.990) (0.996)
Adaptive functioning 0.167 −0.097 0.017 0.053 0.035 0.026
(0.731) (0.980) (0.999) (0.999) (0.999) (0.996)
Autonomy −0.014 0.056 0.049 −0.054 0.138 −0.021
(0.999) (0.999) (0.999) (0.999) (0.996) (0.996)
Affect 0.031 −0.021 0.035 0.070 −0.001 0.037
(0.999) (0.999) (0.999) (0.999) (0.999) (0.996)
Interaction 0.004 −0.179 −0.013 −0.285 −0.083 −0.095
(0.999) (0.831) (0.999) (0.705) (0.996) (0.990)
ASQ:SE Total 0.245 −0.067 0.125 −0.034 0.152 0.077
(0.192) (0.513) (0.425) (0.835) (0.660) (0.660)

Note: Individual rows present the results of separate regressions for each subpopulation group. Sample sizes and controls notes from Table 2 apply here as well.
Results are shown as effect sizes, also known as Cohen's D, are βs interpreted as fraction of SD in control group at baseline (except for HOME for which there was no
baseline testing). *p < 0.10; **p < 0.05; ***p < 0.01 according to stepdown P-values using for Romano and Wolf (2005). In these set of estimations, step-downs are
also calculated within subgroup categories in pairs of columns, that is: females and males, less than high school and high school or higher, and student and non-
stunted. Therefore, totals have estimated step-down p-values within the pair. ‡ We internally age-standardized both, BSID III receptive vocabulary and TVIP scores in
the complete sample and then pooled both in a single regression controlling for measure type.

subgroup estimations are shown in Table 4 (ITT estimations) and composite of BSID III, 0.48 SD (P-value = 0.02) for the complete
Table 5 (TOT estimations).44 ITT and TOT estimates for subgroup sample using age-standardized pooled TVIP and receptive language
analyses are virtually identical in terms of identified effects, and we scores and also 0.61 SD (P-value = 0.051) for TVIP in the subsample of
describe only the TOT estimates in the text for brevity. As expected, girls who outgrew the BSID III between baseline and follow-up. We also
TOT estimated effects are larger. observed an effect of 0.228 SD (P-value = 0.014) on fine motor de-
We find significant and large positive effects on language, motor velopment, 0.183 for overall development by BSID III.
and cognitive development for girls, but not for boys. Specifically, ef- By maternal education, we observe most effects on language, motor
fects for girls were of 0.33 SD (P-value = 0.073) on expressive voca- and cognitive development for children of more educated mothers and
bulary by BSID III, 0.301 SD (P-value = 0.014) for the language not for children of less educated mothers. The TOT effect on BSID re-
ceptive language was 0.379 SD (P-value=0.009) for children of mo-
thers with high school attainment or above, 0.269 SD (P-
44
The results for the impacts on HOME are available in Tables A7 and A8 in value = 0.090) for BSID expressive vocabulary and 0.320 SD (P-
the appendix.

39
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Table 5
aeioTU treatment-on-treated effects by outcome, for selected groups. Reports effect size (p-value†).
Outcome variables Females Males Less than high school High school or higher Stunted Non-stunted
A. Nutrition (Z scores)

Height-for-age 0.019 0.141 0.100 0.041 −0.247 0.174


(0.998) (0.755) (0.843) (0.974) (0.902) (0.397)
BMI-for-age −0.063 −0.065 −0.174 0.152 −0.159 −0.030
(0.990) (0.990) (0.561) (0.848) (0.902) (0.958)
Weight-for-age 0.012 0.069 0.004 0.126 −0.119 0.105
(0.998) (0.990) (0.994) (0.773) (0.929) (0.765)
Weight-for-length −0.004 0.020 −0.076 0.187 0.083 −0.015
(0.998) (0.998) (0.936) (0.700) (0.958) (0.981)
Arm circumference −0.083 −0.018 −0.060 −0.016 −0.244 0.010
(0.990) (0.998) (0.974) (0.994) (0.765) (0.981)

B. Cognitive and language development: bayley scales of infant development III (BSID III)
BSID receptive vocabulary 0.261 0.131 0.072 0.379 0.162 0.177
(0.128) (0.411) (0.505) (0.009) (0.587) (0.176)
BSID expressive vocabulary 0.330 0.047 0.140 0.269 −0.106 0.218
(0.073) (0.662) (0.384) (0.090) (0.587) (0.119)
BSID language total 0.301 0.062 0.110 0.320 −0.028 0.205
(0.014) (0.519) (0.251) (0.008) (0.872) (0.029)
BSID cognitive 0.137 0.142 0.102 0.209 0.218 0.088
(0.253) (0.198) (0.384) (0.041) (0.317) (0.317)
BSID fine motor 0.228 0.032 0.096 0.134 −0.039 0.106
(0.014) (0.695) (0.432) (0.168) (0.755) (0.193)
BSID gross motor 0.144 0.046 0.089 0.075 0.124 0.053
(0.175) (0.695) (0.432) (0.432) (0.581) (0.581)
BSID motor total 0.183 0.026 0.080 0.102 −0.033 0.076
(0.018) (0.649) (0.186) (0.125) (0.737) (0.176)
BSID III total 0.178 0.056 0.064 0.197 0.005 0.102
(0.018) (0.359) (0.316) (0.007) (0.967) (0.082)
TVIP 0.618 −0.477 −0.006 0.084 0.166 −0.034
(0.051) (0.064) (0.967) (0.965) (0.735) (0.884)
Receptive vocabulary‡ 0.486 −0.152 −0.008 0.480 0.277 0.100
(0.020) (0.406) (0.959) (0.057) (0.313) (0.515)

C. Socio-emotional development: ages and stages socio-emotional (ASQ:SE)


Self-regulation 0.354 −0.029 0.197 −0.025 0.041 0.208
(0.676) (0.999) (0.961) (0.999) (0.998) (0.894)
Compliance 0.174 0.052 0.205 0.187 0.269 −0.006
(0.996) (0.999) (0.975) (0.996) (0.952) (0.999)
Communication −0.129 0.061 0.112 −0.310 0.297 −0.046
(0.999) (0.999) (0.999) (0.980) (0.984) (0.998)
Adaptive functioning 0.298 −0.177 0.032 0.090 0.051 0.048
(0.672) (0.974) (0.999) (0.999) (0.991) (0.991)
Autonomy −0.023 0.095 0.081 −0.091 0.196 −0.037
(0.999) (0.999) (0.999) (0.999) (0.991) (0.991)
Affect 0.055 −0.038 0.066 0.118 −0.002 0.071
(0.999) (0.999) (0.999) (0.999) (0.999) (0.991)
Interaction 0.007 −0.327 −0.025 −0.485 −0.119 −0.180
(0.999) (0.801) (0.999) (0.633) (0.991) (0.984)
ASQ:SE Total 0.442 −0.122 0.232 −0.059 0.219 0.148
(0.180) (0.494) (0.402) (0.820) (0.639) (0.639)

Note: We estimate the effect of actual enrollment in centers by two-staged least squares, instrumenting enrollment with random assignment to treatment. All notes
from Table 4 apply here as well. *p < 0.10; **p < 0.05; ***p < 0.01.

value = 0.008) for BSID language total. Similarly, the effect on BSID services, mostly low quality home-based care. The rest of the children
cognition is 0.209 SD (P-value = 0.041). The effect on receptive vo- (87.5%) were cared for at home by parents or relatives. Children came
cabulary for the full sample is 0.48 SD (P-value = 0.057) for children of from very deprived backgrounds as indicated by their socioeoconomic
more educated mothers. There are no differences on motor develop- conditions, home learning environments, maternal education and initial
ment by maternal education.45 developmental levels. As a result, the counterfactual to the high-quality
center-based care intervention studied was predominantly parental care
5. Discussion and conclusions or public home-based care with poor learning environments.
From only 130–150 days of intervention within the 8-month period
Infants and toddlers were randomly assigned to treatment in two between program roll-out and post-test, we find positive effects of 0.20
economically disadvantaged sites in northern Colombia to estimate the SD for BSID III total language and 0.14 SD for BSID III cognitive, 0.09
impact of a high-quality center-based care intervention. At baseline, SD for BSID III motor development and 0.12 SD for BSID III overall
only 12.5% of infants and toddlers had previously used child care development, for the subsample of children younger than 36 months of
age at follow-up. These effects can be thought as percentages of the
development gap with respect to more advantaged children this age as
45
There are no differences in impacts on socio-emotional development poorer children began about 0.75 SD behind. The particularly strong
(Table 5) or the home environment (Tables A7 and A8 in the appendix) by findings for language effects given the short duration of the program,
child's gender, initial developmental status or maternal education.

40
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

suggest that the program has been particularly effective in exposing development is similar to what we found for girls is suggestive. Possibly
children to richer language environments than they would have ex- there is some complementarity between home and center care for dis-
perienced otherwise and in encouraging the use of expressive language. advantaged children. Children of the most disadvantaged families ex-
The importance of introducing infants and toddlers to a rich language perienced a large number of adverse risks, with higher rates of food
environment has been well documented in the literature (Weisleder & fragility (0.32 SD, P-value = 0.007), higher rates of single mothers
Fernald, 2013). (0.18 SD, P-value = 0.022), lower wealth index (0.30 SD, P-
The positive findings when added to the existing evidence support value = 0.000), lower household income (0.26 SD, P-value = 0.26)
the view that the quality of the experiences provided by a policy or and a lower number of children's books at home (0.24 SD, P-
program relative to that otherwise available in or out of the home is a value = 0.001). The literature on adverse childhood experiences
critical variable. Where relative quality is higher, effects are positive (ACEs) has shown that the greater the number of adverse experiences,
(e.g., Drange & Havnes, 2015; Felfe & Lalive, 2014, and Noboa-Hidalgo the lower the language, literacy and social skills in early childhood
& Urzua, 2012). Where relative quality is lower, effects are negative experiences (Jimenez, Wade, Lin, Morrow, & Reichman, 2016). Possibly
(Baker et al., 2015; Fort et al., 2019; Rosero & Oosterbeek, 2011). The some children were too disadvantaged to benefit from only eight
aeioTU program in this study was offered to highly disadvantaged fa- months of intervention.
milies, with staff exceeding average education levels in the community We do not find statistically significant effects on socio-emotional
and the program exceeding the quality of the few other alternatives, development. The lack of effects might be due to several reasons. The
and the quality of the learning environment in the children's homes. ASQ:SE is parental-reported. The use of parent-reported measures poses
We find significant gender differences in impacts on language and the usual concerns: lack of sensitivity, biases, and or the accurate
cognitive development in favor of girls. Magnuson et al. (2016) report measurement of underlying constructs. There are also documented
that the literature on the effects of early childhood interventions pre- differences in child socio-emotional reports across informants
sents mixed evidence regarding which gender benefits the most. Gender (Achenbach, McConaughy, & Howell, 1987; Renk & Phares, 2004)
differences seem to vary by context, type and quality of the interven- which could be related to differences in how children behave in centers
tion, children's ages and the specific developmental domains under versus at home. However, behavioral effects have been observed for
study. García, Heckman and Ziff (2018), for example, report that girls children this young in other studies with parent reported measures
benefitted more from the Abedecerian Program in the U.S. than boys (Deater‐Deckard, Pinkerton, & Scarr, 1996; Loeb, Bridges, Bassok,
did because girls came from more deprived households in which the Fuller, & Rumberger, 2007). It is also possible that the aeioTU curri-
learning environment was worse. On the other hand, Muschkin, Ladd, culum did not sufficiently emphasize socio-emotional development as
Dodge, and Bai (2018) report that boys benefitted more than girls in opposed to language and cognition as it was initially implemented.
educational achievement gains from added public investments in early Finally, the ASQ:SE is a screening tool to assess risk rather than pro-
care and education. gression in socio-emotional skills, it may not capture modest differences
In our study, we do not observe systematic baseline differences in socio-emotional development (Yates, Ostrosky, Cheatham,
between boys and girls in socioeconomic characteristics. However, we LaShorage, & Santos, 2008).
do observe baseline differences in parental interactions with boys We do not find evidence of impacts on nutritional outcomes. We
versus girls. In particular, we observe better parental-reported inter- propose two possible explanations for the lack of nutritional effects
actions of mothers with boys in play, feeding practices and reading.46 despite aeioTU investing additional resources in the nutritional com-
These differences in parent-child interactions suggest that girls experi- ponent of the program. First, children in the sample at baseline seem to
enced a greater increase in quality of care from the program. be quite lagged in terms of height-for-age with stunting reaching almost
On the other hand, developmental differences by gender could ex- 22% and an additional 30% of children being at risk of stunting given
plain the gender outcomes. For example, recent studies reported a that their height-for-age z-scores are between −2 and −1 SD. On the
consistent advantage of girls during the first 30 months of life in early other hand, the weight indicators appear to be above the population's
communicative gestures, early vocabulary growth, and vocabulary size mean. In particular, only 3% of children are underweight and less than
and complexity (see Barbu et al., 2015, for a review). We observe that 1% exhibit wasting. Height is a long-term indicator that depends not
at baseline, girls outperform boys in expressive vocabulary, language, only on diet but also on health and sanitation conditions, and therefore
motor skills and self-regulation.47 Higher initial developmental levels it is more difficult to alter in the short term than weight. This means
may have allowed girls to reap greater benefits from the learning ex- that a change in the children's diet for 8 months might not have been
perience. enough to affect this particular nutritional status indicator.
Our finding that cognitive gains are concentrated among children of Second, as children received breakfast, lunch and snacks in the
mothers with more education seems at odds with the more typical centers, parents might reallocate resources at home to other children in
finding that impacts of early care and education are typically higher for the household. We find a positive association between the program and
more disadvantaged children. However, all households in our study parental-reported food insecurity49 of about five percentage points that
were disadvantaged, so our results apply only to low-income vulnerable is statistically significant at 5% confidence level. One interpretation of
households. Children of more educated parents had higher receptive this result is that the food provided by the program heightened parental
vocabulary (0.18 SD, P-value = 0.022), cognition (0.13 SD, P- awareness of unmet nutritional needs in the home, and this could have
value = 0.050) and autonomy (0.14 SD, P-value = 0.097) at baseline, contributed to reallocation of home resources. On the other hand, that
higher levels of parent-child interactions (reading and play), and higher the food provided by the program allowed reallocation at home might
parental educational expectations.48 That the higher level of baseline have been expected to have reduced parental reporting of food in-
security. Finally, provision of meals is not the same as consumption. For
46
example, Andrew et al. (2016) report that children younger than 3
Mother reported more play interactions with boys (0.11 SD, P-value 0.082)
and also more feeding interactions for boys (0.15 SD, P-value 0.032) at base-
line.
47
Girls outperform boys in receptive vocabulary (0.22 SD, P-value=0.055), (footnote continued)
expressive vocabulary (0.15 SD, P-value=0.048), fine motor skills (0.15SD, P- value=0.097, respectively). Fathers in homes of more educated mothers also
value=0.049), and total BSID scores (0.21 SD, P-value=0.005) at baseline. reported reading to children more often (0.14 SD, P-value=0.038) and parents
They also outperformed boys in self-regulation (−0.17 SD, P-value=0.022). reported more use of positive discipline methods (0.14 SD, P-value 0.065).
48 49
Higher educated mothers at baseline reported reading to and playing with The specific question is whether the child skipped at least one meal during
children more often (0.15 SD, with P-value=0.052, and 0.11 SD with P- the previous week due to monetary constraints.

41
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

years of age in center-based care eat only about 40% of lunch portions Acknowledgments
without an adult's help.
Our initial positive results on language and cognitive development, This research was supported by the Jacobs Foundation, Grant No.
eight months into the program, are quite important for those concerned 209-805-1, the UBS Optimus Foundation and the Inter-American
with the development of disadvantaged infants and toddlers in devel- Development Bank, PO 209-805-1. We are very thankful to aeioTU
oping countries. Alternative center-based early education programs in for their commitment to early childhood and opening the doors to our
Colombia with several common features (Bernal et al., 2019), or in- evaluation team; iQuartil for their excellent work managing data col-
novations introduced to center-based care (Andrew et al., 2016) have lection on site; and our data collectors who worked under very difficult
demonstrated small or null effects. Other early years interventions have conditions. We also gratefully acknowledge the valuable research as-
shown similar results as those reported in this study only after longer sistance provided by Cynthia van der Werf, Roman Zárate, María de la
periods of exposure (Attanasio et al., 2014; Bernal & Fernández 2013). Paz Ferro and Santiago Lacouture. Any views expressed are those of the
Attanasio et al. (2014) reported effects of 0.22 SD on language for authors and do not necessarily represent those of the funders.
children 12–24 months of age at baseline of a researcher-implemented
home visitation trial with weekly visits offered for 18 months. Supplementary materials
Bernal et al. (2019) reported negative and null effects of offering
children in home-based childcare to transit to center-based childcare in Supplementary material associated with this article can be found, in
urban Colombia after 18 months of exposure. Centers in that study the online version, at doi:10.1016/j.econedurev.2019.05.004.
broadly complied with comprehensive operational and technical na-
tional guidelines (as did aeioTU centers) for structural service para- References
meters such as the number of children per square meter, characteristics
of physical areas, teachers’ qualifications, food handling, bookkeeping, Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/adolescent beha-
etc. However, they differed significantly from the aeioTU program in vioral and emotional problems: Implications of cross-informant correlations for si-
tuational specificity. Psychological Bulletin, 101(2), 213.
their pre- and in-service training strategies, the use of specific curricular Andrew, A., Attanasio, O., Bernal, R., Cardona, L., Krutikova, S., & Martínez, D. (2016).
guidelines for children's learning, professional developmental in- Evaluation of centers of infant development: An early years intervention in Colombia.
itiatives for teachers, and children's monitoring, among other features. Unpublished manuscript. Institute for Fiscal Studies.
Araujo, M. C., Dormal, M., & Schady, N. (2018). Child care quality and child develop-
Our findings add to the evidence regarding the elements of quality that ment. Journal of Human Resources March 2.
matter for impacts on children. Araujo, M. C., López Bóo, F., & Puyana, J. M. (2013). Overview of early childhood devel-
Above all, our results highlight the importance of evaluating a wider opment services in Latin America and the Caribbean. Washington, DC: Inter-American
Development Bank.
range of potentially scalable early childhood programs in low-income Attanasio, O., Baker-Henningham, H., Bernal, R., Meghir, C., Pineda, D., & Rubio-Codina,
countries. The findings suggest that policymakers should remain open C. (2018). Early stimulation and nutrition: The impacts of a scalable intervention.
to higher quality interventions in centers and not just to less expensive NBER Working Paper No. 25059.
Attanasio, O., Fernández, C., Fitzsimons, E., Grantham-McGregor, S., Meghir, C., & Rubio-
care or home-based interventions for infants and toddlers. Further
Codina, M. (2014). Using the infrastructure of a conditional cash transfer program to
studies of early childhood interventions are necessary to understand the deliver a scalable integrated early child development program in Colombia: Cluster
effects of differences in quality, dosage and delivery platforms. Future randomized controlled trial. BMJ, 349(Sep29 5), g6126.
reports from this study hope to provide additional information on the Attanasio, O., Di Maro, V., & Vera‐Hernández, M. (2013). Community nurseries and the
nutritional status of poor children. Evidence from Colombia. The Economic Journal,
effects of continued exposure to the program, effects at older ages, and 123(571), 1025–1058.
the persistence of effects as children progress in the aeioTU program Barbu, S., Nardy, A., Chevrot, J., Guellai, B., Glas, L., Juhel, J., & Lemasson, A. (2015).
and move on to primary schools. However, this first analysis is im- Sex differences in language across early childhood: Family socioeconomic status does
not impact boys and girls equally. Frontiers in Psychology, 6, 1–10.
portant as many interventions in developing regions do not surpass Barnett, W. S., & Boocock, S. (Eds.). (1998). Early care and education for children in poverty:
even the length of the pre-to-post-testing period of this study. This study Promises, programs and long-term results (pp. 11–44). (Eds.). Albany, NY: SUNY Press.
also suggests that quality and curriculum deserve attention from both Barnett, W. S., & Masse, L. (2007). Comparative benefit–cost analysis of the abecedarian
program and its policy implications. Economics of Education Review, 26(1), 113–125.
researchers and policy makers. The case of aeioTU also highlights the Baker, M.; Gruber, J.; Milligan, K., Non-cognitive deficits and young adult outcomes: the
potential of private-public partnerships that can make it feasible to long-run impacts of a universal child care program, NBER Working Paper No. 21571,
increase access and quality simultaneously on larger scales by com- 2015. https://www.nber.org/papers/w21571.
Bayley, N. (2005). Bayley scales of infant development. San Antonio, TX: The Psychological
bining public and private sector resources.
Corporation, Harcourt Brace & Company.
In interpreting the results of this study, it is important to keep in Behrman, J., & Urzúa, S. (2013). Economic Perspectives on Some Important Dimensions
mind that at the time of the evaluation, the aeioTU program was still a of Early Childhood Development in Developing Countries. In Pia Rebello Britto,
Patrice Engle, & Charles Super (Eds.). Handbook of early childhood development:
very young program with only a couple of years of experience, while at
Translating research to global policy (pp. 123–141). (ed.). New York: Oxford University
the same time having a strong focus on growth in access. Since then, the Press.
program has put forth a continuous improvement cycle based on Behrman, J., Cheng, Y., & Todd, P. (2004). Evaluating preschool programs when length of
monitoring of classroom quality measures and detailed data on chil- exposure to the program varies: A nonparametric approach. Review of Economics and
Statistics, 86(1), 108–132.
dren's developmental trajectories (see Nores et al., 2018). With this Berlinski, S., & Schady, N. (2016). The early years: Child well-being and the role of public
improvement process in place, which is well aligned with what is policy (eds.). New York: Springer.
known in terms of quality programs (Frede, 2005), we would expect the Berlinski, S., Galiani, S., & Gertler, P. (2009). The effect of pre-primary education on
primary school performance. Journal of Public Economics, 93(1–2), 219–234.
program to be able to produce even greater impacts on children. Berlinski, S., & Galiani, S. (2007). The effect of a large expansion on pre-primary school
In terms of external validity of our results, it is important to mention facilities on preschool attendance and maternal employment. Labour Economics, 14,
that while the program has been scaled across the country, reaching a 665–680.
Bernal, R., Attanasio, O., Peña, X., & Vera-Hernández, M. (2019). The effects of the
significant number of children urban, peri‑urban and semi-rural com- transition from home-based childcare to center-based childcare in Colombia. Early
munities, the extent to which we can generalize the results to these Childhood Research Quarterly, 47, 418–431.
communities and other countries depends on how similar families may Bernal, R., & Camacho, A. (2014). Early childhood policy in the context of equity and
social mobility in Colombia. In A. Montenegro and, & M. Meléndez (Eds.). Equidad y
be. What is plausibly common across the rest of the country, and most
movilidad social: Diagnósticos y propuestas para la transformación de la sociedad
of Latin America, is that the counter-factual to such an intervention for Colombiana(eds.). Editorial Uniandes.
this age-group is no care, or in a few instances, home-based care or Bernal, R., & Fernández, C. (2013). Subsidized childcare and child development in
Colombia: Effects of hogares comunitarios de bienestar as a function of timing and
stimulation interventions, which would support generalizability of our
length of exposure. Social Science & Medicine, 97(November), 241–249.
results. Bernal, R., & Keane, M. P. (2011). Child care choices and children's cognitive achieve-
ment: The case of single mothers. Journal of Labor Economics, 29(3), 459–512.

42
M. Nores, et al. Economics of Education Review 72 (2019) 30–43

Bernal, R., Martínez, M. A., & Quintero, C. (2015). Situación de niñas y niños colombianos García, J. L., Heckman, J. J., & Ziff, A. L. (2018). Gender differences in the benefits of an
menores de cinco años entre 2010 y 2013. Bogotá, Colombia: Editorial Kimpress. influential early childhood program. European Economic Review, 109, 9–22.
Bernal, R. (2015). The impact of a vocational education program for childcare providers Herbst, C. M. (2013). The impact of non-parental child care on child development:
on children's well-being. Economics of Education Review, 48, 165–183. Evidence from the summer participation “dip”. Journal of Public Economics, 105,
Bernal, R., & Ramírez, C. (2018). Improving child care quality at scale: The effects of from 86–105.
zero to forever. CEDE Working Paper No. 40. Hoddinott, J., Alderman, H., Behrman, J., Haddad, L., & Horton, S. (2013). The economic
Biroli, P., Del Boca, D., Heckman, J. J., Heckman, L. P., Koh, Y. K., Kuperman, S., et al. rationale for investing in stunting reduction. Maternal and Child Nutrition, 9(S2),
(2018). Evaluation of the Reggio approach to early education. Research in Economics, 69–82.
72(1), 1–32. Jimenez, M. E., Wade, R., Lin, Y., Morrow, L. M., & Reichman, N. E. (2016). Adverse
Black, M., Walker, S., Fernald, L., Andersen, C., DiGirolamo, A., Lu, C., et al. (2016). Early experiences in early childhood and kindergarten outcomes. Pediatrics, 137(2),
childhood development coming of age: Science through the life course. Lancet, e20151839 peds-2015.
389(10064), 77–90. Loeb, S., Bridges, M., Bassok, D., Fuller, B., & Rumberger, R. W. (2007). How much is too
Blaga, O., Shaddy, J., Anderson, C., Kannass, K., Little, T., & Colombo, J. (2009). much? The influence of preschool centers on children's social and cognitive devel-
Structure and continuity of intellectual development in early childhood. Intell, 37(1), opment. Economics of Education Review, 26(1), 52–66.
106–113. Magnuson, K., Kelchen, R., Duncan, G., Schindler, H., Shager, H., & Yoshikawa, H.
Black, M. M., Walker, S. P., Fernald, L. C., Andersen, C. T., DiGirolamo, A. M., Lu, C., et al. (2016). Do the effects of early childhood education programs differ by gender? A
(2017). Early childhood development coming of age: Science through the life course. meta-analysis. Early Childhood Research Quarterly, 36(Q3), 521–536.
The Lancet, 389(10064), 77–90. McCormick, M. C., Brooks-Gunn, J., Buka, S. L., Goldman, J., Yu, J., Salganik, M., et al.
Bowman, B., Donovan, M., & Burns, M. (2001). Eager to learn: Educating our preschoolers. (2006). Early intervention in low birth weight premature infants: Results at 18 years
Washington, DC: National Academy Press. of age for the infant health and development program. Pediatrics, 117(3), 771–780.
Britto, P. R., Lye, S., Proulx, K., Yousafzai, A., Matthews, S., Vaivada, T., et al. (2017). Muennig, P., Robertson, D., Johnson, G., Campbell, F., Pungello, E., & Neidell, M. (2011).
Nurturing care: Promoting early childhood development. The Lancet, 389(10064), The effect of an early education program on adult health: The Carolina Abecedarian
91–102. Project Randomized Controlled Trial. American Journal of Public Health, 101(3),
Caldwell, B., & Bradley, R. (1984). Home observation for measurement of the environment: 512–516.
Administration manual. Tempe, AZ: Family & Human Dynamics Research Institute, Muschkin, Clara G., et al. (May 2018). Gender differences in the impact of North
Arizona State University. Carolina's early care and education initiatives on student outcomes in elementary
Camilli, G., Vargas, V., Ryan, S., & Barnett, W. S. (2010). Meta-analysis of the effects of school. Educational Policy. https://doi.org/10.1177/0895904818773901.
early education interventions on cognitive and social development. Teachers College Noboa-Hidalgo, G. E., & Urzua, S. S. (2012). The effects of participation in public child
Record, 112(3), 579–620. care centers: Evidence from Chile. Journal of Human Capital, 6(1), 1–34.
Carneiro, P., Løken, K. V., & Salvanes, K. G. (2015). A flying start? Maternity leave Nores, M., Figueras-Daniel, A., López, M. A., & Bernal, R. (2018). Implementing aeioTU:
benefits and long-run outcomes of children. Journal of Political Economy, 123(2), Quality improvement alongside an efficacy study. Learning while growing. Annals of
365–412. the New York Academy of Sciences, 1419(1), 201–217 May.
Comisión Inter-Sectorial para la Primera Infancia CIPI. (2013). De Cero a Siempre: Atención Nores, M., & Barnett, W. S. (2010). Benefits of early childhood interventions across the
Integral a la primera infancia. Estrategia de atención integral a la primera Infancia. world: (Under) Investing in the very young. Economics of Education Review, 29(2),
Fundamentos políticos, técnicos y de gestión. Colombia: Presidencia de la República. 271–282.
Cunha, F., Heckman, J., Lochner, L., & Masterov, D. (2006). Interpreting the evidence on Padilla, E., Lugo, D., & Dunn, L. (1986). Test de vocabulario en imágenes peabody (TVIP).
life cycle skill formation. In Eric Hanushek, & Finis Welch (Vol. Eds.), (ed.). Handbook Circle Pines, MN: American Guidance Service (AGS), Inc.
of the economics of education: 1, (pp. 697–812). Elsevier. Renk, K., & Phares, V. (2004). Cross-informant ratings of social competence in children
Deater‐Deckard, K., Pinkerton, R., & Scarr, S. (1996). Child care quality and children's and adolescents. Clinical Psychology Review, 24(2), 239–254.
behavioral adjustment: A four‐year longitudinal study. Journal of Child Psychology and Romano, J., & Wolf, M. (2005). Stepwise multiple testing as formalized data snooping.
Psychiatry, 37(8), 937–948. Econometrica, 73(4), 1237–1282.
Drange, N., & Havnes, T. (2015). Child care before age two and the development of Rosero, J., & Oosterbeek, H. (2011). Trade-offs between different early childhood inter-
language and numeracy: Evidence from a lottery. IZA Discussion Paper No. ventions: Evidence from Ecuador. Tinbergen Institute Discussion Paper No. 102/3.
8904https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2582539. Rubio-Codina, M., Attanasio, O., Meghir, C., Varela, N., & Grantham-McGregor, S. (2015).
Encuesta Longitudinal Colombiana. (ELCA). (2010). Bogotá, Colombia: Universidad de The socio-economic gradient of child development: Cross-sectional evidence from
los Andes. Available at https://encuestalongitudinal.uniandes.edu.co/en/. children 6-42 months in Bogota. Journal of Human Resources, 50(2), 464–483 Spring.
Encuesta Longitudinal Colombiana, (ELCA). (2013). Bogotá, Colombia: Universidad de Squires, J., Bricker, D., & Twombly, E. (2009a). Technical report on ASQ:SE. Baltimore, Co:
los Andes. Available at https://encuestalongitudinal.uniandes.edu.co/en/. Paul H. Brookes Publishing.
Engle, P., Fernald, L., Alderman, H., Behrman, J., O'Gara, C., Yousafzai, A., et al. (2011). Squires, J., Bricker, D., & Twombly, E. (2009b). Ages & stages questionnaires: a parent-
Strategies for reducing inequalities and improving developmental outcomes for completed child monitoring system. Baltimore, Co: Paul H. Brookes Publishing.
young children in low-income and middle-income countries. Lancet, 37(9799), Vogel, C. A., Xue, Y., Moiduddin, E. M., Carlson, B. L., & Kisker, E. E. (2010). Early head
1339–1353. start children in grade 5: Long-term followup of the early head start research and eva-
Engle, P., Black, M., Behrman, J., Cabral de Mello, M., Gertler, P., Kapiriri, L., et al. luation project study sample. Princeton, NJ: Mathematica Policy Research.
(2007). Strategies to avoid the loss of developmental potential in more than 200 Weisleder, A., & Fernald, A. (2013). Talking to children matters: Early language experi-
million children in the developing world. Lancet, 369(9557), 229–242. ence strengthens processing and builds vocabulary. Psychological Science, 24(11),
Feinstein, L. (2003). Inequality in the early cognitive development of British children in 2143–2152.
the 1970 cohort. Economica, 70(277), 73–97. World Economic Forum.. (2017). Realizing the human potential in the fourth industrial re-
Felfe, C., & Lalive, R. (2014). Does early child care help or hurt children's development? volution. An agenda for leaders to shape the future of education. Gender and Work. White
IZA Discussion Paper No. 8484https://papers.ssrn.com/sol3/papers.cfm?abstract_id= paperhttp://www3.weforum.org/docs/WEF_EGW_Whitepaper.pdf.
2505346. World Health Organization. (2006). WHO child growth standards: Length/height-for-age,
Fernald, L., Gertler, P., & Neufeld, L. (2008). Role of cash in conditional cash transfer weight-for-age, weight-for-length, weight-for-height and body mass index-for-age:
programmes for child health, growth, and development: An analysis of Mexico's Methods and development. WHO Multicentre Growth Reference Study GroupWorld
oportunidades. Lancet, 371(9615), 828–837. Health Organization.
Fernald, L. C., Prado, E., Kariger, P., & Raikes, A. (2017). A toolkit for measuring early World Health Organization. (2007). WHO child growth standards: Head circumference-
childhood development in low and middle-income countries. Washington, DC: Strategic for-age, arm circumference-for-age, triceps skinfold-for-age and subscapular skinfold-
Impact Evaluation Fund, the World Bank. for-age: methods and development. WHO Multicentre Growth Reference Study
Fort, et al. (2019). The cognitive cost of daycare 0-2 for children in advantaged families. GroupWorld Health Organization.
University of Bologna Manuscript. Yates, T., Ostrosky, M. M., Cheatham, A. F., LaShorage, S., & Santos, R. M. (2008).
Frede, E. (2005). Assessment in a continuous improvement cycle: New Jersey's Abbott Research synthesis on screening and assessing social-emotional competence. Nashville, TN:
preschool program. Invited paper for the National Early Childhood Accountability Center on the Social and Emotional Foundations for Early Learning, Vanderbilt
Task Force with support from The Pew Charitable Trusts, the Foundation for Child University.
Development, and the Joyce Foundation. Available online at:http://nieer.org. Yoshikawa, H., Weiland, C., & Brooks-Gunn, J. (2016). When does preschool matter. The
Frongillo, E. A., Tofail, F., Hamadani, J. D., Warren, A. M., & Mehrin, S. F. (2014). Future of Children, 26(2), 21–36.
Measures and indicators for assessing impact of interventions integrating nutrition, Yoshikawa, H., Leyva, D., Snow, C., Treviño, E., Barata, M., Weiland, Gomez, C. J., et al.
health, and early childhood development. Annals of the New York Academy of Sciences, (2015). Experimental impacts of a teacher professional development program in
1308(1), 68–88. Chile on preschool classroom quality and child outcomes. Developmental Psychology,
García, J., Heckman, J., Leaf, D., & Prados, M. J. (2016). The life-cycle benefits of an 51(3), 309–322.
influential early childhood program. NBER Working Paper 22993.

43

You might also like