You are on page 1of 19

Summary of the Proceedings on the Conference on Impact Evaluation: Methods, Practices, and Lessons 11 July 2012 (ADB Headquarters)

I. Background and Objectives of the Conference 1. Bilateral and multilateral development agencies spend more than $30 billion each year on social development programs, with developing countries spending another $300 billion annually. However, the development effectiveness of these spending is not always clear, and evidence on their impact, if they are available, is scant. This lack of information limits the ability of policy makers and development partners to further improve effectiveness of the development spending through proper design of programs and projects. 2. In this regard, impact evaluation plays a key role in bridging this information gap by providing rigorous and scientific evidence on how projects affect the lives of people and what project design features contributed to this impact (or lack thereof). Impact evaluation is relevant in all sectors, countries, and regions, and conducting these studies for strategically selected interventions will improve the quality of development policy and design of projects. Properly implemented impact evaluations will help in the proper selection of a strategic mix of effective interventions, eventually leading to improved development effectiveness for developing member countries (DMCs). 3. The conference on Impact Evaluation: Methods, Practices, and Lessons brought together impact evaluation experts from the region and around the globe to discuss best practices in conducting impact evaluation, how methodologies are applied in sector studies, and the lessons learned from impact evaluation studies. The conference facilitated knowledge sharing and dissemination among impact evaluation researchers, policymakers, and project teams with the aim of mainstreaming impact evaluation in project design and implementation in ADB DMCs. The conference provided a venue to present lessons learned from the impact evaluation studies conducted by ADBs regional departments; and discuss the conduct of ADBs impact evaluation studies on the ground, results and lessons from impact evaluation studies, and their implications on policymaking and project design. 4. A total of 57 participants from the Regional Operations Departments, Economic and Research Department (ERD), Impact Evaluation Department (IED), and Central Operations Services Office (COSO) of ADB, and experts from the academe attended the conference. II. Proceedings of the Conference A. Opening Session 5. Cyn-Young Park, Assistant Chief Economist of the ERD, opened the event with a brief explanation of the Technical Assistance (TA) 7680 on Impact Evaluation (IE). Changyong Rhee, the Chief Economist of the ERD, gave the Welcome Remarks. Rhee enumerated three objectives of the conference: (i) update and upgrade the Review of Recent Developments in Impact Evaluation with more recent methodological developments; ii) determine how to apply recent methodologies (e.g. randomization) in evaluating ADB projects (especially large-scale infrastructure projects), and incorporate project-level evaluation findings in the Development Effectiveness Review (Currently, the Development Effectiveness Review mainly focuses on country-level achievements. To identify and separate the impact of ADB projects from impact of other external factors is a key concern.); and (iii) enhance the internal cooperation between the Regional Departments and the ERD.

6. Professor Dean Karlan from Yale University gave the Keynote Speech. He explained that in the framework of maximizing aid effectiveness, the basic formula is to maximize A x B. A refers to choosing ideas that are proven to work, backed by theory that supports the prediction that the idea should work in that particular context. B refers to how the institution implements the aid. He elaborated that evaluation is about A, while monitoring is about B. Karlan further stressed the importance of understanding the difference between the two concepts, and reminded researchers and practitioners not to mix them up. 7. Impact evaluation focuses on two basic goalsto know how to allocate scarce resources and to guide management in decision-making process. IE studies will allow the management to identify what program works and does not work in a certain context, and what program design works better. The desired outcome is to afford resources to projects that deliver results. 8. There are two dimensions to consider in assessing the quality of the evaluationinternal and external validity. On one hand, internal validity asks whether the results are valid internal to the setting. On the other hand, external validity asks whether the results obtained from one area can be extended to other areas. 9. Karlan suggested three questions to consider when doing impact evaluation studies: (i) Is there a market failure, and if so how/what/why? (ii) Does a specific policy solve the market failure? (iii) Is it worth solving, relative to cost? 10. The conference proceeded with presentations of IE studies in three sessions. The first session showcased IE studies on Health and Education, presented by invited experts from the academe. The second and third sessions featured presentations of IE studies by the Departments of the ADB, including the five on-going studies supported by TA 7680. B. Session 1: Evaluating Impacts on Health and Education (Chaired by Edgar Cua, Deputy Director General, EARD) B.1. Summary of Presentations Does Access to Improved Sanitation Reduce Childhood Diarrhea in Rural India by Santosh Kumar, Lecturer, University of Washington 11. The paper looked at the program on Total Sanitation Campaign (TSC), launched in 1999. The main objective of the program is to improve sanitation coverage in rural areas by providing access to toilet. It is a demand-driven program, spread across 590 districts in 30 states. The program has been effective in increasing rural sanitation coverage in India, which stood at 60% in 2008 from only 18% in 1999. 12. The study evaluated the impact of access to improved sanitation on child morbidity (measuring morbidity by diarrhea), and quantified child-health gains from access to improved sanitation using Propensity Score Matching (PSM) method. Kumar explained the PSM process, and the assumptions and limitations of the PSM. He elaborated on the PSM steps, as follows: (i) Estimate the propensity score based on covariates that are deemed important in the study. (ii) Check the common support and balance of covariates between treated and controlled group. (The whole idea is for covariates not to be similar between the treatment and control before matching.) (iii) Take the difference in mean outcomes in treatment and control in the matched sample

13. Kumar further explained the assumptions and limitations of PSM. The first assumption is the Conditional Independence Assumption (CIA), which suggests that once you control for observed covariate, the potential outcomes become independent of the treatment status. The second assumption is that there should be enough common support or overlap condition between the control and treated groups to match. On the limitation, the PSM does not address biases that result from unobserved variables or covariates. 14. The study used data from the reproductive and child health survey and district-level household survey conducted during 2007-2008. The survey covered 1,000 to 1,500 households, surveyed in each of 611 districts. Particularly, the survey asked if there was any diarrhea episode in the last two weeks for children born since 2004. 15. Access to improved sanitation, as the treatment variable, was regressed against the following covariates: piped water, maternal age, maternal education, paternal age, paternal education, type of house (pucca/kutcha), fraction of young boys, average age of young children, number of males and females in the household, land ownership, below poverty line (BPL) status, religion, caste, electrification status, health and sanitation committee in the village, distance to district HQ, ICDS worker in the village, whether panchayat head lives in the village. The results showed significant difference between the treatment and control groups, thus, the need for matching. 16. Least Square Method showed that with improved sanitation, children were 0.8% less likely to contract diarrhea. This was significant at 90%. Using the PSM estimate, the study found out that with improved sanitation, children were 2.2% less likely to suffer from diarrhea. This was significant at 99%. This paper used nearest neighbor matching (NN1), which means that each treated unit was matched with the nearest one unit in the control group. Finally, using Weighted Square Regression, the results showed that children are 1% less likely to get diarrhea. 17. Testing the heterogeneous impacts of improved sanitation showed that the program was more effective on boys than girls. On the socio-economic status (SES), those in high SES seemed to benefit more from improved sanitation than those in the mid and low SES. The same results came out in a study by Ravallion. Whether households treat water before drinking, the estimation suggested that households that treat water benefit from improved sanitation more than those that do not. 18. To check for robustness, the study used first alternate estimation p-score using probit, second alternate estimation of p-score using logit and with additional covariates, and third alternate estimation of p-score, using probit and additional covariates. Robustness checks indicated that the 2.2% result in the base model is consistent using the three alternate estimations of p-score. Finally, the study employed Rosenbaums bounding approach to test the sensitivity of estimated treatment effects to unobserved characteristics. Results showed that estimations of impact are robust. Are premium subsidies effective in expanding enrolment of the informal sector in social health insurance: Preliminary results from the 3P Study by Aleli Kraft, Associate Professor, University of the Philippines 19. The study evaluated the impact of reducing financial and informational barriers to voluntary enrollment of workers in the informal sector to the social health insurance system or PhilHealth. It assessed the impact on enrollment of the PhilHealth Pre-Paid Premium (3P), a program that offered premium subsidies (by providing 3P certificate worth PhP 600) and information package (through distribution of information kit

and SMS reminders) to randomly selected households who were eligible to the Individually-Paying Program.1 20. The 3P covered 243 randomly chosen municipalities nationwide (except in two provinces); and ran from February 2011 to March 2012. Using multi-stage cluster sampling, the baseline survey covered 2,950 households2,220 were treatment units and 730 were control units. An analysis of the baseline data confirmed the comparability of the control and treatment groups. There was no significant difference between the two groups in terms of willingness to pay for health insurance, household income and experience of health shock in the last three years. 21. During the first round (February-December 2011), the team distributed the 600-peso subsidy and the information kit. This, however, earned low response from target IPP enrollees. Through SMS replies from treatment households, the team learned of various transactional barriers preventing households from enrolling. During the second round (January-March, 2012), the team offered pre-accomplished PhilHealth membership forms, and assigned enumerators to collect them and submit to the PhilHealth office. This additional package was offered to randomly selected un-enrolled subsidy recipients as of December 2011. With the additional intervention, overall take up rose from 7% to 23%. 22. Introduction of the vouchers increased the likelihood of enrolling in the IPP by 29%. The study also found that those who experienced health shock in the last three years were more likely to enrol, pointing to some possible adverse selection. This was substantiated by the positive effect of having a formal health facility nearby. This may also indicate that those who have less transactions costs may also be more likely to enrol. 23. For those who initially did not enrol within the first year who were provided the additional intervention were 26% more likely to enrol. All the other variables were not significant, except for those representing residence in Visayas and Mindanao. These may be capturing other barriers to enrolment. The results indicated that additional interventions, aside from reducing financial barriers, may be necessary to improve the risk pool. Presentation on School meal programs- an impact assessment by Farzana Afridi, Assistant Professior, Indian Statistical Institute 24. Dr. Afridis presentation covered three IE studies of school meal programs under Indias nationwide transition to provision of cooked meals (from provision of ready-to-eat/food grains) to children in public primary schools. The program covered 120 million children, and was completed in 2003. 25. The first study looked at the impact of providing cooked school meals on short-term nutrition of treated children, vis--vis provision of food grains to households. The study used primary data from 41 villages in central rural areas of India, collected through interviews of children on a randomly chosen date. Random interview dates yielded children in some villages recalling their previous days food intake for a school day; while in other villages the previous day was a non-school day. A subsample of children was revisited to obtain their consumption data in both school and non-school days. 26. Using Instrumental Variable Analysis, results revealed that 49%100% of the nutrient intake of a child for the entire day was from the school meal, indicating that a substantial proportion of the transfers benefitted the recipient. A disaggregation of the total daily consumption data into intake of nutrients
1

Individually-Paying Program is PhilHealths program that covers workers in the informal sector.

during school and non-school hours strengthened this conclusion. Robustness tests, that included community-fixed effects and individual fixed effects model, revealed that results from the cross-sectional analysis were unaffected by unobserved characteristics. 27. The second study analyzed the effect of cooked school meals to attendance rates of students by grade and gender. The study took advantage of the staggered transition of schools from food grain distribution to cooked meals. Schools that transitioned in July 2003 composed the treatment group; while the remaining schools that transitioned between July and December 2003 composed the control group. At the baseline, the trends in school participation of the two groups were comparable (a requirement for difference-in difference model). 28. Using the difference-in-differences strategy, results showed that the attendance rate for grade 1 girls was 12 percentage points higher due to the program transition (significant at 5% confidence interval). There was no significant impact of the program attendance rates of grade 1 boys and of boys and girls in the high grade levels. 29. Building from the two earlier studies, the third study looked at how two changes in the designtransition from provision of ready-to-eat snacks to cooked meals and change to better quality provideraffected the attendance rates of students in public primary schools in Delhi. This study took advantage of the staggered transition to cooked meals. Almost 50% of the sampled schools implemented cooked meals program before September 2003 (which composes the treatment group) and the other half, after September 2003 (which composes the control group). The transition to better quality provider took place between 2004 to 2005. 30. Estimates showed that the introduction of cooked meal resulted to 2 percentage points increase in attendance rate (measured by daily class attendance). Changing to good quality meal provider resulted to at least 2.6 percentage points increase in attendance rate. When looking at the effect of cooked meals and changing providers vis--vis ready-to-eat meals, results showed that school attendance increased by at least 2.7 percentage points due to cooked meals, and an additional 2.2 percentage points due to change in meal provider. B.2. Summary of Comments by the Discussants On Dr. Kumars Presentation 31. Richard Bolt, Advisor of SERD, and Hyun Hwa Son, Senior Evaluation Specialist of IED, were the discussants for the first session. Bolt focused on the implications of Dr. Kumars findings from a program point of view. He emphasized on the importance of looking at the effectiveness of other programs to identify where one can get the biggest bang for the buck. This was coming from the finding that improved sanitation does not have significant effect on diarrhea incidence unless prior measures such as boiling water and handwashing are undertaken. Noting the treatments weak effect on low-income families and on girls, Bolt called on ensuring that the service provider reaches the hard-to-reach groups. 32. Son stressed that sanitation is a multi-dimensional subject. It is shaped by many factors such as clean water, clean toilet, handwashing behavior, food preparation and drainage system in the house. In the paper, sanitation is defined in a very limited way, that is, whether households have access to flush toilet. The treatment group is defined as households that have flush toilets, while the control group is defined as those with no flush toilet. This raises a question on the choice of treatment variable. Diarrhea among children is mainly caused by contaminated water and food. Son expressed that access to piped water may

be a better indicator to define a treatment group. Furthermore, poor households in slum areas, which have poor drainage system, generally suffer from low level of sanitation. In this case, poverty status of households may be a better indicator. The weak impact of the treatment to childhood diarrhea may be because of the weak linkage between having flush toilet and diarrhea. 33. Son further commented on the data set. While it was comprehensive and large, it only comprised of a single cross-section data. Therefore, it does not show how diarrhea prevalence changed over time. The diarrhea incidence may have decreased during the evaluation period due to other efforts to improve child health and achieve MDGs in India. The timing of the survey should have also been carefully considered, taking note of the seasonality of diarrhea. On Dr. Krafts Presentation 34. Noting the insignificant difference on the willingness to pay of the controlled and the treated groups, Bolt urged the researchers to investigate on the following questions: What is the general perception of the people on health services in the Philippines? What can be the reason why people are not enrolling? Do they see that the health services are not valuable? He added that it is important to understand why eligible IPP member are not signing up. Is it because they feel that they can access private services, and that they have means to pay for their own welfare? On Dr. Afridis Presentation 35. According to Hyun, it seemed rather obvious that childrens nutrition will improve with the provision of school meals. It is more appropriate to be looking at the net impact. It is possible that when children receive food at school, parents will not provide them with food that they normally eat. 36. In terms of nutrition impact, it was not clear how the treatment and control groups were defined. On school participation, the study did not discuss the impact of school lunch program on school participation. It measured the impact on school attendance against how the school meals were delivered to children, thus the meager observed effect. B.3. Comments/Clarifications/Questions from Participants and Response On Dr. Kumars Presentation 37. The researcher may consider other observables that have impact on diarrhea, such as number of children in the household, immunization, and breastfeeding. It is also important to look at the interaction between water and sanitation, noting that providing a toilet and no water will not have any meaning. 38. Guntur Sugiyarto, Senior Economist of the ERD, stressed the importance of asking the right evaluation question when doing IE studies. C. Session 2: Presentations by Regional Departments (Chaired by Ayumi Konishi, Deputy, Director General, PARD) C.1. Summary of Presentations Presentation on Impact Evaluation of Labor-Based Road Work in Pacific, Aaron Batten, Christopher Edmonds, Daisuke Misuzawa, and Craig Sugden, Pacific Regional Department

39. The ADB supports labor-based (LB) method for road maintenance and rehabilitation of unpaved rural roads in Papua New Guinea, Solomon Islands and Timor-Leste. Such method is supposed to promote poverty alleviation through income generation and job creation among rural communities. 40. The study investigates five evaluation questions: What are the comparative changes in living standards of households in communities located near labor-based and traditional capital-intensive road rehabilitation and maintenance projects? Are communities near LB projects more likely to express positive attitudes toward the road projects? Do workers employed in LB road projects show evidence of improved postproject employment experiences? Do communities participating in LB projects demonstrate greater commitment to road maintenance once the project is completed? What types of social and environmental externalities associated with LB and traditional road maintenance and rehabilitation projects and practical importance in the communities and households neighboring road project areas? 41. In the PNG and Solomon Islands, the team identified and surveyed households and businesses proximate to (i) where LB road maintenance and rehabilitation are being undertaken, (ii) where there is road maintenance using capital-intensive methods, and (iii) where there is no road maintenance. The team also interviewed transport service providers and engaged community groups in focus group discussions. Direct observation and measurement along roads in the study area by road engineers complemented all the surveys. The use of GIS technology in Solomon Islands to identify survey households is an innovation worth noting. In Timor Leste, due to shortage of budget and the request from the government to stop the surveys, the team utilized existing data source such as census data and household income and expenditure survey data for the baseline. 42. There are two treatment sites in PN Eagel Kero and Mulitaka. There are two control sites, as well. One is an area that has undergone capital-intensive rehabilitation under government funding, and the other is an area that has rural access road but has not received maintenance for a couple of years and is not expected to receive rehabilitation funding for the next years to come. 43. In Solomon Islands, the team chose two adjacent roads. One is being upgraded using LB method supported by the Second Road Improvement Project of the ADB, while the other is to be rehabilitated entirely by government funding through a capital-intensive approach. 44. In Timor Leste, it was difficult to identify a control group ex-ante, as there were many rural infrastructure programs underway and the future location is not yet known. The 2010 population and housing census provide the baseline data. The intention is to update the data in the next census, or if necessary, undertake targeted surveys to update needed data. In trying to understand changes in living standards, the team measured asset holdings rather than the usual measure of income or consumption. The team provided ranking of each villages, and will try to see changes in ranking over time between villages that have LB road works and those that did not. 45. The study team enumerated the following issues they have encountered: logistical difficulties in conducting surveys, limited time to observe impact of the projects, comparability of project sites across the three countries, and ethical concerns (managing expectations of non-treatment villages). Comments by the discussant 46. The discussant, Santosh Kumar, commented on the methodology and timing of the study. Kumar suggested that for this study, the best strategy is to combine Propensity Score Matching, or any other sort of matching, and difference-in-differences methodology. He expressed concern on the short period in between the conduct of the baseline and the end-line survey (which is only 6 to 8 months), noting that

changes in the socio-economic status of households manifest after a longer period of time from the project completion.

Presentation on Impact Evaluation of Raising Incomes of Small and Medium Farmers in Nepal by Natalie Chun 47. This study evaluates the impact of a 20.1-million ADB project on raising incomes of small and medium farmers in Mid-West and Far-West Nepal. The project aims to establish or strengthen market linkages between farmers and buyers for high-value crops (HVC) by providing grants to farmer cooperatives and/or groups. Prospective grantees submit proposal of business plan related to HVC for review and approval by a committee. The project is still at its basic implementation stage. The implementation period is from 2012 to 2017. 48. The presenter pointed out a number of challenges that the team has encountered. The implementation plan and selection criteria are not clear yet. These are crucial in identifying the control group. Secondly, there will be a wide variation on the amount of grants. This poses a problem in measuring the impact especially if the grants vary between 1,500 dollars to 250,000 dollars. Budget is another constraint. The IE study is allotted USD 150,000 only. 49. Another challenge is deciding what exactly to measure. The team is looking at measuring (i) Intention to Treat Effects (compare areas that have the opportunity to vie for grant applications to areas that do not have the opportunity), (ii) Average Treatment Effects (compare people in groups who actually receive the grant to similar groups who did not have the opportunity to receive the grant), and (iii) Localized Average Treatment Effects (where grant applicants just above the approval threshold are compared to grant applicants just below the approval threshold) using Regression Discontinuity Design. It is important to have enough number of applicants for the grants to measure these effects, and finalize the selection criteria to be able to decide on the cut-off. 50. The proposed approach is to survey extensively four districtstwo from the Far-West and two from the Mid-West. In each of these regions, one district will be in the low-lying area and the other in the hill region. This is to capture the kind of variation in terms of the crop production and the kind of access to market. The proposed sample size is 2,400 farmers20 farmers per cooperative group and 30 groups per district. 51. The proposal is to randomly select Village Developments Committees (VDCs) and obtain a census of farmers within these VDCs, while randomly selecting farmer groups from the census. There will be three types of questionnaires: households, cooperative or farmer groups and villages. The questionnaire will try to measure the differences between treatment and control groups in terms of income and vulnerability to poverty. 52. In terms of timing, the team will conduct a baseline survey before the implementation starts and a midterm evaluation. If funding is available, the team will do an end-line survey. 53. Under the first best-case scenario, the team will have a phased-in approach to districts and sub-districts. All applications will occur at once, then randomly select on who get treated first versus who get treated second. In this manner, the team will be able to measure all the effects. Under the worst-case scenario, if the government counterparts would not agree in implementing IE on a phased-in approach, the team will have a very detailed questionnaire that has all sorts of different measures, and identify instruments to measure the local average treatment effect.

Comments by the Discussant 54. Farzana Afridi was the discussant for the presentation. She shared that in a scenario where you cannot do randomization, phasing-in is something that most government can digest. She pointed out the possibility of randomizing the amount of loans given, at the regional level. She suggested conducting information campaigns. The team might be interested in looking at the take up of this program and see whether information campaign plays a role in improving the take up of the program. This could potentially be randomized. Presentation on Evaluation of Public Resources Management Program in Assam through Synthetic Control Method by Hiranya Mukhopadhyay, IED 55. The study evaluated the impact of Public Resource Management Program in Assam (PRMPA) using the Synthetic Control Method (SCM). The PRMPA provided budgetary support to help state improve tax administration, public sector enterprise restructuring and data management. Two loans were granted to Assam. One was in 2004, and the other was in 2008. 56. The basic approach of the IE study was to clone Assam. The methodology entailed creating a synthetic Assam before the intervention, and later on, compares the real and synthetic Assam in terms of a particular outcome variable. Suppose there were three states: Assam, which is the program state, and Maharashtra and Tamil Nadu. The SCM required cloning Assam using Maharashtra and Tamil Nadu. Alberto Abadie and Javier Gardeazabal first used the SCM in a study on Economic Costs of Conflict: A Case Study of the Basque Country, published by the American Economic review. 57. The key evaluation question of the study is: What would have happened in the absence of the PRMPA? The study looked at the effect of the PRMPA on the ratios of own revenue to GDP and interest payment to GDP. With respect to own revenue to GDP ratio, the SCM estimates showed significant divergence between the real and synthetic Assam from 2005 onwards. Real Assam had better own revenue to GDP ratio after the introduction of the project. 58. Looking at interest payment to GDP ratio, estimates revealed no significant impact of the project. When the program was introduced, interest payment to GDP ratio had not gone down in real Assam vis-a-vis synthetic Assam. 59. The approach has a number of issues and limitations, one of which is the presence of other similar projects that are ongoing in other states. This may have underestimated the effect of the PRMPA in Assam. The paper discusses in detail a number of other important caveats. Comments by the Discussant 60. The discussant (Farzana Afridi) commented on the methodology used in the study. She pointed out that the approach is somewhat similar to PSM on a micro level. Being similar to PSM, the SCM cannot match units on unobservable characteristics. Unobservables include legal institutions, bureaucracy, and the responses of the state to the project. Thus, it was difficult to establish that whatever difference the study observed between the synthetic and the actual treatment group can be attributed to the program.

61. The other concern was spillovers brought about by macroeconomic changes in the economy. Although trends were similar up to one point, there was no guarantee that the trends will remain the same between the treatment and synthetic group. 62. It was important to understand and identify in the paper the mechanisms that initiated the observed changes. The program was doing many things, and it was important to know which component led to these changes. It could be helpful for ADB to know which component to prioritize. Presentation on Impact Evaluation of Tbilisi, Georgia Metro Extension Project by Tomomi Tamaki and Nina Fenton, CWRD 63. The study looks at the impact of the 1.2-kilometer Tbilisi Metro Extension Project to the local population and local activities. The completion of the metro extension is scheduled in December 2014. 64. The projects expected impacts are improved urban environment, local economic development, better health and poverty reduction through improved access to employment, lower transport cost and travel time, and reducing pollution. The project design predicts that the poor and socially excluded will benefit in this project by having a new low-cost option to travel to the city center. Based on this, the team developed a causal chain, which led to an evaluation plan. 65. The original plan was to adopt difference-in-difference methodology of households, students and businesses. The team was to visit several areas in Tbilisi to identify candidate control groups. The plan changed to just using available data on socio-economic characteristics, from which the control groups will be identified. Once identified, large-scale household survey would be conducted to collect required information for the difference-in-difference approach. However, inspection of comparison areas in Tbilisi revealed that there were no areas identical to the project area in terms of geography, socio-economic development, and infrastructure and business development. Moreover, needed data (such as income, employment, etc. at the city or district level, disaggregated business data) were not available. 66. Considering the limitations, the team resorted to conducting four surveys: students, households, enterprises and air quality. The first survey will include university students, to examine impacts on timeliness, travel cost, changes in consumption patterns, and attendance rates. The study will apply difference-in-difference methodology using pipeline approach. The evaluation team will survey a sample of three cohorts of students enrolling in 2013, 2014 and 2015 at the national university. The first two cohorts will serve as the comparison group, and the last cohort as the treatment group. The control group will be surveyed in September 2013 and 2014, and the treatment group will be surveyed in 2015. In addition, a sample of students in other cities will be surveyed as a second control group. If possible, two small surveys will be conducted to assess the quantitative impact of the project. 67. The household survey intends to analyze employment generation, income generation, time and travel cost savings, access to services, rental and real estate prices. Using 300 households, the survey will be conducted among households located within one-kilometer radius from the new metro extension. This will most likely consist of in-depth interviews and focus group discussions. The approach is to conduct beforeafter analysis. 68. The business survey intends to see the changes in business registration and revenues. The survey will include 300 samples of small service trade retail businesses located within 1 km radius. Baseline household surveys are now on-going, conducted by local consultants in Tbilisi. The team will use before and after

analysis for this component. Air quality survey is a technical survey of air pollution by the new station in Tblisi. 69. The team shared their learning in this IE project. It was important to lay out theory and establish clear causal chain. Customizing the IE study to seek the second best option was especially important to establish a clear causal chain. Brainstorming opportunities were important. It was to be expected that lack of data, project delays and difficulty in identifying control groups, especially in rapidly developing countries would be encountered. It was helpful to be able toget guidance from ERD colleagues and other experts. Lastly, it was helpful to get the commitment of counterpart DMC officials in the evaluation project. Comments by the Discussant 70. Muhammad Ehsan Khan, Principal Economist of the ERD, commented on the presentation. He explained that the challenges that the team faced had implications on the quality of the evaluation. Moving from a large to a small sample raises a concern on how representative the 300 households were. The evaluation team explained that a team of local experts built up a sampling procedure and determined that the 300 is representative. 71. Another concern was the timing of the baseline and endline surveys. Changes might not be observable in the short period of time between the conduct of the baseline and endline surveys. The evaluation team agreed that this could be true for households and businesses. But for students, changes were expected to transpire more quickly. When the metro opens, students will immediately use it. 72. Attribution of effects to the project was important to consider. For example, when interpreting increases in rents of property prices, it was imperative to determine how much of this benefit was brought about by an on-going trend and how much was triggered by the project. C.3. Comments/Clarifications/Questions from Participants and Response On CWRDs Presentation 73. Dr. Kumar, one of the invited experts, is positive that if the baseline survey has not been done yet, matching approach can still be adopted. He further explained that in the treated communities, it is possible to sample 500 households on the Tbilisi metro line, sample 700 from somewhere else, then match some of the observables. The team may lose some observations but they may find 300 treated that match with the control group. Then the team may do endline survey of the 300 in the treated and the 300 in the control group. 74. The Chairman of the session, Ayumi Konishi, commented that the introduction of Multi-Tranche Financing widened the scope of projects and made evaluation more difficult. ADB does not design a project for evaluation, and certainly, the institution cannot go back to the old days of smaller and easier projects. At the same time, the ADB has the responsibility to report the effectiveness of projects to donors. Furthermore, Konishi asked if the ADB is spending enough money, time, resources for designing the evaluation framework. He added that IE should be integrated in the project cycle. On ERDs Presentation

75. The team has considered the information campaign component. The team has decided to do newspaper informational campaign. The team might eventually do randomized phone calls to cooperatives to participate.

D. Session 3: Presentations by Regional Departments (Chaired by Juan Miranda, Director General, SARD) D.1. Summary of Presentations Presentation on Impact Evaluation of Foodstamp and Medicard Programs in Mongolia by Wendy Walker and Claude Bodart, EARD 76. This study looks at the nationally implemented Food Stamp and Medicard programs in Mongolia. The food stamps program is a response to the food and oil crisis in 20082009. It is literally providing food stamps to 5% of the population. Food Stamp entitles beneficiaries to purchase ten basic food items (including flour, milk, eggs, etc.). On the other hand, the Medicard program is a response to the financial crisis. It provides access to health services, pharmaceutical and out-of pocket expenses to 5% of the population. 77. The team identified target beneficiaries using the Proxy Means Test (PMT). When fully rolled out, the programs intend to reach about 18,000 households or over 100,000 individuals. The transfer of the Food Stamp is about USD 33 per household per month and it has made a significant impact at the household level based on program monitoring results. 78. The purpose of the evaluation is to provide evidence to the government on the impact of the programs. The results will help the government decide on whether to expand the coverage of the programs. 79. The study uses Regression Discontinuity Design (RDD). Control and treatment groups were chosen based on the proxy means test score. The treatment group was drawn from the eligible households in the bottom 5% PMT score. The comparison group are households just above the 5% line who are not eligible for the food stamps or Medicard programs. The baseline was conducted in December 2011, and implementation of endline survey will start in October 2012. 80. The project has some special features. In the design of the questionnaire, the team is looking at food security indices using the specially designed FANTA modules to measure food security. This might be able to help the team do cross-country comparison and international comparisons of food security. In addition, the team was able to include a Food Stamps module into the Household Socio-Economic Survey (a nationally representative survey, conducted independently by the National Statistical Office). This will allow the team to validate the targeting results generated from the PMT. Lastly, there is a bias for urban areas because much of the PMT survey was in urban areas. 81. Synchronizing the implementation of the program and the evaluation study was a challenge to the team. The implementation of Food Stamps and Medicard in the sample areas was delayed largely because of procurement issues that happened earlier in the PMT surveys. The team has adjusted the distribution period from one full year to just eight months. 82. One major challenge was the governments initiative to increase the eligibility threshold last December 2011, after the team has done the survey. This brought much of the households in the control group into

the treatment group. In addition, there were new benefits that the government announced to the same target population. In June 2012, the government announced that part of a large cash distribution of mining revenue would be targeted to the same 5% households in Ulaanbaatar and Selenge. Comments by the Discussant 83. Aleli Krafts comments on the presentation focused on how the study intends to measure the effect of the two programs. It was not clear on whether the team will do a separate evaluation for the two programs, or a combined evaluation. Kraft suggested looking at the links between the two programs. She noted that healthier people do not need to be hospitalized and would, therefore, have less utilization of insurance. 84. Kraft emphasized the importance of clarifying the outcomes that the study wants to measure. For the Food Stamp program, is it nutrition, food security or other outcomes? For the medicard program, is it health outcomes or the out-of-pocket payments? In addition, Kraft commented that the assumption on the similarity of living standards between the control and treatment groups should be empirically tested. Presentation on Lessons on Impact Evaluation on Select Infrastructure Project by Ganesh Rauniyar, IED 85. The presentation is a personal reflection of the experiences in the IED. So far, the IED has completed four IE projects, and the fifth one is ongoing (on ground water irrigation project in Nepal). IEDs impact evaluation focuses on human welfare, and on one or more of the following dimensionseconomic, social, institutional, and environmental. 86. Ganesh enumerated key challenges in IEDs evaluation studies. The IED is involved in ex-post evaluation, and relies mostly on project documents. In many cases, project documents are not clear in terms of the outcomes and impacts expected from a project. Furthermore, identification of counterfactuals is often difficult. Often, baseline data do not exist or of not usable quality for impact evaluation. Timing also poses a constraint, so as the complexity of ADB assistance. It is worth noting that the IED does not have any control over project design, as the IED comes in 2 to 5 years after the completion of the project. 87. In terms of approach, the IED partners with field staff (resident missions, executing and implementing agencies and nongovernment organizations). In number of cases, NGOs play an important role in the conduct of IE studies. 88. The IED uses mixed method approach that involves quantitative and qualitative tools, depending on the nature of the project to be evaluated. The IED also uses secondary information/data, when available, for establishing counterfactuals. Typically, surveys are between approximately 20002,500 households. There is a variation depending on the time and resources. Reports are peer reviewed, both internally and externally. 89. Ganesh shared evaluation lessons based from IED experiences, as follows: local knowledge is important, clarity in conceptual framework for evaluation requires focus on what the project intended to deliver, timing of the survey needs to be flexible, the composition and the quality of the evaluation and data collection team should be considered, and foremost, the quality of the data is important and should determine the choice of quantitative technique to be implemented. 90. It is important to remember that IE is not for every project or for every circumstance. It is important to ask: Who is it for? When should it be done? How should it be done? How will its findings be used?

91. ADB projects tend to be multi-dimensional and multi-component. Scope of IE needs to be clearly defined. Government buy-in is very important and it is useful to engage the DMC institutions from the beginning. However, the limited capacity of national institutions calls for timely technical support, at least, in the short to medium term. Comments by the Discussant 92. Hans Carlsson, Discussant and Lead Development Effectiveness Specialist of SARD, disagreed that the ADB process is very tight. Rather, he believes that resources are very tight. He agreed that ADB has to work more with DMCs and encourage impact evaluation. He added that a repository of the IE works could be useful. The repository may host available baseline data, so that after completion, IED can use these data for impact evaluation. Presentation on Challenges in Integrating Impact Evaluation into Operations by Dinuk Jayasuriya, ANU College of Asia and the Pacific 93. The presentation focuses on the use of RCT and is largely based on the presenters experiences at IFC, World Bank and the Red Cross. The World Bank has over 300 rigorous evaluations (completed and ongoing) as of FY 2011. Around 70% of the fund is from the Spanish Trust Fund, 30% is from the project. On the other hand, there are 23 IE studies at the IFC, which are mainly quasi-experiment. Funds are from the project itself. 94. Jayasuriya enumerated challenges for RCT and suggested ways to go about them, as summarized in the table below. Challenges Buy-in for the evaluation from all stakeholders (donors/partner government/consultants/project team) Understanding why and how we are doing the evaluation Sufficient financial resources Sufficient human resources. Ideal team should be composed of a principal investigator, evaluation technical partners, support staff, data collection staff and partner coordinator Ensuring statistical power and controlling for spillover effects Delays in intervention Recommended Responses Ensure adequate incentives for all parties involved

Have sufficient human resources including independent in-country Field Coordinator Ensure budget is sufficiently large, approved and controlled by evaluation team Decide upfront the cost-benefit of having a smaller or larger team and the associated risks

Solicit expert opinion and conduct baseline survey data analysis Certain level of access to senior management to suggest alterations to the evaluation/program plans given any delay Brief project management team that delays may influence evaluation timeline People should be aware of the time taken to undertake evaluations Allot sufficient time to design the evaluation

Timing of the project (evaluation is requested just prior to program implementation)

Ethical concerns

Consider the impact of no intervention given the local context Consider if evaluation is justified given possible concerns Explain benefits of evaluation

D.2 Comments/Clarifications/Questions from Participants and Response On EARDs Presentation 95. One of the conditions for the RDD to be valid is that the units under study should not be able to influence their side of the discontinuity. Self-reporting allows households to under-report their income to qualify for the programs. 96. The presenter clarified that although income was self-reported, validation was done by surveyors. The presenter further emphasized that the team will continue to be responsive to the changes in the program and to other factors that may come into play. On Dr. Jayasuriyas Presentation 97. It is encouraging to know that the IFC is about at the same stage that the ADB is at now. An inventory of ADBs IE studieswould show 20-30 completed and on-going evaluations. However, I may not agree with my colleague in the IED that IE studies are cheap. Resources still have to be mobilized. 98. Jayasuriya added that in the IFC, the money for IE studies comes from project funds. In the World Bank, 70% of the cost of IE studies is supported by the Spanish Trust Fund, while the 30% comes from the project fund. E. Closing Session Closing Remarks by Juzhong Zhuang, Deputy Chief Economist, ERD 99. Juzhong Zhuang closed the conference by congratulating the presenters, discussants, participants and organizers for a very productive event. He noted that the objectives set for the conference have been achieved. He reiterated the need to build an IE culture in ADB and mainstream IE in ADB operations. Zhuang appreciated this ADB-wide IE initiative, noting that mainstreaming is already happening. Furthermore, he acknowledged the contribution of Richard Bolt and Jesus Felipe in setting up the working group that made this initiative possible.

Annex 1 Program Agenda TA7680 Implementing Impact Evaluation at ADB Conference on Impact Evaluation: Methods, Practices, and Lessons 11 July 2012, Auditorium A, ADB Headquarters, Manila

Moderator: Cyn-Young Park, Assistant Chief Economist, ERD 8:309:00 9:009:10 9:109:30 9:30-9:45 Registration Welcome Remarks Changyong Rhee, Chief Economist, ERD Keynote speech Dean Karlan, Professor, Yale University Coffee break Session I: Evaluating Impacts on Health and Education Chair: Edgar A. Cua, Deputy Director General, EARD 9:4510:10 Santosh Kumar, Lecturer, University of Washington Does Improved Sanitation Reduce Diarrhea in Children in Rural India? Aleli Kraft, Associate Professor, University of the Philippines "Are Premium Subsidies Effective in Expanding Enrolment in Social Health Insurance? Preliminary Results from the 3P Study Farzana Afridi, Assistant Professor, Indian Statistical Institute The Impact of School Meals on Student Participation and Nutrition in India Discussants: Richard Bolt, Advisor, SERD Hyun Hwa Son, Senior Evaluation Specialist, IED Open forum Lunch Session II: Presentations by Regional Departments Chair: Ayumi Konishi, Deputy Director General, PARD 1:001:20 Presentation by Pacific Regional Department Christopher Edmonds, Senior Economist, PARD/ Craig Sugden, Resident Representative, SOTL Impact Evaluation of Labor-Based Road Work in Pacific Discussant: Santosh Kumar, Lecturer, University of Washington

10:1010:35

10:3511:00

11:0011:30

11:3011:50 11:501:00

1:201:30

1:301:50

Presentation by South Asia Regional Department Natalie Chun, Economist, ERD Impact Evaluation of Raising Incomes of Small and Medium Farmers in Nepal Presentation by South Asia Regional Department Hiranya Mukhopadhyay, Public Management Economist, SARD Evaluation of Public Resource Management Program in Assam through Synthetic Control Method Discussant: Farzana Afridi, Assistant Professor, Indian Statistical Institute Presentation by Central and West Asia Regional Department Tomomi Tamaki, Principal Economist/Nina Fenton, Young Professional, CWRD "Impact Evaluation of Tbilisi, Georgia Metro Extension Project" Discussant: Muhammad Ehsan Khan, Principal Economist, ERD Coffee break Session III: Presentations by Regional Departments Chair: Juan Miranda, Director General, SARD

1:502:10

2:102:30 2:302:50

2:503:00

3:00-3:15

3:15-3:35

3:353:45 3:45-4:05

Presentation by East Asia Regional Department Wendy Walker, Senior Social Development Specialist, EARD Impact Evaluation of Medicard and Food Stamp Programs Discussant: Aleli Kraft, Associate Professor, University of the Philippines Presentation by Independent Evaluation Department Ganesh Rauniyar, Principal Evaluation Specialist, IED "Impact Evaluation of Selected Infrastructure - Lessons from Independent Evaluation" Discussant: Hans G. Carlsson, Lead Development Effectiveness Specialist, SARD Dinuk Jayasuriya, ANU College of Asia and the Pacific "Challenges in Integrating Impact Evaluation into Development Operations Open forum Closing Remarks Juzhong Zhuang, Deputy Chief Economist, ERD

4:054:15

4:15-4:35

4:355:05 5:055:15

Annex 2 List of Participants

External Presenters 1 2 3 4 5 Name Afridi, Farzana Jayasuriya, Dinuk Karlan, Dean Kraft, Aleli Kumar, Santosh Institution Indian Statistical Institute Development Policy Center, ANU Yale University University of the Philippines University of Washington ADB Staff 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Name Amoranto, Glenita Barcenas, Marissa Bautista, Maria Socorro Benizano, Noemi Bernardo, Alely Bolt, Richard Butiong, Ronald Antonio Carlsson, Hans Cham, Rowena Chun, Natalie Cua, Edgar Delos Santos, Maria Cristina Edes, Bart Fenton, Nina Fernandez, Win Graham, Benjamin Hashizume, Shuji Jain, Anupma Jayewardene, Ruwani Khan, Muhammad Ehsan Konishi, Ayumi Markus, Vorpahl Miranda, Juan Mukhopadhyay, Hiranya Nam, Kee-Yung Park, Cyn-Young Parvez, Safdar Patacsil-Cruz, Carol Ramos, Mary Grace Rauniyar, Ganesh Department Economic Research Department Central Operations Services Office Office of the Chief Economist Central and West Asia Department Southeast Asia Regional Department Southeast Asia Regional Department Central and West Asia Department South Asia Regional Department Economic Research Department Economic Research Department East Asia Regional Department Central and West Asia Department Regional and Sustainable Development Department Central and West Asia Regional Department Independent Evaluation Department Private Sector Operations Department Southeast Asia Regional Department Central and West Asia Department Economic Research Department Pacific Regional Department Environment, Natural Resources & Agriculture Division South Asia Regional Department South Asia Regional Department Economic Research Department Economic Research Department Central and West Asia Regional Department South Asia Regional Department South Asia Regional Department Independent Evaluation Department

36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

Rhee, Changyong Samson, Jindra Santos, May Son, Hyun Sugiyarto, Guntur Sumulong, Leah Tamaki, Tomomi Vandenberg, Paul Walker, Wendy Wu, Guoqi Yang, Yingming Ye, Jingjing Yi, Jiang Yu, Fei Zhuang, Juzhong

Economic Research Department Economic Research Department Southeast Asia Regional Department Independent Evaluation Department Economic Research Department Office of the Chief Economist Central and West Asia Regional Department Economic Research Department East Asia Regional Department Board of Directors Board of Directors Economic Research Division Environment, Natural Resources & Agriculture Division Environment, Natural Resources & Agriculture Division Economic Research Department External Partcipants/Consultants Institution USAID Intern- ADB Consultant- Central West Regional Department Consultant- Economic Research Department Consultant-Independent Evaluation Department Consultant- Economic Research Department Consultant- Economic Research Department

Name 51 52 53 54 55 56 57 Britan, Gerald Davies, Rosemarie Gregorio, Cristina Malang, Lyndree Nguyen, Tu Chi Piza, Sharon Faye Quiao, Lotis

You might also like