You are on page 1of 13

Introduction to Monitoring and Evaluation Glossary

A
Activities:​ Activities are actions taken or work performed through which inputs such as funds,
technical assistance, and other types of resources are mobilized to produce specific outputs
Anonymity: ​Anonymity in evaluation means that it is impossible for anyone – including the
evaluator - to determine the identity of participants.
Averages:​ Averages are known as “summaries of central tendency in statistics”, and provide
information about the typical observation. The most commonly used averages in program M&E
are the “mean” and the “median”.

B
Baseline Study: ​A baseline study is sometimes just called a “baseline,” and is a study which
captures the conditions before the start of a project. Findings from the study provide a starting
reference to track any changes in the future and are usually followed by another similar study
much later in the project called an endline study.
Belmont Report: ​The Belmont Report is a report published by the U.S. National Commission for
the Protection of Human Subjects of Biomedical and Behavioral Research in 1979 which defined
three fundamental principles of ethics in research: respect for persons, beneficence, and
justice.
Beneficence:​ Beneficence is an ethical principle which specifies the need to “do no harm” and
to maximize the possible benefits for participants, while minimizing potential risks.
Breadth: ​Breadth refers to variation across diverse cases, which is important when determining
sample size. If the aim of the evaluation is to capture variation, a larger sample size may be
required.

C
Causal inference:​ Causal inference refers to being able to attribute any observed changes to
the activity or program. The term causation is often used to refer to causal inference.
Closed-Ended Questions:​ Closed-ended questions are questions with limited response choices,
such as “Yes/No,” “5 years,” and multiple-choice options such as, “Strongly Agree,” “Agree,”
“Disagree,” and “Strongly Disagree,” which allow for numeric summaries and comparison of
responses across participants.
Codebook: ​A codebook is a compiled list of all the codes you’ve used when coding, which
includes a description of each code, rules about when a code should and should not be applied,
and an example of coded text for reference, in order to ensure codes are applied in a consistent
way.
Coding: ​Coding is the systematic process of organizing, compiling, and summarizing the large
amount of data that qualitative methods typically generate by applying labels, or codes, to
segments of data that are relevant to answering your evaluation question.
Confidence Interval: ​Confidence intervals are used to determine statistical significance.
Confidence intervals describe a range of plausible values for an unknown estimate.
Confidentiality: ​Confidentiality refers to a condition in which the evaluator knows the identity
of the participant but takes care to protect others from discovering it. P
Convenience sampling:​ Convenience sampling is when a group is sampled out of convenience
(i.e. because of their ease of ability), rather than taking a systematic approach to sampling like
probability or purposive sampling.
Convergent Design:​ In the convergent design, also known as the concurrent or parallel design,
the qualitative and quantitative data are collected and analyzed in parallel, or during a similar
time frame.

D
Data:​ Data is information that can be quantitative (expressed as numbers and can be measured
or counted) or qualitative (descriptive and usually represented by text, such as transcripts of
audio or video recordings)
Data Abstraction: ​Data abstraction is the process of collecting quantitative data from
pre-existing data sources, which can be in electronic-form, such as data from a national
electronic health information system, or paper-based, such as patient charts, patient health
cards, facility registries, and facility log-books.

Data Collection: ​ Data collection is the process of gathering data in a systematic way. Data
collection is important in program M&E because it allows us to systematically gather data that
informs data-driven decisions about the program.

Data Collection Tool: ​A data collection tool is a tool that is used to collect data. Examples of
data collection tools include surveys, observation guides, interview guides, and focus group
discussion guides.

Data Processing: ​Data processing refers to the process of taking data that was collected and
preparing it for data analysis using practices such as data entry, transcription, data quality
checks, and data cleaning. Let’s talk about each of these.
Document Review: ​Document review is a way of systematically collecting data by reviewing
existing documents that are useful to answer the evaluation question, such as funding
proposals, program reports, meeting minutes, emails, newspaper articles, and government
policy documents.
Data Saturation: ​Data saturation is the point where new data no longer brings additional
insights to your evaluation questions. For quantitative data, when you have achieved a
sufficient sample size, you have reached data saturation, whereas for qualitative methods, you
must require continual review and analysis of incoming data so you know when you reach
saturation and can cease data collection.
Data Visualization: ​Data visualization is the graphical representation of data, such as bar charts,
line graphs, and heat charts, which are tables with color coding.
Declaration of Helsinki​: The Declaration of Helsinki​ ​was developed by the World Medical
Association in 1964. Its principles include that an individual has a right to make informed
decisions about participating in research, and that an individual’s welfare and interests should
be prioritized over science and any potential contribution to society.
Depth:​ depth refers to a deeper exploration of a phenomenon in data collection, which would
be better explored using in-depth interviews with a smaller number of participants.
Descriptive Analysis: ​Descriptive analysis is a way to summarize quantitative data using
measures such as frequencies, percentages, and averages.
Descriptive Statistics: ​ Descriptive statistics, often referred to as “summary statistics” refer to
group level measures such as frequencies, percentages, and averages, which can help
summarize all the available data to give program teams and stakeholders a quick snapshot of
the program, as well as help to identify patterns in the data.
Difference-in-differences Design​: A difference in differences design, also known as the “DID,” is
a design which compares the average change from pre to post among those who participate in
the activity or program to the average change from pre to post among those who did not
participate in the activity. The non-participant group is known as the “control” group, because
they show us what may have happened to the participant group in the absence of the activity
or program.
Dissemination Plan: ​A dissemination plan is used to guide the process of determining the
evaluation purpose and use. It includes the objective of conducting the evaluation, the effect
that disseminating hopes to have, the primary audience, the use of the findings are, and how
and when the findings will be disseminated

E
Economic Evaluation: ​Economic evaluations (such as cost analysis, cost-effectiveness, and
cost-benefit) are conducted to measure how much it costs for a program to achieve its health
benefits, which is useful to policy makers and funders to determine whether the resources to
invest in the program are available as well as whether the program is a worthwhile investment
in comparison to other health programs.
Endline Study: ​An endline study assesses the degree of change from baseline to endline during
the project period.
Ethics: ​Ethics are a set of norms that guide how we expect individuals and groups to behave.
Ethics are linked to cultural and societal values at a specific time in history and may change as
attitudes evolve.
Evaluation: ​An evaluation is the systematic process of collecting, analyzing, and using data
about a program’s design, implementation, and results to determine its effectiveness. In other
words, a program evaluation can provide insight on how well a program was implemented,
whether the program made a difference, and what aspects of the program made it successful
Exclusion Criteria: ​Exclusion criteria​ ​are the characteristics that exclude possible participants
from being included in the evaluation, are often put in place to protect potential participants
and to maintain proper ethical standards.
Explanatory Sequential Design: ​ ​In the explanatory sequential design, quantitative data are first
collected and analyzed, and then qualitative data are collected and analyzed to help explain the
quantitative results.
Exploratory Sequential Design: ​ ​In the exploratory sequential design, qualitative data are first
collected and analyzed to inform quantitative data that is then collected and analyzed. This
design is often used to inform survey development.

F
Focus Group Discussion​: A focus group discussion, which are often referred to by the
abbreviation, “FGD” aims to explore a specific set of topics in-depth with a group of people to
gather qualitative data on their perspectives and experiences. They are different from
interviews in that they explicitly rely on group interaction to generate insights that would be
less possible without the interaction found in a group.

Human Subjects: ​A human subject is a living person about whom data are obtained, either
through interaction with the individual, or via identifiable private information
I

Impact:​ Impact refers to the long-term, cumulative effects of programs or interventions over
time on what they ultimately aim to change, such as a reduction in HIV infection, AIDS-related
morbidity and mortality.
Impact Evaluations: ​An impact evaluation is conducted to assess if any observed changes can
be attributed to, or caused by, your project. These are less commonly done because they can
be resource intensive, demand a lot of time, require high quality data, and need highly skilled
evaluators.
Impact Statement:​ An impact statement is often the same as the overall, long-term goal of the
program. It answers the questions: What long-term change do we hope to see as a result of this
program? What problem or need does it address? The impact statement helps establish the
overall direction of the work to ensure shared understanding among program team members
and other collaborators.
Implausible data: ​implausible data refers to unlikely data, or data that does not fall within a
reasonable range of expected values. For example: If you are collecting age data, you could set
the plausible data range to be 0 to 122 years old, and an age value of 170 would be considered
implausible.
Implementation, Monitoring and Evaluation Phase: ​The implementation, monitoring, and
evaluation phase includes monitoring, and process, outcome, and impact evaluations as well as
dissemination activities.
Inclusion Criteria: ​Inclusion criteria are the key characteristics of the setting and participants
that determine if they will be included in the evaluation.
Indicator: ​An indicator is a unit of measurement that helps programs assess progress toward a
desired result. Indicators are important because they help define how interventions will be
measured and provide evidence about whether a program’s activities are making a difference.
Inferential Analysis: ​Inferential analysis allows the analyst to make inferences, or conclusions,
about a program or the population based on statistical significance.
Informed Consent: ​ is a process that involves telling potential participants about an evaluation
in a way they understand so they can make a voluntary decision about participating.
Inputs:​ Inputs are the resources that are needed to carry out the activities. These resources are
typically categorized into human, financial, physical, organizational, and community or
system-level resources.
Institutional Review Board: ​Institutional Review Boards (sometimes called IRBs) are
committees who formally review and approve research and evaluation activities to ensure that
they properly address human subjects protections.
Interrupted Time Series:​ Interrupted time series (sometimes called regression discontinuity)
design is like a pre and post design, with multiple pre and post points. The more pre and post
points, the stronger the design is in being able to make causal inference, that is, to be able to
attribute any observed changes to the activity or program, so this is a useful design for impact
evaluations.

Interview:​ An interview involves a trained interviewer who asks neutral questions using an
interview guide to an interviewee, who responds to those questions. Interviews are a way of
gathering in-depth qualitative data.

J
Justice:​ Justice is an ethical principle in research which specifies that the costs and benefits of
research should be distributed equitably. That is, no one group should bear all the burdens or
have access to all the benefits of the research outcomes.

Key Informant Sampling: ​Key informant sampling is one type of purposive sampling in which
key informants, or “insiders”, are selected because of their special knowledge, connections,
status, communication skills, and willingness to share what they know with the evaluator about
the topic of interest to help the evaluator obtain a more nuanced understanding of the target
population and its context.

Likert Scale: ​A Likert scale is a question that uses a scale to quantitatively measure perceptions
and experiences. Likert scales are typically on a scale of 3, 5, or 7-points. Examples include:
Level of agreement such as agree, neutral, or neither agree nor disagree, and disagree,
Frequency such as, always, often, sometimes, rarely, and never, and Level of satisfaction such
as very satisfied, satisfied, neutral, dissatisfied, and very dissatisfied.

Logic Model:​ A logic model is a visual representation to systematically map the relationship
between a project’s resources, activities, and the results of those activities. It illustrates a
project’s assumptions of how it intends to contribute to improvements in the target population.
Logic models will be covered in much greater detail elsewhere.
M
Maximum Variation Sampling: ​Maximum Variation Sampling, often referred to as
Heterogeneity Sampling, is a type of purposive sampling in which you deliberately select a wide
range of cases in order to capture diversity on a specific phenomenon of interest that is
relevant to the evaluation question. The purpose behind this sampling technique is to
document and describe the variation; and second, to identify key patterns that are shared
across the diverse cases.
Mean:​ The mean is the average value in our data set. The “mean” is calculated by adding up all
the values and then dividing that by the number of values.
Median: ​The median is the middle value in a distribution where the results are placed in order
by value.
Mixed Methods: ​A mixed methods design is an evaluation design which is a mix of quantitative
and qualitative methods called “mixed methods.” There are three commonly used mixed
methods designs in program evaluation; they are: explanatory sequential, exploratory
sequential, and the convergent designs. Each of these explain the order in which quantitative
and qualitative data are collected and how they inform each other.
Mixed Purposive Sampling: ​Mixed purposive sampling is when you combine two or more
sampling strategies in order to address your evaluation question. Mixed purposive sampling can
allow you to compare the information gathered through different sources and allows for more
flexibility in terms of serving the needs and interests of different stakeholders.
Mode:​ The value that appears most frequently in a group of data.
Monitoring:​ Monitoring refers to the ongoing, routine collection and analysis of data about a
program’s activities in order to measure program progress. Monitoring is concerned with
answering the fundamental question, “what are we doing?” Monitoring usually focuses on the
processes that occur during implementation.
A ​Monitoring and Evaluation Plan: ​A monitoring and evaluation plan is a table or document
that describes all the monitoring and evaluation activities for a particular project. The purpose
of this tool is to track progress toward project objectives and determine whether you have
achieved your desired results. M&E plans will also be discussed in-depth elsewhere.
Monitoring Deliverables: ​Monitoring deliverables include tables that contain information on
activities conducted and their associated indicators, as well as charts and/or graphs to depict
changes over time.

N
Needs Assessment: ​A needs assessment is a systematic process used to identify needs (or
“gaps”) in the target population, examine the causes of those gaps, and prioritize strategies to
address them. Needs are defined by focusing on the difference between “what is” and “what
should be.”
Null Hypothesis: ​A null hypothesis is a statement that there is no association between a
variable of interest and an outcome.

O
Observations:​ Observations involve collecting new data by observing and systematically
measuring and describing behaviors, events, and objects of interest, and documenting what is
observed. Observations can gather quantitative-only, qualitative-only, or a mix of both
quantitative and qualitative data.
Open Ended Questions: ​Open-ended questions are questions that allow people to respond
freely in their own words, such as “Please describe your experience at the clinic today.”
Open-ended questions are useful when collecting qualitative data from observations, surveys,
interviews, and focus groups. Open-ended questions have the ability to draw out in-depth
answers that are exploratory, meaningful, and culturally relevant.
Outcome:​ Outcomes are the short- and medium-term changes you hope to see that contribute
to the impact. They are not your program activities, but instead, are the changes you hope to
see as a result of the program activities. A program typically needs to achieve several outcomes
in order to achieve its long-term impact.
Outcome Evaluations: ​Outcome evaluations, sometimes called final evaluations, are commonly
conducted at the end of a project to assess how well it achieved its planned objectives and to
understand the difference it made. Data collected from baseline and endlines may be used for
the outcome evaluation.
Outputs:​ Outputs are the immediate effects of program or intervention activities; the direct
products or deliverables of program or intervention activities, such as the number of HIV
counseling sessions completed, the number of people served, the number of condoms
distributed.

P
Percentage: ​Percentages are a common descriptive statistic used in program monitoring.
Percentage is a number that represents the proportion of the whole, standardized to 100.

Planning Phase: ​The Planning Phase refers to the second phase that occurs prior to
implementation is project planning, which is directly informed by the findings from the needs
assessment. This phase often includes the development of a logic model and M&E plan, and
conducting a baseline study.
Population:​ A population is the unlimited source of data, whether it is the complete set of
people, organizations, events, or other observations. For example, “women with breast
cancer,” “HIV treatment centers,” or “emergency room visits in March,” are all examples of a
“population” in statistics.
Post-only design: ​A post only design describes a design where the data is collected and
analyzed only after the activity or program has been completed, which is an appropriate design
if you are looking to assess if a certain standard has been met after the activity or program has
completed.
Pre and Post Design:​ The pre and post design, also known as the before and after design, uses
the same data collection tool before the activity or program takes place and then again after
the activity or program has completed.
Probability Sampling:​ Probability sampling uses a random approach to select the sample from
the population. Random means that each unit in the population has an equal chance, or
probability, of being selected. For example, if you are selecting from a population of 100 health
care workers, each health care worker should have a 1 in 100 chance for being chosen.

P-value: ​P-value (or probability value) refers to the probability of​ obtaining test results at least
as extreme as the results actually observed, assuming that the null hypothesis is correct.
Problem Statement:​ A problem statement is a statement that provides a clear and
comprehensive understanding of the specific problem that the program aims to address. The
problem statement quickly tells an audience what the problem is. It should include the who,
where, what, and why – with the WHY being an explanation of why the issue matters.
Process evaluation: ​A process evaluation may be conducted to determine whether project
activities have been implemented as intended. Process evaluations often include a detailed
description of a project’s strengths and weaknesses and provide evidence-based
recommendations for project improvement.
Program Frameworks: ​Program frameworkers are a visual tool to map out what you are trying
to achieve and how you are going to achieve it.
Program Goal: ​A program goal is a broad, big picture statement that describes the ultimate aim
of the program, which establishes the overall direction, focus, and scope of a program. It
answers the questions, “What long-term change do we hope to see as a result of this
program?” and “What problem does it address?”
Program Theory: ​A program theory is a theory of the cause and effect of the program. A
program theory explains why a program is supposed to work by describing the chain of cause
and effects that lead to the achievement of a specific goal.
Purposive Sampling: ​Purposive sampling is an approach to sampling that strategically selects
participants based on predefined criteria that are relevant to answering your evaluation
question.
Q
Qualitative Indicators​: Qualitative indicators are indicators that are reported in words and
complement quantitative indicators by providing a fuller story about the context. Qualitative
indicators are useful for capturing participant perceptions, attitudes, or opinions about a given
topic, which can be difficult to measure numerically.
Qualitative Methods: ​Qualitative methods refer to the use of qualitative data, typically
expressed as text, to describe the variation, relationships, and individual experiences with
program outputs, outcomes and impacts.
Quantitative Indicators​: Quantitative indicators measure quantities or amounts: “how much,”
“how many,” or “how often”. They are expressed numerically, such as whole numbers,
percentages, ratios, and averages.
Quantitative Methods: ​Quantitative methods refer to the use of quantitative, or numeric, data
to measure the extent to which program outputs, outcomes, and impacts have been achieved.
Quasi-Experimental Designs​: Quasi experimental designs are ones in which the observed group
is not randomized to participate in the program. That is, the evaluators are not randomly
assigning who does and does not participate in program activities.

Quota Sampling:​ In quota sampling, you decide while designing the study how many people
with certain characteristics, such as age, gender, occupation, and health status to include as
participants. These chosen characteristics allow you to focus on participants you think will be
the most likely to offer useful information on the evaluation topics.

R
Randomized Experiments: ​Randomized experiments, also commonly known as randomized
controlled trials, are a type of design that randomly assigns individual people or groups to
either be exposed to the program or not, and the random assignment reduces the chances that
the participant and control groups differ in some way.
Respect for persons: ​Respect for persons is an ethical principle in research that recognizes that
people should have autonomy. In other words, people should be able to voluntarily choose if
they want to take part in research.
Routine Data Sources: ​Routine data sources such as health information systems or service
records such as logbooks and patient registers. Programs may also obtain monitoring data from
routine program records and reports, like training reports, inventories, employment records, or
meeting minutes.

S
Sample: ​A sample is a smaller portion of the population.
Sampling: ​Sampling refers to identifying who and how many people to study. In some sampling
approaches, like probability sampling, even if you are not able to speak to all women with
breast cancer about the symptoms they experience, you can speak with your sample, a smaller
group of women with breast cancer, to draw conclusions about the symptoms that the greater
population of women with breast cancer experience. In other sampling approaches, like
purposive sampling, your goal is not necessarily to have a sample that is representative of the
population, but rather to select a sample that can provide useful, in-depth information on the
topic of interest.
Semi-Structured Interviews: ​Semi-structured interviews are a type of interview which ​relies on
a prepared list of questions, they also allow interviewers to ask follow-up questions or probe
for further clarification when something is unclear or unexpected.
SMART Objective:​ A SMART Objective is one that is Specific (e.g. the objective is specific so
therefore expected change can be measured using a number or percentage); Measurable (i.e.
able to be quantified to understand the amount of change expected); Appropriate (i.e. should
fit within the scope of the work and mission of the organization); Realistic (i.e. should be
achievable and within the control of the organization); Timebound (i.e. clearly specify a time
within which the objective will be achieved).
Snowball Sampling:​ Snowball sampling, also known as chain referral sampling is when
already-sampled participants (often key informants) use their networks to refer the evaluator
to other people who could potentially participate in and contribute to the evaluation.

Spot checks:​ Spot checks are when data quality checks are conducted on select portion
Stakeholders: ​Program evaluation stakeholders are individuals or groups who would potentially
use the evaluation findings to make programmatic or policy decisions or be affected by these
resulting decisions. This could include the program’s priority population, government
stakeholders such as the Ministry of Health who may make policy or guidelines decisions based
on your evaluation findings, or the donor funding your program.
Statistical significance: ​Statistical significance refers to the output of a statistical test which
allows you to determine if the observed results are due to chance/random error or can be
attributed to the intervention. P-values and confidence intervals are used to determine if you
can infer statistical significance.
Structured Interviews: ​Structured interviews​ are a type of interview which require interviewers
to present the same questions in the same order to minimize variation. The tool often
resembles a qualitative survey more than an interview guide and may include both open- and
closed-ended questions.

Surveys: ​A survey is a way of systematically collecting data from people using a structured tool,
typically to make numeric summaries and generalizations about a population. Surveys can ask
closed-ended questions to collect quantitative data, open-ended questions to collect qualitative
data, or a mix to collect both quantitative and qualitative data.

T
Target: ​A target is the desired end point for an indicator; a specific, planned level of a result to
be achieved within a specific timeframe, with a given amount of resources. Targets determine
the size or amount of the intended change. Setting targets is often a requirement for
performance-based funding.
Thematic Analysis:​ Thematic analysis is a method of systematically identifying, organizing, and
offering insight into patterns of meaning or themes across a dataset. It involves careful,
detailed reading of data to identify themes that relate to a specific evaluation question.
A Theory of Change: ​A theory of change is a visual depiction of how and why a desired change
will happen.The one thing that the theory of change always has is a long-term goal at the end
with items pointing toward that end goal. From there, team members discuss what conditions
need to be in place to address the current problem or gap.
Triangulation: ​Triangulation refers to using multiple research methods, such as quantitative
and qualitative, to compare and support findings
Transcription: ​The process of creating a transcript, or a text-based version, of an audio or video
recording.

U
Unstructured interviews:​ ​Unstructured interviews encourage ​and prompt participants to talk
in-depth about the topic being investigated without using a pre-written question list.

V
Verbal consent process:​ A ​ verbal consent process involves the evaluator reading and
explaining an oral version of a consent form to participants.
Vulnerability: ​refers to a person's inability to protect themselves when faced with physical,
social, economic, environmental, or other threats

W
Written consent process:​ ​A written consent process involves using the consent form as a guide
to verbally explain the study. The consent form is a document that includes all the information
previously described and this should serve as the basis for a meaningful exchange between the
evaluator and the participant.

You might also like