You are on page 1of 80

Chapter Five

Data Collection Methods

Debrework Tesfaye
B.Ed., M.Sc., Assistant Professor
Data Collection Methods
• Data collection aims to address the research
question, and is critically important to a study’s
success.
• Without quality data collection techniques, the
accuracy of the research is easily challenged.
• It is therefore essential that the researcher be
familiar with the various techniques, including their
advantages and disadvantages, so that they can
select the one most suitable for the study purpose,
setting and population.
The data-collection process
• Data collection describes the way in which the researcher approaches
answering the research question.
• It provides an audit trail which includes a clear and specific
explanation of how data were collected, how the results or findings
were derived as well as the rationale for the method selected.
• When planning the data collection process, the researcher is guided by
five important questions:
1. What?
2. How?
3. Who?
4. Where?
5. When?
What data will be collected?
• The researcher must carefully consider what
information is needed to answer the research question.
• For example, does the question call for knowledge, or
attitudes or behaviors?
• If the researcher is concerned with the way crisis
situations affect students, the what of data collection
becomes students’ behaviors or responses in crises.
• The researcher must also consider whether to quantify
the data or analyze it qualitatively.
• In the former case, a decision must be made regarding
the level of measurement, or the measurement scale to
be used.
How will data be collected?
• The researcher must use an instrument to gather the data.
• The type can vary from a checklist to a questionnaire to a
sophisticated physiological measure.
• The choice of instrument is a decision that should be made
only after careful consideration. It is also important that the
manner in which data is captured is reliable.
• For example, if voice recorders are used, the researcher should
ensure these are in working order, and that fieldworkers know
how to use them.
• Raw data needs to be stored in a safe place, especially if it has
not yet been anonymised.
Who will collect the data?
• Teams of researchers usually collect data, and
people outside the research team may also be used.
• Data collectors can be paid for their services.
• However, it is necessary to ensure that data is
gathered in the same manner whenever more than
one person is involved.
• In addition, data collectors need training, and the
reliability of the collected data needs to be checked.
Where will the data be collected?
• The setting for data collection must be carefully
determined.
• It could take place in a controlled laboratory, a
sport field, a classroom, a ward, a clinic, a home,
a community center, within a specific region, and
so on.
Types of data
• Primary data: is data that you collect through
measurements, questionnaires, interviews and
observations which you use to investigate your
research problem.
• Secondary data: is previously published data found
in books, journals, government publications,
websites and other forms of media. Secondary data
is used to form rationales for your research and to
support or counter-argue your research findings.
Scales Measurement and Variables
• Measurement is the foundation of any scientific
investigation
• measurement is the assignment of numbers to objects
• Everything we do begins with the measurement of
whatever it is we want to study
• The relationship of the values that are assigned to the
attributes for a variable
Variables in Research
INTRODUCTION
• Each person/thing we collect data on is called an observation (in our
research work these are usually people/subjects).
• Observation (participants) possess a variety of
characteristics.
• If a characteristic of an observation (participant) is the same for every
member of the group i.e. it does not vary, it is called a constant
• If a characteristic of an observation (participant)
differs for group members it is called a variable.
MEANING OF VARIABLES
• A variable is a concept, construct or abstract idea that can be
described in measurable terms. In research, this term refers to the
measurable characteristics, qualities, traits, or attributes of a particular
individual, object, or situation being studied.
• Anything that can vary can be considered a variable. For instance,
age can be considered a variable because age can take different
values for different people or for the same person at different
times. Similarly, Income can be considered a variable
because a person's income can be assigned a value.
• Variables are properties or characteristics of
some event, object, or person that can take on
different values or amounts.
• A variable is not only something that
we measure, but also something that we can
manipulate and something we can control for.
TYPES OF VARIABLES
Dependent vs Independent

Variables
Independent variables are variables which are manipulated
or controlled or changed. It is what the researcher studies to
see its relationship or effects.
 Presumed or possible cause
• Dependent variables are the outcome variables and are the
variables for which we calculate statistics. The variable which
changes on account of independent variable is known as
dependent variable. i.e.It is influenced or affected by the
independent variable
 Presumed results(Effect)
The Relationship between
Independent and Dependent Variables
Example
• Imagine that a tutor asks 100 students to complete a
maths test. The tutor wants to know why some
students perform better than others. Whilst the tutor
does not know the answer to this, she thinks that it
might be because of two reasons: (1) some students
spend more time revising for their test; and (2) some
students are naturally more intelligent than others. As
such, the tutor decides to investigate the effect of
revision time and intelligence on the test performance
Solution
• Dependent Variable: Test Mark (measured
from 0 to 100)
• Independent Variables: Revision time
(measured in hours) Intelligence (measured
using IQ score)
Activity
• Indentify the dependent and Independent
Variables for the following examples:
1. A study of coach-player interaction at different
levels of coaching.
2. A comparative study of the professional
attitudes of secondary school teachers by
gender.
Solution
1. Independent variable: Level of schooling, four categories –
primary, upper primary, secondary and junior college.
Dependent variable: coach – player interaction
2. Independent variable: Gender of the teacher –
male, female.
Dependent variable: Score on a professional attitude
inventory.
Quantitative and Qualitative Variables
• Quantitative variables are ones that exist
along a continuum that runs from low to high.
Interval, and ratio variables are quantitative.
• Quantitative variables are sometimes called
continuous variables because they have a
variety (continuum) of characteristics.
• Height in inches and scores on a test would
Quantitative and Qualitative Variables
• Qualitative variables do not express
differences in amount, only differences.
• They are sometimes referred to as categorical
variables because they classify by categories.
Ordinal, Nominal variables are qualititative
• Nominal variables such as gender, religion, or
eye color are categorical variables.
Variable

Qualitative Quantitative

Nominal Ordinal Interval Ratio


Continuous vs Discontinuous Variables

• If the values of a variable can be divided into


fractions then we call it a continuous variable.
• Such a variable can take infinite number of
values. Income, temperature, age, or a test
score are examples of continuous variables.
• These variables may take on values within a
given range or, in some cases, an infinite set.
VARIABLES EXAMPLES Examples
•Gender:Male and female
Dichotomous
•Variables Type of property: Commercial

and residential

•Pregnant and non pregnant

•Alive and dead

•HIV positive and HIV negative, •Education: Literate and illiterate

Trichotomous Variables
Residence: Urban, semi urban and rural
Religion: Orthodox, catholic, muslim

Multiple Variables •Blood groups: A,B,AB and O


Extraneous variable
• It happens sometimes that after completion of
the study we wonder that the actual result is not
what we expected. In spite of taking all the
possible measures the outcome is unexpected. It
is because of extraneous variables
have not been adequately considered in the
• Variables that may affect research outcomes but
Extraneous variables exist in all studies and can
study are termed as extraneous variables.
the relationship among these variables.
affect the measurement of study variables and
• Extraneous variables that are not recognized until
the study is in process, or are recognized before the
study is initiated but cannot be controlled, are
referred to as confounding variables. These
variables interferes the results of the existing
activity.
• Certain external variables may influence the
relationship between the research variables, even
though researcher cannot see it. These variables are
called intervening variables.
Four Types of Measurement Scales
Nominal
Ordinal
Interval
Ratio
• The scales are distinguished on the relationships
assumed to exist between objects having different scale
values
• The four scale types are ordered in that all later scales
have all the properties of earlier scales—plus
additional properties
Nominal Scale
• Nominal scales are used when persons, events or other
phenomena are separated into mutually exclusive
categories: for example, married or single, divorced or
widowed, dead or alive, win/lose, yes/no .
• Not really a ‘scale’ because it does not scale objects along
any dimension
• It simply labels objects
• The values “name” the attribute uniquely.
• The value does not imply any ordering of the cases, for
example, jersey numbers in football.
Religious Affiliation
Catholic =1
Gender
Protestant = 2
Male = 1 Orthodox =3
Female = 2 Muslim =4
Other = 5
Categorical data are measured on nominal scales
which merely assign labels to distinguish categories
Ordinal Scale
• Used for variables that can be categorized and
rank ordered or assessed incrementally.
• Numbers are used to place objects in order
• When attributes can be rank-ordered…
• It allows you to say who is best and second best,
but does not tell you the difference between the
two.
• This type of data provides the researcher with a
rank order, but does not give an exact value.
• Distances between attributes do not have any
meaning, for example, code Educational Attainment as
0=less than H.S.; 1=some H.S.; 2=H.S. degree;
3=college diploma; 4=college degree; 5=post college
Is the distance from 0 to 1 the same as 3 to 4?
• But, there is no information regarding the differences
(intervals) between points on the scale
• For example, the feelings of a person are classified not
only as happy or sad, but also more specifically as
extremely happy, happy, indifferent, unhappy or
extremely unhappy, thus enabling the comparison
between degrees of a person’s happiness.
Interval Scale
• An interval scale is a scale on which equal intervals between
objects, represent equal differences
• When distance between attributes has meaning, for example,
temperature (in Fahrenheit) -- distance from 30-40 is same as
distance from 70-80
• Note that ratios don’t make any sense -- 80 degrees is not twice
as hot as 40 degrees (although the attribute values are).
• A 10-degree difference has the same meaning anywhere along
the scale
• The interval differences are meaningful
• But, we can’t defend ratio relationships
• If body temperature is being measured, a reading of 36.2° C
could be one category, 37.0 °C another and 37.8° C a third.
• The researcher would conclude that there is a difference of 0.8
°C between the first and second categories, as well as between
the second and third – indicating equal intervals.
• Similarly, if the researcher undertakes a study in which a
psychological test is used, the scores would represent interval
data. Two hundred people completed the test and 90 obtained
scores between 40 and 49, 30 obtained scores between 50 and
59, 60 obtained scores between 60 and 69, and 20 obtained
scores between 70 and 79.
• The scores are categorized into interval classes, which means
that they are ranked and the measurements between each class
are equal.
Ratio Scale
• A ratio level of measurement includes data which can be
categorized and ranked.
• Has an absolute zero that is meaningful
• Can construct a meaningful ratio (fraction), for example,
number of clients in past six months
• It is meaningful to say that “...we had twice as many clients
in this period as we did in the previous six months.
• Physical scales of time, weight, length and volume are
ratio scales
• We can say that 20 seconds is twice as long as 10 seconds
• Ratio scales range from zero upwards and cannot have
negative scores. For example, if a basketball team
scores 40 points, it is worth twice as much as their
opponents who have scored 20 points.
• However;
If a researcher designs a qualitative study, they are not
concerned with measurement scales and collect data in
narrative form instead.
The type of data needed also governs the how, who,
where and when of the data-collection process. The
answers to these questions are interrelated.
The Hierarchy of Levels

Nominal
The Hierarchy of Levels

Nominal Attributes are only named; weakest


The Hierarchy of Levels

Ordinal
Nominal Attributes are only named; weakest
The Hierarchy of Levels

Ordinal Attributes can be ordered

Nominal Attributes are only named; weakest


The Hierarchy of Levels

Interval
Ordinal Attributes can be ordered
Nominal Attributes are only named; weakest
The Hierarchy of Levels

Interval Distance is meaningful


Ordinal Attributes can be ordered
Nominal Attributes are only named; weakest
The Hierarchy of Levels

Ratio
Interval Distance is meaningful
Ordinal Attributes can be ordered
Nominal Attributes are only named; weakest
The Hierarchy of Levels

Ratio Absolute zero

Interval Distance is meaningful


Ordinal Attributes can be ordered
Nominal Attributes are only named; weakest
Discrete and Continuous Variables
• Discrete variables have a relatively small set of
possible values

gender, marital status, religious affiliation

• Continuous variables can (theoretically) assume


any value between the lowest and highest points on
the scale

time, distance, weight


Quantitative data collection techniques
• Several data collection techniques can be used for
quantitative data collection.
• You have covered how non-participant observation
can be used in both qualitative and quantitative
research – don’t forget about this technique when
considering quantitative research. Other techniques
used in quantitative research include questionnaires.
• The settings in which data will be collected are
either field-based data collection settings or
laboratory-based data collection settings.
Types of Quantitative data collection techniques
–Questionnaire
–Laboratory-based data
collection
–Field-based data collection
–Non-participant observation
Questionnaire
• In the questionnaire process, the respondent, who is the
unit of analysis, writes down their answers in
response to questions in a printed document.
• A well-designed questionnaire is easy for the
respondent to complete if they are literate, and is also
easy for the researcher to administer and score.
• Questionnaires are, however, difficult to develop. Each
aspect – from the questions themselves to the color of
the paper – can influence respondents’ replies.
• The researcher must therefore pay careful attention to
the development and construction of the
questionnaire.
A well-designed questionnaire should:
• meet the objectives of the inquiry
• demonstrate a fit between its contents and the
research problem and objectives
• obtain the most complete, accurate information
possible, and do so within reasonable limits of time
and resources.
Strength and weakness of questionnaire
Strength Weakness
• The researcher has to choose between using unstructured, open-ended
questions or structured, closed-ended ones.
• The former allow the respondent to answer in any way they see fit, while
the latter require the respondent to choose from a set of options.
• Examples of open-ended questions include:
• What do you think major problems facing sport sciences’ education today
are?
• Are there circumstances which make the use of marijuana outside the
home acceptable?
• Closed-ended questions can be ‘yes’ or ‘no’, multiple-choice, checklist-type,
‘true’ or ‘false’, and matching questions.
• Examples of closed-ended questions are:
• Are you well or ill?
• Please indicate your annual income level for the previous year with a tick
against the appropriate number
• Open-ended questions are not based on preconceived
answers and are appropriate only for explanatory studies,
case studies or studies based on qualitative analyses of data.
• They generally provide richer, more diverse data than can
be obtained with the use of closed-ended questions.
• Closed-ended questions limit the answers to options
provided by the researcher.
• This has several advantages for the researcher.
• It facilitates the coding and analysis of data.
• Respondents are able to complete more closed-ended
questions in a given amount of time, and are often more
willing to complete closed-ended questions.
Guidelines when formulating questions:
• They should be simple and short. Complex questions should
be broken up into several simpler ones.
• Questions should not be ‘double barreled’, that is, contain
two questions. For example: ‘Do you plan to pursue a master’s
degree in sport management and seek an administrative
position upon graduation?’ This question should be divided
into two separate questions.
• Questions should be unambiguous. Words which are too
general or vague, or that could be misinterpreted, should be
replaced with more specific terms. For instance, words like
‘often’, ‘many’ and ‘enough’ should be replaced by ‘three
times a week’, ‘10’, ‘two meals a day’, and so on.
• Questions should be understandable. Vocabulary
adapted to the participants’ level of education should
be used. Jargon and sophisticated language should
be avoided.
• Leading questions – questions that favour one type
of answer over another – should be avoided. For
example, ‘Don’t you agree that ...?’ and ‘... is it not
so?’
• Questions should be stated in an affirmative
manner.
• There are many methods of distributing
questionnaires:

•they can be emailed,


•hand delivered,
•given in groups, or
•administered one-on-one.
Laboratory-based data collection
• Involves collecting data in an environment where all the conditions and variables
are controlled, so that you are only measuring the variables in question.
• One advantage of laboratory-based data collection is that it has high levels of
internal validity.
• You are controlling all your variables so you know that you are only measuring the
aspect you mean to measure.
• One disadvantage of laboratory-based data collection is that it has low levels of
ecological validity because the data is not collected in an environment that
reflects the situation in which the activity is performed.
• Another disadvantage of laboratory-based data collection is that it normally
requires the use of expensive or technical equipment to collect data, making it
difficult to use this if you don’t have a lot of resources.
Field-based data collection
• Field-based data is collected in the environment that
simulates the one in which the sport is played.
• One of the key strengths of field-based data collection is
that it mimics the performance environment so you can
claim ecological validity when you are collecting
data in this setting.
• Field-based data collection can be cheaper than
laboratory-based collection, making it more
accessible to people without lots of resources.
• However, one limitation is that you don’t control all
the variables in this data collection setting, so it can
be difficult to claim internal validity.
Non-participant observation
• Non-participant observation involves the
researcher observing ‘from the outside’.
• There is no interaction with the individuals or the
activity being observed.
• For example, if you wanted to look at injuries during
a Football match, you could watch how many
injuries happened, what types of injuries they
were and record the numbers on a data recording
sheet.
Advantages and Disadvantages of Observation
Methods of data collection in qualitative

research
The difference between quantitative and qualitative mainly lies in the
manner in which a method is applied in an actual data collection situation.
• Use of these methods in quantitative research demands standardization of
questions to be asked of the respondents, a rigid adherence to their structure
and order, an adoption of a process that is tested and predetermined, and
making sure of the validity and reliability of the process as well as the
questions.
• However, the methods of data collection in qualitative research follow a
convention which is almost opposite to quantitative research. The
wording, order and format of these questions are neither predetermined nor
standardized.
• Qualitative methods are characterized by flexibility and freedom in terms of
structure and order given to the researcher.
• There are three main methods of data collection in
qualitative research:
1. unstructured interviews;
2. participant observation;
3. Secondary sources.
1. Unstructured interviews
• Flexibility, freedom and spontaneity in contents and
structure underpin an interaction in all types of unstructured
interview.
• This interaction can be at a one-to-one (researcher and a
respondent) or a group (researcher and a group of
respondents) level.
• There are several types of unstructured interview that are
prevalent in qualitative research, for example
– in-depth interviewing,
– focus group interviewing, (FGD)
– narratives and
– oral histories.
In-depth interviews
• known as the interpretive tradition.
• In-depth interviewing is ‘repeated face-to-face
encounters between the researcher and informants
directed towards understanding informants’
perspectives on their lives, experiences, or situations
as expressed in their own words’.
• two essential characteristics of in-depth interviewing:
• (1) it involves face-to-face, repeated interaction
between the researcher and his/her informant(s); and
• (2) it seeks to understand the latter’s perspectives.
Focus group interviews/discussion
• The only difference between a focus group interview and an in-
depth interview is that the former is undertaken with a group
and the latter with an individual.
• In a focus group interview, you explore the perceptions,
experiences and understandings of a group of people who
have some experience in common with regard to a situation or
event.
• For example, you may explore with relevant groups such issues
as domestic violence, physical disability or refugees.
• In focus group interviews, broad discussion topics are developed
beforehand, either by the researcher or by the group.
Narratives
• The narrative technique of gathering information has even less structure than the focus
group.
• Narratives have almost no predetermined contents except that the researcher seeks to
hear a person’s retelling of an incident or happening in his/her life. Essentially, the person
tells his/her story about an incident or situation and you, as the researcher, listen
passively.
• Occasionally, you encourage the individual by using active listening techniques; that is, you
say words such as ‘uh huh’, ‘mmmm’, ‘yeah’, ‘right’ and nod as appropriate.
• Basically, you let the person talk freely and without interrupting.
• Narratives are a very powerful method of data collection for situations which are sensitive in
nature.
• For example, you may want to find out about the impact of child sexual abuse in sport on
people who have gone through such an experience.
• You, as a researcher, ask these people to narrate their experiences and how they have been
affected.
Oral histories
• Oral histories, like narratives, involve the use of both
passive and active listening.
• Oral histories, however, are more commonly used for
learning about a historical event or episode that took
place in the past or for gaining information about a
cultural, custom or story that has been passed from
generation to generation.
• Narratives are more about a person’s personal
experiences whereas historical, social or cultural
events are the subjects of oral histories.
Strength and Weakness of Interview
Strength Weakness
Participant observation
• Participant observation means that the researcher is
actively involved in the topic they are researching.
• For example, if you were studying team cohesion in
football, you could join a football team, to observe
‘from the inside’ and gain your own experiences of
cohesion as a player.
• Data would then be recorded in the form of field notes,
with you recording your own thoughts, feelings,
opinions, emotions and experiences.
• This method is useful when trying to discover the more
delicate aspects of group behavior that are not easy to
see from the outside.
Collecting data using secondary sources
• So far we have discussed the primary sources of data collection where
the required data was collected either by you or by someone else for
the specific purpose you have in mind.
• There are occasions when your data have already been collected by
someone else and you need only to extract the required
information for the purpose of your study.
• Both qualitative and quantitative research studies use secondary
sources as a method of data collection.
• In qualitative research you usually extract descriptive (historical and
current) and narrative information and in quantitative research the
information extracted is categorical or numerical.
• The following section provides some of the many secondary sources grouped into
categories:
• Government or semi-government publications – There are many government and
semi-government organizations that collect data on a regular basis in a variety of areas
and publish it for use by members of the public and interest groups.
• Some common examples are the census, vital statistics registration, labor force
surveys, sport participants, health reports, economic forecasts and demographic
information.
• Earlier research – For some topics, an enormous number of research studies that have
already been done by others can provide you with the required information.
• Personal records – Some people write historical and personal records (e.g. diaries) that
may provide the information you need.
• Mass media – Reports published in newspapers, in magazines, on the Internet, and so
on, may be another good source of data.
Problems with using data from secondary sources
• When using data from secondary sources you need to be careful as there
may be certain problems with the availability, format and quality of data.
• The extent of these problems varies from source to source.
• While using such data some issues you should keep in mind are:
• Validity and reliability – The validity of information may vary markedly
from source to source. For example, information obtained from a census is
likely to be more valid and reliable than that obtained from most personal
diaries.
• Personal bias – The use of information from personal diaries, newspapers
and magazines may have the problem of personal bias as these writers are
likely to exhibit less rigorousness and objectivity than one would expect in
research reports.
• Availability of data – It is common for beginning researchers to
assume that the required data will
be available, but you cannot and should not make this assumption.
• Therefore, it is important to make sure that the required data is
available before you proceed further with your study.
• Format – Before deciding to use data from secondary sources it is
equally important to ascertain that the data is available in the required
format.
• For example, you might need to analyze age in the categories 23–33,
34–48, and so on, but, in your source, age may be categorized as 21–
24, 25–29, and so on.
Data quality and its determinants
• All researchers want to produce quality research.
• They want results to be meaningful, to reflect reality
as accurately as possible, and to be replicable.
• Unfortunately, all measurement is accompanied by the
possibility of error.
• No data-collection technique is perfect.
• It is therefore essential that researchers control for
error, and reduce error as much as possible.
• Thus we will discuss factors that can affect reliability
and validity in data collection.
Validity and Reliability of a Research Instrument
• Reliability refers to the extent that the instrument
yields the same results over multiple trials.
• Validity refers to the extent that the instrument
measures what it was designed to measure.
• In research, there are three ways to approach validity
and they include
• content validity,
• construct validity, and
• criterion-related validity.
• Content validity measures the extent to which the items that
comprise the scale accurately represent or measure the
information that is being assessed. Are the questions that are
asked representative of the possible questions that could be asked?
• Construct validity measures what the calculated scores mean
and if they can be generalized.
• Construct validity uses statistical analyses, such as correlations, to
verify the relevance of the questions.
• Questions from an existing, similar instrument, that has been found
reliable, can be correlated with questions from the instrument under
examination to determine if construct validity is present.
• If the scores are highly correlated it is called convergent
validity. If convergent validity exists, construct validity is
supported.
• Criterion-related validity has to do with how well the
scores from the instrument predict a known outcome
they are expected to predict.
• Statistical analyses, such as correlations, are used to
determine if criterion-related validity exists. Scores
from the instrument in question should be correlated
with an item they are known to predict. If a correlation
of > .60 exists, criterion related validity exists as well.
• Reliability can be assessed with the test-retest method,
alternative form method, internal consistency method, the
split-halves method, and inter-rater reliability.
• Test-retest is a method that administers the same instrument
to the same sample at two different points in time, perhaps
one-year intervals.
• If the scores at both time periods are highly correlated, >
.60, they can be considered reliable.
• The alternative form method requires two different
instruments consisting of similar content.
• The same sample must take both instruments and the
scores from both instruments must be correlated.
• If the correlations are high, the instrument is considered
reliable.
• Internal consistency uses one instrument administered
only once.
• The coefficient alpha (or Cronbach’s alpha) is used to
assess the internal consistency of the item. If the alpha
value is .70 or higher, the instrument is considered
reliable.
End of Chapter

You might also like