0% found this document useful (0 votes)
116 views24 pages

Data Collection Methods and Importance

Data collection is the process of gathering and measuring information to answer research questions and test hypotheses. It involves determining what type of data is needed, the source, and collection method. There are two primary methods: quantitative, which uses structured tools to fit experiences into categories for easy analysis, and qualitative, which provides contextual understanding through descriptive data. Ensuring accurate, appropriate collection is essential for maintaining research integrity regardless of field or data type. Improper collection can lead to distorted findings and wasted resources.

Uploaded by

ROSE ANN SAGUROT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views24 pages

Data Collection Methods and Importance

Data collection is the process of gathering and measuring information to answer research questions and test hypotheses. It involves determining what type of data is needed, the source, and collection method. There are two primary methods: quantitative, which uses structured tools to fit experiences into categories for easy analysis, and qualitative, which provides contextual understanding through descriptive data. Ensuring accurate, appropriate collection is essential for maintaining research integrity regardless of field or data type. Improper collection can lead to distorted findings and wasted resources.

Uploaded by

ROSE ANN SAGUROT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CHAPTER 1 - OBTAINING DATA

LESSON 1 - DATA COLLECTION

Lesson Objectives:

At the end of the lesson, students should:

1. Define data collection;

2. Explain the critical objective of data collection and importance of quality data in research work;

3. Discuss the importance of ensuring accurate and appropriate data collection;

4. Define and differentiate quality control and quality assurance;

5. Enumerate the results of improper data collection; and

6. Define and differentiate the two methods of collecting data and types of data.

Introduction

In research, statisticians use data in many ways. Data can be used to describe situations or events.

For example, a manufacturer might want to know something about the consumers who will be

purchasing his product so he can plan an effective marketing strategy. In another situation, the

management of a company might survey its employees to assess their needs to negotiate a new
contract

with the employees’ union. Data can be used to determine whether the educational goals of a school

district are being met. Finally, trends in various areas, such as the stock market, can be analyzed,
enabling

prospective buyers to make more intelligent decisions concerning what stocks to purchase. These

examples illustrate a few situations where collecting data will help people make better decisions on

courses of action.

Lesson Proper

Data Collection

- The process of gathering and measuring information on variables of interest, in an established

systematic fashion that enables one to answer stated research questions, test hypotheses, and

evaluate outcomes.

- The process of gathering and measuring data, information or any variables of interest in a
standardized and established manner that enables the collector to answer or test hypothesis and

evaluate outcomes of the particular collection.

- This is an integral, usually initial, component of any research done in any field of study such as the

physical and social sciences, business, humanities and others.

- A process by which the researcher collects the information from all the relevant sources to find

answers to the research problem, test the hypothesis and evaluate the outcomes.

- Defined as the procedure of collecting, measuring and analyzing accurate insights for research using

standard validated techniques. A researcher can evaluate their hypothesis on the basis of collected

data.

- In most cases, data collection is the primary and most important step for research, irrespective of

the field of research. The approach of data collection is different for different fields of study,

depending on the required information.

- Enables a person or organization to answer relevant questions, evaluate outcomes and make

predictions about future probabilities and trends.

- The systematic approach to gathering and measuring information from a variety of sources to get a

complete and accurate picture of an area of interest.

Most Critical Objective of Data Collection

- Ensuring that information-rich and reliable data is collected for statistical analysis so that data-driven

decisions can be made for research.

Primary Goal of any Data Collection Endeavor

- To capture quality data or evidence that easily translates to rich data analysis that may lead to

credible and conclusive answers to questions that have been posed.

The Importance of Ensuring Accurate and Appropriate Data Collection

Regardless of the field of study or preference for defining data (quantitative, qualitative), accurate data

collection is essential to maintaining the integrity of research. Both the selection of appropriate data

collection instruments (existing, modified, or newly developed) and clearly delineated instructions for

their correct use reduce the likelihood of errors occurring.

Consequences from improperly collected data include:


• inability to answer research questions accurately

• inability to repeat and validate the study

• distorted findings resulting in wasted resources

• misleading other researchers to pursue fruitless avenues of investigation

• compromising decisions for public policy

• causing harm to human participants and animal subjects

Issues Related to Maintaining Integrity of Data Collection:

The primary rationale for preserving data integrity is to support the detection of errors in the data

collection process, whether they are made intentionally (deliberate falsifications) or not (systematic or

random errors).

Most, Craddick, approaches Crawford, Redican, Rhodes, Rukenbrod, and Laws (2003) describe ‘quality

assurance’ and ‘quality control’ as two that can preserve data integrity and ensure the scientific validity

of study results. Each approach is implemented at different points in the research timeline (Whitney,
Lind,

Wahl, 1998):

1. Quality assurance - activities that take place before data collection begins

2. Quality control - activities that take place during and after data collection

Quality Assurance

• Since quality assurance precedes data collection, its main focus is 'prevention' (i.e., forestalling

problems with data collection).

• Prevention is the most cost-effective activity to ensure the integrity of data collection.

• This proactive measure is best demonstrated by the standardization of protocol developed in a

comprehensive and detailed procedures manual for data collection.

*validity - the degree to which an instrument actually measures what it purports to measure

*standardization of protocol - ensuring that all elements of a protocol are implemented in exactly the

same manner

content analysis – a technique used in qualitative analysis to study written material by breaking it into

meaningful units, using carefully applied rules.


Quality Control

While quality control activities (detection/monitoring and action) occur during and after data collection,

the details should be carefully documented in the procedures manual. Quality control also identifies the

required responses, or ‘actions’ necessary to correct faulty data collection practices and also minimize

future occurrences. These actions are less likely to occur if data collection procedures are vaguely
written

and the necessary steps to minimize recurrence are not implemented through feedback and education

(Knatterud, et al, 1998).

Examples of data collection problems that require prompt action include:

• errors in individual data items

• systematic errors

• violation of protocol

• problems with individual staff or site performance

• fraud or scientific misconduct

*documented – furnished with or supported by written/recorded citations

*scientific misconduct – fabrication, falsification or plagiarism in proposing, performing, or reviewing

research results (Steneck, Zinn, 2003)

Accurate Data Collection

- Essential to maintaining the integrity of research, making informed business decisions and ensuring

quality assurance.

For example:

✓ In retail sales, data might be collected from mobile applications, website visits, loyalty programs

and online surveys to learn more about customers.

✓ In a server consolidation project, data collection would include not just a physical inventory of

all servers, but also an exact description of what is installed on each server -- the operating

system, middleware and the application or database that the server supports.

- Essential to ensure the integrity of the research, regardless of the field of study or data preference
(quantitative or qualitative). The selection of appropriate data collection tools and instruments

which may be existing, modified or totally new, and with clearly defined instructions for their proper

use, reduces, the chances of errors occurring during collection.

Distorted findings are often the result of improper data collection such as misleading questions on

questionnaires, unknowingly omitting the collection of some supporting data, and other unintentional

errors. This would lead to a skewed conclusion that may be useless.

While collecting the data, the researcher must identify the type of data to be collected, source of data,

and the method to be used to collect the data. Also, the answers to the questions that who, when and

where the data is to be collected should be well addressed by the researcher.

There are two methods of collecting data. They are,

1. Quantitative Data Collection

2. Qualitative Data Collection

Quantitative Data Collection Methods

- It rely on random sampling and structured data collection instruments that fit diverse experiences,

into predetermined response categories. They produce results that are easy to summarize, compare,

and generalize.

Qualitative Data Collection Methods

- It play an important role in impact evaluation by providing information useful to understand the

processes behind observed results and assess changes in people’s perceptions of their well-being.

Regardless of the kinds of data involved, data collection in a qualitative study takes a great deal of

time. The researcher needs to record any potentially useful data thoroughly, accurately, and

systematically, using field notes, sketches, audiotapes, photographs and other suitable means. The

data collection methods must observe the ethical principles of research.

Generally, there are two types of data: quantitative data and qualitative data.

Quantitative data is any data that is in numerical form -- e.g., statistics and percentages.

Qualitative data is descriptive data -- e.g., color, smell, appearance and quality.

In addition to quantitative and qualitative data, some organizations might also make use of secondary

data to help drive business decisions. Secondary data is typically quantitative in nature and has already
been collected by another party for a different purpose. For example, a company might use U.S. Census

data to make decisions about marketing campaigns. In media, a news team might use government
health

statistics or health studies to drive content strategy.

As technology evolves, so does data collection. Recent advancements in mobile technology and the

Internet of Things are forcing organizations to think about how to collect, analyze and monetize new

data. At the same time, privacy and security issues surrounding data collection heat up.

LESSON 2 – METHODS OF DATA COLLECTION

Lesson Objectives:

At the end of the lesson, students should:

1. Define and differentiate the different methods of data collection.

2. Explain the advantages and disadvantages of different methods of data collection.

3. Explain the importance/purpose of different methods of data collection in research work.

4. Identify the type of the study describe in the situation.

5. Discuss the type of interference that can and cannot be drawn from the study.

6. Identify the experimental units and the treatments from the study or situation.

Introduction

“So far, we have learned how to explore, summarize, display, and describe patterns in data. But

how did we get that data in the first place? This chapter provides an introduction to five ways of

obtaining data; questionnaire, interview, experiment, sample survey, observational study/direct

observation. You will learn about how to obtain data using these methods, about different types of
biases

that can get introduced due to inaccurately applying these methods, and about different types of

conclusions that can be drawn from data obtained using these different methods.” (Richard L. Scheaffer,

[Link], 2012, p.130).

“The key to decision making is objective data; the key to good decision making is good objective

data. It is not enough just to have data; the data must be valid and reliable in that they actually measure

what they are supposed to measure and do so to a reasonable degree of accuracy.” (Richard L.
Scheaffer,
[Link], 2012, p.131).

Lesson Proper

There are many methods of gathering information, and a wide variety of information sources. The

following are the few methods of collecting information for research projects.

1. Questionnaire 4. Survey

2. Interview 5. Observational study

3. Experimental Study

Surveys, interviews and focus groups are primary instruments for collecting information. Today, with

help from Web and analytics tools, organizations are also able to collect data from mobile devices,

website traffic, server activity and other relevant sources, depending on the project.

The choice of data collection methods depends on the research problem under study, the research
design

and the information gathered about the variable. Broadly, the data collection methods can be classified

into two categories:

• Primary Data Collection Methods: The primary data are the first-hand data, collected by the

researcher for the first time and is original in nature. The researcher collects the fresh data when the

research problem is unique, and no related research work is done by any other person. The results

of the research are more accurate when the data is collected directly by the researcher but however

it is costly and time-consuming.

• Secondary Data Collection Methods: When the data is collected by someone else for his research

work and has already passed through the statistical analysis is called the secondary data. Thus, the

secondary data is the second-hand data which is readily available from the other sources. One of

the advantages of using the secondary data is that it is less expensive and at the same time easily

available, but however the authenticity of the findings can be questioned.

Thus, the researcher can obtain data from either of the sources depending on the nature of his study
and

the pursued research objective.

2.1 Questionnaire
Questionnaires are a popular means of collecting data. But the designing is difficult because it often

requires many re-writes before finalization. The most important issue related to data collection is

choosing the most appropriate information or evidence to answer the author’s questions. To plan

data collection the author had to think about the questions to be answered and information sources

available. Also it had to think how these data could be organized, interpreted and then reported to

various audiences before finalizing the questionnaires.

There are advantages of questionnaires. Some of them are,

▪ Can be used as a method in its own right or as a basis for interviewing or a telephone survey

▪ Can be posted, e-mailed or faxed

▪ Can cover the large number of people and organization

▪ Wide geographical coverage

▪ Relatively cheap

▪ No prior arrangements are needed

▪ Avoid embarrassment on the part of the respondent

▪ No interviewer bias

▪ Possible anonymity of respondent

There are also disadvantages of questionnaires. They are,

▪ Designing problem

▪ Question have to be relatively simple

▪ Time delay whilst waiting for responses to be returned

▪ Assume no literacy problems

▪ No control over who completes it

▪ Problems with incomplete questionnaires.

The targeted group of people had to be selected carefully to avoid such disadvantages.

2.2 Interview

• An interview is generally a qualitative research technique which involves asking open-ended

questions to converse with respondents and collect elicit data about a subject.

• Interviews are similar to focus groups and surveys when it comes to garnering information from
the target market but are entirely different in their operation – focus groups are restricted to a

small group of 6-10 individuals whereas surveys are quantitative in nature.

• Interviews are conducted with a sample from a population and the key characteristic they exhibit

is their conversational tone.

• The interviewer in most cases is the subject matter expert who intends to understand respondent

opinions in a well-planned and executed series of questions and answers.

• Interviewing is a great way to learn detailed information from a single individual or small number

of individuals. This is a main data collection method used in the research. It is very useful when

someone wants to gain expert opinions on the subject or talk to someone knowledgeable about

a topic.

Methods of Research Interviews:

There are several types/methods to conduct research interviews, each of which is peculiar in its

application and can be used according to the research study requirement. The author has to select

one kind of interviewing method considering the type of technology which is available and the

availability of the individual the author is interviewing, and how comfortable author feels talking to

people.

These are the methods of interviews which are very popular among the researchers:

1. Face to face Interviews (in-person interviews)

2. Phone Interviews

3. Email Interviews

4. Chat/Messaging Interviews (Online)

Face to face Interviews/Personal Interviews: (in-person interviews)

✓ Personal interviews are one of the most used types of interviews, where the questions are asked

personally directly to the respondent. For this, a researcher can have a guide online surveys to

take note of the answers. A researcher can design his/her survey in such a way that they take

notes of the comments or points of view that stands out from the interviewee.

✓ When the author sits down and talks with someone it is a face-to-face interview. It is very

important that the author can adapt questions to the answers of the person author is

interviewing and also it is needed to bring recording device for the interview.
Advantages:

• Higher response rate.

• When the interviewees and respondents are face-to-face, there is a way to adapt the questions

if this is not understood.

• More complete answers can be obtained if there is doubt on both sides or a particular

information is detected that is remarkable.

• The researcher has an opportunity to detect and analyze the interviewee’s body language at

the time of asking the questions and taking notes about it.

• In-depth and a high degree of confidence on the data

Disadvantages:

✓ They are time-consuming and extremely expensive.

✓ They can generate distrust on the part of the interviewee, since they may be self-conscious and

not answer truthfully.

✓ Contacting the interviewees can be a real headache, either scheduling an appointment in

workplaces or going from house to house and not finding anyone.

✓ Therefore, many interviews are conducted in public places, such as shopping centers or parks.

There are even consumer studies that take advantage of these sites to conduct interviews or

surveys and give incentives, gifts, coupons, in short; there are great opportunities for online

research in shopping centers.

✓ Among the advantages of conducting these types of interviews is that the respondents will have

more fresh information if the interview is conducted in the context and with the appropriate

stimuli, so that researchers can have data from their experience at the scene of the events,

immediately and first hand. The interviewer can use an online survey through a mobile device

that will undoubtedly facilitate the entire process.

Telephonic Interviews/ Phone Interviews:

✓ Telephonic interviews are widely used and easy to combine with online surveys to carry out

research effectively.
✓ If author needs to interview someone who is geographically far away, or too busy to personally

meet, or does not have internet connectivity, the phone interview method is very convenient.

Advantages:

• To find the interviewees it is enough to have their telephone numbers on hand.

• They are usually lower cost.

• The information is collected quickly.

• Having a personal contact can also clarify doubts, or give more details of the questions.

• High degree of confidence on the data collected, reach almost anyone

Disadvantages:

• Many times researchers observe that people do not answer phone calls because it is an unknown

number for the respondent, or simply already changed their place of residence and they cannot

locate it, which causes a bias in the interview.

• Researchers also face that they simply do not want to answer and resort to pretexts such as they

are busy to answer, they are sick, they do not have the authority to answer the questions asked,

they have no interest in answering or they are afraid of putting their security at risk.

• One of the aspects that should be taken care of in these types of interviews is the kindness with

which the interviewers address the respondents, in order to get them to cooperate more easily

with their answers. Good communication is vital for the generation of better answers.

• Expensive, cannot self-administer, need to hire an agency

Email or Web Page Interviews:

✓ Online research is growing more and more because consumers are migrating to a more virtual

world and it is best for each researcher to adapt to this change.

✓ The increase in people with Internet access has made it popular that interviews via email or

web page stand out among the types of interviews most used today. For this nothing better

than an online survey.

✓ More and more consumers are turning to online shopping, which is why they are a great niche

to be able to carry out an interview that will generate information for the correct decision
making.

✓ The author used this method to get some clarification of the information received from the

questionnaire.

✓ This method is highly convenient for most individuals who are used to emailing frequently.

✓ It is also less personal than face to face or phone interviews. But it may not get more

information from an individual in an email interview because author is not able to follow up

questions or play off the interview response. However, email interviews are useful because

they are already in a digital format.

Advantages of email interviews:

• Speed in obtaining data

• The respondents respond according to their time, at the time they want and in the place they

decide.

• Online surveys can be mixed with other research methods or using some of the previous

interview models. They are tools that can perfectly complement and pay for the project.

• A researcher can use a variety of questions, logics, create graphs and reports immediately.

• Can reach anyone and everyone – no barrier

Disadvantages of email interviews:

• Expensive, data collection errors, lag time

Chat/Messaging Interviews (Online)

✓ Using instant messaging services like MSN messenger, Google talk, Skype, SMS messages using

mobile phones, the author is able to collect necessary information relating to the research

project.

✓ These interview methods allow to get information from the people who are living/working far

away and who are having internet connectivity and it is also convenient for Chat/Messaging

methods.

Advantages of Chat/Messaging Interviews:

• Cheap

• Can self-administer
• Very low probability of data errors

Disadvantages of Chat/Messaging Interviews:

• Not all your customers might have an email address/be on the internet

• Customers may be wary of divulging information online

When setting up an interview the author make sure to be courteous and professional. Before starting

the interview the author explained the reason of the interview, what author wanted to talk to them

about, and what the research project the author is going to do? Getting permission from the officers

who were engaged in interviews, author was able to use video recorders to record the conversations

held.

When conducting interviews the author adhered to the following rules.

• Carefully selected the questions asked.

• Started interview with some small talks

• Brought extra recording device (another video recorder)

• Author paid more attention while the interviews were going on

• Came to the interview prepared

• Did not pester or push the officer. The author was interviewing and if he/she did not talk about

an issue, author respected and did not push them

• At the interview time author was rigid with his questions

• Did not allow the officer to get off the topic and asked follow up questions to redirect the

conversation to the subject

Conclusion

✓ Undoubtedly, the objective of the research will set the pattern of what types of interviews are

best for data collection. Based on the research design, a research can plan and test the questions,

for instance, if the questions are the correct and if the survey flows in the best way.

✓ In addition there are other types of research that can be used under specific circumstances, for

example in the case of no connection or adverse situations to carry out surveyors, in these types

of occasions it is necessary to conduct a field research, which cannot be considered an interview

if not rather a completely different methodology.


✓ To summarize the discussion, an effective interview will be one that provides researchers with

the necessary data to know the object of study and that this information is applicable to the

decisions researchers make.

2.3 Survey

Sample Surveys and Inference about Populations

Some studies are designed for the purpose of estimating population characteristics, such as means or

proportions. Before planning such a data collection activity, we need to identify the population

involved.

Population - is the entire group of individuals or items in a study.

Sample – a part of a population that is actually studied.

Frame - a list (or comparable form of identification) all of members of a population

Example:

• a list of all students in the college of engineering

• a list of all equipment owned by a company

• a list of possible errors that can occur when a program is run

• a list of all addresses served by a power supplier

• a list of all bidders for a construction project

• a list of all the trees in a particular plot

Could serve a frame for various studies. For many population, like residents of a state, a frame is not

readily available.

A sample is a part of population that is actually studied. For example,

• All the fish Mobile Bay constitute by a population, but the fish caught to measure mercury levels

make up a sample.

• All the items produce in one run of an assembly line make up of a population, but the items inspected

for defects make up a sample.

Sample Survey - A sample is collected and studied to gain information about a population.

For example:

• During the process of negotiating an annual contract with a parking facility, a manager of a large
company wants to know how many employees will need parking space next year. How can we

get the reliable information? One way is to question all employees, but this procedure would be

somewhat inaccurate and very time-consuming. We could take the number of space in use this year

and assume that the need for the next year will about the same, but this method would have

inaccuracies as well. A simple technique that works well cases is to select a sample ‘of employees

not planning to retire at the end of this year and ask each selected employee if he or she will be

requesting a parking space. From the proportion of” yes” answers, an estimate of the number of

parking space required by the entire population of the employees can be obtained.

• When Alabama was planning to offer tax incentives to Mercedes for building a plant in Alabama

the Mobile register conducted a telephone survey of about 400 adult Alabama residents and asked

them, “Should Alabama offer tax incentives to industries to relocate in the state”? People

respondent by saying “agree,” “disagree,” or “don’t know”. From those who agreed, an estimate

of the percent of adult Alabama residents in favor of offering tax incentives to industries for

relocating to the state was estimated.

The scenarios outlined above have all elements of a typical sample survey. There is a question of “How

many?” or “How much” to be determined for a specific target population, the population to which we

intend to apply the result of the study. The population from which the data is collected is known as the

sampled population. It is desirable to have the target population the same as sampled population, but in

some circumstances they might differ. For example, random-digit-dialing telephone poll systematically

leave out those without telephone and may miss those with cell phones.

An approximate answer for a population is derive from a sample of data extracted from the population

of interest. Of key importance is the fact that the approximate answers will be a good approximation

only if the sample truly represents the population under study. Randomization plays a vital role in the

selection of samples that represent the population and hence produce good approximations. Virtually

any sampling scheme that depends upon subjective judgements rather than randomizations as to who

(or which item) should be intended in the sample will suffer from judgmental bias. As you will see later

chapters, randomization also forms the probabilistic basis for statistical inference.

Example
The Tennessee State board of Architectural and Engineering Examiners asked the Tennessee Society of

Professional Engineers (TSPPE) and the Consulting Engineers of Tennessee (CET) to look into various

issues related to professional registration. One of the issues was the professional registration of

engineering facility in Tennessee. Between 1999 and 20003, they sent a survey to engineering deans to

determine the registration rate for administrators (deans and departments chair) and full-time faculty.

Also of the interest where the opinions about the need for maintaining professional registration and

whether they provide incentives to the faculty to obtain their PE certification. The survey questionnaire

and results were reported by Madhaban and Malasri (Journal of Professional Issues in Engineering

Education and Practice, 2003). The deans that receive the survey questionnaire were not selected

randomly from all available deans of engineering colleges. In fact, no specific scheme was used to select

deans. Repeated rounds of mailing were used. What effect (if any) do you think this nonrandom
selection

will have on the outcome?

Census

The United States conducts a census every 10 years, in other words, the government attempts to count

everyone living in the United States and to measure various other features of the population. The

information collected is used for the future planning in such areas as taxation, building of schools,

planning retirement centers, and forecasting energy needs. A census means a complete enumeration. It
is

a process of collecting information from every unit in a target population. In other words, a census is

big sample survey. Making a list of all music CDs you own is taking a census of music CDs. If a firm takes

inventory, it is taking a census of everything in stock. The computerized record of all the employees of a

firm is in fact a census of employees. So, the target population might be your CDs, the stock of the firm,

or the employee of the company, but the key identifies census is that information is available on each

element of that population. No randomization is used in the census data collected from all the residents

of United States, but random sampling is used to augments these data in selected issues.

Example

U.S News and World report (September 2003) reported 50 top-rated doctoral universities in the
country.

They collected information on several important factors such as ACT or SAT scores, percent of freshmen
in the top 10% of their high school class, students/faculty ratio, graduation rate, freshmen retention
rate,

alumni giving rate, and so on, for all the doctoral universities in the country. Then, using statistical

techniques, they ranked the universities. In this study, U.S News and World Report collected information

from all doctoral institution in the country, no randomization was used in collecting this data. In other

words, they conducted census.

It is feasible to conduct a census if the population is small in the process of getting information does not

destroy or modify units of the population. For example, the owner of manufacturing firm might be

interested in getting information about the stores to which his business supplies items produced. It is

possible to gather this information even if there are 2,500 area stores to which he supplies items.

However, in many situation census is not method of choice to gather information. For large population,

a census can be costly and time-consuming process of data gathering. Sometimes the process of

measurements is destructive, as in testing an appliance for life length.

• One political advisor to a candidate for governor’s position is interested in determining how much

support his candidate has in the state. Suppose the state has 4,000,000 eligible voters. Then will be

too time-consuming to contact each voter to determine the amount of support. By the time the

census is finished, the support level might change, and the information collected may be useless.

• Suppose a Department of Fisheries is interested in determining the mercury level in the fish in

Mobile Bay. Using a census will mean capturing all the fish in Mobile Bay and testing them for

mercury level, which is not advisable (or even possible) method of gathering information.

• A manufacturer of suspension cable is interested in determining the strength of the cable produce

by his factory. The strength test involves applying force till the cable breaks. Obviously, a census

would leave no cable to use. So, a census would not be a practical method of gathering information

in this situation.

2.4 Experiment and Inference about Cause and Effect

An experiment is a planned activity designed to compare “treatments.” In an experiment, the

experimenter creates differences in the experimental units involved by subjecting them to different

treatment and then observing the effect of such treatments on the measure of outcome.
For example,

• In laboratory testing, engineers at one car manufacturing facility run crash test that involve

running cars at different speeds (predetermined and controlled) and crashing them at a specific

site. Then they measure the damage to the bumpers. In this example, the team of engineers

creates the differences in environment by running cars at different speed. (The cars are

experimental units, and the speed are treatments.)

• Engineers interested in studying heat transfer use pipes of different sizes and controlled thy

direction in which water is flowing. In one study, the engineers create different environment

by controlling the size of the pipes and direction of the water flow to determine the percent

of heat transfer in those different environments. (The pipes are the experimental units and size-

direction combination are the treatments.)

As in sample surveys, randomization plays a vital role in designed experiments. By randomizing

the assigned of different environment (treatments) to experimental unit, biases that might

result due to learning effects or specific orders can be avoided. Designed experiments are

conducted not only to establish differences in outcome and environments. In sample surveys,

a sample is selected randomly from a population of interest to estimate some population

characteristic, in designed experiments, different experimental units are designed randomly too

different treatments to study the treatments effects.

Example

Guo and Uea (Trains IchemE, 2003) conducted experiments to study effects of impregnation conditions

on the textural and chemical characteristic of the prepared absorbents. They used three different

concentrations (20%, 30%, and 40%) of three different solutions ----zinc chloride (ZnCI2), phosphoric

acid (H3PO4), and potassium hydroxide (KOH) -----and recorded the amount of nitrogen dioxide (NO2)

and ammonia (NH3) absorbed onto the oil-palm-shell absorbents. In this experiments, different

treatments were created by using different concentrations of the solutions, and the effects of these

different treatments were measure in the amount of the nitrogen dioxide and the ammonia absorbed.
Nine different treatments created in the experiment can be listed as follows:

(1) 20% of ZnCI2 (2) 30% of ZnCl2 (3) 40% of ZnCI2

(4) 20% of H3PO4 (5) 30% of H3PO4 (6) 40% of H3PO4

(7) 20% of KOH (8) 30% of KOH (9) 40% of KOH

2.5 Observational Study

An observational study is a data collection activity in which the experimenter merely plays the role

of an observer. The experimenter observes the differences in the conditions of units and observes the

effects of these conditions on measurements taken on these units. The experimenter does not interject

any treatment and does not contribute to the creation of observed differences.

For example:

• One researcher collected information about the speed at which the car was travelling when a

crash occurred and the amount of damage to the bumper from the accident’s reports filed by

the local Police Department. In this example, the researcher has no control over the speed of

the car. He did not contribute to creation among the differences among the speeds. The

researcher merely the differences in speeds and the result of them measured by the amount of

damage to the bumper.

• The weather station at the Mobile/Pascagoula Regional Airport recorded the wind speed, wind

direction, and eye radius of the storm when Hurricane Danny stayed over Mobile Bay for three

days. The meteorologists studied the relation among different factors to investigate reasons

behind the fluctuations in the eye of the storm. Changing values of the wind speed and wind

direction had created different environments in the storm and such environments could be

evaluated using difference in wind speeds and directions. However, the meteorologist did not

control those scenarios, they merely observed those conditions created by nature.

Example

Wolmuth and Surters (Proceeding of the ICE, 2003) studied crowd-related failure of bridge in the world.

They collected information on the bridge failure from the years 1825-2000. For each failed bridge, they

collected information on the age, use (road, footbridge, other), form (aluminum, chain, cable supports,

concrete, desk structure, iron, steel, timber), span, width, occasion,( cavalry or soldiers, sports
gathering,
religious gathering, river spectacle, toll dispute, other,) crowd size, crowd action, (walking from one end

of the bridge to other, procession, crowd concentrated, at one parapet, crowd going from one parapet

to the other, queue, cavalry, soldiers or other military), number of deaths, number injured, and so forth.

This was an observational study because the authors collected from existing scenarios (they did not
create

differences inn them) and analyzed collected to the answers specific question about the bridge failure.

Even an observational study such as this one provides very valuable information to engineers about

planning bridges construction and proper use of bridge, but such studies do not allow cause and -effect

conclusions.

Although we might like to, it is possible to conduct an experiment in all investigations that involve a

comparison of treatments. Sometimes we must use an observational study instead of an experiment.

• To study effect of asbestos on the health of the workers in a certain industry that makes use of that

product, an experiment will require a group worker to be exposed in product containing asbestos

while another group is not. It is unethical to expose somebody intentionally to possibly harmful

chemicals so that damage to health can be measured.

• Certain inherited traits a worker’s ability to perform certain task. It is possible to randomly assign

genetic traits to different workers; they are born with those traits.

In observational studies, result cannot be generalized to a population because observational studies use

volunteers or sample of convenience, such as workers in the first shift instead of random sample
selected

from all workers. However, we can sometimes check to see whether the result can reasonably be

explained by chance alone.

Review Exercises

1. Engineers are interested in comparing the mean hydrogen production rates per day for three

different heliostat sizes. From the past week’s records, the engineers obtained the amount of

hydrogen produced per day for each of the three heliostat sizes. Then they computed and

compared the sample means, which showed that the mean production rate per day increased

with the heliostat sizes.


a. Identify the type of the study describe here.

b. Discuss the type of interference that can and cannot be drawn from this study.

2. To investigate reasons why people do not work, the Census Bureau interviewed a group of

randomly selected individuals, from April-July 1996, in four separate rotation groups,

respondents were asked to select 1 of 11 categories consisting of economic and noneconomic

reasons for not working, in response to the question, “What is the main reason you did not work

at a job or business [in the last four months]?”

a. Identify the type of the study described here.

b. Identify the population of interest.

3. Ariatnam, Najafi, and Morones (Journal of professional Issues in Engineering Education and

Practice, 2002) describe an overview of academies in horizontal directional drilling conducted to

train engineers and inspector for the California Department of Transportation. A pretest was

administered on the first day prior to any instruction. Instruction and field experience were

provided over a 3 day period, followed by a final test administered at the end of the last day.

Although the average final test score of 75.27% was higher than the average pretest score of

55.61%, the difference was not significant.

a. Identify the type of the study described here.

b. What is the purpose of administering the pretest?

4. A materials engineer wants to study the effects of two different processes for sintering copper (a

process by which copper powder coalesces into a solid but porous copper) on two different types

of copper powders. From each type of copper powder, she randomly selects two samples and

then randomly assigns one of the two sintering processes to each sample by the flip of a coin.

The response of interest measured is the porosity of the resulting copper. Explain what type of

study this is and why.

5. A textile engineer is interested in measuring heat resistance of four different types of treads used

in making fire-resistant clothing for firefighters. A random sample of 20 threads from each type

was taken and subjected to a heat test to determine resistance (the length of time the fibers survive

before starting to burn.) Explain what type of study this is and why.
6. A manufacturer of “Keep it Warm” bags is interested in comparing the heat retention of bags

when used at five different temperatures (100 oF, 125 oF, 150 oF, 175 oF, and 200 oF). Thirty bags

are selected randomly from last week’s production and randomly assigned, six each, to five

different groups. Items from group 1 at beginning temperature 100 oF were kept in bags for an

hour, and the temperatures of those items were recorded after an hour. Similarly, groups 2 to 5

were assigned items at 125 oF, 150 oF, 175 oF, and 200 oF, respectively.

a. Identify the type of study used here.

b. What type of inference is possible from this study?

LESSON 3 - CHOOSING DATA COLLECTION TECHNIQUES

Lesson Objectives:

At the end of the lesson, students should:

1. Discuss the important features of a data collection plan;

2. Explain the importance of every data collection method or technique for gathering data; and

3. Differentiate the survey, experiment, and observational study.

Introduction

Data can be collected in a variety of ways. One of the most common methods is through the use

of surveys. We have briefly discussed three common techniques for gathering data: observational
studies,

experiments, and surveys. Which techniques is best? The answer depends on the number of variables of

interest and level of confidence needed regarding statements of relationships among the variables.

Lesson Proper

✓ Survey may be the best choice for gathering information across a wide range of many variables. Many

questions can be included in a survey. However, great care must be taken in the construction of the

survey instrument and in the administration of the survey. Nonresponse and other issues discussed

earlier can be introduce bias.

✓ Observational studies are the next most convenient technique for gathering information on many

variables. Protocols for taking measurements or recording observations need to be specified carefully.
✓ Experiments are the most stringent and restrictive data gathering technique. They can be time-

consuming, expensive, and difficult to administer. In experiments, the goal is often to study the effects

of changing only one variable at a time. Because of the requirements, the number of the variables

may be more limited. Experiments must be designed carefully to ensure that the resulting data are

relevant to the research questions.

Comment

An experiment is the best technique for reaching valid conclusions. By carefully controlling for other

variables, the effect of changing one variable in a treatment group and comparing it to a control group

yields results carrying high confidence.

The next most effective technique for obtaining results that have high confidence is the use of

observational studies. Care must be taken that the act of observation does not change the behavior
being

measured or observed.

The least effective technique for drawing conclusions is the survey. Surveys have many pitfalls and by

their nature cannot give exceedingly precise results. A medical study utilizing a survey asking patients if

they feel better after taking a specific drug gives some information, but not precise information about

the drug’s effects. However, surveys are widely used to gauge attitudes, gather demographic
information,

study social and political trends, and so on.

Important Features of a Data Collection Plan

A data collection plan identifies

• The population

• The variable or variables

• Whether the data are observational or experimental

• Whether there is a control group, use of placebos, double-blind treatment, etc.

• The sampling technique to be used, including whether a block design is to be used

• The method used to collect the data for the variables: survey, method of measurement, count,
etc.

You might also like