Chapter 1-Obtaining Data

Engineering Data Analysis
Mr. MARK JAVE C. GUALBERTO, RME

Lecturer I
Chapter 1:OBTAINING DATA
▪ Methods of Data Collection

▪ Planning and Conducting Surveys
▪ Planning and Conducting Experiments: Introduction to Design of
Experiments
Obtaining Data
▪ Introduction
▪ Statistics may be defined as the science that deals with the collection,
organization, presentation, analysis, and interpretation of data in order
be able to draw judgments or conclusions that help in the
decision-making process. The two parts of this definition correspond
to the two main divisions of Statistics. These are Descriptive Statistics
and Inferential Statistics. Descriptive Statistics, which is referred to in
the first part of the definition, deals with the procedures that organize,
summarize and describe quantitative data. It seeks merely to describe
data. Inferential Statistics, implied in the second part of the definition,
deals with making a judgment or a conclusion about a population
based on the findings from a sample that is taken from the population.
Obtaining Data
▪ Statistical Terms
Before proceeding to the discussion of the different methods of obtaining data, let us
have first definition of some statistical terms:
Population or Universe refers to the totality of objects, persons, places, things used in a
particular study. All members of a particular group of objects (items) or people
(individual), etc. which are subjects or respondents of a study.
Sample is any subset of population or few members of a population.
Data are facts, figures and information collected on some characteristics of a population
or sample. These can be classified as qualitative or quantitative data.
Ungrouped (or raw) data are data which are not organized in any specific way. They are
simply the collection of data as they are gathered.
Grouped Data are raw data organized into groups or categories with corresponding
frequencies. Organized in this manner, the data is referred to as frequency distribution.
▪ Parameter is the descriptive measure of a characteristic of a population
▪ Statistic is a measure of a characteristic of sample
▪ Constant is a characteristic or property of a population or sample which is common to
all members of the group.
▪ Variable is a measure or characteristic or property of a population or sample that may
have a number of different values. It differentiates a particular member from the rest of
the group. It is the characteristic or property that is measured, controlled, or
manipulated in research. They differ in many respects, most notably in the role they
are given in the research and in the type of measures that can be applied to them. 3
MATH 403- ENGINEERING
Methods of Data Collection
Methods of Data Collection
▪ Collection of the data is the first step in conducting statistical inquiry. It simply
refers to the data gathering, a systematic method of collecting and measuring data
from different sources of information in order to provide answers to relevant
questions. This involves acquiring information published literature, surveys through
questionnaires or interviews, experimentations, documents and records, tests or
examinations and other forms of data gathering instruments. The person who
conducts the inquiry is an investigator, the one who helps in collecting information
is an enumerator and information is collected from a respondent. Data can be
primary or secondary. According to Wessel, “Data collected in the process of
investigation are known as primary data.” These are collected for the investigator’s
use from the primary source. Secondary data, on the other hand, is collected by
some other organization for their own use but the investigator also gets it for his use.
According to M.M. Blair, “Secondary data are those already in existence for some
other purpose than answering the question in hand.”
Planning and Conducting Surveys
▪ A survey is a method of asking respondents some well-constructed questions. It is

an efficient way of collecting information and easy to administer wherein a wide
variety of information can be collected. The researcher can be focused and can
stick to the questions that interest him and are necessary in his statistical inquiry or
study However surveys depend on the respondents honesty, motivation, memory
and his ability to respond. Sometimes answers may lead to vague data. Surveys
can be done through face-to-face interviews or self-administered through the use
of questionnaires. The advantages of face-to-face interviews include fewer
misunderstood questions, fewer incomplete responses, higher response rates, and
greater control over the environment in which the survey is administered; also, the
researcher can collect additional information if any of the respondents’ answers
need clarifying. The disadvantages of face-to-face interviews are that they can be
expensive and time-consuming and may require a large staff of trained
interviewers. In addition, the response can be biased by the appearance or attitude
of the interviewer.
▪ Self-administered surveys are less expensive than interviews. It can be administered in large
numbers and does not require many interviewers and there is less pressure on respondents.
However, in self-administered surveys, the respondents are more likely to stop participating
mid-way through the survey and respondents cannot ask to clarify their answers. There are
lower response rates than in personal interviews.
▪ When designing a survey, the following steps are useful:
1. Determine the objectives of your survey: What questions do you want to answer?
2. Identify the target population sample: Whom will you interview? Who will be the
respondents? What sampling method will you use?
3. Choose an interviewing method: face-to-face interview, phone interview, selfadministered
paper survey, or internet survey.
4. Decide what questions you will ask in what order, and how to phrase them.
5. Conduct the interview and collect the information.
6. Analyze the results by making graphs and drawing conclusions
▪ In choosing the respondents, sampling techniques are necessary. Sampling is the

process of selecting units (e.g., people, organizations) from a population of
interest. Sample must be a representative of the target population. The target
population is the entire group a researcher is interested in; the group about which
the researcher wishes to draw conclusions. There are two ways of selecting a
sample. These are the non-probability sampling and the probability sampling.
Non-Probability Sampling
▪ Non-probability sampling is also called judgment or subjective sampling. This method is convenient and
economical but the inferences made based on the findings are not so reliable. The most common types of
non-probability sampling are the convenience sampling, purposive sampling and quota sampling.
▪ In convenience sampling, the researcher use a device in obtaining the information from the respondents
which favors the researcher but can cause bias to the respondents.
▪ In purposive sampling, the selection of respondents is predetermined according to the characteristic of
interest made by the researcher. Randomization is absent in this type of sampling.
▪ There are two types of quota sampling: proportional and non-proportional. In proportional quota
sampling the major characteristics of the population by sampling a proportional amount of each is
represented.
▪ For instance, if you know the population has 40% women and 60% men, and that you want a total
sample size of 100, you will continue sampling until you get those percentages and then you will stop.
▪ Non-proportional quota sampling is a bit less restrictive. In this method, a minimum number of sampled
units in each category is specified and not concerned with having numbers that match the proportions in
the population.
Probability Sampling
▪ In probability sampling, every member of the population is given an equal

chance to be selected as a part of the sample. There are several probability
techniques. Among these are simple random sampling, stratified sampling and
cluster sampling.
Simple Random Sampling
▪ Simple random sampling is the basic sampling technique where a group of

subjects (a sample) is selected for study from a larger group (a population). Each
individual is chosen entirely by chance and each member of the population has
an equal chance of being included in the sample. Every possible sample of a
given size has the same chance of selection; i.e. each member of the population
is equally likely to be chosen at any stage in the sampling process.
Stratified Sampling
▪ There may often be factors which divide up the population into sub-populations
(groups / strata) and the measurement of interest may vary among the different
subpopulations. This has to be accounted for when a sample from the population
is selected in order to obtain a sample that is representative of the population.
This is achieved by stratified sampling.
▪ A stratified sample is obtained by taking samples from each stratum or
sub-group of a population. When a sample is to be taken from a population with
several strata, the proportion of each stratum in the sample should be the same as
in the population.
▪ Stratified sampling techniques are generally used when the population is
heterogeneous, or dissimilar, where certain homogeneous, or similar,
sub-populations can be isolated (strata). Simple random sampling is most
appropriate when the entire population from which the sample is taken is
homogeneous. Some reasons for using stratified sampling over simple random
sampling are:
1. the cost per observation in the survey may be reduced;
2. estimates of the population parameters may be wanted for each
subpopulation;
3. increased accuracy at given cost.
Cluster Sampling
▪ Cluster sampling is a sampling technique where the entire population is divided

into groups, or clusters, and a random sample of these clusters are selected. All
observations in the selected clusters are included in the sample.
Planning and Conducting Experiments:
Introduction to Design of Experiments
Planning and Conducting Experiments:
Introduction to Design of Experiments
▪ The products and processes in the engineering and scientific disciplines are mostly
derived from experimentation. An experiment is a series of tests conducted in a
systematic manner to increase the understanding of an existing process or to explore a
new product or process. Design of Experiments, or DOE, is a tool to develop an
experimentation strategy that maximizes learning using minimum resources. Design of
Experiments is widely and extensively used by engineers and scientists in improving
existing process through maximizing the yield and decreasing the variability or in
developing new products and processes. It is a technique needed to identify the "vital
few" factors in the most efficient manner and then directs the process to its best setting
to meet the ever-increasing demand for improved quality and increased productivity.
▪ The methodology of DOE ensures that all factors and their interactions are
systematically investigated resulting to reliable and complete information. There are
five stages to be carried out for the design of experiments. These are planning,
screening, optimization, robustness testing and verification.
Planning
▪ It is important to carefully plan for the course of experimentation before

embarking upon the process of testing and data collection. At this stage,
identification of the objectives of conducting the experiment or investigation,
assessment of time and available resources to achieve the objectives. Individuals
from different disciplines related to the product or process should compose a
team who will conduct the investigation. They are to identify possible factors to
investigate and the most appropriate responses to measure. A team approach
promotes synergy that gives a richer set of factors to study and thus a more
complete experiment. Experiments which are carefully planned always lead to
increased understanding of the product or process. Well planned experiments are
easy to execute and analyze using the available statistical software.
Screening
▪ Screening experiments are used to identify the important factors that affect the
process under investigation out of the large pool of potential factors. Screening
process eliminates unimportant factors and attention is focused on the key
factors. Screening experiments are usually efficient designs which require few
executions and focus on the vital factors and not on interactions.
Optimization
▪ After narrowing down the important factors affecting the process, then
determine the best setting of these factors to achieve the objectives of the
investigation. The objectives may be to either increase yield or decrease
variability or to find settings that achieve both at the same time depending on the
product or process under investigation.
Robustness Testing
▪ Once the optimal settings of the factors have been determined, it is important to
make the product or process insensitive to variations resulting from changes in
factors that affect the process but are beyond the control of the analyst. Such
factors are referred to as noise or uncontrollable factors that are likely to be
experienced in the application environment. It is important to identify such
sources of variation and take measures to ensure that the product or process is
made robust or insensitive to these factors.
Verification
▪ This final stage involves validation of the optimum settings by conducting a few
follow up experimental runs. This is to confirm that the process functions as
expected and all objectives are achieved.
*
End

Chapter 1-Obtaining Data

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1-Obtaining Data

Uploaded by

Copyright:

Available Formats

Engineering Data Analysis

Mr. MARK JAVE C. GUALBERTO, RME

▪ Methods of Data Collection

▪ A survey is a method of asking respondents some well-constructed questions. It is

▪ In choosing the respondents, sampling techniques are necessary. Sampling is the

▪ In probability sampling, every member of the population is given an equal

▪ Simple random sampling is the basic sampling technique where a group of

▪ Cluster sampling is a sampling technique where the entire population is divided

▪ It is important to carefully plan for the course of experimentation before

You might also like