You are on page 1of 163

MATH403-

ENGINEERING
DATA
ANALYSIS
Engr. DONITA B. REYES
Lecturer
TOPIC FOR
TODAY
Obtaining Data
Definition of Terminologies
Types of data, variables and level of
measurements
Data collection techniques
Sampling techniques
STATISTICS

DEFINED AS THE SCIENCE THAT


DEALS WITH THE COLLECTION,
ORGANIZATION, PRESENTATION,
ANALYSIS, AND INTERPRETATION
OF DATA IN ORDER BE ABLE TO
DRAW JUDGMENTS OR
CONCLUSIONS THAT HELP IN THE
DECISION-MAKING PROCESS
DESCRIPTIVE STATISTICS INFERENTIAL STATISTICS
STATISTICS

DECRIPTIVE INFERENTIAL
STATISTICS STATISTICS
DEALS WITH THE DEALS WITH MAKING
PROCEDURES THAT A JUDGMENT OR A
ORGANIZE, CONCLUSION ABOUT
SUMMARIZE AND A POPULATION BASED
DESCRIBE ON THE FINDINGS
QUANTITATIVE DATA. FROM A SAMPLE
IT SEEKS MERELY TO THAT IS TAKEN FROM
DESCRIBE DATA. THE POPULATION.
STATISTICS
TERMINOLOGIES
Population or Universe
REFERS TO THE TOTALITY OF OBJECTS, PERSONS,
PLACES, THINGS USED IN A PARTICULAR STUDY. ALL
MEMBERS OF A PARTICULAR GROUP OF OBJECTS
(ITEMS) OR PEOPLE (INDIVIDUAL), ETC. WHICH ARE
SUBJECTS OR RESPONDENTS OF A STUDY.

Sample
I S A NY S U BS ET O F P O P U LA T I O N O R F EW
M E M BE RS OF A PO PU L AT I O N.
TERMINOLOGIES
Data
ARE FACTS, FIGURES AND
INFORMATION COLLECTED ON
SOME CHARACTERISTICS OF A
POPULATION OR SAMPLE.
THESE CAN BE CLASSIFIED AS
QUALITATIVE OR QUANTITATIVE
DATA.
TERMINOLOGIES
Grouped Data Ungrouped Data
ARE RAW DATA ARE DATA WHICH
ORGANIZED INTO ARE NOT
GROUPS OR CATEGORIES ORGANIZED IN ANY
WITH CORRESPONDING
SPECIFIC WAY.
FREQUENCIES.
ORGANIZED IN THIS THEY ARE SIMPLY
MANNER, THE DATA IS THE COLLECTION OF
REFERRED TO AS DATA AS THEY ARE
FREQUENCY GATHERED.
DISTRIBUTION.
TERMINOLOGIES
METHOS ON
OBTAINING DATA I. METHODS OF
DATA COLLECTION

II. PLANNING AND


CONDUCTING
SURVEYS

III. PLANNING AND


CONDUCTING
EXPERIMENTS
I. METHODS OF DATA COLLECTION
METHOS ON
OBTAINING DATA Collection of the data is the
first step in conducting
statistical inquiry. It simply
refers to the data gathering, a
systematic method of
collecting and measuring data
from different sources of
information in order to provide
answers to relevant questions.
I. METHODS OF DATA COLLECTION
METHOS ON
OBTAINING DATA This involves acquiring
information published
literature, surveys through
questionnaires or interviews,
experimentations,
documents and records,
tests or examinations and
other forms of data gathering
instruments.
RESPONDENT
information is
ENUMERATOR collected from them

the one who


helps in collecting
INVESTIGATOR information
person who
conducts the inquiry
PRIMARY DATA SECONDARY DATA
Secondary data, on the
According to Wessel, “Data
other hand, is collected by
collected in the process of
some other organization
investigation are known as
for their own use but the
primary data.” These are
investigator also gets it for
collected for the
his use. According to M.M.
investigator’s use from the
Blair, “Secondary data are
primary source.
those already in existence
for some other purpose
DATA

than answering the


question in hand.”
Tell whether if it is PRIMARY or SECONDARY

BOOKS
INTERNET COMMUNICATION
SURVEYS (COLLECTED BY YOU)
CENSUS
GOOGLE ANALYTICS
NEWSPAPER
THREE BASIC METHODS OF
COLLECTING DATA
In engineering,
would use the the there are problem

population or researchers areas with no


scientific or
sample of the only observe engineering theory
historical data the subjects that are directly or

which had and do not completely


applicable, so
been archived interfere or experimentation
try to and observation of
over some
influence the the resulting data
period of time is the only way to
outcomes
solve them.
RETROSPECTIVE OBSERVATIONAL EXPERIMENTS
STUDY STUDY DESIGNED
An example of an experiment is when
scientists give rats a new medicine and
see how they react to learn about the
medicine. An example of an experiment
is when you try a new coffee shop but
you aren't sure how the coffee will taste.
The result of experimentation.
II . PLANNING AND CONDUCTING
SURVEYS
METHOS ON
OBTAINING DATA
SURVEY
is a method of asking respondents
some well-constructed questions.
It is an efficient way of collecting
information and easy to
administer wherein a wide variety
of information can be collected.
FACE TO FACE SELF-ADMINISTER
•Advantages of face-to-
face interviews include •Less expensive than
fewer misunderstood interviews.
questions, fewer •It can be administered in
incomplete responses, large numbers and does
higher response rates, and
not require many
greater control over the
interviewers and there is
environment in which the
less pressure on
SURVEY

survey is administered;
also, the researcher can
respondents.
collect additional
information if any of the
respondents’ answers
need clarifying
FACE TO FACE SELF-ADMINISTER
•The disadvantages of •The respondents are
face-to-face more likely to stop
interviews are that participating mid-way
they can be expensive through the survey
and time-consuming and respondents
and may require a
cannot ask to clarify
large staff of trained
SURVEY

their answers. There


interviewers. In
are lower response
addition, the response
rates than in personal
can be biased by the
interviews
appearance or attitude
of the interviewer.
FIRST
STEPS ON
DESIGNING A
SURVEY
1. Determine the
objectives of your
SECOND
survey: What
2. Identify the target
questions do you
population sample:
want to answer?
Whom will you
interview? Who will be
the respondents? What
THIRD
sampling method will
3. Choose an
you use?
interviewing method:
face-to-face interview,
phone interview, self-
administered paper
survey/internet survey.
FOURTH
STEPS ON
DESIGNING A
SURVEY
4. Decide what
questions you will
ask in what order,
and how to phrase
them.
FIFTH
5. Conduct the
interview and collect
the information.

SIXTH
6. Analyze the results
by making graphs
and drawing
conclusions.
SAMPLING
Sampling is the process of selecting
units (e.g., people, organizations) from a
population of interest
SAMPLE
must be a representative of the target
population. The target population is the
entire group a researcher is interested in;
the group about which the researcher
wishes to draw conclusions.
TWO WAYS OF
SELECTING SAMPL
PROBABILITY NON-
SAMPLING PROBABILITY
Probability sampling is SAMPLING
defined as a sampling
technique in which the It is also called judgment
researcher chooses or subjective sampling.
samples from a larger This method is convenient
population using a method and economical but the
based on the theory of inferences made based on
probability. For a the findings are not so
participant to be reliable
considered as a
probability sample, he/she
must be selected using a
random selection.
TWO WAYS OF
SELECTING SAMPL
PROBABILITY NON-
SAMPLING PROBABILITY
SAMPLING
The most critical
requirement of probability It is a sampling method in
sampling is that everyone which not all members of
in your population has a the population have an
known and equal chance of equal chance of
getting selected. participating in the study,
unlike probability
sampling. Each member of
the population has a
known chance of being
selected.
TWO WAYS OF
SELECTING SAMPL
TWO WAYS OF
SELECTING SAMPL
TWO WAYS OF
SELECTING SAMPL
NON-
PROBABILITY PROBABILITY
SAMPLING SAMPLING
For example, if you have Non-probability sampling
a population of 100 is most useful for
people, every person exploratory studies like a
would have odds of 1 in pilot survey (deploying a
100 for getting selected. survey to a smaller sample
Probability sampling compared to pre-
gives you the best determined sample size).
chance to create a Researchers use this
sample that is truly method in studies where it
representative of the is impossible to draw
population. random probability
sampling due to time or
cost considerations.
NON-PROBABILITY SAMPLING
CONVENIENCE SAMPLING
The researcher use a device in obtaining the information
from the respondents which favors the researcher but
can cause bias to the respondents.

It means collecting a sample of whichever participants


are easiest to reach
NON-PROBABILITY SAMPLING
PURPOSIVE SAMPLING
The selection of respondents is predetermined according to the
characteristic of interest made by the researcher.
Randomization is absent in this type of sampling.
The participants are selected based on the purpose of the
sample, hence the name. Participants are selected according to
the needs of the study (hence the alternate name, deliberate
sampling); applicants who do not meet the profile are rejected
NON-PROBABILITY SAMPLING
QUOTA SAMPLING
PROPORTIONAL NON-PROPORTIONAL
In proportional quota Non-proportional quota sampling is
sampling the major a bit less restrictive. In this method,
characteristics of the a minimum number of sampled
population by sampling a units in each category is specified
proportional amount of each and not concerned with having
is represented. numbers that match the
proportions in the population.
NON-PROBABILITY SAMPLING
QUOTA SAMPLING
PROPORTIONAL
•For example, imagine you want to create a council of 20
employees that will meet and recommend possible changes to the
employee handbook. Let's say 40% of your employees are in Sales
and Marketing, 30% in Customer Service, 20% of your employees
are in IT, and 10% in Finance. You will randomly select 8 people
from Sales and Marketing, 6 from Customer Service, 4 from IT, and
2 from Finance. As you can see, each number you pick is
proportionate to the overall percentage of people in each
category (e.g., 40% = 8 people).
NON-PROBABILITY SAMPLING
QUOTA SAMPLING
NON- PROPORTIONAL
PROBABILITY SAMPLING
SIMPLE RANDOM SAMPLING
Simple random sampling is the basic sampling
technique where a group of subjects (a sample) is
selected for study from a larger group (a
population). Each individual is chosen entirely by
chance and each member of the population has an
equal chance of being included in the sample. Every
possible sample of a given size has the same chance
of selection; i.e. each member of the population is
equally likely to be chosen at any stage in the
sampling process.
PROBABILITY SAMPLING
STRATIFIED SAMPLING
A stratified sample is obtained by taking
samples from each stratum or sub-group
of a population. When a sample is to be
taken from a population with several
strata, the proportion of each stratum in
the sample should be the same as in the
population
PROBABILITY SAMPLING
STRATIFIED SAMPLING
For example, you have three sub-groups
with a population size of 150, 200, 250
subjects in each subgroup respectively.
Now, to make it proportionate, the
researcher uses one specific fraction or a
percentage to be applied on its subgroups
of population. The sample for first group
would be 150*0.5= 75, 200*0.5=100 and
250*0.5= 125. Here the constant factor is
the proportion ration for each population
subset.
PROBABILITY SAMPLING
CLUSTER SAMPLING

Cluster sampling is a sampling technique


where the entire population is divided
into groups, or clusters, and a random
sample of these clusters are selected. All
observations in the selected clusters are
included in the sample.
PROBABILITY SAMPLING
CLUSTER SAMPLING
In cluster sampling, researchers divide a
population into smaller groups known as
clusters. They then randomly select among
these clusters to form a sample.
Cluster sampling is often used to study large
populations, particularly those that are widely
geographically dispersed. Researchers usually
use pre-existing units such as schools or cities
as their clusters.
PROBABILITY SAMPLING
CLUSTER SAMPLING
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
EXPERIMENT
is a series of tests conducted in a
systematic manner to increase the
understanding of an existing
process or to explore a new
product or process
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Design of Experiments, or DOE
is a tool to develop an experimentation
strategy that maximizes learning using
minimum resources. Design of
Experiments is widely and extensively
used by engineers and scientists in
improving existing process through
maximizing the yield and decreasing
the variability or in developing new
products and processes.
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Design of Experiments, or DOE
It is a technique needed to identify
the "vital few" factors in the most
efficient manner and then directs the
process to its best setting to meet the
ever-increasing demand for improved
quality and increased productivity.
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA Methodology of DOE
ensures that all factors and their
interactions are systematically
investigated resulting to reliable and
complete information
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE

1. Planning
2. Screening
3. Optimization
4. Robustness Testing
5. Verification
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE
1. Planning
It is important to carefully plan for the course of
experimentation before embarking upon the process of
testing and data collection. At this stage, identification
of the objectives of conducting the experiment or
investigation, assessment of time and available
resources to achieve the objectives. Individuals from
different disciplines related to the product or process
should compose a team who will conduct the
investigation. Well planned experiments are easy to
execute and analyze using the available statistical
software.
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE

2. Screening
Screening experiments are used to identify the
important factors that affect the process
under investigation out of the large pool of
potential factors. Screening process eliminates
unimportant factors and attention is focused
on the key factors. Screening experiments are
usually efficient designs which require few
executions and focus on the vital factors and
not on interactions.
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE

3. Optimization
After narrowing down the important factors
affecting the process, then determine the best
setting of these factors to achieve the
objectives of the investigation. The objectives
may be to either increase yield or decrease
variability or to find settings that achieve both
at the same time depending on the product or
process under investigation
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE

3. Optimization
It is an act, process, or methodology of
making something (such as a design,
system, or decision) as fully perfect,
functional, or effective as possible
specifically It is the mathematical
procedures (such as finding the maximum
of a function) involved in this.
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE
4. Robustness Testing
A robust statistic is resistant to errors in the results
Once the optimal settings of the factors have been
determined, it is important to make the product or
process insensitive to variations resulting from
changes in factors that affect the process but are
beyond the control of the analyst. Such factors are
referred to as noise or uncontrollable factors that are
likely to be experienced in the application
environment. It is important to identify such sources
of variation and take measures to ensure that the
product or process is made robust or insensitive to
these factors.
II I. PLANN ING AND COND UCTING
EXPER IME NTS
METHOS ON
OBTAINING DATA
Five Stages of Methodology of DOE

5. Verification
A process in which different types of data
are checked for accuracy and
inconsistencies after data migration is
done.
This final stage involves validation of the
optimum settings by conducting a few
follow-up experimental runs. This is to
confirm that the process functions as
expected and all objectives are achieved.
THANK YOU FOR
LISTENING!
Part I
Probability

For example, the probability of flipping a coin and it being heads is ½,


because there is 1 way of getting a head and the total number of possible
outcomes is 2 (a head or tail). We write P(heads) = ½ .

The probability of something which is certain to happen is 1.

The probability of something which is impossible to happen is 0.

The probability of something not happening is 1 minus the probability


that it will happen.
_
Introduction

Experiment

Outcome

Sample Space

- Event
.
Experiment
-is a process of investigation from which results are observed or recorded.
Experiment
-is a process of investigation from which results are observed or recorded.
Outcome - possible result of an experiment
Outcome - possible result of an experiment
Sample Space
- it is the set of all possible outcomes for a probability experiment or activity .
It is usually denoted by the letter S .
Sample Space
- it is the set of all possible outcomes for a probability experiment or activity .
It is usually denoted by the letter S .
Event
- is the subset of all outcomes or sample space of an experiment.
Event
- is the subset of all outcomes or sample space of an experiment.
Activity
List the sample space for the following experiment
Activity
List the sample space for the following experiment
Activity
List the sample space for the following experiment
Part II
Any Questions?
Resource Page
math 403 - lecture 3

discrete probability

Engr. Donita B.Reyes


IntendedLearning Outcomes
1. Determine probabilities from probability mass functions.
2. Determine probabilities from cumulative functions and
cumulative distribution functions from probability mass
functions.
3. Calculate means and variances for discrete random
variables.
4. Understand the assumptions for each of the discrete
probability distributions presented.
5. Select an appropriate discrete probability distribution to
calculate probabilities in specific applications.
6. Calculate probabilities, determine means and variances for
each of the discrete probability distributions presented
Discrete Probability
Distribution
A discrete distribution describes the
probability of occurrence of each value
of a discrete random variable. A
discrete random variable is a random
variable that has countable values,
such as a list of non-negative integers.
Discrete Probability
Distribution
With a discrete probability distribution,
each possible value of the discrete
random variable can be associated
with a non-zero probability. Thus, a
discrete probability distribution is
often presented in tabular form.
Topics to be discussed;
Example:
When an unbiased coin
is tossed seven times ,
what is the probability
of obtaining exactly 4
heads?
Example:
A teacher developed a 5-
item multiple choice
questions with four options in
each item. What is the
probability that a certain
student who randomly selects
his answers will get exactly 4
correct answers?
questions?
Thank you for listening!

You might also like