Professional Documents
Culture Documents
Lecture 1
Lecture 1
403 DATA
ENGINEER
ING
DATA
ANALAYSI
S
Introduction
Statistics Descriptive Statistics
deals with the procedures
defined as the science that deals
with the collection, organization, that organize, summarize
presentation, analysis, and and describe quantitative
interpretation of data in order be data. It seeks merely to
able to draw judgments or describe data.
conclusions that help in the
decision-making process
Inferential Statistics
deals with making a
judgment or a conclusion
about a population based
on the findings from a
sample that is taken from
the population
https://www.youtube.com/watch?v=0VDafmUys04
Intended Learning Outcomes
At the end of this module, it is expected that the students will be able to:
Population or Universe
refers to the totality of objects, persons, places,
things used in a particular study. All members of a
particular group of objects (items) or people
(individual), etc. which are subjects or respondents
of a study.
Sample
Constant
is a characteristic or property of a population or sample which is common to all
members of the group.
Variable
A variable is any characteristics, number, or quantity that
can be measured or counted. A variable may also be
called a data item. Age, sex, business income and
expenses, country of birth, capital expenditure, class
grades, eye colour and vehicle type are examples
of variables.
Methods on Obtaining Data
• Collection of the data is the first step in conducting statistical inquiry. It simply refers to the
data gathering, a systematic method of collecting and measuring data from different sources
of information in order to provide answers to relevant questions.
• This involves acquiring information published literature, surveys through questionnaires or
interviews, experimentations, documents and records, tests or examinations and other forms
of data gathering instruments.
person who conducts the the one who helps in collecting information is collected
inquiry information from them
DATA Tell whether if it is a Primary or
Secondary Data
Primary Secondary
Raw data
According to Wessel, “Data Secondary data, on the other biographies
collected in the process of hand, is collected by some other dictionary
organization for their own use diary
investigation are known as surveys
primary data.” These are but the investigator also gets it photographs
for his use. According to M.M.
collected for the
Blair, “Secondary data are those Tax records books
investigator’s use from the
already in existence for some
primary source.
other purpose than answering
dissertations Reports
the question in hand.” experiments letters
questionnaire interview
Internet articles
Political Journals
commentary
In Engineering, there are three basic methods of collecting data
4. Decide what questions you will ask in what order, and how to phrase them.
Sampling Sampling is the process of selecting units (e.g., people, organizations) from a
population of interest
Sample must be a representative of the target population. The target population is the entire
group a researcher is interested in; the group about which the researcher wishes to
draw conclusions.
Two ways of selecting a
sample.
Probability sampling is defined as a sampling It is also called judgment or subjective sampling. This
technique in which the researcher chooses samples method is convenient and economical but the inferences
from a larger population using a method based on the made based on the findings are not so reliable
theory of probability. For a participant to be It is a sampling method in which not all members of the
considered as a probability sample, he/she must be population have an equal chance of participating in the
selected using a random selection. study, unlike probability sampling. Each member of the
The most critical requirement of probability sampling population has a known chance of being selected. Non-
is that everyone in your population has a known and probability sampling is most useful for exploratory studies
equal chance of getting selected. like a pilot survey (deploying a survey to a smaller sample
For example, if you have a population of 100 people, every compared to pre-determined sample size). Researchers use
person would have odds of 1 in 100 for getting selected.
Probability sampling gives you the best chance to create a
this method in studies where it is impossible to draw
sample that is truly representative of the population. random probability sampling due to time or cost
considerations.
Convenience Sampling
Non-probability sampling The researcher use a device in obtaining the
information from the respondents which
favors the researcher but can cause bias to the
respondents.
Convenience It means collecting a sample of whichever
Sampling participants are easiest to reach
Purposive
Sampling
Quota
Sampling.
Purposive Sampling
Non-probability sampling The selection of respondents is predetermined
according to the characteristic of interest made
by the researcher. Randomization is absent in
this type of sampling.
Convenience The participants are selected based on the
Sampling purpose of the sample, hence the name.
Participants are selected according to the
needs of the study (hence the alternate name,
deliberate sampling); applicants who do not
Purposive
meet the profile are rejected.
Sampling
Quota
Sampling.
Quota
Sampling.
Non-probability sampling
Proportional Non Proportional
20 employees In proportional quota Non-proportional quota
40% SM_8 sampling the major sampling is a bit less
Convenience 30% CS_6 restrictive. In this method, a
characteristics of the
Sampling 20% IT_4 minimum number of
10% Finance_2 population by sampling a
proportional amount of each sampled units in each
is represented. category is specified and
7000male 70% not concerned with having
Purposive • For example, imagine you want to create a numbers that match the
3000 fmale30% council of 20 employees that will meet and
Sampling recommend possible changes to the proportions in the
employee handbook. Let's say 40% of your
employees are in Sales and Marketing, population.
30% in Customer Service, 20% of your
employees are in IT, and 10% in Finance.
You will randomly select 8 people from
Sales and Marketing, 6 from Customer
Quota Service, 4 from IT, and 2 from Finance. As
Sampling. you can see, each number you pick is
proportionate to the overall percentage of
people in each category (e.g., 40% = 8
people).
Simple Random Sampling
Simple random sampling is the basic sampling Probability Sampling
technique where a group of subjects (a sample)
is selected for study from a larger group (a
population). Each individual is chosen entirely
by chance and each member of the population Simple Random
has an equal chance of being included in the Sampling
sample. Every possible sample of a given size
has the same chance of selection; i.e. each
member of the population is equally likely to
be chosen at any stage in the sampling process. Stratified
Sampling
Cluster Sampling.
Stratified Sampling
A stratified sample is obtained by taking samples Probability Sampling
from each stratum or sub-group of a population.
When a sample is to be taken from a population with
several strata, the proportion of each stratum in the
sample should be the same as in the population Simple Random
Sampling
Stratified
Sampling
Cluster Sampling
III. Planning and Conducting Experiments
Experiment
is a series of tests conducted in a systematic manner
to increase the understanding of an existing process
or to explore a new product or process
Screening
Five stages of
Methodology Optimization
of DOE
robustness testing
Verification
1. Planning
Screening experiments are used to identify the important factors that affect the process
under investigation out of the large pool of potential factors. Screening process eliminates
unimportant factors and attention is focused on the key factors. Screening experiments are
usually efficient designs which require few executions and focus on the vital factors and not
on interactions.
3. Optimization
After narrowing down the important factors affecting the process, then determine the best
setting of these factors to achieve the objectives of the investigation. The objectives may be
to either increase yield or decrease variability or to find settings that achieve both at the
same time depending on the product or process under investigation
It is an act, process, or methodology of making something (such as a design, system, or
decision) as fully perfect, functional, or effective as possible specifically It is the
mathematical procedures (such as finding the maximum of a function) involved in this.
4. Robustness Testing
• Montgomery, Douglas C.,et al., Applied Statistics and Probabiliy for Engineers, 7th ed., John Wiley & Sons
(Asia) Pte Ltd, 2018
• Panopio, Felix M. (2004). Statistics with Probability. Batangas City, Philippines: Feliber Publishing House
• Rawley, Eve. Planning and Conducting Surveys. https://www.ck12.org/statistics/planning-and-conducting-
surveys/lesson/Planning-and-Conducting-Surveys-ALG-I/ Date accessed: July 27, 2020
• Walpole, Ronald E., et al., Probability and Statistics for Engineers and Scientists, 9th ed., Pearson Education
Inc., 2016
• Introduction to Design of Experiments. https://www.weibull.com/hotwire/issue84/hottopics84.htm. Date
Accessed: April 15, 2020
• https://mathspace.co/learn/world-of-maths/language-and-use-of-statistics/planning-a-statistical-investigation-
i-investigation-18643/investigation-statistical-inquiry-916/
ACTIVIT
Y
As one of the students of EDA class, you are tasked to conduct a survey to show which
extracurricular activities the students from the College of Engineering, Architecture and Fine
Arts would like to engage in during the first semester. Follow the presented steps in conducting a
survey.(Steps are in slide #12)