You are on page 1of 24

ENGINEERING

DATA ANALYSIS
LEARNING
MODULE
SAMPLING AND DATA ANALYSIS
Learning objectives
·Sampling and Data-gathering techniques
·Determine the sample size from a given population.
·Explain the meaning of the margin of Error used in research.
·Slovin’s Formula
·Differentiate probability from non-probability sampling.
·Use different data-gathering techniques.
·Use different sampling techniques.
SAMPLING AND DATA ANALYSIS
SAMPLING – SAMPLING INVOLVES
SELECTING A SUBSET OF INDIVIDUALS
FROM A LARGER POPULATION TO
PARTICIPATE IN A STUDY OR ANSWER
A QUESTIONNAIRE. THE GOAL IS TO
CHOOSE A REPRESENTATIVE SAMPLE
THAT ACCURATELY REFLECTS THE
CHARACTERISTICS AND OPINIONS OF
THE ENTIRE POPULATION BEING
STUDIED. THIS ENSURES THAT THE
DATA COLLECTED IS RELIABLE AND
CAN BE USED TO DRAW VALID
CONCLUSIONS.
KEY WORD IN UNDERSTANDING SAMPLING AND DATA ANALYSIS

POPULATION – REFERS TO THE LARGE NUMBER OR GROUP OF PEOPLE FROM


WHOM YOU WILL CHOOSE THE SAMPLE THAT WILL REPRESENT THAT
POPULATION.

SAMPLING FRAME – THIS IS THE LIST OF PEOPLE OR MEMBERS OF THE


POPULATION TO WHOM YOU WILL GENERALIZE OR APPLY YOUR FINDINGS
ABOUT THE SAMPLE.

SAMPLING UNIT – IS REFERS TO EVERY INDIVIDUAL IN THE POPULATION.

THE RESULT COMING FROM THE SAMPLING IS EXPECTED TO GENERALIZE THE


ANSWER OF THE POPULATION ITSELF. BUT IT IS EXPECTED THAT THAT IT WILL
NOT BE 100 PERCENT, THUS THE MOE OR MARGIN OF ERROR EXISTS.
FACTORS THAT AFFECT

SAMPLING SELECTION
IN CHOOSING YOUR
RESPONDENT, YOU DO NOT
JUST LISTEN TO THE DICTATES
OF YOUR MIND BUT ALSO TO
THE OTHER FACTORS SUCH AS
THE FOLLOWING “BABBLE 2013,
EDWARD 2013 TUCKMAN AND
ENGEL 2012”
FACTORS THAT AFFECT SAMPLE SIZE – THE NUMBER OF
PARTICIPANTS IN THE STUDY CAN
GREATLY AFFECT THE FAIRNESS AND
SAMPLING SELECTION CREDIBILITY OF THE RESULT.
IN CHOOSING YOUR
SAMPLING TECHNIQUE – THIS HAS TWO
RESPONDENT, YOU DO NOT
CATEGORIES, PROBABILITY SAMPLING
JUST LISTEN TO THE DICTATES AND NON-PROBABILITY SAMPLING.
OF YOUR MIND BUT ALSO TO
THE OTHER FACTORS SUCH AS -PROBABILITY SAMPLING – PURE
CHANCE (RANDOM SAMPLING).
THE FOLLOWING “BABBLE 2013,
EDWARD 2013 TUCKMAN AND -NON-PROBABILITY SAMPLING –
ENGEL 2012” PURPOSIVE OR CONTROLLED SELECTION
OF PARTICIPANTS.
HETEROGENEITY OF THE POPULATION – REFERS TO THE
INDIVIDUAL'S VARIED ABILITIES.

STATISTICAL TECHNIQUES – THE ACCURACY OF THE


RESULT OF THE STUDY IS DEPENDENT ON HOW CLEAR
AND PRECISE THE CALCULATION IS ON THE COLLECTED
DATA.

TIME AND COST – CHOOSING THE RIGHT SAMPLE FROM


THE GIVEN POPULATION TO STUDY TAKES TIME,
EFFORT, AND MONEY TO MANAGE THE SAID SAMPLES
EFFECTIVELY AND EFFICIENTLY.
DETERMINING THE SAMPLE SIZE FROM A GIVEN POPULATION

KNOWING THE CORRECT NUMBER OF SAMPLES OR


REPRESENTATIVES FROM THE GIVEN POPULATION IS ONE OF
THE MAJOR KEYS TO HAVING A PRECISE RESULT IN OUR STUDY.
TO DO THIS, WE MUST FIRST KNOW SOME OF THE KEYWORDS
AND THE MEANING OF THE FOLLOWING WORDS.
CONFIDENCE LEVEL – REFERS TO THE MEASURE OF
RELIABILITY OF THE ESTIMATE THAT IS PRODUCED BY
A STATISTICAL ANALYSIS. EXPRESSED IN
PERCENTAGE, IT REPRESENTS THE TRUE-TO-LIFE
PARAMETERS USED IN THE STUDY ARE CORRECT.
IN CHOOSING THE AMOUNT OF CONFIDENCE IN THE
STUDY, IT IS DEPENDENT ON THE LEVEL OF CERTAINTY
YOU WANT TO ESTIMATE. THE MORE COMMON
CHOICES ARE 90%, 95%, AND 99%. THE HIGHER THE
CONFIDENCE LEVEL THE HIGHER THE LARGER THE
SAMPLE SHOULD BE. EACH PERCENTAGE IS
REPRESENTED WITH A SPECIFIC VALUE KNOWN AS Z-
SCORE.
MARGIN OF ERROR – REFERS TO THE MEASURE OF UNCERTAINTY WITHIN THE
RESULT OF THE STUDY/RESEARCH, EXPRESSED IN PERCENTAGE, IT REPRESENTS THE
RANGE OF THE POPULATION PARAMETERS. THE LOWER THE MARGIN ERROR THE
MORE PRECISE THE RESULT IS.

STANDARD DEVIATION – IS THE MEASURE OF VARIATION AND DISPERSION OF THE


VALUES FROM A GIVEN SET OF DATA. IT SHOWS HOW SPREAD OUT THE VALUES ARE
FROM THE MEAN. IF THE STANDARD DEVIATION IS UNKNOWN, A VALUE OF 0.5 IS
COMMONLY USED AS A SUBSTITUTE.

POPULATION SIZE – IT IS IMPORTANT TO KNOW THE POPULATION TO FURTHER HAVE


AN ACCURATE ESTIMATION FOR THE RESPONDENT WHICH WILL YIELD A MORE
PRECISE AND ACCURATE RESULT OF THE STUDY.
TO KNOW THE NUMBER OF PARTICIPANTS FROM A GIVEN AMOUNT OF POPULATION
THIS FORMULA IS USED.

SAMPLE SIZE =(Z-SCORE) ^2 (STDDEV)(1-STDDEV)/(MARGIN OF ERROR)^2


MARGIN OF ERROR IN RESEARCH

AS DISCUSSED ABOVE, THE MARGIN OF ERROR ALSO


KNOWN AS MARGINAL ERROR IS THE STATISTICAL OR
PERCENTAL AMOUNT OF ERROR WITHIN THE RESULT
OF THE SURVEY. IT TELLS HOW CLOSE YOUR SURVEY
RESULT IS COMPARED TO REAL-LIFE SCENARIOS.
IT IS CALCULATED USING THE FORMULA.

MOE = Z X (Σ/√N)
SLOVINS FORMULA

IS A FORMULA USED TO DETERMINE THE MINIMUM SAMPLE SIZE OF A GIVEN


POPULATION IF THE POPULATION IS FINITE OR KNOWN.

IT IS CALCULATED USING THE FORMULA.

N=N/(1+NE^2)

WHERE:

N IS THE SAMPLE SIZE NEEDED.


N IS THE POPULATION-SIZED
E IS THE MARGIN OF ERROR.
SAMPLE PROBLEMS.

1 DETERMINE THE SAMPLE SIZE


NECESSARY TO ESTIMATE THE
PROPORTION OF PEOPLE SHOPPING AT A
SUPERMARKET IN THE U.S. WHO IDENTIFY
AS VEGAN WITH 95% CONFIDENCE AND A
MARGIN OF ERROR OF 5%. ASSUME A
POPULATION PROPORTION OF 0.5 AND AN
UNLIMITED POPULATION SIZE. REMEMBER
THAT Z FOR A 95% CONFIDENCE LEVEL IS
1.96. REFER TO THE TABLE PROVIDED IN
THE CONFIDENCE LEVEL SECTION FOR Z
SCORES OF A RANGE OF CONFIDENCE
LEVELS.

OR N APPROXIMATELY EQUAL TO 384.


2 SUPPOSE A BOTANIST WANTS TO ESTIMATE THE MEAN
HEIGHT OF A CERTAIN SPECIES OF PLANT IN SOME REGION.
SUPPOSE SHE KNOWS THERE ARE 500 OF THESE PLANTS IN THE
REGION AND IT WOULD TAKE FAR TOO LONG TO MEASURE
EACH PLANT, SO SHE WOULD INSTEAD LIKE TO TAKE A
RANDOM SAMPLE OF PLANTS. ASSUME THAT SHE WOULD LIKE
TO ESTIMATE THIS MEANS WITH A MARGIN OF ERROR OF .02 OR
LESS. FIND THE SAMPLE SIZE NEEDED USING SLOVIN'S FORMULA.

N = N / (1 + NE2)
N = 500 / (1 + 500(.02)2)
N = 416.667 OR N APPROXIMATELY EQUAL TO 417.
THE DIFFERENCE BETWEEN PROBABILITY AND NON-
PROBABILITY SAMPLING, AND SAMPLING TECHNIQUE.

SAMPLING HAS TWO CATEGORIES, THE FIRST IS


PROBABILITY SAMPLING, AND THE SECOND IS NON-
PROBABILITY SAMPLING, EACH ONE OF THESE TWO IS
DIFFERENT FROM THE OTHERS AND MAY BE USED IN
RESEARCH TO HAVE BETTER AND MORE ACCURATE
RESULTS THAT ALIGN WITH YOUR RESEARCH PURPOSE.
NOW LET'S DIFFERENTIATE THE TWO.
1 PROBABILITY SAMPLING – THIS IS A SAMPLING
METHOD WHO’S THE SELECTION OF
RESPONDENTS OR THE SAMPLE SIZE IS PURELY
BASED ON CHANCE. IT IS NOT CONTROLLED BY
THE RESEARCHER.

• EVERYBODY IN THE POPULATION HAS A


CHANCE TO PARTICIPATE.

• IT EXCLUDES THE RESEARCHER'S JUDGMENT.


THE FOLLOWING ARE THE SAMPLING TECHNIQUES UNDER PROBABILITY SAMPLING.
THE FOLLOWING ARE THE SAMPLING TECHNIQUES UNDER PROBABILITY SAMPLING.

A SIMPLE RANDOM SAMPLING – IS A STRAIGHTFORWARD, UNBIASED WAY OF SAMPLING FROM A


LARGER GROUP OR POPULATION.

B SYSTEMATIC SAMPLING – IS A MORE COMPLEX AND RANDOM WAY OF SAMPLING, USUALLY, IT


INVOLVES ASSIGNING EACH MEMBER OF THE POPULATION WITH A NUMBER, AND THEN THE SAMPLE
IS CHOSEN VIA A REGULAR INTERVAL CORRESPONDING TO THEIR ASSIGNED NUMBER.

C STRATIFIED SAMPLING – IT INVOLVES DIVIDING THE POPULATION INTO SUBGROUPS ALSO CALLED
STRATA BASED ON THEIR CHARACTERISTIC THAT ARE RELEVANT TO THE RESEARCH AND THEN
RANDOMLY CHOOSING FROM EACH SUBGROUP WHICH WILL REPRESENT THEM RESPECTIVELY.
USUALLY USEFUL WHEN THERE ARE KNOWN FACTORS THAT CAN AFFECT THE RESULT IN BOTH
WAYS.

D CLUSTER SAMPLING – RATHER THAN DIVIDING THE POPULATION INTO SUBGROUPS WITH
SIMILARITIES WITH EACH OTHER RESPECTIVELY. IN THIS METHOD, EACH SUBGROUPS HAVE
SIMILARITIES AS A HOLE IN THE POPULATION, AND YOU CHOOSE THE SAMPLE BY CHOOSING THE
SUBGROUPS.
2 NON-PROBABILITY SAMPLING – IN THIS
METHOD, CHOOSING THE SAMPLE IS NOT
RANDOM OR BY CHANCE, MOSTLY IT IS
CONTROLLED OR PURPOSELY CHOSEN.

• IT IS NOT RANDOM, AND SUSCEPTIBLE TO BIAS.

• THE LIKES, WISHES, AND INTENTIONS OF THE


RESEARCHER ARE MET IN THIS METHOD.
THE FOLLOWING ARE THE SAMPLING TECHNIQUES UNDER NON-PROBABILITY SAMPLING.
THE FOLLOWING ARE THE SAMPLING TECHNIQUES UNDER NON-PROBABILITY SAMPLING.

A QUOTA SAMPLING – IS A METHOD WHERE THE RESEARCHER CREATES


QUOTAS OR PREDETERMINED CHARACTERISTICS THAT ARE RELEVANT FOR
THE BETTER YIELD OR RESULT OF THE RESEARCH. AND THEN PURPOSELY
FINDING THE INDIVIDUAL WHO POSSESSES THESE QUOTAS.

B VOLUNTARY SAMPLING – IS A SIMPLE METHOD, BY SIMPLY FINDING


INDIVIDUALS THAT ARE WILLING TO ANSWER YOUR QUESTIONNAIRE.

C PURPOSIVE SAMPLING – IS A METHOD OF PURPOSELY CHOOSING


INDIVIDUALS WHOM YOU PERCEIVE AS RELEVANT FOR YOUR STUDY.

D SNOWBALL SAMPLING – THIS INVOLVES FINDING AN INITIAL SAMPLE, AND


THE INITIAL SAMPLE MAY REFER YOU TO ANOTHER SAMPLE THAT MAY BE
RELEVANT TO YOUR RESEARCH.
DIFFERENT DATA GATHERING TECHNIQUE
DIFFERENT DATA GATHERING TECHNIQUE

NOW THAT YOU HAVE GATHERED YOUR RESPONDENTS OR YOUR SAMPLING, IT IS TIME TO
GATHER SOME DATA FROM THEM BY USING THE MOST COMMON DATA-GATHERING TECHNIQUES.

• OBSERVATION – IS A METHOD OF DIRECTLY OBSERVING AND RECORDING THE EVENTS.

• SURVEY – IS A METHOD OF GIVING OUT PAPER TO YOUR RESPONDENTS, THIS PAPER CONTAINS
A SERIES OF QUESTIONS THAT ARE RELEVANT TO YOUR STUDY.
SIMILARLY, THE INTERVIEW IS JUST LIKE A SURVEY, BUT RATHER THAN A PIECE OF PAPER YOU
ORALLY COMMUNICATE WITH THE RESPONDENTS AND RECORD THEIR ANSWERS BY TAKING
NOTES OR VIA TECHNOLOGY SUCH AS A CAM RECORDER, ETC.

• EXPERIMENTS – IS A MORE SCIENTIFIC AND COMPLEX APPROACH, BY MANIPULATING SOME


VARIABLES AND ASPECTS AND THEN RECORDING THE RESULT.
THANK YOU

You might also like