Sampling Methods Presentation)

Sampling Methods and Inferential Statistics
Suparat Walakanon D5220038
Presentation Topics
1. Sampling Methods Population Sample Sampling Methods 2. Inferential Statistics Parametric Tests Nonparametric Tests
What is a population?
A population is the complete collection of specific types of elements such as scores, people, and other shared variables to be studied.
A population must be clearly defined in terms of the following 3 aspects:

Content research subjects Extent geographical boundaries Time the time period under consideration
Frankfort-Nachmias and Nachmias (1996)
The first-year SUT undergraduate students enrolled in English I course in Trimester 1/2010.
What is sampling?
Sampling is the process of selecting a small number of elements from a larger target group of such elements so that the data gathered from the small group will allow judgments or claims to be made about the populations.
Sampling Frame
A sampling frame is an actual set of units from which a sample has been identified, and should cover all the sampling units in the population of interest.
Potential Problems of a Sampling Frame

1. Incomplete frames - missing names of late enrolled students 2. Clusters of elements - samples are located in clusters (separate groups) 3. Blank foreign elements - inclusion of non-members of the population in the sample frame
Sampling Methods
Probability sampling
Nonprobability sampling
Probability Sampling
A sampling in which members of the population have equal chance (probability) of being selected. Nonprobability Sampling A sampling in which the chances (probability) of selecting members from the population are not equal.
Probability Sampling
Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling
Nonprobability Sampling
Convenience sampling Judgment sampling Quota sampling
Simple Random Sampling (SRS)

the probability of being selected is equal for all members of the population
Blind Draw Method (e.g. names placed in a box and then drawn randomly) Random Numbers Method (all items in the sampling frame given numbers, numbers then drawn using table or computer program)
Advantages of SRS
Fair Unbiased
Disadvantages of SRS
over- or under-sampling no guarantee of getting good representatives
Systematic random sampling

A sample is obtained be selecting every K-th e.g. every 15th participant from a list containing the total population, after a random start.
Advantages of Systematic Random Sampling

Efficiency..do not need to designate (assign a number to) every population member, just those early on on the list (unless there is a very large sampling frame). Less expensive faster than SRS
Disadvantages of Systematic Random Sampling

- Small loss in sampling precision - Potential periodicity problems
Stratified Sampling
The population is separated into homogeneous groups/segments/strata and a sample is taken from each. The results are then combined to get the picture of the total population.
Advantages of Stratified Sampling

representativeness of the composition of the population is guaranteed.
Disadvantages of Stratified Sampling

more complex sampling plan requiring different sample sizes for each stratum
Cluster sampling
method by which the population is divided into groups (clusters), any of which can be considered a representative sample
Advantages of Cluster Sampling

Economic efficiency faster and less expensive than SRS Does not require a list of all members of the population.
Disadvantages of Cluster Sampling

- Cluster specification error the more homogeneous the cluster chosen, the more imprecise the sample results.
Convenience Sampling
A sample is obtained by selecting individual participants who are easy to approach.
Advantages of Convenience Sampling

convenient inexpensive
Disadvantages of Convenience Sampling

- biased
Purposive Sampling
This method starts with a purpose in the researcher s mind, and the sample is thus selected to include participants of interest and exclude those who do not suit the purpose.
Advantages of Purposive Sampling

serves the purpose of the research is convenient
Disadvantages of Convenience Sampling

- subjective - low generalizibility
Quota Sampling
A sample is obtained by identifying subgroups to be included, then establishing quotas for individuals to be selected through convenience for each subgroup.
Advantage of Quota Sampling

can ensure that convenience samples will have desired proportion of subgroups
Disadvantage of Quota Sampling

- biased
INFERENTIAL STATISTICS
Hypothesis and Hypothesis Testing Level of Significance Directional and Non-directional Hypothesis Testing Type I and Type II Error Parametric and Nonparametric Tests
Research Hypothesis
A hypothesis is an assumption about the population parameter.
A parameter is a characteristic of the population, like its mean or variance. The parameter must be identified before analysis.
Hypothesis Testing
Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test:
Null hypothesis (H0) Alternative hypothesis (HA) Test statistic Rejection region (the alpha level)
H 0 : Q1 ! Q 2
Null and Alternative Hypotheses

Null Hypothesis (H0)
- Statement regarding the value(s) of unknown parameter(s). Typically will imply no association between explanatory and response variables in the study.
H0: Q 1 ! Q Alternative Hypothesis (HA)
- Statement contradictory to the null hypothesis (will
always contain an inequality)
HA :
Q1 { Q 2
The Alpha Level ( )

a probability value that is used to define the very unlikely sample outcomes if the null hypothesis is true
=.05 =.01
the most unlikely 5% (or 1%) of the sample means (the extreme values) is separated from the most likely 95% (99%) of the sample means (the central values).
Critical Region
Critical Value
Value or values that separate the critical region (where we reject the null hypothesis) from the values of the test statistics that do not lead to a rejection of the null hypothesis
Critical Value
Critical Value ( z score )
Critical Value
Reject H0 Fail to reject H0
Critical Value ( z score )
Two-tailed,Right-tailed, Left-tailed Tests

The tails in a distribution are the extreme regions bounded by critical values.
Two-tailed Test
H0: = 100 H1: { 100
Two-tailed Test
H0: = 100 H1: { 100

E is divided equally between
the two tails of the critical region
Two-tailed Test
H0: = 100 H1: { 100
Means less than or greater than

Two-tailed Test
H0: = 100 H1: { 100
Means less than or greater than

Reject H0 Fail to reject H0 Reject H0
100
Values that differ significantly from 100
Right-tailed Test
H0: e 100 H1: > 100
Right-tailed Test
H0: e 100 H1: > 100

Points Right
Right-tailed Test
H0: e 100 H1: > 100
Points Right
Fail to reject H0 Reject H0
100
Left-tailed Test
H0: u 100 H1: < 100
Left-tailed Test
H0: u 100 H1: < 100
Points Left
Left-tailed Test
H0: u 100 H1: < 100
Points Left
Reject H0 Fail to reject H0
100
Conclusions in Hypothesis Testing

always test the null hypothesis
1. Reject the H0 2. Fail to reject the H0
need to formulate correct wording of final conclusion
Type I Error
The mistake of rejecting the null hypothesis when it is true.
(alpha) is used to represent the probability of a type I error
Example: Rejecting a claim that the group mean score equals 96 when the mean really does equal 96
Type II Error
the mistake of failing to reject the null hypothesis when it is false.
(beta) is used to represent the

probability of a type II error
Example: Failing to reject the claim that the group mean score is 96 when the mean is really different from 96
Inferential Statistics
Parametric Tests
normal distribution ratio or interval scale random sampling T-test ANOVA Pearson s Chi-square
Nonparametric Tests
do not require normality ordinal or nominal scale
t-tests
Compute two sets of mean values 1. one sample t-test 2. two independent samples t-test 3. two paired (dependent) samples ttest
One group t-test

to examine whether a sample mean value is different from a pre-set value
Example: Is the students TOEFL mean score higher or lower than 500?
One group t-test

Formulating a null and research hypothesis
H0: The students TOEFL mean score is about 500. HA: The students TOEFL mean score is different from 500.
Students Individual Scores

500 530 440 450 460 485 465 510 490 495 500 505 430 470 500 510 490 485 520 475 460 490 465 520
Output Data
Significant at p-value = .011, p < .05 Reject H0 The students TOEFL mean score is different from 500
Dependent-sample t-test
compares the means of individual participants in one group. pre-test posttest design
Example:
Is the students individual scores of the pre-test and posttest different?
Formulating a null and research hypothesis
H0: There is no difference between the mean scores of the pre-test and posttest. HA: The students mean scores in the posttest is higher than those in the pre-test
Data Output for dependent t-test
Significant at p = .025, p < .05

Reject H0, The students mean scores in the posttest is higher than those in the pre-test
Independent-sample t-test
examines whether the mean values of two independent groups are significantly different.
A researcher wants to know whether the students of his class perform better or worse than students in another class in an English final examination.
Research Hypothesis
H0 : There is no difference between the mean scores of the two classes. HA: The mean scores between two classes are different
Not significant Retain H0
One-Way ANOVA
The response variable is the variable you re comparing The factor variable is the categorical variable being used to define the groups We will assume k samples (groups) The one-way is because each value is classified in exactly one way Examples include comparisons by gender, race, political party, color, etc.
One-Way ANOVA
determines whether there is any significant difference of the mean values among sample groups
Why not repeated t-tests? 1. One-way ANOVA can handle the comparison for more than two groups in one time. 2. More tests done, higher risk of Type-I error.
Research Hypothesis
H0:
All the means are equal.
HA: At least two groups have different mean value.
ANOVA + Post Hoc tests

ANOVA only tells whether one pair of mean scores are different but it does not tell which pair is different. Post hoc tests e.g. Sheffe or Tukey s tests will do this job.
Non-parametric Test
Pearson s Chi-square
- Goodness-of-fit
test
- Test for Independence
Goodness-of-Fit Test
Compares observed frequencies within groups to their expected frequencies. HO = observed frequencies are not different from the expected frequencies. Research hypothesis: They are different.
Test of Independence
Review cross-tabulations (= contingency tables) Are the differences in responses of two groups statistically significantly different? One-way = observed vs expected Two-way = one set of observed frequencies vs another set.
Thank you very much

Sampling Methods Presentation)

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sampling Methods Presentation)

Uploaded by

Copyright:

Available Formats

Sampling Methods and Inferential Statistics

Suparat Walakanon D5220038

A population must be clearly defined in terms of the following 3 aspects:

Potential Problems of a Sampling Frame

Simple Random Sampling (SRS)

Systematic random sampling

Advantages of Systematic Random Sampling

Disadvantages of Systematic Random Sampling

Advantages of Stratified Sampling

Disadvantages of Stratified Sampling

Advantages of Cluster Sampling

Disadvantages of Cluster Sampling

Advantages of Convenience Sampling

Disadvantages of Convenience Sampling

Advantages of Purposive Sampling

Disadvantages of Convenience Sampling

Advantage of Quota Sampling

Disadvantage of Quota Sampling

Null and Alternative Hypotheses

H0: Q 1 ! Q Alternative Hypothesis (HA)

- Statement contradictory to the null hypothesis (will

always contain an inequality)

The Alpha Level ( )

Critical Value ( z score )

Critical Value ( z score )

Two-tailed,Right-tailed, Left-tailed Tests

H0: = 100 H1: { 100

H0: = 100 H1: { 100

E is divided equally between

Means less than or greater than

Values that differ significantly from 100

H0: e 100 H1: > 100

Values that differ significantly from 100

H0: u 100 H1: < 100

Values that differ significantly from 100

Conclusions in Hypothesis Testing

need to formulate correct wording of final conclusion

(alpha) is used to represent the probability of a type I error

 (beta) is used to represent the

One group t-test

One group t-test

Students Individual Scores

Is the students individual scores of the pre-test and posttest different?

Formulating a null and research hypothesis

Data Output for dependent t-test

Significant at p = .025, p < .05

Not significant Retain H0

All the means are equal.

HA: At least two groups have different mean value.

ANOVA + Post Hoc tests

- Test for Independence

Thank you very much

You might also like

need to formulate correct wording of final conclusion

(beta) is used to represent the