You are on page 1of 13

POPULATIONS AND SAMPLING

Populations
Definition - a complete set of elements (persons or objects) that possess some
common characteristic defined by the sampling criteria established by the
researcher

Composed of two groups - target population & accessible population


Target population (universe)

The entire group of people or objects to which the researcher


wishes to generalize the study findings

Meet set of criteria of interest to researcher

Examples

All institutionalized elderly with Alzheimer's

All people with AIDS

All low birth weight infants

All school-age children with asthma

All pregnant teens

Accessible population

the portion of the population to which the researcher has


reasonable access; may be a subset of the target population

May be limited to region, state, city, county, or institution

Examples
All institutionalized elderly with Alzheimer's in St.
Louis county nursing homes

All people with AIDS in the metropolitan St. Louis area

All low birth weight infants admitted to the neonatal


ICUs in St. Louis city & county

All school-age children with asthma treated in pediatric


asthma clinics in university-affiliated medical centers in
the Midwest

All pregnant teens in the state of Missouri

Samples
Terminology used to describe samples and sampling methods

Sample = the selected elements (people or objects) chosen for


participation in a study; people are referred to as subjects or
participants

Sampling = the process of selecting a group of people, events,


behaviors, or other elements with which to conduct a study

Sampling frame = a list of all the elements in the population from


which the sample is drawn

Could be extremely large if population is national or


international in nature

Frame is needed so that everyone in the population is identified


so they will have an equal opportunity for selection as a subject
(element)

Examples

A list of all institutionalized elderly with Alzheimer's in


St. Louis county nursing homes affiliated with BJC

A list of all people with AIDS in the metropolitan St.


Louis area who are members of the St. Louis Effort for
AIDS
A list of all low birth weight infants admitted to the
neonatal ICUs in St. Louis city & county in 1998

A list of all school-age children with asthma treated in


pediatric asthma clinics in university-affiliated medical
centers in the Midwest

A list of all pregnant teens in the Henderson school


district

Randomization = each individual in the population has an equal


opportunity to be selected for the sample

Representativeness = sample must be as much like the population in as


many ways as possible

Sample reflects the characteristics of the population, so those


sample findings can be generalized to the population

Most effective way to achieve representativeness is through


randomization; random selection or random assignment

Parameter = a numerical value or measure of a characteristic of the


population; remember P for parameter & population

Statistic = numerical value or measure of a characteristic of the


sample; remember S for sample & statistic

Precision = the accuracy with which the population parameters have


been estimated; remember that population parameters often are based on
the sample statistics

Types of Sampling Methods - probability & non-probability

Probability Sampling Methods


Also called random sampling
 Every element (member) of the population has a probability
greater than) of being selected for the sample

 Everyone in the population has equal opportunity for selection


as a subject

 Increases sample's representativeness of the population

 Decreases sampling error and sampling bias

Types of probability sampling - see table in course materials for details

Simple random

 Elements selected at random

 Assign each element a number

 Select elements for study by:

1. Using a table of random numbers in book

A table displaying hundreds of digits from 0 to 9


set up in such a way that each number is equally
likely to follow any other

See text for random sampling details & table of


random numbers

 Computer generated random numbers table

 Draw numbers for box (hat)

 Bingo #=s

Stratified random
Population is divided into subgroups, called strata, according
to some variable or variables in importance to the study

Variables often used include: age, gender, ethnic origin, SES,


diagnosis, geographic region, institution, or type of care

Two approaches to stratification - proportional &


disproportional

Proportional

Subgroup sample sizes equal the proportions of


the subgroup in the population

Example: A high school population has

15% seniors

25% juniors

25% sophomores

35% freshmen

With proportional sample the sample has


the same proportions as the population

Disproportional

Subgroup sample sizes are not equal to the


proportion of the subgroup in the population

Example

Class Population Sample

Seniors 15% 25%

Juniors 25% 25%

Sophomores 25% 25%

Freshmen 35% 25%


With disproportional sample the
sample does not have the same
proportions as the population

Cluster random sampling

A random sampling process that involves stages of sampling

The population is first listed by clusters or categories

Procedure

Randomly select 1 or more clusters and take all of their


elements (single stage cluster sampling); e.g. Midwest
region of the US

Or, in a second stage randomly select clusters from the


first stage of clusters; eg 3 states within the Midwest
region

In a third stage, randomly select elements from the


second stage of clusters; e.g. 30 county health dept.
nursing administrators from each state

Systematic

A random sampling process in which every kth (e.g. every


5th element) or member of the population is selected for the
sample after a random start is determined

Example

Population (N) = 2000, sample size (n) = 50, k=N/n, so k


= 2000 ) 50 = 40

Use a table of random numbers to determine the


starting point for selecting every 40th subject

With list of the 2000 subjects in the sampling frame, go


to the starting point, and select every 40th name on the
list until the sample size is reached. Probably will have
to return to the beginning of the list to complete the
selection of the sample.
Non-probability sampling methods

Characteristics

Not every element of the population has the opportunity for selection in
the sample

No sampling frame

Population parameters may be unknown

Non-random selection

More likely to produce a biased sample

Restricts generalization

Historically, used in most nursing studies

Types of non-probability sampling methods

Convenience - aka chunk, accidental & incidental sampling

Selection of the most readily available people or objects for a


study

No way to determine representativeness

Saves time and money

Quota

Selection of sample to reflect certain characteristics of the


population

Similar to stratified but does not involve random selection

Quotas for subgroups (proportions) are established

E.g. 50 males & 50 females; recruit the first 50 men and first 50
women that meet inclusion criteria

Purposive - aka judgmental or expert's choice sampling

Researcher uses personal judgement to select subjects that are


considered to be representative of the population
Handpicked subjects

Typical subjects experiencing problem being studied

Snowball

Also known as network sampling

Subjects refer the researcher to others who might be recruited


as subjects

Time Frame for Studying the Sample


See design notes on longitudinal & cross-sectional studies

Longitudinal

Cross-sectional

Sample Size
General rule - as large as possible to increase the representativeness of the
sample

Increased size decreases sampling error

Relatively small samples in qualitative, exploratory, case studies, experimental


and quasi-experimental studies

Descriptive studies need large samples; e.g. 10 subjects for each item on the
questionnaire or interview guide

As the number of variables studied increases, the sample size also needs to
increase in order to detect significant relationships or differences

A minimum of 30 subjects is needed for use of the central limit theorem


(statistics based on the mean)

Large samples are needed if:


There are many uncontrolled variables

Small differences are expected in the sample/population on variables of


interest

The sample is divided into subgroups

Dropout rate (mortality) is expected to be high

Statistical tests used require minimum sample or subgroup size

Power Analysis
Power analysis = a procedure for estimating either the likelihood of committing a Type II
error or a procedure for estimating sample size requirements

Background Information for Understanding Power Analysis:

Type I and Type II errors


Type I error

Based on the statistical analysis of data, the researcher wrongly rejects a


true null hypothesis; and therefore, accepts a false alternative hypothesis

Probability of committing a type I error is controlled by the researcher


with the level of significance, alpha.

Alpha  is the probability that a Type I error will occur

Alpha  is established by researcher; usually  = .05 or .01

lpha = .05 means there is a 5% chance of rejecting a true


null hypothesis; OR out of 100 samples, a true null hypothesis
would be rejected 5 times out of 100 and accepted 95 times out of
100.
Alpha  = .01 means there is a 1% chance of rejecting a true null
hypothesis; OR out of 100 samples, a true null hypothesis would
be rejected 1 time out of 100 and accepted 99 times out of 100

Type II error

Based on the statistical analysis of data, the researcher wrongly accepts a


false null hypothesis; and therefore, rejects a true alternate hypothesis

Probability of committing a Type II error is reduced by a power analysis

Probability of a Type II error is called beta 

Power, or 1-  is the probability of rejecting the null


hypothesis and obtaining a statistically significant result



Type I & Type II Errors In the real world, In the real world, the
the actual situations actual situations is
is that the null that the null
hypothesis is : hypothesis is :

True False

Based on statistical analysis,


the researcher concludes that:
Correct decision: the Type II error: the
Null true: Null hypothesis is actual true null is actual false null is
accepted accepted accepted

Based on statistical analysis,


the researcher concludes that:
Type I error: the Correct decision: the
Null false: Null hypothesis is actual true null actual false null is
rejected & alternate is hypothesis is rejected rejected & alternate is
accepted accepted
Background Information for Understanding Power Analysis:

Population Effect Size - Gamma 

Gamma  measures how wrong the null hypothesis is; it measures how strong
the effect of the IV is on the DV; and it is used in performing a power analysis

Gamma  is calculated based on population data from prior research studies, or


determined several different ways depending on the nature of the data and the
statistical tests to be performed

The textbook discusses 4 ways to estimate gamma (population effect size) based
upon:

Testing the difference between 2 means (t-test)

Testing the difference between 3> means (ANOVA)

Testing bivariate correlation (relationship) between 2 variables


(Pearson's r)

Testing the difference in proportions between 2 groups (chi-square)

If there is no relevant research on topic to estimate the population effect size


(gamma), then use guidelines for gamma  or its equivalent

Testing the difference between 2 means (t-test) - gamma  for small


effects  = .20; medium effects  = .50; large effects  = .80

Testing the difference between 3> means (ANOVA) - eta squared 2for
small effects 2 = .01; medium effects 2 = .06; large effects 2 = .14

Testing bivariate correlation (relationship) between 2 variables


(Pearson's r) gamma  for small effects  = .10; medium effects  = .30;
large effects  = .50

Testing the difference in proportions between 2 groups (chi-square - no


conventions for unknown populations

Determining Sample Size through Power Analysis


Need to have the following data:

Level of significance criterion = alpha , use .05 for most nursing studies and
your calculations
Power = 1 -  (beta); if beta is not known standard power is .80, so use this when
you are determining sample size

Population size effect = gamma  or its equivalent, e.g. eta squared 2; use
recommended values for small, medium, or large effect for the statistical test you
plan to use to answer research questions or test hypothesis

Use tables on pages 455-459 of Polit & Hungler or other reference

Mathematical formulas and computer programs can also be used for calculation of sample
size

Sampling Error and Sampling Bias


Sampling error = The difference between the sample statistic (e.g. sample mean)
and the population parameter (e.g. population mean) that is due to the random
fluctuations in data that occur when the sample is selected

Sampling bias

Also called systematic bias or systematic variance

The difference between sample data and population data that can be
attributed to faulty sampling of the population

Consequence of selecting subjects whose characteristics (scores) are


different in some way from the population they are suppose to
represent

This usually occurs when randomization is not used

Randomization Procedures in Research


Randomization = each individual in the population has an equal opportunity to
be selected for the sample

Random selection = from all people who meet the inclusion criteria, a sample is
randomly chosen

Random assignment
The assignment of subjects to treatment conditions in a random
manner.

It has no bearing on how the subjects participating in an experiment


are initially selected.

See Polit & Hungler, pg. 160-162 for random assignment to groups and
group random assignment to tx. using a random numbers table

Return to calendar/assignments

You might also like