Professional Documents
Culture Documents
What is Data?
1. Characteristics of a well-designed
and well-conducted survey
a. A good survey must be representative of the population.
b. To use the probabilistic results, it always incorporates a chance, such as a random number
generator. Often we don’t have a complete listing of the population, so we have to be careful
about exactly how we are applying “chance”. Even when the frame is correctly specified, the
subjects may choose not to respond or may not be able to respond.
c. The wording of the question must be neutral; subjects give different answers depending on the
phrasing.
d. Possible sources of errors and biases should be controlled. The population of concern as a
whole may not be available for a survey. Its subset of items possible to measure is called a
sampling frame (from which the sample will be selected). The plan of the survey should
specify a sampling method, determine the sample size and steps for implementing the
sampling plan, and sampling and data collecting.
2. Sampling Methods
a. Nonprobability sampling – is any sampling method where some elements of the
population have no chance of selection or where the probability of selection can’t
be accurately determined.
Example: We visit every household in a given street, and interview the first person to
answer the door. In any household with more than one occupant, this is a nonprobability
sample, because some people are more likely to answer the door (e.g. an unemployed
person who spends most of their time at home is more likely to answer than an
employed housemate who might be at work when the interviewer calls) and it’s not
practical to calculate these probabilities.
In addition, nonresponse effects may turn any probability design into a nonprobability
design if the characteristics of nonresponse are not well understood, since nonresponse
effectively modifies each element’s probability of being sampled.
b. Probability Sampling – it is possible to both determine which sampling units
belong to which sample and the probability that each sample will be selected.
The following sampling methods are example of probability sampling:
i. Simple Random Sampling (SRS), all samples of a given size have an
equal probability of being selected and selections are independent. The
frame is not subdivided or partitioned. The sample variance is a good
indicator of the population variance, which makes it relatively easy to
estimate the accuracy of results.
ii. Systematic Sampling – relies on dividing the target population into strata
(subpopulations) of equal size and then selecting randomly one element from the
first stratum and corresponding elements from all other strata. A simple example
would be to select every 10th name from the telephone directory, with the first
selectin being random. SRS may select a sample from the beginning of the list.
Systematic sampling helps to spread the sample over the list.
iii. Stratified Sampling – when the population embraces a number
of distinct categories, the frame can be organized by these
categories into separate “strata”. Each stratum is then sampled as
an independent sub-population. Dividing the population into strata
can enable researchers to draw inferences about specific
subgroups that may be lost in a more generalized random sample.
Example: Consider the statistical relationship between ice cream sales and drowning
deaths. These two (2) variables have a positive correlation because both occur more
often during summer. However, it would be wrong to conclude that there is a cause-and-
effect relation between them.
b. Placebo and blinding – a placebo is an imitation pill identical to the actual treatment
pill, but without the treatment ingredients. A placebo effect is a sham (or simulated) effect
when medical intervention has no direct health impact but results in actual improvement
of a medical condition because the patients knew they were treated.
c. Blocking – is the arranging of experimental units in groups (blocks) that are similar to
one another. Typically, a blocking factor is a source of variability that is not of primary
interest to the experimenter.
4. Completely randomized design, randomized
block design and matched pairs
a. Completely randomized designs – are for studying the effects of
one primary factor without the need to take other nuisance variables
into account. The experiment compares the values of a response
variable (like health improvement) based on the different levels of that
primary factor (e.g., different amounts of medication).
Cluster Sampling
2. To determine the quality of education at
the University of Utah, a UNID number is
chosen at random, then every 1000th
student is evaluated until 30 students are
selected.
Systematic Sampling
3. The names of 25 employees are being
chosen out of a hat from a company of 250
employees.
Stratified Sampling
Thank you ☺