You are on page 1of 1

estimate of the proportion of Mexicans who

Determining Sample Size for Surveys are literate falls within ± 2%.

Christopher McCarty There is a rule of thumb in survey research


Bureau of Business and Economic Research that 1600 responses is an adequate sample
for most research. This is true when the
University of Florida variance in the variable of interest
approaches 50%. In the case of a binary
Introduction proportion, for example, such as the answer
s =
standard deviation to a yes/no question, the maximum variance
Determining proper sample size is perhaps H = precision level, which is the occurs when half the respondents (0.5)
the most confusing part of planning a survey. compactness of the confidence interval answer &dquo;yes&dquo; and half (0.5) answer &dquo;no.&dquo;
It seems intuitive that the larger the desired With a confidence level of 95% and a
population from which the sample is drawn, As an example, suppose you want to estimate precision level of 2.5 %, the suggested sample
the larger the sample size should be. In fact, the average age of people over 18 in all is 1537, or roughly 1600 respondents.
only in very few cases is the sample size households in the country. Suppose further
adjusted for the size of the population and that: (1) the estimate must are willing to settle for a precision
be within, say, If you
this only when the size of the sample level of 5%, then a sample size of 384 is
plus or minus 5 years (H in the formula); (2) called
approaches the size of the population. (This you want to be confident that another survey
for. If the variance is more
is called the finite population adjustment; it would yield the same results 95 times out of complicated, then 1600 responses may be
is employed if the sample is expected to 100 repetitions (z in the formula); and (3) too few, and if you are willing to settle for a
exceed 10% of the population size.) precision level of 5 %, then 1600 responses is
prior surveys suggest the standard deviation
of age is 17 years (s in the formula). too many. Notice the ratio and the cost of
In any survey (whether done by telephone, that ratio: you have to quadruple the sample
by mail, or with face-to-face interviews), if The number of respondents needed, then, size to halve the precision interval.
the sampling is random, then sample size is:
should be determined by three things: the What this means, of course, is that surveys
confidence level you can tolerate, the do not have margins of error; only responses
precision you want, and the variance (or to particular questions and statistics or
estimated variance) of the variable being indexes calculated from responses have a
measured. Of course, in most cases there is margin of error. When press reports say that
more than one variable, so a balance has to You would need to sample 44 respondents to a survey of 1600 people has a margin of error
be struck between sampling for variables be assured at the 95% level of confidence of 2.5°10, that’s usually bad reporting because
with the highest variance and sampling for that the estimate of age was the actual surveys contain so many questions with so
variables that are most important. population average, give or take 5 years. A many different variances. This is more than
precision level of ± 1 year would require a semantic quibbling; an erroneously applied
Suppose a survey about health insurance sample of 289 respondents. margin of error can lead to a serious
needs contains a question on the misinterpretation of results.
respondent’s annual income and a question For variables that are proportions rather
asking if the respondent has health insurance. than means, such as responses to yes/no When all is said and done, sample size is
The variance on the income question could questions or variables such as
categorical often determined by budget constraints
be very high, requiring a sample of 2000 religious affiliation, the formula is the same rather than by formulas. If the formula calls
respondents to get a tight confidence but the calculation of the standard deviation for 3000 responses at $10 each, and there is
interval. is different. Suppose you want to estimate only $20,000 in the budget, researchers live
the proportion of Mexicans who are literate with 2000 surveys. There is nothing wrong
But if the variable of most interest is the in Spanish and that previous studies suggest with this, so long as one is aware of the effect
proportion of respondents who have health that 80% are literate. The formula to estimate this may have on confidence and precision.
insurance, and the estimated variance on the standard deviation is:
that variable is low, then you might need Notes
only a sample of 1000 to get more or less the
same tight confidence interval you’d get 1. Estimating the variance of a question in a
with 2000 respondents. It pays in this case to population can be difficult. Fortunately, variance
ask how tight an interval is needed on In this case p=.80, so: for most responses tends to change less quickly
estimates of income. over time than means or proportions. If previous
studies suggest a variance, then it can be used.
The formula to determine sample size’ is: Otherwise, estimating the maximum variance
Assuming you want a confidence level of expected is recommended.
95% and a confidence interval of ± 2%, the
2. This and other formulas in this article
sample size is: were

taken from Parasuraman (1991).


where
n = sample size Reference
z =
confidence level (that is, 1.96 for 95%
confidence and 2.575 for 99% You need to interview 1537 respondents to Parasuraman, A. 1991. Marketing Research, 2d
confidence) be confident at the 95% level that your edition. Reading, MA: Addison-Wesley.
5

Downloaded from fmx.sagepub.com at CORNELL UNIV on June 24, 2015

You might also like