P. 1
CIM Market Information and Research 2009-2010

CIM Market Information and Research 2009-2010

|Views: 710|Likes:
Published by Bola Balogun

More info:

Published by: Bola Balogun on Oct 24, 2011
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





There is no necessary relationship between the size of the population and

the sample. Whilst the larger the sample size the more accurate the results,

this has to be traded off against the cost of producing this effect and the

complexity, and therefore cost of managing the collection and processing of

large amounts of data.

The cost of producing more response is normally proportional, that is the

percentage increase in the cost of producing a percentage increase in sample

size will be the same. However, the increase in accuracy is not proportional.

As Wilson (2006) points out, sampling error tends to decrease at a rate equal

to the square root of the relative increase in sample size. A sample increased

by 100% will improve accuracy by 10%.

Sample size is often determined by past experience. Previous studies will


& The degree of variability in the population – the more the variability,

the larger the sample size will need to be.

& The likely response rates – if these are believed to be low, the sample

will need to be larger.

Determining the Sample Size 195

& The incident rate of the characteristic being researched – if this is

common, the sample may be smaller.

& The number of sub-groups within the data – the smaller groups will

have larger sampling errors and a larger sample might be needed to

ensure that subgroups can be effectively analysed.

Other factors play a key role in determining sample size. These include the


& Budget – always a factor in marketing decisions; the higher the sample

size, the greater the cost.

& Timings – the larger the sample size, the longer it takes to gather data

and complete the analysis.

& The risk attached to any decision – the greater the risk, the higher the

level of accuracy required.

The nature of the research may indicate complex analysis of sub-samples,

for example women as opposed to men buying a certain product; if this is the

case the sub-samples need to be large enough to ensure statistical reliability.

Statistical techniques for determining sample size

For probability samples, statistical methods are used to establish sample


We need three pieces of information to work this out.

1. Variance and the degree of variability of the population, known as

standard deviation.

2. The required limit of accuracy or sampling error.

3. The required level of confidence that the results will fall within a

certain range.

Variance is a measure of how spread out a data set is. We work it out by

looking at the average squared deviation of each number from its mean.

There are different formulae for working out variance but the one most

commonly used in market research takes into account the potential bias in a


The formula is




ðXi À XÞ2
nÀ 1

196 CHAPTER 8: Sampling

where X is the individual value in an array of data

X is the mean of the array and

n is the number of values in an array.

For example, for the numbers 1, 3, 6, 4 and 1, the number of values is 5,

and the variance is 4.5.

Sum of squared differences divided by number of observations less 1 is

18/(5 – 1)=4.5.

This is the variance.

Standard deviation is used to compare the spread of data sets. The more

spread a set of values, the higher the standard deviation.

Standard deviation is the square root of the variance which we calculated

above. You can see that the formula within the square root symbol is the

formula we used to calculate variance.

SD ¼

ðXi À XÞ2
nÀ 1


Xi=the value of each data point

X =the average of all the data points

=the Greek letter sigma, meaning ‘sum of’ and

n=the total number of data points.




Deviation squared



























Deviation squared























Determining the Sample Size 197

Sum of squared differences divided by number of observations less 1 is

18/(5 – 1)=4.5.

This is the variance.

The standard deviation is the square root of the variance or 2.12.

Normal distribution

Standard deviation is a measure of how widely values are dispersed from the

average value (the mean). The higher the standard deviation, the more

widely the values are spread. This allows us to use standard deviation to

compare data sets.

In order to apply this to the determination of sample size, we need to

understand another concept. That is normal distribution. Normal distribu-

tion is an important concept. What it implies is that the distribution of

values within any data set will be similar, for example shoe size, height or

income, and will follow the pattern shown below – known as a bell-shaped

curve (Figure 8.2).

So what does this mean? The area under the curve represents all occur-

rences. The line through the centre of the curve is the mean value.

Normal distribution has another key characteristic. Sixty-eight per cent

of all occurrences fall within one standard deviation of the mean.

Normal distribution also tells us that 95% of occurrences would fall

between 1.96 standard deviations. This is very important as for the most

part market researchers work at this level of certainty. What it means

effectively is that there is a 1 in 20 chance of an occurrence falling outside

this predicted range. Normal distribution also tells us that 99% of occur-

rences fall within 2.58 standard deviations:

The key point is that for any normal distribution, for any data set, the

distribution of values is the same.

50% of

50% of

FIGURE 8.2 Normal distribution, the bell-shaped curve.

198 CHAPTER 8: Sampling

To repeat:

& 68% of values fall within 1 standard deviation.

& 95% fall within 1.96 standard deviations.

& 99% fall within 2.58 standard deviations.

These percentages (68, 95 and 99%) are known as confidence levels and are

the same for all data sets that conform to a normal distribution. There are

othertypes ofdistributionbutyouneednotgofurtherintothisforthecourse.

For our purpose, marketers generally use 95 or 99% confidence limits.

These relate to 1.96 and 2.58 standard deviations and these are the con-

fidence levels also known as Z values that are used. The upper and lower

limit of the range that they indicate (e.g.=+1.96) is called the confidence

limit. The range itself is the confidence interval. Together these represent

the most valuable tools for working out occurrences in the total market from

a smaller sample.

There are two different ways of working out sample sizes for random

samples, and these depend on whether we are measuring averages or


For studies involving averages or means

The formula to work out sample size is

N ¼ Z2



& where Z is the confidence level

& is the population standard deviation

& E is the acceptable level of precision.

Specify the level of precision

The level of precision is worked out by clients and researchers and reflects

the budget available and the acceptable margin of error or degree of risk

attached to the outcome of the research. If there is a need for accurate data,

the sample size may be larger and the level of precision would be tighter.

Determine the acceptable confidence interval

As we have seen above, the standard level of confidence is 95%. Remember,

this means that at the 95% confidence interval there is a 1 in 20 chance of

the sample being wrong. If the level of risk was high, then we could work at

Determining the Sample Size 199

the 99% confidence level, here there is a 1 in a 100 chance of the sample

being wrong.

Estimate the standard deviation

It is impossible to know this before carrying out the survey, so an estimate is

required. This can be based upon:

& Previous studies

& Secondary research

& The result of pilot surveys

& Judgement.

Once the study is completed, the sample mean and standard deviation can

be calculated, and the exact confidence level and limits of error can be

worked out.

Remember the formula, and work through the example

N ¼ Z2



The sample required is 443.

Play around with the formula. Change the required level of precision and

look at the impact on the sample size required.

Studies involving proportions

Studies measuring the proportion of a population having a certain character-

istic are often required in marketing and in surveys, for example the propor-

tion responding to a promotion or the number of voters against university

top-up fees. To determine sample size here a different formula is needed.

Remember Z is our confidence level; let us use the standard marketing

confidence level – so Z is 1.96 or the 95% confidence level.

E is the limit of error. In this case we need the results to be correct to

within let us say+3%, written as a decimal+0.03.

P is the estimated percentage of the population who have the character-

istic. In this case we will look at the number of people who may respond to a

test mailing and we estimate that 15% may respond. This again is written as

a decimal –0.15.

So, let us work this through:

N ¼ 1:96  1:96½0:15ð1 À 0:15ފ

200 CHAPTER 8: Sampling

We would therefore need a sample of 544 to be 95% confident of our 15%

response rate on roll out of the campaign.

If we reduced the limits of error to+1% the sample size would increase

to 4896.

If the estimated response was 2% we can see the sample size would

decrease to 750. The figure reduces because the variance in the population

is lower.

If the estimated response rate went to 20%, then the sample required

would be 6144.

Adjustment for larger samples

We have said that there is no direct relationship between population and

sample size to estimate a characteristic with a level of error and confidence.

The assumption is that sample elements are drawn independent of one

another. This cannot be assumed when the sample is higher than 10% of the

population. If this is the case, an adjustment is made, called the finite

population correction factor.

The calculation reduces the required sample:

N1 ¼ nN
N þnÀ 1

& N1 is the revised sample size

& n is the original sample size

& N is the population size.

For example, if the population has 2000 elements and the original sample

size is 400, then,

N ¼ 400 Â 2000
2000 þ 400 À 1

N ¼ 333

Other rules-of-thumb factors to consider in setting sample sizes:

& Trade of cost against reliability and accuracy.

& Minimum subgroup sizes should be more than 100 respondents. It is

difficult to be confident in figures lower than this.

& The average sample size in national surveys in the United Kingdom is

around 1000–2000 respondents. Minimum sample sizes in the FMCG

markets are 300–500 respondents.

Determining the Sample Size 201

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->