You are on page 1of 8

The third Community Innovation Survey

METHODOLOGICAL RECOMMENDATIONS

Methodological recommendations for CIS3.


1. Target population
The target population of the CIS3 shall be the total population of enterprises of a
certain size and industry.

NACE
The following industries shall be included in the target population of the CIS3:
- mining and quarrying (NACE 10-14)
- manufacturing (NACE 15-37)
- electricity, gas and water supply (NACE 40-41)
- wholesale trade (NACE 51)
- transport, storage and communication (NACE 60-64)
- financial intermediation (NACE 65-67)
- computer and related activities (NACE 72)
- research and development (NACE 73)
- architectural and engineering activities (NACE 74.2)
- technical testing and analysis (NACE 74.3)

Size-classes
The cut off point for inclusion in the target population should not be more than 10
employees in any of the specified sectors. Countries may also include enterprises with
less than 10 employees, if they are treated separately.

Statistical unit
The statistical unit for CIS3 shall be the enterprise, as defined in the Council
Regulation1 on statistical units or as defined in the statistical business register. If the
enterprise for some specific reasons is not feasible as statistical unit, other units like
divisions of enterprise groups, kind of activity units or even enterprise group can be
used. It is important that the data collectors know which unit each report relates to and
make the necessary adjustments to avoid double-counting or missing reporting. Other
units used than the enterprise should be indicated in the database.

2. Survey methodology

Sampling frame
The sampling frame should be a business register with as good quality as possible,
containing basic information such as names, addresses, NACE-sector, size and region
of all enterprises in the target population. The ideal frame would be an up-to-date
official business register established for statistical purpose. If possible, the official
statistical business register of the country should be used. Otherwise other registers

1 Council Regulation (EEC) N° 696/93 of 15 March 1993, OJ N° L76 of the 3 March.


have to be used. If the quality is too poor, e.g. not covering all sampling units of
interest, important variables are missing or not up-to date, it is necessary to improve
the register, otherwise the survey may fail.

Census or sample survey


Innovation data may be collected both through census or sample surveys. Resource
limitations and response burden will in most cases rule out a survey of the entire
population (census). It is assumed that the third Community Innovation Survey, like
the second, should be based on mail surveys.

A census may however be unavoidable in some cases. It may be a legal requirement


that all business surveys have to be censuses. In addition, when the frame population
is fairly small (e.g in small countries), proper sampling technique may produce a
sample size which is relatively close in size to the frame population. In such cases
censuses may be worth considering.

Stratification
When one has to deal with a heterogeneous population, a sound technique is to break
the target population into similar structured subgroups or strata. Appropriate
stratification gives results with smaller sampling errors that a non-stratified sample of
the same size and also makes sure that there are enough units in the respective
domains of the study.

The best stratification variables for CIS3, i.e. the characteristics used to break down
the sample into similarly structured groups, are:
- the industry classification (NACE)
- size according to number of employees.
These two variables are highly correlated with innovation activity.
The size-classes should at least be the following 3 classes: 10-49 employees (small),
50-249 employess (medium-sized) and 250+ employees (large). A more detailed size-
band within these 3 classes may also be used, but whenever these are grouped, they
should fit in the size bands mentioned above.
Stratification by NACE should be by 2-digit level (division) or groups of division,
with 74.2 and 74.3 as exceptions.
The regional dimension should be taken into account by checking that the regional
allocation of sample units seems reasonable compared to the regional distribution of
the population before the sample is finally decided.

Sampling
It is recommended to select samples for innovation survey according to a randomising
procedure. Only random samples offer the major advantage that the order of
magnitude of sampling errors can be controlled at the designed state and be
determined after the survey solely from the data obtained with the sample.

Allocation principle
In general, there are two main ways of calculating the sample size in each stratum,
either proportional or optimum allocation. Both methods are acceptable.
In the pure proportional allocation method, the same sampling fraction is applied to
all strata, thus yielding a self-weighting sample. If numerous estimates have to be
made, a self-weighting sample is time-saving. However, it is recommended to fix
different sampling fractions, at least according to size. A full census, or high fraction
rate, is recommended for large enterprises, and lower fraction rate for medium-sized
enterprises and even lower for small enterprises. The allocation principle can be
further fine-tuned by NACE; normally the larger is the number of enterprises in the
strata, the smaller should be the sampling fraction.

The aim of optimum allocation is either to minimise the variance of an estimator for a
specified cost or to minimise the cost for a specific variance. The allocation is based
on the following rules of conduct: a larger sample is selected for a stratum if it has a
higher weight in the population, if it has a higher variance and if the survey cost per
unit is lower in this stratum. The variance in each stratum can be based on previous
CIS.

If new sectors of the economy are added in CIS3, data from CIS2 will obviously not
be available. However the optimal allocation can be carried out by the use of
assumptions. Either the new sectors are considered as the average national firm (based
on the national mean of the share of innovators by size class) or one can assume the
new sector will be close to a sector that has already been sampled previously.

Precision
The sample should be carried out in order to achieve a certain level of precision with
regards to the following indicators :
1. the percentage of innovators
2. the share of new or improved products in total turnover
3. total turnover per employee.

It is recommended that the 95% confidence interval for the first two indicators should
be within X ± 5%. For the last indicator the confidence interval should be within ±
10% of the estimated indicator.

Response rate.
The response rates in CIS2 varied between countries from 24% till 90% with 57% as
the average. The expected response rate has to be taken into account when
determining the sample size. The optimal number of enterprises in each stratum,
should be the expected number in the realised sample.

3. Collecting and processing of data

Unit non response


All efforts should be made to minimise the unit non response. To secure an acceptable
response rate, at least two reminders have to be made. If the non response exceeds
30% based on the ratio between non responding and operating enterprises in the
sample population as a whole (exclusion of no longer operating enterprises or not
found enterprises), a non-response analysis is needed. The experience from CIS 1 and
CIS 2 showed that non-respondents might be biased towards a certain type of
enterprises.

Therefore, a non-response analysis has to undertaken including the main questions of


the ordinary questionnaires in order to make it possible to distinguish innovators from
non innovators. At least a 10% sample of the non-respondents where no longer
operating enterprises have been excluded should be drawn. Given that the sample size
of the non-response analysis has to be relatively small and given the large number of
strata, stratification by NACE and size classes may not be used. However information
on the NACE and size class must be available and may also be used as stratification
variables, but this require a higher sample size. The non response analysis should be
made on the basis of a simple random sample of the non-respondent population.
In order for the non-response analysis to be useful, it should have a very high
response rate (preferably 100%).

Item non response


Item non response and partial item non response should be kept at a minimum by
contractors by asking the enterprises for additional information. Item non-response for
variables on general and basic information about the enterprises should not exist.
These information should be available in the business registers or from other sources.
Contractors should as far as possible impute item non response and use auxiliary
information whenever possible. The data should also be checked and corrected for
logical inconsistencies.

Weighting factors
To extrapolate results to the whole target population weighting factors have to be
calculated. The weighting factors should be based on shares between the numbers of
enterprises or number of employees in the realised sample and total number of
enterprises/employees in each stratum of the frame population, correction made for no
longer existing enterprises and changes in size or Nace classes (and adjusted for non
response). In case a non-response analysis is carried out then the results of the non
response analysis will be used in the calculation of weighting factors.

Quality control of the data.


It is important that quality controls have been done on the data at micro- and macro-
level by the Member States before they are sent to Eurostat. There should be a set of
controls that are agreed upon as common rules. Eurostat will work out a proposed set
of rules based on those used in CIS2.

4. Transmission of data
The Member States should send individual microdata to Eurostat. The data should be
sent in a fixed file format, preferable using STADIUM. The data will be handled in
line with the rules for treatment of confidential Community data.
In addition to microdata a set of specified tables with background information
(population, sample) and with the main variables have to be sent to Eurostat, first
preliminary results and then final results. The tables with variables should be
controlled national data that can be put into the database New Cronos with minimum
modifications. Confidential data in the tables have to be identified and marked by
each Member state.
Annex

Optimum allocation:

In optimum allocation, the theory of stratified sampling2 shows that an optimum for
the number of units to be sampled in a stratum is reached at:
N h Sh
nh Ch
≡ (1)
n æN S ö
å çç h h ÷÷
è Ch ø

where nh is the number of units in the sample in stratum h


n is total sample size
Nh total number of units in the frame population for stratum h
Sh true standard deviation in stratum h for the relevant variable
Ch is the cost for sampling one unit in stratum h

The first step is to determine the total sample size. Under cost constraints, the aim is
to minimise the variance of his estimator and the sampler should determine its total
sample size according to:

(C − Co )å N h Sh
Ch
n=
å (N h Sh Ch )
(2)

where C is the total cost


Co represents an overhead cost

An important special case arises if the cost per unit is the same in all strata. Then the
Neyman allocation can be used. The size of the sample in each stratum is:

W h Sh = n N h Sh
nh = n (3)
å W h Sh å Nh Sh

where Wh is the weight of each stratum, ie. Nh/N where N is the total number of units
in the frame.

In the case the sampler has to minimise his cost given a fixed variance, then n should
be:

2 For further information see Cochran, William G., Sampling techniques, 3rd Edition, John Wiley, 1977
(å W S
h h Ch )å W S
h h

Ch
n= (4)
æ1ö
V + ç ÷å W h S2h
èNø

where V is the fixed variance selected according to the degree of precision requested
for the relevant variable.
The number of enterprises to be sampled in each stratum is then determined by
equation 1.

You might also like