Professional Documents
Culture Documents
Methodological Recommendations
Methodological Recommendations
METHODOLOGICAL RECOMMENDATIONS
NACE
The following industries shall be included in the target population of the CIS3:
- mining and quarrying (NACE 10-14)
- manufacturing (NACE 15-37)
- electricity, gas and water supply (NACE 40-41)
- wholesale trade (NACE 51)
- transport, storage and communication (NACE 60-64)
- financial intermediation (NACE 65-67)
- computer and related activities (NACE 72)
- research and development (NACE 73)
- architectural and engineering activities (NACE 74.2)
- technical testing and analysis (NACE 74.3)
Size-classes
The cut off point for inclusion in the target population should not be more than 10
employees in any of the specified sectors. Countries may also include enterprises with
less than 10 employees, if they are treated separately.
Statistical unit
The statistical unit for CIS3 shall be the enterprise, as defined in the Council
Regulation1 on statistical units or as defined in the statistical business register. If the
enterprise for some specific reasons is not feasible as statistical unit, other units like
divisions of enterprise groups, kind of activity units or even enterprise group can be
used. It is important that the data collectors know which unit each report relates to and
make the necessary adjustments to avoid double-counting or missing reporting. Other
units used than the enterprise should be indicated in the database.
2. Survey methodology
Sampling frame
The sampling frame should be a business register with as good quality as possible,
containing basic information such as names, addresses, NACE-sector, size and region
of all enterprises in the target population. The ideal frame would be an up-to-date
official business register established for statistical purpose. If possible, the official
statistical business register of the country should be used. Otherwise other registers
Stratification
When one has to deal with a heterogeneous population, a sound technique is to break
the target population into similar structured subgroups or strata. Appropriate
stratification gives results with smaller sampling errors that a non-stratified sample of
the same size and also makes sure that there are enough units in the respective
domains of the study.
The best stratification variables for CIS3, i.e. the characteristics used to break down
the sample into similarly structured groups, are:
- the industry classification (NACE)
- size according to number of employees.
These two variables are highly correlated with innovation activity.
The size-classes should at least be the following 3 classes: 10-49 employees (small),
50-249 employess (medium-sized) and 250+ employees (large). A more detailed size-
band within these 3 classes may also be used, but whenever these are grouped, they
should fit in the size bands mentioned above.
Stratification by NACE should be by 2-digit level (division) or groups of division,
with 74.2 and 74.3 as exceptions.
The regional dimension should be taken into account by checking that the regional
allocation of sample units seems reasonable compared to the regional distribution of
the population before the sample is finally decided.
Sampling
It is recommended to select samples for innovation survey according to a randomising
procedure. Only random samples offer the major advantage that the order of
magnitude of sampling errors can be controlled at the designed state and be
determined after the survey solely from the data obtained with the sample.
Allocation principle
In general, there are two main ways of calculating the sample size in each stratum,
either proportional or optimum allocation. Both methods are acceptable.
In the pure proportional allocation method, the same sampling fraction is applied to
all strata, thus yielding a self-weighting sample. If numerous estimates have to be
made, a self-weighting sample is time-saving. However, it is recommended to fix
different sampling fractions, at least according to size. A full census, or high fraction
rate, is recommended for large enterprises, and lower fraction rate for medium-sized
enterprises and even lower for small enterprises. The allocation principle can be
further fine-tuned by NACE; normally the larger is the number of enterprises in the
strata, the smaller should be the sampling fraction.
The aim of optimum allocation is either to minimise the variance of an estimator for a
specified cost or to minimise the cost for a specific variance. The allocation is based
on the following rules of conduct: a larger sample is selected for a stratum if it has a
higher weight in the population, if it has a higher variance and if the survey cost per
unit is lower in this stratum. The variance in each stratum can be based on previous
CIS.
If new sectors of the economy are added in CIS3, data from CIS2 will obviously not
be available. However the optimal allocation can be carried out by the use of
assumptions. Either the new sectors are considered as the average national firm (based
on the national mean of the share of innovators by size class) or one can assume the
new sector will be close to a sector that has already been sampled previously.
Precision
The sample should be carried out in order to achieve a certain level of precision with
regards to the following indicators :
1. the percentage of innovators
2. the share of new or improved products in total turnover
3. total turnover per employee.
It is recommended that the 95% confidence interval for the first two indicators should
be within X ± 5%. For the last indicator the confidence interval should be within ±
10% of the estimated indicator.
Response rate.
The response rates in CIS2 varied between countries from 24% till 90% with 57% as
the average. The expected response rate has to be taken into account when
determining the sample size. The optimal number of enterprises in each stratum,
should be the expected number in the realised sample.
Weighting factors
To extrapolate results to the whole target population weighting factors have to be
calculated. The weighting factors should be based on shares between the numbers of
enterprises or number of employees in the realised sample and total number of
enterprises/employees in each stratum of the frame population, correction made for no
longer existing enterprises and changes in size or Nace classes (and adjusted for non
response). In case a non-response analysis is carried out then the results of the non
response analysis will be used in the calculation of weighting factors.
4. Transmission of data
The Member States should send individual microdata to Eurostat. The data should be
sent in a fixed file format, preferable using STADIUM. The data will be handled in
line with the rules for treatment of confidential Community data.
In addition to microdata a set of specified tables with background information
(population, sample) and with the main variables have to be sent to Eurostat, first
preliminary results and then final results. The tables with variables should be
controlled national data that can be put into the database New Cronos with minimum
modifications. Confidential data in the tables have to be identified and marked by
each Member state.
Annex
Optimum allocation:
In optimum allocation, the theory of stratified sampling2 shows that an optimum for
the number of units to be sampled in a stratum is reached at:
N h Sh
nh Ch
≡ (1)
n æN S ö
å çç h h ÷÷
è Ch ø
The first step is to determine the total sample size. Under cost constraints, the aim is
to minimise the variance of his estimator and the sampler should determine its total
sample size according to:
(C − Co )å N h Sh
Ch
n=
å (N h Sh Ch )
(2)
An important special case arises if the cost per unit is the same in all strata. Then the
Neyman allocation can be used. The size of the sample in each stratum is:
W h Sh = n N h Sh
nh = n (3)
å W h Sh å Nh Sh
where Wh is the weight of each stratum, ie. Nh/N where N is the total number of units
in the frame.
In the case the sampler has to minimise his cost given a fixed variance, then n should
be:
2 For further information see Cochran, William G., Sampling techniques, 3rd Edition, John Wiley, 1977
(å W S
h h Ch )å W S
h h
Ch
n= (4)
æ1ö
V + ç ÷å W h S2h
èNø
where V is the fixed variance selected according to the degree of precision requested
for the relevant variable.
The number of enterprises to be sampled in each stratum is then determined by
equation 1.