You are on page 1of 32

4.

The epidemiological approach to investigating disease problems


4.1 Introduction 4.2 Types of epidemiological study 4.3 Sampling techniques in epidemiological studies 4.4 Sample sizes 4.5 Methods for o taining data in epidemiological studies 4.! "asic considerations in the design of epidemiological in#estigations 4.$ The use of e%isting data 4.& Monitoring and sur#eillance

4.1 Introduction
In 'hapters I and 2 (e descri ed the need for an epidemiological approach to the in#estigation of disease pro lems. )e also implied that such in#estigations usually ha#e the asic o *ecti#e of descri ing and quantifying disease pro lems and of e%amining associations et(een determinants and disease. )ith these o *ecti#es in mind+ epidemiological in#estigations are normally conducted in a series of stages+ (hich can e roadly classified as follo(s, 1. - diagnostic phase+ in (hich the presence of the disease is confirmed. 2. - descripti#e phase+ (hich descri es the populations at ris. and the distri ution of the disease+ oth in time and space+ (ithin these populations. This may then allo( a series of hypotheses to e formed a out the li.ely determinants of the disease and the effects of these on the frequency (ith (hich the disease occurs in the populations at ris.. 3. -n in#estigati#e phase+ (hich normally in#ol#es the implementation of a series of field studies designed to test these hypotheses. 4. -n e%perimental phase+ in (hich e%periments are performed under controlled conditions to test these hypotheses in more detail+ should the results of phase 3 pro#e promising. 5. -n analytical phase+ in (hich the results produced y the a o#e in#estigations are analysed. This is often com ined (ith attempts to model the epidemiology of the disease using the information generated. Such a process often ena les the epidemiologist to determine (hether any #ital its of information a out the disease process are missing. !. -n inter#ention phase+ in (hich appropriate methods for the control of the disease are e%amined either under e%perimental conditions or in the field. Inter#entions in the disease process are effected y manipulating e%isting determinants or introducing ne( ones. $. - decision/ma.ing phase+ in (hich a .no(ledge of the epidemiology of the disease is used to e%plore the #arious options a#aila le for its control. This often in#ol#es the

modelling of the effects that these different options are li.ely to ha#e on the incidence of the disease. These models can e com ined (ith other models that e%amine the costs of the #arious control measures and compare them (ith the enefits+ in terms of increased producti#ity+ that these measures are li.ely to produce. The optimum control strategy can then e selected as a result of the e%pected decrease in disease incidence in the populations of li#estoc. at ris.. &. - monitoring phase+ (hich ta.es place during the implementation of the control measures to ensure that these measures are eing properly applied+ are ha#ing the desired effect on reducing disease incidence+ and that de#elopments that are li.ely to *eopardise the success of the control programme are quic.ly detected. The follo(ing t(o sections are concerned (ith descri ing (ays in (hich epidemiological in#estigations can e designed and implemented+ and the data produced analysed.

4.2 Types of epidemiological study


4.2.1 0rospecti#e studies 4.2.2 1etrospecti#e studies 4.2.3 'ross/sectional/studies There are three main types of epidemiological study,

Prospective studies, (hich loo. for(ard o#er a period of time and normally attempt to e%amine associations et(een determinants and the frequency of occurrence of a disease y comparing attac. rates or incidences of disease in groups of indi#iduals in (hich the determinant is either present or a sent+ or its frequency of occurrence #aries.
1etrospecti#e studies+ (hich loo. backward o#er a period of time and normally attempt to compare the frequency of occurrence of a determinant in groups of diseased and non/ diseased indi#iduals.

Cross-sectional studies, (hich attempt to e%amine and compare estimates of disease pre#alence et(een #arious populations and su sets of populations at a particular point in time.
2requently+ ho(e#er+ these approaches may e com ined in a general study of a disease pro lem. In such studies+ other mor idity and mortality rates may e compared as (ell as other #aria les such as (eight gain+ mil. yield etc. depending on the o *ecti#es of the particular study.

4.2.1 Prospective studies


There are+ essentially+ t(o approaches to a prospecti#e study. The first+ (hich is similar to that used in controlled e%periments+ can e used (hen the in#estigator has control o#er the

distri ution of the determinant that is to e studied. The indi#idual animals selected for the study are assigned to groups or cohorts. 32or this reason+ prospecti#e studies are often called cohort studies4. The determinant to e studied is then introduced into one cohort and the other cohort is .ept free of the determinant as a control. The t(o cohorts are o ser#ed o#er a period of time and the frequencies (ith (hich disease occurs in them are noted and compared. 5ften+ ho(e#er+ the in#estigator has no control o#er the distri ution of the determinant eing studied. In such a case he (ill select the indi#iduals that ha#e een or are e%posed to the determinant concerned+ (hile another group of indi#iduals that do not ha#e+ or ha#e not een e%posed to+ that determinant is used as a control. The frequency of occurrence of the disease in the different groups is then o ser#ed o#er a period of time and compared. In prospecti#e studies+ the cohorts eing compared should consist+ ideally+ of animals of the same age+ reed and se% and should e dra(n from (ithin the same herds or floc.s+ since there may e many differences in the (ay that different herds or floc.s are .ept and managed+ (hich may e e%pected to ha#e an effect on the frequency of occurrence of the disease eing in#estigated. If such cohorts can e selected+ prospecti#e studies can demonstrate accurately the association et(een determinants and disease+ since the cohorts (ill differ from each other merely in the presence or a sence of the particular determinant eing studied. This (ill only e possi le if the in#estigator has control o#er the distri ution of the determinant eing selected. 6#en then+ such conditions are often #ery difficult to fulfil in the field+ (here the in#estigator is dependent on the cooperation of li#estoc. o(ners (ho may e un(illing to alter their management systems to fit in (ith the study design. If the in#estigator has no control o#er the distri ution of the determinant eing studied+ the study design ecomes more complicated and the in#estigation may ha#e to e repeated to ta.e into account the #ariations in the many different factors in#ol#ed. 0rospecti#e studies ha#e the disad#antage that if the incidence of the disease is lo(+ or the difference one (ishes to demonstrate et(een groups is small+ the size of the study groups has to e large. 3Methods for analysing the results of prospecti#e studies and for estimating the size of cohorts needed are descri ed in 'hapter 54. The pro lem of lo( disease incidence can sometimes e o#ercome y artificially challenging the different cohort groups (ith the disease in question. 7o(e#er+ this may not e accepta le under field conditions+ since li#estoc. o(ners ta.e gra#e e%ception to ha#ing their animals artificially infected8 2or these reasons+ prospecti#e studies are normally performed on diseases of high incidence and (here the e%pected difference in disease frequencies et(een the groups studied is li.ely to e large.

4.2.2 Retrospective studies


1etrospecti#e studies are often referred to as case-control studies. In such studies+ the normal procedure is to loo. ac. through records of cases of a particular disease in a population and note the presence or the a sence of the determinant eing studied. The case group can then e compared (ith a group of disease/free indi#iduals in (hich the frequency of occurrence of the determinant has een determined. 9ote that in a case/ control study one is+ in effect+ comparing the frequency of occurrence of the determinant in t(o groups+ one diseased 3cases4 and one not 3controls4.

1etrospecti#e studies ha#e #arious ad#antages and disad#antages (hen compared (ith prospecti#e studies. The principal ad#antage of retrospecti#e studies is that they ma.e use of data that ha#e already een collected and can+ therefore+ e performed quic.ly and cheaply. In addition+ ecause diseased indi#iduals ha#e already een identified+ retrospecti#e studies are particularly useful in in#estigating diseases of lo( incidence. The main disad#antage is that the in#estigator has no control o#er ho( the original data (ere collected+ unless he or she collected them. If the data are old+ it may not e possi le to contact the indi#iduals (ho had collected them+ and thus there is often no (ay of .no(ing (hether the data are iased or incomplete 3see also Section 4.$ on some other disad#antages in using already generated data in epidemiological (or.4. The second ma*or disad#antage is that although one .no(s the frequency of occurrence of the determinant in the case group+ one does not .no( its frequency of occurrence in non/ diseased indi#iduals from the same population. The latter is normally determined y sampling from a population of non/diseased indi#iduals at the time that the study is eing carried out. There is no (ay of .no(ing the e%tent of the similarity et(een the t(o different populations from (hich the case and control groups are ta.en. 'onsequently+ there is no (ay of ascertaining the distri ution (ithin these populations of undetermined factors (hich could affect the frequency of the disease. :reat caution has to e e%ercised+ therefore+ in ma.ing inferences a out associations et(een determinants and disease frequencies from retrospecti#e studies. - third disad#antage is that historical data on cases of disease that are sufficiently accurate to merit further study+ are hard to come y in #eterinary medicine. The opportunities for doing case/control studies are thus rather limited. They are much more common in human medical studies. In spite of the fact that classic case/control studies are rarely performed in #eterinary epidemiology+ retrospecti#e data are often used in li#estoc. disease studies. The ad#antages and disad#antages of using such data are discussed later on in this chapter. Methods for analysing case/control study data and for calculating the sizes of case and control study groups are descri ed in the follo(ing chapter.

4.2.3 Cross sectional studies


'ross/sectional studies are+ in fact+ sur#eys. They ta.e place o#er a limited time period and+ in epidemiological studies+ are normally concerned (ith detecting disease+ estimating its pre#alence in different populations or in different groups (ithin populations+ and (ith in#estigating the effect of the presence of different determinants on disease pre#alence. They can+ of course+ e used to pro#ide data on a large num er of other #aria les present in li#estoc. populations. T(o types of cross/sectional study are commonly performed. Censuses - census in effect means sampling e#ery unit in the population in (hich one has an interest. If the population is small+ this is the most accurate and effecti#e (ay of conducting a sur#ey.

;nfortunately+ in most instances the populations studied are large and censuses ecome difficult and e%pensi#e to underta.e. - further dra( ac. (ith censuses in large populations is that+ ecause of the practical constraints of staff and facilities+ each indi#idual unit (ithin a population can e allocated only a limited amount of time and effort. 'onsequently+ the amount of data that can e o tained from each unit sampled is limited. !ample surveys Sample sur#eys ha#e the ad#antage of eing cheaper and easier to perform than censuses. "ecause the population is eing sampled+ the actual num er of units eing measured is relati#ely small+ and as a result more time and effort can e de#oted to each unit. This ena les a considera le amount of data to e collected on each sample unit. The question is+ ho( closely do the results of the sur#ey correspond to the real situation in the population eing sampled< If underta.en properly+ sample sur#eys can generate relia le information at a reasona le cost= if they are performed improperly+ the results may e #ery misleading. This is also true of censuses.

4.3 !ampling techni"ues in epidemiological studies


4.3.1 1andom sampling 4.3.2 Multi/stage sampling 4.3.3 Systematic sampling 4.3.4 0urposi#e selection 4.3.5 Stratification 4.3.! 0aired samples 4.3.$ Sampling (ith and (ithout replacement 6pidemiological studies usually in#ol#e sampling from li#estoc. populations in some (ay in order to ma.e inferences a out a disease or diseases present in these populations. The units sampled are referred to as sample units. Sample units may e indi#idual animals or they may e the units that contain the. animals to e in#estigated+ such as herd+ ranch+ farm+ or #illage. The sample fraction is the num er of units actually sampled+ di#ided y the total num er of units in the population eing sampled. >arious methods can e used to sample a population. The more common techniques used in epidemiological studies are descri ed in the follo(ing sections.

4.3.1 Random sampling


The rationale ehind random sampling is that units are selected independently of each other and+ theoretically+ e#ery unit in the population eing sampled has e%actly the same pro a ility of eing selected for the sample. It is+ in fact+ a.in to the process of dra(ing lots.

1andom sampling remo#es ias in the selection of the sample and there y remo#es one of the main sources of error in epidemiological studies. The first step in random sampling is to construct a list of all the indi#idual sample units in the population eing sampled. This is .no(n as the sample frame 6ach unit in the sample frame can then e assigned an identification num er (hich is normally the numerical order in (hich they appear in the sample frame. - computer program can e used to generate random num ers or a ta le of the out put from such a program. 3- random num er ta le is gi#en in -ppendi% 14. -s each num er is produced+ the unit to e sampled can e identified from the sample frame. 1andom num ers are selected from a random num er ta le y starting any(here in the ta le and then reading either horizontally across the ro(s or #ertically do(n the columns. 6%ample, Suppose (e are interested in detecting the presence of rucellosis in a dairy herd of 34? co(s. )e decide that+ for our purposes+ (e (ish to e ?@A sure of detecting the disease and (e estimate+ although (e do not .no(+ that the pre#alence of rucellosis in the herd is not li.ely to e less than &A 3see Section 4.4 on estimating sample sizes4. 2rom Ta le 1@ (e see that in order to e ?@A sure of detecting the disease at this le#el of pre#alence in a herd of 34? co(s+ (e need a random sample of 2$ animals. The animals in the herd are not tagged+ ut the herdsman is a le to identify each animal y name. )e can+ therefore+ construct a sample frame of the animals in the herd y listing their names. If+ for any reason+ t(o or more animals had the same name+ (e could further identity them y a num er 3e.g. Baisy 1+ Baisy 2 etc4. - similar procedure can sometimes e used to esta lish the identify of certain unnamed animals in a herd y identifying them as the first calf of 6mma+ the second calf of 2lora etc. To select the animals to e sampled (e could simply (rite the name of each animal in the herd on a piece of paper+ place the name cards in a hat and then dra( out 2$ cards. -lternati#ely+ (e could use a random num er generator or ta le to produce a set of three/ digit num ers. 1e*ecting all num ers greater than 34?+ (e continue until (e ha#e 2$ three/ digit num ers. - series of such num ers might for instance read @@1+ @&&+ @45+ @@&+ @1!+ 344 etc. )e (ould then select the first+ the eighty/eighth+ the forty/fifth+ the si%teenth+ the three/hundred/and/fourty/fourth etc animal from the sample frame. Since (e no( .no( the names of the animals to e sampled+ (e can identify them in the herd and include them in the sample. -s a simple alternati#e+ (e could run the herd through a chute and select the animals as they come through+ ta.ing the first+ eighth+ si%teenth+ forty/fifth etc animal for the sample. 9ote that if the population to e sampled (as et(een 1@ and ??+ (e (ould use t(o/digit num ers to select the sample= if it (as et(een 1@@ and ???+ three/digit num ers (ould e used= for populations et(een 1@@@ and ????+ and et(een 1@ @@@ and ?? ???+ four/digit and fi#e/digit num ers+ respecti#ely+ (ould e selected. -ny num er in these categories greater than the size of the population eing sampled is re*ected. If during the sampling procedure the same unit is selected a second time+ the num er that led to that selection is also re*ected. If (e (ere selecting animals from the same herd for the purposes of a prospecti#e study+ (e could use random num ers to identify them in the sample frame and then assign each

animal in turn to the appropriate group. Thus+ in the a o#e e%ample+ if (e (anted to select three groups from the herd+ the first co( on the list (ould e assigned to group I+ the eighty/ eighth co( on the list to group 2+ the forty/fifth co( on the list to group 3+ the eighth co( to group I+ the si%teenth co( to group 2+ the three/hundred/and/forty/fourth co( to group 3 and so on. There are many (ays of selecting random samples+ ut the principles are su stantially the same as those outlined a o#e. -part from remo#ing ias in the selection of the sample+ random sampling has other ad#antages+ the main eing that (e can easily calculate an estimate of the error for the #alues of a population parameter estimated y a random sample. This is done y the use of a statistic .no(n as the standard error 3see Section 4.44. 7a#ing calculated the error+ (e can ad*ust the size of the sample according to ho( precise (e require our sample estimate to e. It is possi le to calculate estimates of errors in other forms of sampling+ ut the calculations in#ol#ed are more comple%. 2or this reason+ random sampling is normally the method of choice (hen circumstances permit. The main disad#antage of random sampling is that it cannot e attempted if the size of the population is not .no(n. In most instances+ a sample frame must e constructed efore sampling can egin. This sample frame must contain all the sample units in the population+ and the sample units must e identifia le y some means or other in the population (hich is eing sampled. Sample frames are notoriously difficult to construct+ certain sample units may occur in the frame more than once+ thus increasing their chance of selection+ or certain sectors of the population to e sampled may e omitted. Moreo#er in -frica+ (here records of indi#idually identifia le animals are seldom a#aila le+ sample frames of indi#idual animal units can rarely e constructed. 2or this reason+ simple random sampling ased on indi#idual animals as sample units is rarely attempted in -frica. 2urthermore+ random sampling is impossi le (here the type of unit eing sampled does not permit the population size to e determined eforehand. If+ for instance+ e#ents such as irths or deaths are eing sampled+ there is simply no (ay of .no(ing (ith a solute precision ho( many irths or deaths there (ill e in a population o#er the study period.

4.3.2 #ulti stage sampling


- (ay round the pro lem of constructing sample frames of indi#idual animal units is to use a technique .no(n as multi-stage sampling. -s the name implies+ this in#ol#es sampling a population in different stages+ (ith the sample unit eing different at each stage. If it is not possi le to construct a sample frame of indi#idual animals+ then herds+ farms or #illages in (hich li#estoc. are .ept can e used as units. Cists+ particularly of farms or #illages+ are frequently compiled for administrati#e purposes y go#ernments+ and it is relati#ely easy to construct a sample frame from such lists. This (ould e the first stage of the process. The sample units are then selected at random from the sample frame. 5nce the farm or #illage units ha#e een selected+ it may pro#e possi le to construct a sample frame of the animals (ithin the units and sample these in turn. -lternati#ely+ all the animals (ithin a #illage+ farm or herd can e sampled. This technique is .no(n as cluster sampling. The herd+ farm or #illage is the sample unit and the animals contained (ithin the sample unit are the cluster. Since one of the main e%penses of

sampling is often for tra#el+ the ad#antages of sampling all the animals in the herd+ #illage or farm during one #isit are o #ious. 2or this reason+ cluster sampling is often the method of choice in epidemiological studies in -frica. -n alternati#e method of cluster sampling is to define the target population as all the li#estoc. of a particular type (ithin a region demarcated y (ell defined geographical oundaries. -n areal sampling method is then used (here y the region is di#ided into small units+ (ith all the animals in each unit eing defined as a single cluster. The ad#antage of this procedure is that the in#estigator .no(s ho( many areal units there are in total+ since he has defined them+ and this in turn ena les him to construct easily a sample frame. The disad#antage is that it may e difficult to find all the animals in a gi#en small area+ or e#en to e sure to (hich areal unit a particular animal elongs. 'luster sampling has some ad#antages and disad#antages (hen compared (ith simple random sampling. These are discussed in detail in the ne%t chapter ut it may e useful to include a rief summary here. The first ad#antage of cluster sampling is one of a sa#ing in tra#el costs. Much less tra#elling is in#ol#ed in sampling animals on a cluster asis than if animals are selected at random from a target population. 0ro#ided that the complete collection of animals in each cluster is included in the sample+ it is not too difficult to calculate an estimate of the #aria le eing in#estigated and the corresponding standard error. 3It is not #ery difficult e#en if only a su set is used4. 7o(e#er+ since the #ariation in disease pre#alence is li.ely to e greater et(een clusters than (ithin clusters+ e%amining animals (ithin clusters (ill gi#e less information than e%amining animals from different clusters. This is particularly so in the case of infectious diseases. The more infectious the disease+ the more li.ely it is that in any particular cluster of animals either none or most of the animals (ill e infected. "ecause of this+ cluster sampling (ill almost al(ays increase the standard error / sometimes #ery considera ly / and hence the uncertainty in#ol#ed in the estimation of the particular #aria le eing considered. 5ne implication of this is that the minimum num er of cases required for a relia le estimate of disease pre#alence or incidence in the target population as a (hole (ill e se#eral times larger than that required in simple random sampling The sample size in a cluster sample has to e correspondingly larger+ therefore+ to produce an estimate of the same relia ility. If+ as a result+ the procedures for measuring a particular #aria le ecome time consuming andDor costly+ the time and money spent may out(eigh the enefits of reduced tra#el costs and increased administrati#e con#enience gained y cluster sampling.

4.3.3 !ystematic sampling


Systematic sampling in#ol#es sampling a population systematically i.e. if a 1Dn sample is required+ e#ery nth unit in that population is sampled. 2or e%ample+ if a 1@A3lD1@4 sample is required+ e#ery 1@th unit in the population is sampled. If a 5A 31D2@4 sample is required+ e#ery 2@th unit in the population is sampled.

The main ad#antage of systematic sampling is that it is easier to do than random sampling+ particularly if the sample frame is large. It also ena les sampling a population (hose e%act size is not .no(n. This is impossi le in random sampling. Thus systematic sampling is used to sample such e#ents as irths or deaths+ (hose total num er cannot e .no(n efore the study egins+ or li#estoc. populations at a attoirs or dips (here+ again+ the population size may not e determina le at the outset. The main disad#antage of systematic sampling is that if the sample units are distri uted in the sample frame or in the population periodically+ and this periodicity coincides (ith the sampling inter#al+ the sample estimate may e #ery misleading. 6stimating the standard error is thus more difficult and depends on ma.ing the assumption that there is no periodicity in the data.

4.3.4 Purposive selection


0urposi#e selection in#ol#es the deli erate selection of certain sample units for some reason or other. The reason may often e that they are regarded as eing EtypicalE of the population eing sampled. 2or e%ample+ a herd or series of herds may e selected ecause they are representati#e of a certain production system. 0urposi#e selection is also used to select particular sample units for a particular purpose e.g. high/ris. sentinel herds along a national or geographic oundary or along a stoc. route. The main ad#antage of purposi#e selection is the relati#e ease (ith (hich sample units can e selected. Its main disad#antage is that sample units are frequently selected not ecause they are representati#e of a particular situation ut ecause they are the most con#enient to sample. 6#en if the sample units are selected as eing representati#e of a general population or situation+ they often tend to reflect the opinions of the indi#idual selecting them as to (hat he or she considers to e representati#e+ rather than the actual case. In addition+ if the samples are selected on the asis of eing typical of the a#erage situation. they only represent those units close to the population mean and tell one little a out the #ariation in the population as a (hole. In spite of these dra( ac.s+ purposi#e selection may in certain instances e the only method a#aila le. If there are difficulties communication+ sample units may ha#e to e selected purposi#ely on the asis of their accessi ility. -lternati#ely+ if the measurement procedures are long or complicated+ in#ol#e some form of damage to an animal or upset local eliefs or pre*udices+ e.g. (hen ta.ing lood or iopsies+ a sample may ha#e to e purposi#ely selected on the asis of the li#estoc. o(nerFs (illingness to cooperate.

4.3.$ !tratification
This in#ol#es treating the population to e sampled as a series of defined su /populations or strata. Suppose+ for e%ample+ that (e (ished to sample a population of 4@@@ goat floc.s in order to estimate the pre#alence of a particular disease in an area+ and that this population consisted of 2@@ large/sized floc.s containing 51 animals or more=

&@@ medium/sized floc.s containing et(een 2@ and 5@ animals= and 3@@@ small/sized floc.s containing 1? animals or less. If (e too. a 1A random sample of all floc.s+ (e might find that this (ould gi#e us a sample consisting of+ say+ 1 large floc.+ ? medium/sized floc.s and 3@ small floc.s. Suppose+ ho(e#er+ that one of the determinants (e (ere interested in (as the influence of floc. size on the pre#alence of the disease. )e (ould o #iously (ant to .no( more a out the larger floc.s than our present system of sampling (ould tell us. )e could+ therefore+ di#ide the population to e sampled into strata according to floc. size+ and sample each stratum in turn. )e could also ta.e larger samples from those strata that (e are particularly interested in and smaller from those that (e are not. 2or e%ample+ (e might decide to ta.e a 5A random sample from the large/floc. stratum+ a 2A sample from the medium/floc. stratum and a @.5A sample from the small/floc. stratum. This might gi#e us 1@ large floc.s+ 1! medium floc.s and 15 small floc.s. 9ote that the actual sample size has increased from 4@ to 41 only+ although if (e (ere cluster sampling more animals (ould e in#ol#ed. This technique is .no(n as stratification with a variable sampling fraction, and its usefulness lies in that it allo(s us to concentrate the facilities at our disposal on those sections of the population that are of particular interest to us. Many different systems of stratification are possi le+ depending on the purpose of the study eing underta.en. 'ommon #aria les for stratification include area+ production system+ herd size+ age+ reed and se%.

4.3.% Paired samples


>ariations in the sample groups due to host and management characteristics can sometimes e o#ercome y pairing indi#iduals in the different sample groups according to common characteristics 3age+ reed+ se%+ system of management+ num ers of parturitions+ stage of lactation etc4 and then analysing the paired samples 3see 'hapter 54. This technique is useful in that it often greatly increases the precision of the study.

4.3.& !ampling 'ith and 'ithout replacement


There are essentially t(o different options for selecting clusters. )e may select them in such a (ay that each cluster has an equal pro a ility of eing selected+ or that some clusters ha#e a higher pro a ility of eing selected than others. If the first option is chosen+ the natural method of selection is simple random sampling. If+ ho(e#er+ the clusters ha#e different pro a ilities of eing selected+ it then ecomes rather difficult to de#ise a sampling method (hich allo(s the clusters to e chosen (ith the intended pro a ility. In addition+ the correct method to calculate un iased estimates of the standard errors of any estimates (hich include E et(een/clusterE #aria ility is rather complicated and requires a po(erful computer (ith a special program. If such resources are not a#aila le+ it (ill e ad#isa le to select clusters (ith replacement i.e. choose from the complete set of clusters (ithout discarding any pre#iously selected. This (ill mean that

sometimes the same cluster (ill appear more than once in the sample+ though this (ill happen rarely if the total num er of clusters is large compared to the sample eing selected. 3The interested reader should consult 'hapters ? and 1@ in 'ochran 31?$$4 for further details4. There are many #ariations and com inations of sampling possi le e#en (ithin one particular study. Betailed descriptions of all the possi le permutations in#ol#ed are eyond the scope of this manual+ and the ensuing discussions in this and the ne%t chapter (ill focus on simple random and cluster sampling.

4.4 !ample si(es


4.4.1 Sample sizes for estimating disease pre#alence in large populations 4.4.2 Sample sizes needed to detect the presence of a disease in a population This section is concerned (ith estimating sample sizes for cross/sectional studies. The approach used (ill depend on (hether (e are measuring a categorical or a numerical #aria le. 'ategorical 3discrete4 #aria les are pro a ly more frequent in epidemiology+ particularly dichotomies+ and (e shall illustrate the pro lem of estimating sample size for such #aria les in the follo(ing su sections. Techniques a#aila le for estimating sample sizes in cross/sectional studies in#ol#ing numerical 3continuous4 #aria les+ and in cohort and case/control studies+ are descri ed in 'hapter 5.

4.4.1 !ample si(es for estimating disease prevalence in large populations


Suppose that (e/(ish to/carry out a sur#ey to in#estigate the distri ution of disease in a large animal population. 7o( ig a sample should (e aim for< Since the cost of finding and e%amining each animal 3i.e. the unit sampling cost) is li.ely to e quite high+ the total sampling cost+ and hence the sample size+ (ill e an important determinant of the total cost of the sur#ey. So ho( do (e decide ho( many animals (e need to e%amine< The ans(er to this question largely depends on four su sidiary questions, / To (hat degree of accuracy do (e require the results< / )hat sampling method ha#e (e used< / )hat is the size of the smallest su group in the population for (hich (e require accurate ans(ers< / )hat is the actual #aria ility in the population sur#eyed of the #aria le (e (ish to measure< 'learly the last of these questions (ill cause the greatest pro lem+ since if (e .ne( the e%act ans(er to this (e (ould ha#e no need to carry out the sur#ey in the first place8 Cet us no( consider these questions one y one. Suppose that a disease is distri uted in a population (ith a pre#alence of 0. and that (e ha#e decided to estimate 0 y means of a sur#ey using a particular sampling method. )e

carry out the sur#ey and o tain an estimated pre#alence p. If (e repeated the (hole sur#ey a second time using the same sampling method and the same sample size+ (e (ould get a different estimate p of the pre#alence 0. If it (ere possi le to go on repeating the sur#ey many times (ith the same sample size+ (e (ould get a (hole series of estimates from (hich (e could dra( a histogram. This (ould resem le 2igure $ if n+ the sample size+ (as large. )igure &. Distribution of different estimates of disease prevalence in a large-sized sample.

It can e sho(n that the a#erage of all the estimates p1+ p2 etc (ill e almost e%actly the true pre#alence 0. and that !&A of the estimates (ill differ from the true #alue y less than the quantity called the standard error of the estimated prevalence 3S64+ (here,

0 G true pre#alence 3A4+ H G 1@@/ 0+ and n G size of the sample. Similarly+ ?5A of the estimates (ould differ from the true #alue y less than t(ice the standard error+ and ??A of the estimates (ould e (ithin three standard errors of the true #alue. This suggests a method for stating ho( precise (e (ould li.e the results to e. )e might+ for e%ample+ say that (e (ould li.e to e ?5A sure of eing (ithin 1A of the correct+ true pre#alence 03A4. This implies that (e (ant t(ice the standard error to e no greater than 1A+ or that the standard error should not e greater than @.5A. This means that it is al(ays possi le to fi% a gi#en accuracy le#el y choosing the sample size so that the standard error of the estimate is controlled. 1equirements for precision can e stated in terms of absolute or relative accuracy. If (e tal. in terms of a solute accuracy (e might say that E(e (ant the error in the pre#alence estimate to e no more than 1AE i.e. p G 0 I 1A. 2or e%ample+ if the true pre#alence is 3A+

(e (ill e requiring an estimate that lies in the range of 2 to 4A. If the true pre#alence is 2@A+ (e require the estimated #alue to fall et(een I ? and 21 A. If (e (ant to state our requirements in terms of relative accuracy, the estimated #alue must lie (ithin 1@A of the true #alue. 2or e%ample+ if the true pre#alence is 2@A+ this (ould mean o taining an estimate in the range of 1& to 22A+ since 2 is 1@A of 2@. If the true #alue (as 5A+ (e (ould e demanding an estimate et(een 4.5 and 5.5A+ since @.5 is 1@A of 5. In principle+ there is nothing (rong in stating accuracy requirements in this (ay+ ut high relati#e accuracy (ill not e possi le (hen true pre#alence is lo( 3see Ta le ?4. Ta le & sho(s the sample sizes required for estimating pre#alences at different le#els of a solute accuracy from large populations. 9ote that no sample size is gi#en unless the standard error is smaller than the true pre#alence. The entries ha#e een calculated using the formula, n G 031@@/04DS6J If the sample size is a large proportion of the population+ say greater than 1@A+ then it is etter to use the more e%act formula,

(here 9 is the total size of the population. Ta le &. Sample size n) for controlling the standard error S!) of estimated prevalence for different values of the true prevalence P) in large populations.
P *+, @.5 1.@ 1.5 2.@ 2.5 3.@ 3.5 4.@ 4.5 5.@ !.@ $.@ &.@ ?.@ 1@.@ 2@.@ 3@.@ ..1 4?$5 ??@@ 132$5 1?!@@ 243$5 2?1@@ 33$$5 3&4@@ 42?$5 4$5@@ 5!4@@ !51@@ $3!@@ &1?@@ ?@@@@ 1!@@@@ 21@@@@ ..$ / 3?! 5?1 $&4 ?$5 11!4 1351 153! 1$1? 1?@@ 225! 2!@4 2?44 32$! 3!@@ !4@@ &4@@ !- *+, 1.. 1.$ 2.. 2.$ / 14& 1?! 244 2?1 33& 3&4 43@ 4$5 5!4 !51 $3! &1? ?@@ 1!@@ 21@@ / / &$ 1@& 12? 15@ 1$1 1?1 211 251 2&? 32$ 3!4 4@@ $11 ?33 / / / !1 $3 &4 ?! 1@$ 11? 141 1!2 1&4 2@5 225 4@@ 525 / / / / 4$ 54 !1 !? $! ?@ 1@4 11& 131 144 25! 33!

4@.@ 5@.@

24@@@@ ?!@@ 24@@ 1@!$ !@@ 3&4 25@@@@ 1@@@@ 25@@ 1111 !25 4@@

6%ample 1, Suppose (e (ish to e ?5A sure that a sur#ey (ill gi#e an estimated pre#alence (ithin 1A of the true #alue in a solute terms. T(o standard errors (ill then e less than 1A i.e 2 S6 GK1A or S6 G K @.5A. Ta le & gi#es the sample sizes required for different pre#alence rates and standard errors. 7o(e#er+ since the sample size (e are loo.ing for (ill depend on true pre#alence+ (hose #alue (e do not .no(+ that eing the reason for the sur#ey+ this does not seem to help much. It (ill e rare+ ho(e#er+ to ha#e a solutely no idea (hat #alue of the true pre#alence to e%pect. )e (ill usually e a le to ma.e an estimate and say+ for e%ample+ that E(e elie#e the pre#alence is not greater than &AE. If (e then choose the sample size+ it might turn out to e much too ig+ since the correct sample size to measure a pre#alence of+ say+ around 2A to the desired accuracy is $&4+ (hile the sample size corresponding to a pre#alence of around &A is 2?44. 7o(e#er+ there is nothing much (e can do a out this. Cac. of prior .no(ledge (ill al(ays result in a need for li eral 3i.e. o#erlarge4 sample sizes and hence higher costs. If (e do not ha#e the slightest idea (hat pre#alence to e%pect+ (e can use the sample size corresponding to the least fa#oura le case 30 G 5@A4 gi#en in Ta le &+ though if (e are demanding a high degree of accuracy the indicated sample size 31@ @@@4 may e unrealistically large. 6%ample 2, )e might suspect that the true pre#alence is of the order of 2@A and (ould li.e to e ??A sure that the estimated pre#alence is (ithin 2A of the true #alue. )e can e ??A certain that the true #alue lies (ithin three standard errors of the estimate. 7ence+ to fulfill the required conditions (e must choose the sample size in such a (ay that 3 S6 G K2A or S6 G K2D3 G @.$A appro%imately. 2rom Ta le & (e see that for S6 G @.5A and 0 G 2@A+ (e need a sample of !4@@. 2or S6 G @.$A+ it seems+ (e (ill need around 4@@@. 3In fact the e%act sample size as calculated from the formula n G 031@@/04DS6J is only 32!54. Ta le ? gi#es sample sizes required to estimate pre#alence in a large population (hen the desired precision is stated in terms of relati#e accuracy. In this case the sample sizes are such as to ensure that the standard error (ill not e greater than the stated percentage of the true pre#alence. The entries in the ta le ha#e een calculated using the formula,

If the sample size required represents a #ery high proportion of+ or is greater than+ the sampled population itself+ the more accurate formula

should e used to calculate the sample size. 39 is the size of the population eing sampled4.

Ta le ?. Sample size n) to control the standard error S!) of estimated prevalence relative to the true value of the prevalence.
P *+, !- as a percentage of P 1.. $.. 1... @.5 1 ??@ @@@ $? !@@ 1? ?@@ 1.@ ??@ @@@ 3? !@@ ? ?@@ 1.5 !5! !!$ 2! 2!$ ! 5!$ 2.@ 4?@ @@@ 1? !@@ 4 ?@@ 2.5 3?@ @@@ 15 !@@ 3 ?@@ 3.@ 323 333 12 ?33 3 233 3.5 2$5 $14 11@2? 2 $5$ 4.@ 24@ @@@ ? !@@ 2 4@@ 4.5 212 222 & 4&? 2 122 5.@ 1?@@@@ $ !@@ 1 ?@@ !.@ 15! !!$ ! 2!$ 1 5!$ $.@ 132&5$ 5314 1 32? &.@ 115@@@ 4!@@ 1 15@ ?.@ 1@1111 4@44 1 @11 1@.@ ?@@ @@@ 3 !@@ ?@@ 2@.@ 4@@@@ 1 !@@ 4@@ 3@.@ 23 333 ?33 233 4@.@ 15 @@@ !@@ 15@ 5@.@ 1@ @@@ 4@@ 1@@

The sample sizes calculated in the t(o different e%ercises (ere o tained assuming that the sample (as to e chosen y simple random sampling i.e. that animals (ere sampled indi#idually. If (e use a different sampling method+ these sample sizes (ill no longer e appropriate. 2or e%ample in cluster sampling+ (hich increases the #aria ility of any estimates made+ (e should assume that+ to e on the safe side+ (e (ill need to e%amine four times as many animals as for a simple random sample. If (e require an accurate estimate of pre#alence not only for the complete population ut also (ithin (ell defined su groups+ as in a stratified sur#ey+ (e need to choose the sample size sufficiently large within each subgroup. Suppose+ for instance+ that the population is distri uted in si% regions. Then+ in our first e%ample+ if (e require to estimate a true pre#alence of 2A (ith an S6 of @.5A for each region+ (e (ould need a sample size of $&4 in each region, assuming that (e ta.e simple random samples (ithin the regions.

4.4.2 !ample si(es needed to detect the presence of a disease in a population


It may sometimes e important to disco#er (hether a disease is at all present in a population. This population may e a single herd or a much larger group in+ say+ a (ell defined geographical region. 7ere the pro lem is no longer one of ha#ing a sample large enough to gi#e a good estimate of true pre#alence+ ut rather of .no(ing the minimum sample size required to find at least one animal (ith the disease. This (ill clearly need a much smaller sample than (ould e required for an accurate estimation of pre#alence.

-gain the ans(er (ill depend on the true+ ut un.no(n+ #alue of the pre#alence of the disease in the target population. 2or small populations+ e.g. indi#idual herds+ the ans(er (ill depend on the size of the population 3Ta le 1@4. 2or populations of o#er 1@ @@@+ the sample sizes in the last column of the ta le (ill e appro%imately correct. The #alues in Ta le 1@ (ere calculated from the formula, 0ro a ility of detection G 1/39/M4D9%39/M/14D39/14%.. 39/M/nL14D39/nL 14 (here, 9 G size of population+ M G total num er of infected animals+ and n G sample size. )here the indicated pre#alence did not correspond to a (hole num er of animals+ the #alue (as rounded up to the ne%t (hole num er 3e.g. 3A of $5 G 2.25 animals= this (as rounded up to 34. The sample sizes indicated in Ta le 1@ are appropriate only for simple random sampling and (ould e much larger if cluster sampling (as used. The determination of sample sizes required to estimate continuous #aria les is discussed in Section 5.3.2.

4.$ #ethods for obtaining data in epidemiological studies


4.5.1 Inter#ie(s and questionnaires 4.5.2 0rocedures in#ol#ing measurements 4.5.3 6rrors due to o ser#ations and measurements In epidemiological studies (e can o tain data on a particular #aria le in t(o main (ays. )e can actually measure the #aria le or (e can as. indi#iduals concerned (ith li#estoc. to gi#e an estimate of the #aria le in the li#estoc. populations (ith (hich they are concerned. -s in estimating sample size+ the approach adopted (ill largely depend on the purposes of the study. If the o *ecti#e of the study is to o tain road estimates of the relati#e importance of #arious diseases (ithin a li#estoc. population+ the degree of precision need not e great. 'onsequently+ the sample size may e small and the quality of the data generated does not need to e high. If+ on the other hand+ (e are interested in studying the epidemiology of a particular disease in detail+ accurate estimates of pre#alence or incidence may e needed+ the sample size (ill ha#e to e large+ and the data generated must e of high quality.

4.$.1 Intervie's and "uestionnaires


Inter#ie(s and questionnaires are frequently used in epidemiological studies and can e a #alua le means of generating data. In countries (ith good postal ser#ices+ data can e collected cheaply and quic.ly y circulating questionnaires. "ecause of literacy and communications difficulties+ this approach is of little use (hen one is soliciting information from traditional li#estoc. o(ners+ ut it can e helpful in o taining information from e%tension officers+ #eterinarians and other indi#iduals concerned (ith traditional li#estoc.

production. It should e noted+ ho(e#er+ that questionnaires in#ol#ing a considera le effort in filling in are li.ely to ha#e a high non/return rate+ and the sample size may ha#e to e ad*usted accordingly. 2urthermore+ high non/return rates can introduce su stantial ias in the estimates calculated from the returns. 6pidemiological studies often in#ol#e #isiting the sample units and collecting the rele#ant data y questioning the o(ners andDor carrying out the appropriate measurement procedure on the animals concerned. Besigning questionnaire formats and inter#ie( protocols can e a long and difficult process+ particularly (here traditional li#estoc. producers are concerned. 1emem er that questioning a traditional li#estoc. producer a out the num ers or performance of his animals is a.in to questioning other indi#iduals a out their an. accounts8 'onsidera le time and patience are needed to o tain the trust and cooperation of such indi#iduals. )here#er possi le+ a trusted intermediary should e employed. 9e#ertheless+ as most traditional li#estoc. producers li#e in close pro%imity to their animals and normally come from sections of the population (ith a #ast e%perience of .eeping li#estoc. under -frican conditions+ they are o #iously an e%tremely useful and #alua le source of information. Ta le 1@. Sample size as a function of population size, prevalence and minimum probability of detection.
Population si(e $. &$ 1.. 3.. $.. 1... $... 1. ... a4 ?@A pro a ility of detection @.5 5@ $5 1@@ 2$1 342 3!? 43? 44? 1 45 !& ?1 1!1 1&4 2@5 224 22$ 2 45 51 !? ?5 1@2 1@& 113 114 3 34 4@ 54 !$ $1 $3 $! $! 4 34 4@ 44 52 54 55 5$ 5$ 5 2$ 33 3$ 42 43 44 45 45 ! 2$ 2$ 32 35 3! 3$ 3& 3& $ 22 24 2& 31 31 32 32 32 & 22 24 25 2$ 2$ 2& 2& 2& ? 1& 21 2@ 22 22 22 22 22 1@ 1& 1& 2@ 22 22 22 22 22 4 ?5A pro a ility of detection @.5 5@ $2 1@@ 2&! 3&& 45@ 5!4 5&1 1 4& $2 ?! 1&? 225 25& 2?@ 2?4 2 4& 5& $& 11$ 12? 13& 14$ 14& 3 3? 4$ !3 &4 ?@ ?4 ?& ?& 4 3? 4$ 52 !! !? $1 $3 $4 5 31 3? 45 54 5! 5$ !? 5? ! 31 33 3? 45 4$ 4& 4? 4? $ 2! 2? 34 3? 4@ 41 42 42 & 2! 2? 31 34 35 3! 3! 3! ? 22 2! 2& 31 31 32 32 32 1@ 22 23 25 2& 2& 2? 2? 2? c4 ??A pro a ility of detection @.5 5@ $5 1@@ 2?$ 45@ !@1 &4@ &$& P *+,

1 2

5 ! $ & ? 1@

5@ 4? 4& 45 3? 3? 34 34 2? 2?

$5 !& 5? 5? 51 44 3? 3? 35 32

?? ?@ $& !& 5? 53 4$ 43 3? 3!

235 1!@ 11? ?4 $& !! 5& 51 45 41

3@@ 1&3 131 1@1 &3 $@ !@ 53 4$ 42

3!& 2@4 141 1@$ &! $2 !2 54 4& 43

43& 223 14? 112 &? $4 !4 55 4? 44

44& 22! 151 113 ?@ $5 !4 5! 4? 44

The success or failure of this type of epidemiological study depends as much on the design of recording forms as it does on the o#erall sur#ey+ the actual field (or. and the analysis. The latter (ill e impossi le unless the material recorded is intelligi le. Much thought should therefore e gi#en to the design of forms and their efficiency should e tested in pilot trials. The forms should e orderly+ (ith related items grouped together 3calf num er+ date of irth+ place of irth4+ con#enient to use 3the form should fit on a clip oard4+ and technical (ords not li.ely to e understood y field staff a#oided+ as should any am iguities in the terms used. The form should ha#e a title and pro#isions for the identification of oth the officer completing the form and the data source. It should also ha#e a reference num er (hich relates to the sur#ey design 3e.g. @!D@4D?3 might indicate the si%th #isit to farm ?3 in stratum 44. 'ompleted forms should e chec.ed for errors as soon as possi le+ so that appropriate corrections can e made (hile the memory of the inter#ie(er is still fresh and the sample unit accessi le. Some additional points to ear in mind in the design of inter#ie(s and questionnaires include, i4 6%plain the purposes of the inter#ie( to the inter#ie(ee. 0eople are generally much more cooperati#e (hen they .no( (hy they are eing questioned. ii4 "eing normally #ery polite+ li#estoc. o(ners tend to ans(er questions (ith the ans(er that they thin. the inter#ie(er (ishes to hear+ rather than gi#ing the correct ans(er. The use of leading questions (hich gi#e the inter#ie(ee a clue as to the ans(er e%pected or desired+ should therefore e a#oided. iii4 7uman memories are short+ and there is a tendency to concentrate e#ents into a more limited time period than (as actually the case. So if li#estoc. o(ners are as.ed a out e#ents that occurred in their animals o#er the last year+ they tend to report e#ents that happened o#er the last 2 or 3 years. This o #iously e%aggerates data on disease frequencies. i#4 Bo not ma.e inter#ie(s or questionnaires too long+ or else the inter#ie(ee (ill get ored and the quality of his ans(ers (ill suffer. To a#oid this+ the most important questions should e as.ed at the eginning.

#4 Huestions requiring su *ecti#e ans(ers generate data that are e%tremely difficult to analyse. They should e a#oided (hene#er possi le+ e#en though they may gi#e #alua le insights. #i4 Cong+ complicated questions tend to lead to misunderstanding and (rong ans(ers.

4.$.2 Procedures involving measurements


If a high degree of precision is required in the study+ the #aria le eing in#estigated (ill normally ha#e to e measured in some (ay. This may in#ol#e ta.ing a iological specimen from an animal for a diagnostic test+ (eighing the animal+ measuring mil. yield+ or measuring climatic #aria les such as rainfall+ temperature etc. "efore measuring egins+ it is important to understand e%actly (hat is eing measured and (hat are the ad#antages and disad#antages of the method used. This applies particularly to diagnostic tests. If the procedure is complicated or in#ol#es comple% equipment+ the person using it must master all its aspects efore the sur#ey egins+ to ensure that an accepta le le#el of consistency in the measurements is eing o tained. The equipment used during a field in#estigation should e cali rated and chec.ed for accuracy efore the start of each series of measurements and should e regularly maintained.

4.$.3 -rrors due to observations and measurements


6arlier in this chapter (e discussed statistical techniques a#aila le to calculate the size of a sample that (ould gi#e a population estimate (ith the precision required if, The study is performed e%actly as it (as originally designed= and -ll the statistical assumptions are fulfilled. 7o(e#er+ this does not ta.e into account errors due to #ariations et(een o ser#ers and those inherent in the measurement procedures used. These errors may+ in fact+ e more important than the errors generated y faulty sampling procedures. -rrors due to variations bet'een observers Many epidemiological studies are conducted (ith the help of enumerators+ usually field ser#ices staff+ (ho #isit the sample units and carry out the procedures required. If inter#ie(s are eing conducted y such staff+ ans(ers may e recei#ed (hich could e su *ect to different interpretations y different indi#iduals. To .eep errors to a minimum+ strict control should e maintained o#er the inter#ie( protocols and the inter#ie(ees monitored from time to time. >ariations et(een different o ser#ers may occur (hen some degree of su *ecti#e *udgement is in#ol#ed+ as may e the case in the diagnosis of a disease. 'riteria need to e esta lished y (hich a diagnosis is arri#ed at and adhered to y all those engaged in the study. Such considerations are of particular importance in retrospecti#e studies.

-n additional pro lem frequently encountered is that of ias on the part of the o ser#er. If an indi#idual (ishes to pro#e a particular point he may+ quite unintentionally+ e iased in recording his o ser#ations. This pro lem can e a#oided y the use of a E lindE technique (here y the o ser#er is .ept ignorant of the distri ution of the determinant in the groups eing studied+ merely eing required to record a set of o ser#ations a out those groups. -rrors due to measurements 6rrors inherent in the procedures y (hich a #aria le is eing measured are common in epidemiological studies. 2or e%ample+ if t(o (eighing scales are eing used in a study+ one scale may consistently gi#e a higher reading than the other. 5 #iously+ careful chec.ing and monitoring of such apparatus efore and during the study (ill reduce errors of this .ind. 2urther errors may occur (hen diagnostic tests are eing used to determine the presence or a sence of an infectious agent. The terms used to descri e the relia ility of diagnostic procedures are, "epeatability, (hich is the a ility of a diagnostic test to gi#e consistent results. -ccuracy+ (hich is the a ility of a test to gi#e a true measure of the #aria le eing tested. -ccuracy is normally measured y t(o criteria, - Sensitivity, (hich is the capa ility of that test to identify an indi#idual as eing infected (ith a disease agent (hen that indi#idual is truly infected (ith the disease agent in question. In other (ords+ it gi#es the proportion of infected indi#iduals in the sample that produce a positi#e test result. / Specificity+ (hich is the capa ility of that test to identify an indi#idual as eing uninfected (ith a disease agent (hen that indi#idual is truly not infected (ith the disease agent in question. In other (ords+ it gi#es the proportion of uninfected indi#iduals in the sample that produce a negati#e test result. These t(o terms are illustrated in Ta le 11. Ta le 11. !stimated and true prevalences of a disease agent illustrating the terms specifcity and sensitivity.
/umber of individuals infected /umber of individuals not infected Total 0ositi#e test result a aL 9egati#e test result c d cLd Total aLc Ld 9

9otes, The estimated pre#alence is 3aL 4D9= the true pre#alence is 3aLc4D9. The sensiti#ity of the test is aD3aLc4 and its specificity is dD3 Ld4

6%ample 1, Suppose that (e tested a sample of 1@@@ animals for the presence of a disease agent using a test of ?@A sensiti#ity and ?@A specificity. The results of the testing procedure are sho(n in Ta le 12. Ta le 12 is some(hat artificial in that it gi#es the column totals+ (hich (e are trying to estimate. 7o(e#er+ if the disease (as distri uted through the population in this (ay and (e used a test that (as ?@A sensiti#e and ?@A specific to estimate the e%tent of this distri ution+ (e (ould arri#e at an estimated pre#alence of 1&@D1@@@+ (hich (ould e an o#erestimate of the true pre#alence of 1@@D1@@@. 5f the 1&@ animals that the test identified as positi#e+ ?@ (ere+ in fact+ not infected (ith the disease+ (hile of the &2@ animals that the test identified as negati#e+ 1@ (ere+ in fact+ infected (ith the disease. Ta le 12. "esults of using a diagnostic test of #$% sensitivity and #$% specificity in a sample of &$$$ animals in which the true prevalence of infection is &$%.
/umber of individuals infected /umber of individuals not infected 0ositi#e test result ?@ ?@ 9egati#e test result 1@ &1@ Total 1@@ ?@@ Total 1&@ &2@ 1@@@

6%ample 2, Suppose (e used the same diagnostic test on a similar sample of animals ut the true pre#alence of the infection in the sample (as 1A. The results of this test are gi#en in Ta le 13. Ta le 13. "esults of using a diagnostic test of #$% sensitivity and #$% specificity in a sample of &$$$ animals in which the true prevalence of infection is l A
/umber of individuals infected /umber of individuals not infected 0ositi#e test result ? ?? 9egati#e test result 1 &?1 Total 1@ ??@ Total 1@& &?2 1@@@

The true pre#alence of the infection in this case is 1@D1@@@ G 1A+ (hile the estimated pre#alence of infection is 1@&D1@@@ G 1@.&A. 5f the 1@& animals that the test diagnosed as positi#e+ ?2A 3i.e. ??D1@&4 (ere+ in fact+ not infected (ith the disease agent in question. This leads us to another useful statistic+ the diagnosibility of a test+ (hich is the proportion of test/positi#e indi#iduals that are truly infected (ith the disease agent. In our first e%ample the diagnosi ility (as ?@D1&@ G 5@A (hile in the second it (as ?D1@& G &.3A. 9ote that the diagnosi ility of a diagnostic test declines as the pre#alence of a disease decreases. This means that sensiti#ity and specificity errors in diagnostic tests produce relati#ely much greater errors in pre#alence estimates of diseases (ith lo( true pre#alence than (ould e the case in diseases of high pre#alence. It is o #iously desira le to use a test that is as sensiti#e and specific as possi le+ so that the num ers of false positi#es and false negati#es in the sample are reduced. The sensiti#ity and specificity of a test can e determined y administering the test to a num er of animals and then comparing its results (ith the results o tained from a series of detailed diagnostic

in#estigations on the animals concerned. In order for the results to e #alid+ ho(e#er+ the animals selected for the e#aluation must e representati#e of the population to (hich the test is to e applied. 5nce the sensiti#ity and specificity of a test are .no(n+ a correction factor can e applied to the pre#alence estimate to ta.e into account the sensiti#ity and specificity of the test,

(here all #alues are e%pressed as decimals. 2or our e%ample 2 3Ta le 134, True pre#alence G 3@.1@& L @.?@/ 14D3@.?@ L @.?@/ 14 G @.@@&D@.&@ G @.@1 or 1A. 9ote that although (e can no( correct the pre#alence estimate+ (e still ha#e no idea (hich of the indi#idual animals are truly negati#e+ falsely negati#e+ truly positi#e and falsely positi#e. This pro lem can occur (hen diagnostic tests are eing used in a test/and/ slaughter policy for controlling a particular disease. Such policies are normally only implemented after a #accination campaign has reduced the disease to a lo( pre#alence+ (hen the diagnosi ility of a test is li.ely to e lo(. In addition+ #accination it tests are eing used in a test/and/slaughter policy for controlling a particular disease. Such policies are normally only implemented after a #accination campaign has reduced the disease to a lo( pre#alence+ (hen the diagnosi ility of a test is li.ely to e lo(. In addition+ #accination itself often has an ad#erse effect on test sensiti#ity and specificity. )e can see from our second e%ample that if (e slaughtered all the test positi#es+ ?2A of the animals eing slaughtered (ould not e actually infected (ith the disease agent. )hile it is relati#ely easy to ma.e a test more sensiti#e+ often y lo(ering the criteria y (hich a test result is deemed positi#e+ this normally results in the test ecoming less specific. Tests (hich are highly specific are often complicated+ time consuming and+ consequently+ e%pensi#e. -s such they can rarely e employed on a large scale. - (ay round this pro lem is to apply t(o separate and independent testing procedures. Initially+ a screening test of high sensiti#ity is needed to ensure that as many infected animals as possi le are detected. 5nce the initial screening test has een performed+ all positi#e reactors can e ree%amined y a second test of high specificity. Since only the positi#e reactors ha#e to e e%amined and not the entire sample+ this cuts do(n the cost of using a highly specific test. 6%ample, Suppose (e (ere attempting to eradicate a disease of 1A pre#alence from a population of 1@ @@@ animals y a process of test and slaughter. If (e first use a test of high sensiti#ity 3?5A4 ut lo( specificity 3&5A4+ our initial results (ould e as illustrated in Ta le 14.

Ta le 14. "esults of a diagnostic test of #'% sensitivity and ('% specificity used to e)amine a population of &$ $$$ animals for the presence of a disease with true prevalence of &%.
/umber of individuals infected /umber of individuals not infected 0ositi#e test result ?5 1 4&5 9egati#e test result 5 & 415 Total 1@@ ? ?@@ Total 1 5&@ & 42@ 1@ @@@

)e then su *ect the 15&@ test/positi#e animals to a further test of the same sensiti#ity ut a higher specificity 3Ta le 154. Ta le 15. "esults of a diagnostic test of #'% sensitivity and #(% specificity applied to the &'($ test-positive animals identified in *able &+.
0ositi#e test result 9egati#e test result Total /umber of individuals infected /umber of individuals not infected ?@ 3@ 5 1 455 ?5 1 4&5 Total 12@ 1 4!@ 1 5&@

This test indicates that (e (ould need to slaughter 12@ as opposed to 15&@ animals. -dmittedly+ a fe( false negati#es might ha#e slipped through the testing procedure+ ut it is hoped that these (ould e pic.ed up on su sequent testing.

4.% 0asic considerations in the design of epidemiological investigations


4.!.1 5 *ecti#es and hypotheses In this chapter (e ha#e illustrated some of the many pro lems that can e encountered in the design and implementation of epidemiological studies+ and it may e useful at this point to summarise the asic considerations.

4.%.1 1b2ectives and hypotheses


- good (ay to approach the planning of a field study is to ta.e the #ie( that (e are+ in effect+ uying information. )e must ma.e sure+ therefore+ that the study produces the information required at the lo(est possi le cost. )e should also as. oursel#es if that information can e o tained from other+ cheaper sources. The processes in#ol#ed in such considerations could e schematised as follo(s, )igure

The first step is to (rite out clearly the o *ecti#es of the study and the data that (ill need to e generated in order to attain them. Throughout the entire planning process+ constant reference should e made to these o *ecti#es in order to ensure that the procedures eing planned are of rele#ance. If it is found that the resources a#aila le may not permit the achie#ement of the original o *ecti#es+ the o *ecti#es may ha#e to e redefined or additional resources found. 5 *ecti#es can often e defined y constructing a hypothesis. -n epidemiological hypothesis should, Specify the population to which it refers i. e. the population a out (hich one (ishes to ma.e inferences and therefore sample from. This is referred to as the target population. Sometimes+ for practical reasons+ the population actually sampled may e smaller than the target population. In such cases the findings of the study (ill relate to the sampled population+ and care must e e%ercised in e%trapolating inferences from the sampled population to the target population. 2requently+ inferences may e required a out different groups (ithin the target population. 2or e%ample+ one may (ant to estimate not only the o#erall pre#alence of a specific disease+ ut also the pre#alences or incidences of the disease in #arious groups or su sets of the population. To o tain estimates (ith the precision required+ the samples ta.en from these groups must e large enough+ and this (ill o #iously affect the design of the study. - further pro lem may occur (hen defining the actual units to e sampled (ithin a population. If+ for e%ample+ the sample unit (as a calf+ at (hat age e%actly does a calf cease eing a calf< -lternati#ely+ suppose the sample unit is a herd. )hat e%actly is meant y the term EherdE< If a li#estoc. o(ner has only one animal+ does that constitute a herd< 5 #iously+ the sample unit must e precisely defined and appropriate procedures designed to ta.e care of orderline cases.

Specify the determinant or determinants being considered 'an such disease determinants as EstressE+ EclimateE and managementE e defined accurately< 7o( are these determinants to e quantified and (hat measurements (ould e used in their quantification< )hat are the ad#antages and disad#antages of these methods of measurement< 7o( accurate are they< Specify the disease or diseases being considered. The criteria y (hich an animal is regarded as suffering from a particular disease must e carefully defined. )ill the disease e diagnosed on clinical symptoms alone< If so+ (hat clinical symptoms< -re there li.ely to e pro lems (ith differential diagnoses< )ill la oratory confirmation e needed< If so+ are there adequate la oratory facilities a#aila le< )ill they e a le to process all the samples su mitted< )ill diagnostic tests e used< 7o( accurate are these tests< 1emem er that studies ased solely on diagnostic tests may pro#ide data a out the rates of infection present in the population eing sampled+ ut they may not indicate (hether the infected animals are sho(ing signs of disease or not. -dditional data on mortalities and mor idities may ha#e to e generated. )hat rates are to e calculated< 1emem er that incidence and attac. rates cannot normally e o tained y a cross/sectional study. If estimates on economic losses due to particular diseases are required+ #arious production parameters may ha#e to e recorded. 7o( are these to e measured< 7o( good and ho( accurate (ill these measurements e< Specify the e)pected response induced by a determinant on the fre,uency of occurrence of a disease. In other (ords+ (hat effect (ould an increase or decrease in the frequency of occurrence of the determinant ha#e on the frequency of occurrence of the disease< 1emem er that the determinant must occur prior to the disease. This may e difficult to demonstrate in a retrospecti#e study. Make biological sense. In epidemiological studies (e are interested in e%ploring relationships et(een the frequency of occurrence of determinants and the frequency of occurrence of disease. )e are particularly interested in determining (hether the relationship is a causal one i.e. (hether the frequency of occurrence of the particular #aria le eing studied determines the frequency of occurrence of the disease. )e analyse such relationships y the use of statistical tests (hich tell us the pro a ility of occurring y chance of the relati#e distri utions of the determinant and the disease in the studied populations. If there is a good pro a ility that the distri utions occur y chance+ the result is not significant and the distri utions of the #aria le and the disease are independently related. If there is a strong pro a ility that the distri utions did not occur y chance+ the result is significant and the distri utions of the #aria le and the disease are related in some (ay. 9ote that astatistically significant result does not necessarily imply a causal relationship. 6%ample, Suppose that the frequency of occurrence of #aria le - is determined y the frequency of occurrence of #aria le ". (hich also determines the frequency of occurrence of disease B. )hat is the relationship et(een #aria le - and disease B< )igure

9ote that although this arrangement (ould produce a statistically significant relationship et(een #aria le - and the disease B+ the relationship is not a causal one+ since altering the frequency of occurrence of #aria le - (ould ha#e no effect on the frequency of occurrence of the disease+ (hich is determined y #aria le ". >aria les that eha#e in this (ay are .no(n as confounding variables and can cause serious pro lems in the analysis of epidemiological data. 2or this reason+ any hypothesis that is made a out the possi le association of a determinant and a disease should offer a rational iological e%planation as to (hy this association should e. 2inally+ remem er that common e#ents occur commonly and that often the simplest e%planation for a disease phenomenon is the right one. 'omplicated hypotheses should not e tested until the simplest ones ha#e een ruled out. 2or e%ample+ the presence of tic.s on supposedly dipped animals is more li.ely to e due to a failure to dip the animals or to improper dipping procedure+ rather than to the appearance of a ne( strain of acaricide/ resistant tic.s. These considerations emphasise the need for careful and detailed planning of an epidemiological study. They also illustrate the need to o tain as comprehensi#e and detailed .no(ledge as possi le a out the su *ect eing in#estigated and the techniques used in the in#estigation. The time spent reading rele#ant literature is therefore usually (ell spent. 6%tensi#e literature searches can often e performed quic.ly and easily y using modern information/processing techniques. Bo not e afraid to as. ad#ice from e%perts. Such ad#ice is essential (hen one is conducting in#estigations or employing techniques outside oneFs particular area of e%pertise. 1emem er that the time to as. for ad#ice is before the study has egun. )hene#er possi le+ consult a statistician on the statistical design of the study in order to ensure that the data generated (ill e sufficient and can e analysed in the appropriate (ay to fulfil the o *ecti#es of the study.

4.& The use of e3isting data


4.$.1 -d#antages and disad#antages 4.$.2 Sources of data 'ollecting specific epidemiological data in#ol#es a considera le amount of time and effort in oth the planning and implementation stages. "ecause of this+ the possi ility of using e%isting data should e e%plored efore generating ne( ones.

4.&.1 4dvantages and disadvantages

The main ad#antages of using e%isting data are, Bata collection is e%pensi#e= using e%isting data is cheaper although not cost free. Time is often essential= analysis of e%isting data sources gi#es ans(ers more quic.ly. "y using data from #arious sources+ it may ecome possi le to monitor the progress of a disease through different populations and to esta lish lin.ages et(een disease e#ents+ so that the sources of disease out rea.s can e traced and populations li.ely to e at ris. of the disease identified. The use of e%isting data sources (ill help strengthen them or induce the need for change. Since the original data collection (as performed in ignorance of the ongoing study+ there may e a reduced chance of ias in fa#our or against any hypothesis eing tested. The main disad#antages encountered in the use of e%isting data include, Bata sets are often incomplete. 2or e%ample+ national reports ased on compilations of regional reports are almost in#aria ly incomplete and frequently #ery late in appearing+ as some regions are late in reporting. 0arts of data sets may ha#e een lent out and not returned. The data may ha#e een collected for other purposes than those of the present study. 2or e%ample+ data collected initially for administrati#e or accounting purposes are unli.ely to help identify the associations et(een a disease and its determinants. 6%isting data may e inconsistent or of un.no(n consistency. 5 ser#ers change and so do recording systems. 'hanges in administrati#e procedures or policy may alter the type and method of data collection and complicate analysis. 1andom errors of counting or in reading instruments may cancel each other out in the long term+ ut errors are often not random. Scales may e consistently misread due to confusion o#er units and graduations. Bifferent o ser#ers may consistently under/ or o#erestimate li#estoc. num ers+ (eights and ages and differ in their diagnosis of the same disease condition. 'alculations of epidemiological rates are often pre*udiced y ignorance of the size of the population at ris. and of the time o#er (hich e#ents (ere o ser#ed. The data may not e rele#ant. 1ecords for 2riesians (ill not e useful in estimating production losses in ze us. -lthough data may e readily a#aila le from commercial producers+ they (ill not relate to the ma*ority of rural enterprises. Since li#estoc. production is dependent on (eather+ among other factors+ data from a series of years need to e e%amined to o tain representati#e estimates of means and scatter. 6#en if such data are a#aila le from apparently similar farming systems+ chec.ing is necessary to indentify any changes that might ha#e occurred in the pro#ision of ser#ices+ health control+ mar.ets and in prices+ efore ta.ing historical data as eing a good estimate of animal health and production at present.

The method used to collate and analyse the data may not e adequate for epidemiological purposes. If this is the case+ the data may ha#e to e o tained in the original form+ if still a#aila le+ and reanalyzed. This may e a time/consuming process. Moreo#er+ it may not e possi le to su *ect the original data to the appropriate analysis. There are nearly al(ays some serious limitations in the #alue of e%isting data for epidemiological purposes. This does not mean that the data may not e useful= if the limitations are understood+ the pro a ility of their misinterpretation (ill e reduced.

4.&.2 !ources of data


In -frica+ epidemiological data can e o tained from the follo(ing potential sources, -ivestock producers. Cittle or no recorded data are generated directly y traditional li#estoc. producers. )here li#estoc. de#elopment pro*ects+ go#ernment+ parastatal+ or commercial farming are operating+ records may e .ept. Such records can often furnish data on production parameters+ irths+ deaths+ purchases and sales+ hus andry practices+ the frequency of occurrence of specific diseases+ particularly those that produce distinct and easily recogniza le symptoms+ and disease control inputs such as #accinations+ dipping+ treatments+ diagnostic tests etc. The quality of such data fluctuates (idely. Staff may change+ and indi#idual animal records may e lost or destroyed on remo#al of the animals. 7istoric records may gi#e no indication of the population at ris.. If record cards of different groups of animals 3e.g. infertile and mil.ing co(s4 are .ept separately+ care should e ta.en that alla#aila le records are+ in fact+ e%amined. If data on disease are eing collected+ it is necessary to .no( the diagnostic criteria used and (ho made the diagnosis+ so that the li.ely pro lem of differential diagnoses can e assessed. )hen disease recording is attempted y farm staff+ there is often a tendency not to record common conditions+ such as mastitis+ neonatal mortalities and lameness+ (hereas the incidence of dramatic diseases or sudden death is gi#en undue prominence. 'ross/chec.ing (ith records on #eterinary inputs may help to re#eal serious discrepancies. The main disad#antage of the data generated y li#estoc. producers is that the data often relate to specific populations of li#estoc. (hich may e atypical in terms of reed+ hus andry practices and disease control inputs+ to the general li#estoc. populations of the country. .eterinary offices, treatment and e)tension centres. The data produced from such sources are li.ely to e in the form of case oo.s+ treatment records+ #accination and drug returns+ out rea. reports etc. The main pro lem (ith such data lies in relating them to a source population. They are frequently incomplete and may contain significant omissions+ particularly (ith regard to those diseases that are either treated y li#estoc. o(ners themsel#es or for (hich treatment is una#aila le. >eterinarians may #ary considera ly in their diagnostic a ility and preferences. -s a result+ increases or decreases in the occurrence of specific diseases (hich may e reflected in the records may not+ in fact+ e due to actual increases or decreases in disease incidence ut rather to the replacement of one #eterinarian y another+ or to a greater efficiency in o#ercoming operational constraints+

or to the pro#ision of additional drugs+ equipment and facilities. -n increased a(areness on the part of li#estoc. o(ners to a particular disease pro lem or more selecti#e diagnosis and treatment may also lead to an apparent increase in recorded incidence. 0ro a ly the most useful data from such sources are those related to notifia le disease out rea.s+ on (hich detailed reports ha#e to e compiled. If the report forms ha#e een properly designed and the in#estigati#e procedures specified+ such data may allo( the appropriate rates to e calculated. 7o(e#er+ o(ners may e reluctant to report such diseases in their li#estoc.+ especially if they .no( that restrictions are li.ely to e imposed. /iagnostic laboratories. The data generated y diagnostic la oratories often pro#ide precise diagnoses of disease conditions ut can e highly selecti#e. The relati#e frequencies (ith (hich specific diagnoses are reported often reflect the standard and range of la oratory facilities+ and the interests or e%pertise of the field staff and la oratory (or.ers+ rather than the actual situation in the field. ;nless the la oratory has a field sur#ey capacity+ incidence and pre#alence rates cannot e esta lished+ since the data on diagnoses o tained cannot e related to a source population. 9e#ertheless+ such data are often useful in highlighting disease pro lems (hich are of particular concern to the indi#iduals su mitting the specimens. The minimum .no(ledge that disease % (as confirmed in location y at time z pro#ides some asis on (hich to uild. "esearch laboratories, institutions and universities. Most of the data generated y these institutions are li.ely to come from e%periments and may e difficult to relate to the situation in the field. 9e#ertheless+ if research is eing conducted into a particular disease+ the data generated are li.ely to pro#ide #alua le insights into the epidemiology of the disease in question. Such institutions are also good sources of reference and ad#ice. Slaughter houses and slaughter slabs. The data generated from these sources are normally in the form of findings at meat inspection+ and may e recorded in a limited and highly administrati#e format. Ma*or #ariations in the sensiti#ity and specificity of diagnoses may occur et(een different inspectors. The data only pertain to certain sections of li#estoc. populations+ eing highly iased since mostly healthy young adults are e%amined. Significant omissions are common+ and relati#ely rare pathological conditions are not usually differentiated+ ut the data may pro#ide information on congenital a normalities and chronic disease conditions (hich produce distincti#e lesions. Slaughter houses and slaughter sla s are frequently used as a starting point for epidemiological in#estigations since they ha#e facilities for conducting e%aminations and ta.ing specimens that are not a#aila le else(here. 0arketing organizations. Bata from mar.eting organizations pro#ide information on sales and off ta.e and sometimes also on li#estoc. mo#ements. Information on the latter might e used to trace ac. disease out rea.s to their sources. ;nfortunately+ this is rarely the ease in -frica+ since animals are seldom indi#idually identified and therefore their mo#ements cannot e accurately recorded. Control posts and ,uarantine stations. 1ecords from these facilities can pro#ide information a out li#estoc. mo#ements and out rea.s of notifia le diseases.

1rtificial insemination services. 1ecords from -I ser#ices may e of assistance in pro#iding some information a out fertility. The data are normally collected in the form of non/return rates i.e. the proportions of first+ second+ third inseminations etc for (hich no further insemination is requested. Such rates often gi#e an o#erestimate of the true reproducti#e performances in the populations concerned. Many -I ser#ices often include a facility for the in#estigation of infertility pro lems. Bata from such a facility can e of interest ut are difficult to a source population. 2nsurance companies. Since these companies no( offer insurance co#er for high/#alue animals+ and may offer limited co#er for animals of lo(er #alue+ they need to calculate and monitor ris.s+ (hich reflects the interest of the epidemiologist. -s such their records may e useful ut only limited data may e a#aila le. The time required to identify and analyse e%isting records should not e underestimated+ (hile their #alue needs to e carefully (eighed against the cost. - quic. ut comprehensi#e sur#ey of such material should indicate (hether it (ill pro#ide the required ans(ers.

4.5 #onitoring and surveillance


4.&.1 6pidemiological sur#eillance 4.&.2 6pidemiological monitoring 5ne of the most important acti#ities in #eterinary epidemiology is the continuous o ser#ation of the eha#iour of disease in li#estoc. populations. This is commonly .no(n as monitoring or sur#eillance. The term surveillance refers to the continuous o ser#ation of disease in general in a num er of different li#estoc. populations+ (hile monitoring normally refers to the continuous o ser#ation of a specific disease in a particular li#estoc. population.

4.5.1 -pidemiological surveillance


Sur#eillance acti#ities in#ol#e the systematic collection of data from a num er of different sources. These may include already e%isting data sources as (ell as ne( ones that ha#e een created for specific sur#eillance purposes. The data are then analysed in order to, 0ro#ide a means of detecting significant de#elopments in e%isting disease situations+ (ith particular reference to the introduction of ne( diseases+ changes in the pre#alence or incidence of e%isting diseases+ and the detection of causes li.ely to *eopardise e%isting disease control acti#ities+ such as the introduction of ne( strains of disease agents+ chancres in systems of li#estoc. management+ changes in the e%tent and pattern of li#estoc. mo#ements+ the importation of li#estoc. and their products+ and the introduction of ne( drugs+ treatment regimes etc.

Trace the course of disease out rea.s (ith the o *ecti#e of identifying their sources and the populations of li#estoc. li.ely to e at ris.. 0ro#ide a comprehensi#e and readily accessi le data ase on disease in li#estoc. populations for research and planning purposes. The prime o *ecti#e of such acti#ities is+ ho(e#er+ to pro#ide up/to/date information to disease control authorities to assist them in formulating policy decisions and in the planning and implementation of disease control programmes. -lthough a detailed discussion on the design and implementation of sur#eillance systems is eyond the scope of this manual+ it may e useful to re#ie( riefly some of the considerations in#ol#ed. The success of any sur#eillance or monitoring system depends largely on the speed and efficiency (ith (hich the data gathered can e collated and analysed+ so that up/to/date information can e rapidly disseminated to interested parties. -s a result of recent ad#ances in data processing techniques+ particularly in the field of computing+ the de#elopment of comprehensi#e and efficient sur#eillance and monitoring systems at a reasona le cost is no( (ithin the reach of most #eterinary ser#ices. The capacity of epidemiological units to employ these modern techniques means that such units may e a le to offer data/processing ser#ices to institutions and organisations in return for the use of their data. This has remo#ed one of the main constraints on the de#elopment of such systems in the past+ (hich (as the reluctance of #arious data/ generating sources to ma.e their data a#aila le to those responsi le for sur#eillance. Such cooperation depends on a clear identification of the information needs of reporting organisations and fulfilling these rapidly and efficiently. Modern computerised data processing allo(s complicated analytical procedures to e carried out on large #olumes of data quic.ly and easily. 7o(e#er+ they must e used (ith a great deal of caution and only on data (hich *ustify them. If used on incomplete or inaccurate data (hose limitations are not understood+ they may produce results (hich are at est confusing or misleading. 2or this reason+ the analysis of sur#eillance or monitoring data should e .ept simple and the limitations of information produced should e clearly stated.8 - further consideration is that of confidentiality. -ny sur#eillance or monitoring system (ill contain a certain amount of confidential data. If such data get into the (rong hands and are used indiscriminately (ithout due regard to their pro a le limitations+ serious pro lems may result. -ppropriate safeguards need to e designed+ therefore+ to ensure that information is distri uted to interested parties on a confidential and need/to/.no( asis.

4.5.2 -pidemiological monitoring


6pidemiological monitoring may include the use of e%isting routine data sources as (ell as of specific epidemiological field studies. Monitoring of a specific disease in a population is+ in effect+ a specialised form of a longitudinal study. The design of any indi#idual monitoring programme (ill depend largely on the disease or control programme eing monitored e.g. monitoring a #accination programme (ould require different types of data than monitoring a

tic. control programme y dipping. The follo(ing o *ecti#es should e orne in mind in the design of monitoring systems, If control measures are eing employed+ the monitoring programme should pro#ide a means to ascertain (hether these measures are eing carried out promptly and efficiently as specified in the programme design+ and if not+ (hy not. The monitoring programme should pro#ide a means to ascertain (hether the control measures eing applied are ha#ing the desired and predicted effect on disease incidence. This normally implies a prompt and comprehensi#e disease/reporting system. The system should not e passi#e+ ut should include a component that is acti#ely concerned (ith searching out disease out rea.s. The monitoring programme should pro#ide a means for a rapid detection of de#elopments (hich might *eopardise the control programme+ or+ in instances (here no control measures are eing implemented+ (hich might (arrant the introduction of control acti#ities.