You are on page 1of 110

Biostatistics - I

Presented by :
Kush Pathak
Contents
Introduction
History
Applications and uses of biostatistics in science
Common statistical terms
Common symbols used
Data -

a! Collection and types

b! Presentation

c! Analysis

d! Interpretation
"imitations
Conclusion
#eferences

Introduction
$here are three kinds of lies: lies% damn lies% and statistics&
Ben'amin Disraeli ( )ark $*ain!&

$he *ord statistics con+eys a +ariety of meanin, to people& It is


kno*n for handlin, data in ,eneral and in field of research&
$he *ord -statistics. comes from Italian *ord /statista0
meanin, /statesman0 or the 1erman *ord -statistik.% each of
*hich means political state&

It comes from t*o main sources% that are 2! 1o+ernment


records 3! )athematics

4ohn 1raunt 2536 - 2578! *as the father of health statistics&


Definitions
q
Statistics : 9cience of collectin,% summari:in,% presentation% analysis
and interpretation of data is called statistics&
q
q
q
Biostatistics : )ethod of collectin,% or,ani:in,% analy:in,%
tabulatin, and interpretin, the data% related to li+in, or,anisms and human
bein,s is called biostatistics&

;9oben Peter& <ssentials of pre+enti+e and community


dentistry% 3
nd
edition& =e* Delhi : Arya> 3665& ?38@
q

HI9$A#B
Father of Health Statistics
2536 - 2578

$H< HI9$A#B AC
9$A$I9$IC9 HA9 I$09
#AA$9 I= BIA"A1B
Sir Francis Galton

In+entor of fin,erprints%

9tudy of heredity of
Duantitati+e traits

#e,ression E correlation

Karl Pearson

Polymath

-9tudied ,enetics

-Correlation coefficient

3
test

-9tandard de+iation

Sir Ronald Fisher

The Genetical Theory of

Natural Selection

Counder of population
,enetics&

Analysis of +ariance

"ikelihood&

P-+alue

APP"ICA$IA=9 A=D F9<9 AC


BIA9$A$I9$IC9 I= 9CI<=C<
I= PHB9IA"A1B A=D A=A$A)B :

G $o define the limits of normality for +ariables such as hei,ht%


*ei,ht% Blood Pressure etc& in a population&

G Hariation more than natural limits may be patholo,ical i&e&


abnormal due to play of certain eIternal factors&

G $o find correlation bet*een t*o +ariables like hei,ht and


*ei,ht&

I= PHA#)ACA"A1B

G $o find the action of dru,s

G $o compare the action of t*o dru,s or t*o successi+e dosa,es


of same dru,

G $o find the relati+e potency of a ne* dru, *ith respect to a


standard dru,

I= )<DICI=<

G $o compare the efficiency of a particular dru,% operation or


line of treatment

G $o find association bet*een t*o attributes such as cancer and


smokin,

G $o identify si,ns and symptoms of disease

I= CA))F=I$B )<DICI=< A=D PFB"IC


H<A"$H

G $o test usefulness of +accine in the field

G In epidemiolo,ic studies the role of causati+e factors is


statistically tested

CA# 9$FD<=$9 :

G By learnin, the methods in biostatistics a student learns to


e+aluate articles published in medical and dental 'ournals
or papers read in medical and dental conferences&

G He also understands the basic methods of obser+ation in his


clinical practice and research&

Common 9tatistical terms

HA#IAB"<9 :

Characteristic that takes different +alues for different persons% place or


thin,s&

A Duantity that +aries bet*een limits i&e& hei,ht% *ei,ht% blood


pressure% a,e etc&

Denoted as J and for orderly series as J2% J3% JKL&&Jn

9i,ma stands for summation of results or obser+ations&

CA=9$A=$ :

Muantities that do not +ary such as N O K&282% e O 3&72?

$hese do not reDuire statistical study&

e&,& in biostatistics% mean% standard de+iation are considered


constant for a population&

AB9<#HA$IA= :

An e+ent and it0s measurements% such as B&P and 236 mm of H,

AB9<#HA$IA=A" F=I$ :

9ource that ,i+es obser+ations% such as ob'ect or person etc&

In medical stats% term indi+iduals or sub'ect% is used more often&

&

DA$A:

9et of +alues recorded on one or more obser+ational units&

PAPF"A$IA= :

Population includes all persons% e+ents and ob'ects under study&

It may be finite or infinite&

9A)P"< :

Defined as a part of a population ,enerally selected so as to be


representati+e of the population *hose +ariables are under study&

PA#A)<$<#

It is a constant that describes a population e&,& in a colle,e there are


86P ,irls& $his describes the population% hence it is a parameter&

9$A$I9$IC

9tatistic is a constant that describes the sample e&,& out of 366 students
of the same colle,e 8QP ,irls& $his 8QP *ill be statistic as it
describes the sample


A$$#IBF$<

A characteristic based on *hich the population can be described into


cate,ories or classes e&,& ,ender% caste% reli,ion&

Commonly used symbols

O
<Dual to
R 1reater than
S "esser than
T =o& of standard de+iations
P Percenta,e
r
Pearson0s correlation coefficient
U 9pearman0s rank correlation coefficient
d&f& or f De,ree of freedom
K =umber of ,roups or classes
P Probability
A Abser+ed number
< <Ipected number
DA$A
9et of +alues recorded on one or more obser+ational units is called
data&

It is of t*o types :

MFA"I$A$IH< discrete! data

MFA=$I$IH< continuous! data

Collection of health information


A. Census :

Fnited nations define census as - the total process of collectin,% compilin,


and publishin, demo,raphic% economic and social data pertainin, at
the specified time or times% to all persons in a country or delimited
territory.

It is an important source of health information&

Cirst re,ular census in India *as taken in 1881% and others took place at 26
year inter+als&

Primary function of census is to pro+ide demographic information such as


total count of population and it0s breakdo*n into ,roups and sub
,roups such as a,e and seI distribution&

Population census provides basic data by a,e and seI! needed to compute
+ital statistical rates% and other health% demo,raphic and socio economic
indicators&

B& Registration of vital events :

Fnited nations define a +ital e+ent re,istration system as includin, - le,al


re,istration% statistical recordin, and reportin, of the occurrence of% and
the collection% compilation% presentation% analysis and distribution of
statistics pertainin, to +ital e+ents i&e& li+e births% deaths% fetal deaths%
marria,es% di+orces% adoption% le,itimations% reco,nitions% annulments and
le,al separations&.

It keeps a continuous check on demo,raphic chan,es&

In 1873% the 1o+t& of India had passed the Births% Deaths and )arria,es
#e,istration Act& But still the re,istration system in India tended to be
+ery unreliable% the data bein, ,rossly deficient in re,ard to accuracy%
timelines% completeness and co+era,e&

Due to this other actions *ere taken :

o
$he Central Births and Deaths #e,istration act% 2V5V :

Central Births and Deaths #e,istration Act *as promul,ated in 2V5V%


*hich came into force on 1
st
pril 1!7"&

$he time limitin, of re,isterin, the e+ents of births is 28 days and that of
deaths is 7 days& In case of any default% a fine of #s& Q6 *as imposed&
o
"ay #eportin, :

It is defined as% -Collection of information% it0s use and transmission to


other le+els of health system by non professional health *orkers&.

9ome countries ha+e attempted to employ first line health *orkerse&,&


+illa,e health ,uides! to record births and deaths in a community&

C. Sample Registration system (SRS) :

It0s a dual record system consistin, of continuous enumeration of


births and deaths by an enumerator and an independent sur+ey
e+ery 5 months by an in+esti,ator- super+isor&

It *as initiated in the mid 1!#"s to pro+ide reliable estimates of birth


and death rates at the national and state le+els&

It is a ma'or source of health information&

D& Notification of iseases :

It0s primary purpose is to effect pre+ention and(or control of the diseases&

Also a +aluable source of morbidity data&

Diseases *hich are considered to be serious menaces to public health are


included in the list of notifiable diseases&

"imitations : a! co+ers only a small part of total sickness in the


community b! 9ystem suffers from a ,ood deal of under reportin, c!
)any cases specially% atypical and subclinical cases escape notification
due to non G reco,nition&

<& !ospital recors :

$hey constitute a basic and primary source of information about diseases


pre+alent in the community&

Dra*backs : a! Pro+ide info& An only those patients *ho seek medical


care& b! Admission policy may +ary from hospital to hospital& c!
Population ser+ed by a hospital cannot be defined&

C& "isease Registers :

Pro+ides a permanent record of diseases and morbidity caused due to


them&

If reportin, system is effecti+e and the co+era,e is on a national basis%


re,ister can pro+ide useful data on morbidity and disease specific
mortality&

1. Recor #in$age :

Fsed to describe the process of brin,in, to,ether% records relatin, to


one indi+idual and the records ori,inatin, in different times or
places&

)edical record linka,e implies the assembly and maintenance for each
indi+idual in a population% of a file of the more important records
relatin, to his health&

Problem : Holume of data accumulated& $herefore% in practice% records


linka,e has been applied only on a limited scale& <&,& t*in studies%
measurement of morbidity% chronic diseases& <tc&

H. %nvironmental health ata :

$hese statistics no* pro+ide data on +arious aspects of air% *ater and
noise pollution> harmful food additi+es> industrial intoIicants etc&

&. !ealth manpo'er statistics :

#elates to physicians% dentists% pharmacists% +eterinarians% nurses%


technicians etc&

$heir records are maintained by state medical( dental( nursin, counsils


and directorates of medial education&

4. Population surveys :

Carried out for epidemiolo,ical studies by trained teams to find


incidence or pre+alence of health or disease in a community&

Pro+ide useful info on :


Chan,in, trends in health status&
$imely *arnin, of public health ha:ards&
Ceedback eIpected to modify policy and
system&

Health sur+eys can be classified as :

a! Health inter+ie* face to face! sur+ey

b! health eIamination sur+ey c! health


records sur+eys d! )ailed Duestionnaire
sur+ey

K& Non( )uantifia*le information :

Health planners also need non Duantifiable info& <&,& health policies%
health le,islations% public attitudes% pro,ramme costs% procedures
and technolo,ies&

$ypes of Data
$%alitative or discrete data :

When the data is collected on the basis of attributes or Dualities like


seI% malocclusion and ca+ities etc&% it is called as Dualitati+e data&

$he number of person ha+in, the same attribute are +ariable and are
measured&

for e&,& G Aut of 266 people% 7Q ha+e diabetes% 2Q ha+e $&B and 26
ha+e Anemia&

$hen diabetes% $&B and Anemia are attributes *hich can not be
measured in fi,ures& Anly number of people ha+in, it can be
determined&
$%antitative or contin%o%s data &

When the data is collected throu,h measurement usin, calipers%


etc& it is called Duantitati+e data&

In such classification there are t*o +ariables : -


Characteristic G such as hei,ht

CreDuency G i&e& number of persons *ith same


characteristic and in same ran,e

e&,&
Hei,ht of one person is 2Q6 cm and other is 256 cm and both are of
same a,e and seI&

Persons *ith 2Q6 cms or in ran,e of 2Q6 G 2Q3 cm may be 26 and that
of 256 cm or in ran,e of 256 G 253 cm may be 36&

$hus *e find out characteristic and freDuency& Both +ary from person
to person as *ell as ,roup to ,roup&
Presentation
q
$abulation
q
q
Dra*in,s
q
$abulation :
Is the most common method

Data presentation is in the form of columns and ro*s

It can be of the follo*in, types


G 9imple tables
G CreDuency distribution tables

9imple tables :
)onth and Bear =umber of biopsies performed
in Aral Patholo,y department
4anuary 3626 2Q
4une 3626 32
December 3626 35
CreDuency Distribution tables :

In a freDuency distribution table% the data is first split into con+enient


,roups class inter+al ! and the number of items freDuency !
*hich occurs in each ,roup is sho*n in ad'acent column

Bear
and
month
=o& of biopsies sent from different departments to
Aral Patholo,y department&
Aral
sur,ery
Aral
)edicin
e
Cons
and
<ndo
Pediatric
Dept&
Perio& Pri+ate
Clinics
4anuary
3626
5 3 K 2 2 3
4une
3626
22 =I" 3 3 3 8
Dec
3626
2V =I" 2 3 2 K
Charts and Dra*in,s :

Fseful method of presentin, statistical data

Po*erful impact on ima,ination of the people

Presentation of Duantitati+e data is done throu,h ,raphs& $hey are :


Histo,rams
CreDuency Poly,ons
CreDuency cur+e
"ine chart or ,raph
Cumulati+e freDuency dia,ram
9catter or dot dia,ram
Presentation of Dualitati+e data is done throu,h dia,rams& $hey are :
Bar
Pie or sector
Picto,ram or picture dia,ram
)ap dia,ram or spot map
Histo,rams

Pictorial presentation of freDuency distribution&

Consists of series of rectan,les&

Class inter+al ,i+en on +ertical aIis

Area of rectan,le is proportional to the freDuency


CreDuency Poly,on

Abtained by 'oinin, midpoints of histo,ram blocks at the hei,ht of


freDuency by strai,ht lines usually formin, a poly,on&


CreDuency cur+e :
When number of obser+ations is +ery lar,e and class inter+al is
reduced the freDuency poly,on looses its an,ulations becomin, a
smooth cur+e kno*n as freDuency cur+e&

"ine Chart

"ine dia,ram are used to sho* the trends of e+ents *ith the passa,e of
time&

Cumulati+e freDuency dia,ram


1raphical representation of cumulati+e freDuency &

It is obtained by addin, the freDuency of pre+ious class &

9catter or Dot dia,ram


9ho*s relationship bet*een t*o +ariables&

If the dots are clustered sho*in, a strai,ht line% it sho*s a relationship


of linear nature&

Bar Chart
"en,th of bars dra*n +ertical or hori:ontal is proportional to
freDuency of +ariable&

9uitable scale is chosen&

Bars are usually eDually spaced&

$hey are of three types :

-9imple bar chart

-)ultiple bar chart

-Component bar chart

9imple bar chart



)ultiple bar chart :

$*o or more +ariables are ,rouped to,ether



Component bar chart :

Bars are di+ided into t*o or more parts&

<ach part representin, certain item and proportional to ma,nitude of


that item&
Pie chart
In this freDuencies of the ,roup are sho*n as se,ment of circle&

De,ree of an,le denotes the freDuency&

An,le is calculated by
class fre'%enc( ) 3#"

total observations

Picto,ram
Popular method of presentin, data to the common man&

9pot map or )ap dia,ram


$hese maps are prepared to sho* ,eo,raphic distribution of
freDuencies of characteristics&

2
Analysis
A+era,e +alue in a distribution is the one central +alue around *hich
all the other obser+ations are concentrated&

A+era,e +alue helps :


$o find most characteristic +alue of a set of measurements&

$o find *hich ,roup is better off by comparin, the a+era,e of


one ,roup *ith that of another&

;K&park& Pre+enti+e and social medicine% 36


th
edition:
)c1ra*-Hill )edical> 366V& 78V@

)ost commonly used a+era,es are


)ean
)edian
)ode

)ean
#efers to arithmetic mean&

Indi+idual obser+ations are first added to,ether% and then di+ided by


the number of obser+ations&

Addition of the obser+ations is called /summation0 and is denoted by


X or 9&

Indi+idual obser+ations are denoted by and the mean is denoted by I


/J0 bar!&

I O I2 Y J3 Y JK L& J (

e,& $he diastolic blood pressure of 26 indi+iduals *as ?K% 7Q% ?2% 7V%
72% VQ% 7Q% 77% ?8% V6& $he total *as ?26% *hich *as then di+ided
by 26% resultin, into ?2&6

Ad+anta,es G It is easy to calculate&

Disad+anta,es G Influenced by eItreme +alues&

)edian
When all the obser+ation are arran,ed either in ascendin, order or
descendin, order% the middle obser+ation is kno*n as median&

In case of e+en number the a+era,e of the t*o middle +alues is taken&

)edian is better indicator of central +alue as it is not affected by the


eItreme +alues&

Diastolic Blood
Pressure
unarran,ed!
?K
7Q
?2
7V
72
VQ
7Q
77
?8
Diastolic Blood
Pressure
arran,ed!
72
7Q
7Q
77
7V median!
?2
?K
?8
VQ
Diastolic Blood
Pressure
unarran,ed!
?K
7Q
?2
7V
72
VQ
7Q
77
?8
V6
Diastolic Blood
Pressure
arran,ed!
72
7Q
7Q
77
7V
?2
?K
?8
V6
VQ
7V Y?2(3 O?6
In case there are 26 +alues instead of V
)ode
)ost freDuently used obser+ation or most /fashionable0 +alue in a
series of obser+ation% is called mode&

<&,& diastolic blood pressure of 36 indi+iduals is ?Q% 7Q% ?2% 7V% 72% VQ%
7Q% 77% 7Q% V6% 72% 7Q% 7V% VQ% 7Q% 77% ?8% 7Q% ?2% 7Q&

Here the most freDuently occurrin, +alue is 7Q&

Avantages :

It is easy to understand&
=ot affected by eItreme items&

"isavantages :
<Iact location is often uncertain and not clearly defined&

;$herefore% mode is not often used in


biolo,ical or medical statistics&@

Interpretation
Test of Significance :

Whate+er be the samplin, procedure or the care taken *hile selectin,


sample% the sample statistics *ill differ from the population
parameters&

Hariations bet*een 3 samples dra*n from the same population may


also occur&

But differences in the results bet*een t*o research *orkers for the
same in+esti,ation may be obser+ed&

9o% it becomes important to find out the si,nificance of this obser+ed +ariation

i&e& *hether it is due to


G chance or biolo,ical +ariation statistically not si,nificant! A#
G due to influence of some eIternal factors statistically si,nificant!

$o test *hether the +ariation obser+ed is of si,nificance% +arious tests of


si,nificance are done&

$ests of si,nificance can be broadly classified as

v Parametric tests
v
v *on parametric tests

Parametric $ests
Parametric tests are those tests in *hich certain assumptions are made about
the population :

v Population from *hich sample is dra*n has normal distribution&


v
v $he +ariances of sample do not differ si,nificantly&
v
v $he obser+ations found are truly numerical thus arithmetic procedure such as
addition% di+ision% and multiplication can be used&

9ince these test make assumptions about the population parameters% they are
called parametric tests &

$hese are usually used to test the difference&

$hey are:
G 9tudent $ test paired or unpaired!
G A=AHA

A=AHA

Analysis of +ariance
In+esti,ations may not al*ays be confined to comparison of 3 samples
only
e&,& *e mi,ht like to compare the difference in +ertical dimension
obtained usin, 3 or more methods like phonetics% s*allo*in,&
In such cases *here more than 3 samples are used A=AHA can be
used
Also *hen measurements are influenced by se+eral factors playin,
there role e&,& factors affectin, retention of a denture% A=AHA can
be used&
A=AHA helps to decide *hich factors are more important

Re)uirements

G Data for each ,roup are assumed to be independent and


normally distributed
G 9amplin, should be at random

+ne 'ay A=AHA :

G -Where only one factor *ill effect the result bet*een 3 ,roups

T'o 'ay A=AHA

G Where *e ha+e 3 factors that affect the result or outcome&

,ulti 'ay A=AHA

-$hree or more factors affect the result or outcomes bet*een


,roups

9tudent t test
It *as ,i+en by W9 1ossett *hose pen name *as student &

$here are t*o types of student t $est&

2& Fnpaired t test


3& Paired t test
Fnpaired t test
Applied to unpaired data of obser+ation made on indi+iduals of 3
separate ,roups to find the si,nificance of difference bet*een 3
means&

9ample si:e is less than K6&

e&,& difference in accuracy in an impression usin, t*o different


impression materials

Steps in unpaire t Test are :

Calculate the mean of t*o samples&

Calculate combined standard de+iation

Calculate the standard error of mean *hich is ,i+en by

9<) O 9D Z2(n2 Y 2(n3&

Calculate obser+ed difference bet*een means J2 G J3

Calculate t +alue O obser+ed difference ( 9tandard error of mean

Determine the de,ree of freedom *hich is one less than no of


obser+ation in a sample n -2!

Here combined de,ree of freedom *ill be O n2 G 2! Y n3 G 2!

#efer to table and find the probability of the t +alue correspondin, to


de,ree of freedom

PR 6&6Q states difference is si,nificant

PS 6&6Q states difference is not si,nificant

Paired t test
It is applied to paired data of obser+ation from one sample only&

Fsed in sample less than K6

$he indi+idual ,i+es a pair of obser+ation i&e& obser+ation before and


after takin, a dru,

$he steps in+ol+ed are :

Calculate the difference in paired obser+ation i&e& before and after O I2


G I3 O y

Calculate the mean of this difference O y

Calculate 9D

Calculate 9< O 9D ( Z n

Determine t O y ( 9<

Determine the de,ree of freedom&

9ince there is one sample df O n-2

#efer to table and find the probability of the t +alue correspondin, to


de,ree of freedom
PR 6&6Q states difference is si,nificant

PS 6&6Q states difference is not si,nificant

=on Parametric tests


In many biolo,ical in+esti,ation the research *orker may not kno* the nature
of distribution or other reDuired +alues of the population&

Also some biolo,ical measurements may not be true numerical +alues hence
arithmetic procedures are not possible in such cases&

In such cases distribution free or non parametric tests are used in *hich no
assumption are made about the population parameters e&,&
G )ann Whitney test
G Chi sDuare test
G Phi coefficient test
G Cischer0s <Iact test
G 9i,n $est
G Creidman[s $est

Chi sDuare test


Chi sDuare test unlike : and t test is a non parametric test&

$he test in+ol+es calculation of a Duantity called chi sDuare &

Chi sDuare is denoted by -3

It *as de+eloped by Karl Pearson

$he most important application of chi sDuare test in medical statistics


are
G $est of proportion
G $est of association
G $est of ,oodness of fit

$est of proportion
G Fsed as an alternate test to find the si,nificance of difference
in 3 or more than 3 proportions

$est of association
G $o measure the probability of association bet*een 3 discreet
attributes e&, smokin, and cancer

$est of ,oodness of fit


G $ests *hether the obser+ed +alues of a character differ from
the eIpected +alue by chance or due to play of some
eIternal factor

9ta,es in performin, $ests of


9i,nificance
9tate the null hypothesis

9tate the alternati+e hypothesis

Accept or re'ect the null hypothesis

Cinally determine the p +alue

9tate the null hypothesis


State the null hypothesis :

=ull Hypothesis% is a hypothesis of no difference bet*een statistics of a


sample and parameter of the population or bet*een statistics of t*o
samples&

It nullifies the claim that the eIperimental result is different from or better
than the one obser+ed already

9tate the alternati+e hypothesis


State the alternative hypothesis :

It states% that the sample result is different i&e& lar,er or smaller than the +alue
of population or statistics of one sample is different from the other&

Accept or re.ect the null hypothesis :

=ull Hypothesis is accepted or re'ected dependin, on *hether the result falls


in :one of acceptance or :one of re'ection&

If the result of a sample falls in the area of mean \ 39< the null hypothesis is
accepted&

$his area of normal cur+e is called :one of acceptance for null hypothesis&

If the result of sample falls beyond the area of mean \ 3 9<&

=ull hypothesis of no difference is re'ected and alternate hypothesis accepted&

$his area of normal cur+e is called :one of re'ection for null hypothesis

/inally etermining the P value :

P +alue is determined usin, any of the pre+iously mentioned methods&

If pS 6&6Q% the difference is due to chance and is not statistically different but
if p R 6&6Q the difference is due to some eIternal factor and statistically
si,nificant&

Probability or p +alue
Concept of probability is +ery important in statistics&

Probability is the chance of occurrence of any e+ent or permutation


combination&

It is denoted by p for sample and P for population&

In +arious tests of si,nificance *e are often interested to kno*


*hether the obser+ed difference bet*een 3 samples is by chance or
due to samplin, +ariation&

At this time% probability or p +alue is used to find out the difference&

P ran,es from 6 to 2

6 O there is no chance that the obser+ed difference could not be due to


samplin, +ariation

2 O it is absolutely certain that obser+ed difference bet*een 3 samples


is due to samplin, +ariation

Ho*e+er such eItreme +alues are rare&

P O 6&8 i&e& chances that the difference is due to samplin, +ariation is 8


in 26

Ab+iously the chances that it is not due to samplin, +ariation *ill be 3


in 26&

$he essence of any test of si,nificance is to find out p +alue and dra*
inference&

If p +alue is 6&6Q or more


It is customary to accept that difference is due to chance
samplin, +ariation! &
$he obser+ed difference is said to be statistically not
si,nificant&

If p +alue is less than 6&6Q

Abser+ed difference is not due chance but due to role of some


eIternal factors&

$he obser+ed difference here is said to be statistically


si,nificant&

9amplin,
When a lar,e proportion of indi+iduals are to be studied% it is
impossible to include each and e+ery member% as it *ill be time
consumin,% costly% laborious& 9o% samplin, is done&

9amplin, is a process by *hich some unit of a population are selected


for the study and by sub'ectin, it to statistical computation%
conclusions are dra*n about the population from *hich these units
are dra*n&

$he sample taken *ill be a representati+e of entire population&

It is sufficiently lar,e&

It is unbiased&

9uch sample *ill ha+e its statistics almost eDual to parameters of


entire population&

$*o main characteristics of a representati+e sample are :

Precision

Fnbiased character

Precision
Precision depends on a sample si:e&

Ardinarily sample si:e should not be less than K6&

Precision O Zn(s

n O sample si:e % s O standard de+iation

Precision is directly proportional to sDuare root of sample si:e& 1reater


the sample si:e ,reater the precision&

$hus% to obtain precision% sample si:e needs to be increased

Fnbiased character
$he sample should be unbiased i&e& e+ery indi+idual should ha+e an
eDual chance to be selected in the sample&

$hus a standard random samplin, method should be used&

=on samplin, errors can be taken care of by


Fsin, standardi:ed instruments and criteria&
By sin,le% double% triple blind trials
Fse of a control ,roup

"imitations
Statistics has several limitations :

It ,i+es statistical and not substanti+e ans*ers&

$he statistical conclusion refers to ,roups and not indi+iduals&

It only summari:es but does not interpret data&


9tatistics can be misused by selecti+e presentation of desired results&

Computation is not an end in itself& It is a tool that can be used *ell or


can be misused&

A human must ha+e a clear idea of *hat is reDuired of the computer


and must instruct it accordin,ly&

$he human must also be able to intelli,ently interpret the output from
the computer&

All *ho tinker *ith computers must remember the ada,e /rubbish
in(rubbish out0&

Conclusion
Health information systems are the best means of ,ettin, reliable%
rele+ant% up to date% adeDuate and reasonably complete information
for health mana,ers at all le+els&
Althou,h% bein, a +ery helpful source for collection of data% it has
been +ery difficult to ,et information *here it matters most i&e& at
community le+el&
9o% actions should be taken in this direction and this system should be
used more freDuently for better and clear results% mainly in cases of
researches in+ol+in, lar,e masses&

#eferences
K&park& Preventive an social meicine% 36
th
edition : )c 1ra*
G Hill )edical > 366V &78K G 7Q5
9oben Peter& %ssentials of preventive an community entistry%
3
nd
edition& =e* Delhi : Arya> 3665& 32 G Q6
B&k&)aha'an& ,ethos in 0iostatistics for meical stuents an
research 'or$ers% 5
th
edition& =e* Delhi : 4aypee brothers >
3665& 2- KV

You might also like