Professional Documents
Culture Documents
Bio Statistics Basics
Bio Statistics Basics
Presented by :
Kush Pathak
Contents
Introduction
History
Applications and uses of biostatistics in science
Common statistical terms
Common symbols used
Data -
b! Presentation
c! Analysis
d! Interpretation
"imitations
Conclusion
#eferences
Introduction
$here are three kinds of lies: lies% damn lies% and statistics&
Ben'amin Disraeli ( )ark $*ain!&
HI9$A#B
Father of Health Statistics
2536 - 2578
$H< HI9$A#B AC
9$A$I9$IC9 HA9 I$09
#AA$9 I= BIA"A1B
Sir Francis Galton
In+entor of fin,erprints%
9tudy of heredity of
Duantitati+e traits
#e,ression E correlation
Karl Pearson
Polymath
-9tudied ,enetics
-Correlation coefficient
3
test
-9tandard de+iation
Natural Selection
Counder of population
,enetics&
Analysis of +ariance
"ikelihood&
P-+alue
I= PHA#)ACA"A1B
I= )<DICI=<
CA# 9$FD<=$9 :
HA#IAB"<9 :
CA=9$A=$ :
AB9<#HA$IA= :
AB9<#HA$IA=A" F=I$ :
&
DA$A:
PAPF"A$IA= :
9A)P"< :
PA#A)<$<#
9$A$I9$IC
9tatistic is a constant that describes the sample e&,& out of 366 students
of the same colle,e 8QP ,irls& $his 8QP *ill be statistic as it
describes the sample
A$$#IBF$<
O
<Dual to
R 1reater than
S "esser than
T =o& of standard de+iations
P Percenta,e
r
Pearson0s correlation coefficient
U 9pearman0s rank correlation coefficient
d&f& or f De,ree of freedom
K =umber of ,roups or classes
P Probability
A Abser+ed number
< <Ipected number
DA$A
9et of +alues recorded on one or more obser+ational units is called
data&
It is of t*o types :
Cirst re,ular census in India *as taken in 1881% and others took place at 26
year inter+als&
Population census provides basic data by a,e and seI! needed to compute
+ital statistical rates% and other health% demo,raphic and socio economic
indicators&
In 1873% the 1o+t& of India had passed the Births% Deaths and )arria,es
#e,istration Act& But still the re,istration system in India tended to be
+ery unreliable% the data bein, ,rossly deficient in re,ard to accuracy%
timelines% completeness and co+era,e&
o
$he Central Births and Deaths #e,istration act% 2V5V :
$he time limitin, of re,isterin, the e+ents of births is 28 days and that of
deaths is 7 days& In case of any default% a fine of #s& Q6 *as imposed&
o
"ay #eportin, :
1. Recor #in$age :
)edical record linka,e implies the assembly and maintenance for each
indi+idual in a population% of a file of the more important records
relatin, to his health&
$hese statistics no* pro+ide data on +arious aspects of air% *ater and
noise pollution> harmful food additi+es> industrial intoIicants etc&
4. Population surveys :
Health planners also need non Duantifiable info& <&,& health policies%
health le,islations% public attitudes% pro,ramme costs% procedures
and technolo,ies&
$ypes of Data
$%alitative or discrete data :
$he number of person ha+in, the same attribute are +ariable and are
measured&
for e&,& G Aut of 266 people% 7Q ha+e diabetes% 2Q ha+e $&B and 26
ha+e Anemia&
$hen diabetes% $&B and Anemia are attributes *hich can not be
measured in fi,ures& Anly number of people ha+in, it can be
determined&
$%antitative or contin%o%s data &
e&,&
Hei,ht of one person is 2Q6 cm and other is 256 cm and both are of
same a,e and seI&
Persons *ith 2Q6 cms or in ran,e of 2Q6 G 2Q3 cm may be 26 and that
of 256 cm or in ran,e of 256 G 253 cm may be 36&
$hus *e find out characteristic and freDuency& Both +ary from person
to person as *ell as ,roup to ,roup&
Presentation
q
$abulation
q
q
Dra*in,s
q
$abulation :
Is the most common method
9imple tables :
)onth and Bear =umber of biopsies performed
in Aral Patholo,y department
4anuary 3626 2Q
4une 3626 32
December 3626 35
CreDuency Distribution tables :
Bear
and
month
=o& of biopsies sent from different departments to
Aral Patholo,y department&
Aral
sur,ery
Aral
)edicin
e
Cons
and
<ndo
Pediatric
Dept&
Perio& Pri+ate
Clinics
4anuary
3626
5 3 K 2 2 3
4une
3626
22 =I" 3 3 3 8
Dec
3626
2V =I" 2 3 2 K
Charts and Dra*in,s :
CreDuency Poly,on
CreDuency cur+e :
When number of obser+ations is +ery lar,e and class inter+al is
reduced the freDuency poly,on looses its an,ulations becomin, a
smooth cur+e kno*n as freDuency cur+e&
"ine Chart
"ine dia,ram are used to sho* the trends of e+ents *ith the passa,e of
time&
Bar Chart
"en,th of bars dra*n +ertical or hori:ontal is proportional to
freDuency of +ariable&
An,le is calculated by
class fre'%enc( ) 3#"
total observations
Picto,ram
Popular method of presentin, data to the common man&
2
Analysis
A+era,e +alue in a distribution is the one central +alue around *hich
all the other obser+ations are concentrated&
)ean
#efers to arithmetic mean&
I O I2 Y J3 Y JK L& J (
e,& $he diastolic blood pressure of 26 indi+iduals *as ?K% 7Q% ?2% 7V%
72% VQ% 7Q% 77% ?8% V6& $he total *as ?26% *hich *as then di+ided
by 26% resultin, into ?2&6
)edian
When all the obser+ation are arran,ed either in ascendin, order or
descendin, order% the middle obser+ation is kno*n as median&
In case of e+en number the a+era,e of the t*o middle +alues is taken&
Diastolic Blood
Pressure
unarran,ed!
?K
7Q
?2
7V
72
VQ
7Q
77
?8
Diastolic Blood
Pressure
arran,ed!
72
7Q
7Q
77
7V median!
?2
?K
?8
VQ
Diastolic Blood
Pressure
unarran,ed!
?K
7Q
?2
7V
72
VQ
7Q
77
?8
V6
Diastolic Blood
Pressure
arran,ed!
72
7Q
7Q
77
7V
?2
?K
?8
V6
VQ
7V Y?2(3 O?6
In case there are 26 +alues instead of V
)ode
)ost freDuently used obser+ation or most /fashionable0 +alue in a
series of obser+ation% is called mode&
<&,& diastolic blood pressure of 36 indi+iduals is ?Q% 7Q% ?2% 7V% 72% VQ%
7Q% 77% 7Q% V6% 72% 7Q% 7V% VQ% 7Q% 77% ?8% 7Q% ?2% 7Q&
Avantages :
It is easy to understand&
=ot affected by eItreme items&
"isavantages :
<Iact location is often uncertain and not clearly defined&
Interpretation
Test of Significance :
But differences in the results bet*een t*o research *orkers for the
same in+esti,ation may be obser+ed&
9o% it becomes important to find out the si,nificance of this obser+ed +ariation
v Parametric tests
v
v *on parametric tests
Parametric $ests
Parametric tests are those tests in *hich certain assumptions are made about
the population :
9ince these test make assumptions about the population parameters% they are
called parametric tests &
$hey are:
G 9tudent $ test paired or unpaired!
G A=AHA
A=AHA
Analysis of +ariance
In+esti,ations may not al*ays be confined to comparison of 3 samples
only
e&,& *e mi,ht like to compare the difference in +ertical dimension
obtained usin, 3 or more methods like phonetics% s*allo*in,&
In such cases *here more than 3 samples are used A=AHA can be
used
Also *hen measurements are influenced by se+eral factors playin,
there role e&,& factors affectin, retention of a denture% A=AHA can
be used&
A=AHA helps to decide *hich factors are more important
Re)uirements
G -Where only one factor *ill effect the result bet*een 3 ,roups
9tudent t test
It *as ,i+en by W9 1ossett *hose pen name *as student &
Paired t test
It is applied to paired data of obser+ation from one sample only&
Calculate 9D
Calculate 9< O 9D ( Z n
Determine t O y ( 9<
Also some biolo,ical measurements may not be true numerical +alues hence
arithmetic procedures are not possible in such cases&
In such cases distribution free or non parametric tests are used in *hich no
assumption are made about the population parameters e&,&
G )ann Whitney test
G Chi sDuare test
G Phi coefficient test
G Cischer0s <Iact test
G 9i,n $est
G Creidman[s $est
$est of proportion
G Fsed as an alternate test to find the si,nificance of difference
in 3 or more than 3 proportions
$est of association
G $o measure the probability of association bet*een 3 discreet
attributes e&, smokin, and cancer
It nullifies the claim that the eIperimental result is different from or better
than the one obser+ed already
It states% that the sample result is different i&e& lar,er or smaller than the +alue
of population or statistics of one sample is different from the other&
If the result of a sample falls in the area of mean \ 39< the null hypothesis is
accepted&
$his area of normal cur+e is called :one of acceptance for null hypothesis&
$his area of normal cur+e is called :one of re'ection for null hypothesis
If pS 6&6Q% the difference is due to chance and is not statistically different but
if p R 6&6Q the difference is due to some eIternal factor and statistically
si,nificant&
Probability or p +alue
Concept of probability is +ery important in statistics&
P ran,es from 6 to 2
$he essence of any test of si,nificance is to find out p +alue and dra*
inference&
9amplin,
When a lar,e proportion of indi+iduals are to be studied% it is
impossible to include each and e+ery member% as it *ill be time
consumin,% costly% laborious& 9o% samplin, is done&
It is sufficiently lar,e&
It is unbiased&
Precision
Fnbiased character
Precision
Precision depends on a sample si:e&
Precision O Zn(s
Fnbiased character
$he sample should be unbiased i&e& e+ery indi+idual should ha+e an
eDual chance to be selected in the sample&
"imitations
Statistics has several limitations :
9tatistics can be misused by selecti+e presentation of desired results&
$he human must also be able to intelli,ently interpret the output from
the computer&
All *ho tinker *ith computers must remember the ada,e /rubbish
in(rubbish out0&
Conclusion
Health information systems are the best means of ,ettin, reliable%
rele+ant% up to date% adeDuate and reasonably complete information
for health mana,ers at all le+els&
Althou,h% bein, a +ery helpful source for collection of data% it has
been +ery difficult to ,et information *here it matters most i&e& at
community le+el&
9o% actions should be taken in this direction and this system should be
used more freDuently for better and clear results% mainly in cases of
researches in+ol+in, lar,e masses&
#eferences
K&park& Preventive an social meicine% 36
th
edition : )c 1ra*
G Hill )edical > 366V &78K G 7Q5
9oben Peter& %ssentials of preventive an community entistry%
3
nd
edition& =e* Delhi : Arya> 3665& 32 G Q6
B&k&)aha'an& ,ethos in 0iostatistics for meical stuents an
research 'or$ers% 5
th
edition& =e* Delhi : 4aypee brothers >
3665& 2- KV