Professional Documents
Culture Documents
Intro To Bio Statistics PDF
Intro To Bio Statistics PDF
Bio-Statistics – What is it
• Biostatistics is a branch of science which deal with the
application of statistics to biological science for
interpretation of the data received in biological systems.
Introduction to Bio-Statistics • Statistics is a branch of science which deals with methods
and tools for collection, compilation, summarization, data
presentation, comparison, interpretation and making
Dr. Biju George inference about the data and population.
Associate Professor • Thus statistics try to make meaning out of the numbers that
Govt Medical College, Kozhikode
we collected as data.
bijugeorge1@gmail.com
Data, Information & Intelligence Data, Information & Intelligence – FBS values
• Data is the values of observation that we collected in any Data 121, 154, 108, 212, 163, 198, 200, 115, 167, 178
study. Data is the unprocessed information
1
11/7/18
Data, Information & Intelligence – FBS values Data, Information & Intelligence – FBS values
Data 121, 154, 108, 212, 163, 198, 200, 115, 167, 178
Data 121, 154, 108, 212, 163, 198, 200, 115, 167, 178
Information No (%)of observation above 150 = 7 / 10 = 70%
Information No (%)of observation above 150 = 7 / 10 = 70% Mean = 161.6, Median =165
Mean = 161.6, Median =165 Intelligence 70 % of the diabetic have FBS above 150 indicate a poor
control in the study group. This could be due to either
poor management of the cases or due to some specific
characteristics of the study group
Since Mean and Median are above 160, these subjects
are poorly controlled and reason could be……
• If there is no possibility of variability, then it is a constant • Qualitative – data obtained can be arranged in categories
only
• Eg- Gender, Disease Present / Absent, Cure yes/No
2
11/7/18
• Nominal scale
• Weight , Height
• Temp in oK • Temp in oC, Temp in oF,
• No Upgradation possible
• Except when we combine variables together
3
11/7/18
• Baseline variables
• Eg- Variables which are collected at the baseline measurements
4
11/7/18
5
11/7/18
Pie diagram –
Pie diagram
100 adult people studies for morbidity
BMI categories
Disease Frequency Percentage
13.2% 16.5% Hypertension 30
<18.5 Diabetes 20
18.5 to 24.99 Obesity 30
28.6%
25 to 29.99 Lipid disorders 20
41.7% >=30
6
11/7/18
25
20 20 20 80
20
60 44 48
15
32 39
10 40 29
10
20
5
0
0
Hypertension Diabetes Obesity Lipid disorder Renal disease CAD Male Female
Morbidities Obese Overweight Normal
7
11/7/18
Stem and Leaf plot. HbA1c values-100 Stem and Leaf plot. HbA1c values-100
5 0,0,0,0,0,1,3,4,5,5,7,7,8,8,8,9,9,9,9,9 20
12.8 9.0 8.1 7.5 7.0 6.8 6.3 6.0 5.9 5.5
6 0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,2,3,3,5,5,5,6,6,7,7,7,7,8,8,9,9,9 34
12.3 8.9 8.0 7.5 7.0 6.7 6.3 6.0 5.9 5.5
7 0,0,0,0,0,0,1,1,1,2,2,3,4,4,5,5,7,8,8,8,9 21
11.1 8.9 8.0 7.4 7.0 6.7 6.2 6.0 5.9 5.4
8 0,0,0,0,1,2,3,3,4,5,8,8,9,9 14
11.0 8.8 8.0 7.4 7.0 6.7 6.1 6.0 5.9 5.3
9 0,3,8,9 4
10.9 8.8 8.0 7.3 7.0 6.7 6.1 6.0 5.9 5.1
10 0,1,9 3
10.1 8.5 7.9 7.2 7.0 6.6 6.1 6.0 5.8 5.0
11 0,1 2
10.0 8.4 7.8 7.2 6.9 6.6 6.1 6.0 5.8 5.0
12 3,8 2
9.9 8.3 7.8 7.1 6.9 6.5 6.1 6.0 5.8 5.0
9.8 8.3 7.8 7.1 6.9 6.5 6.1 6.0 5.7 5.0
9.3 8.2 7.7 7.1 6.8 6.5 6.1 6.0 5.7 5.0
8
11/7/18
9
11/7/18
90
25 25
86.1
85
Mean FBS as mg %
20
79.3
15 14 80
75.9
10 7 75 72.1 72.5 73.1
5
5 3 70
2
0
65
1960 1970 1980 1990 2000 2010
1960 1970 1980 1990 2000 2010
Years Years
85 24
84 84.1 23.5
23.5
83 82.3 23
82 81.4
80.8 22.5 22.9
80.8 22.7 22.7
81 80.2 80.4
80 22
79 21.5
78
l
r
th
th
th
na
a
ye
ye
on
on
on
tio
77
M
M
m
2
n
ve
Baseline 0.5 hrs 1 hrs 1.5 hrs 2.0 hrs 2.5 hrs post Sx 1.0 hrs
t er
In
post Sx
e
Summary measures
• Every variable has to summarized for the given data set to be
interpretable
• Data -> Information -> Intelligence
10
11/7/18
Mode Median
• Most repeated value • Central data when it is arranged in ascending order
• Not commonly used in Health field
• Middle data
• Used mainly for Nominal variables and low ordinal variables
• Positional Average
• Unimodal / Bimodal / Multimodal data
• Divide data into two equal half’s
• Robust measure
Median Median
• 92 96 107 112 129 140 187 223 241 248 272 • 92 96 107 112 124 129 140 187 223 241 248 272
(./0)23
• !"#$%$"& "' ()*$+& = 5"#$%$"&; & = • Odd numbered data – mean of 2 central observations
4
&" "' "7#)89+%$"&
• Position of Median = 6.5th observation
• For even numbered data - central observation
• Average of 6th and 7th Observation = 134.5
• 11observation -> so Median = 6th observation = 140
11
11/7/18
• 0.5, 0.6, 0.7, 1.0, 1.6, 1.7, 1.7, 1.7, 1.9, 1.9, 2.0, 2.0, 2.0, 2.6,
• Most commonly used measure
2.6, 2.8, 2.8, 2.8, 3.1, 3.3, 3.9, 4.3, 5.2, 6.0, 7.1, 8.9, 9.3, 9.9,
10.4, 11.8, 13.2, 15.6, 17.7, 21.3, 23.8, 28.4, 31.0, 35.8
• Influenced by extreme values
,
•!"#$%&'( $)& !* = .
∑
/
0
12
11/7/18
2 sets of data
Set A Set B
Weight of 5 persons in kg Weight of 5 persons in kg
53 51
54 53
55 55
56 57
57 59
Mean = 55 Mean = 55
13
11/7/18
Measures of Variability
Range
(Spread / Dispersion)
• Range • Simplest to calculate
• Mean Deviation
• Variance • !"#$% = ' − )
• Standard deviation
• Inter quartile range • Influenced by extreme values
• Coefficient of variation
Set A ! − !̅ Set B ! − !̅
Mean Deviation Weight in kg Weight in kg
53 2 51 4
• Avrege Absolute Deviation AAD 54 1 53 2
55 0 55 0
∑ ;<;̅ 56 1 57 2
• 0123 415627683 =
> 57 2 59 4
Mean = 55 Mean = 55
∑ ! − !̅ ∑ ! − !̅
% %
6 12
= =
5 5
= 1.2 ,- = 2.4 ,-
14
11/7/18
15
11/7/18
• Proportion = a/(a+b)
+, ,- ./012/. ,345,6/0
• Probability =
+, ,- 4,478 9,001:8/ ,345,6/0
16
11/7/18
Normal distribution
Normal distribution
• Bilaterally symmetrical and bell shaped distribution
• The highest point of the curve corresponds to mode like any other • Gaussian Distribution
distribution. Mean and median coincides with the mode in normal
distribution • Distribution of a continuous variable
• Area under the curve represents the probability of the data and the • Many biological variables follow this
total area is 1
• Mean &SD are appropriate if the data is normally distributed
• The curve does not touch the baseline as theoretically values up to
infinity are possible • How to check Normality
• Mean + 1 SD, Mean + 2 SD, Mean + 3 SD will include 66.3%, 95.4% • Mean ≃ Median
and 99.7% of the central observations respectively. This is called as • Graphs
the 3σ rule or the empirical rule. • Skewness coefficient
• Skewness and excess Kurtosis coefficients are zero. • Statistical test
Skewness
• Asymmetry of the distribution
Median
• No Skew
• Positive skew ( right skew) Mode Mean
• Negative skew (left skew)
Low Middle High
Income Income Income
Skewness Coefficients
Mean Skewness Coefficients Interpretation
Median Mode
<-1 Severe negative skew
-1 to -0.5 Moderate negative skew
-0.5 to 0 Negligible negative skew
0 Symmetrical
0 to +0.5 Negligible positive skew
+0.5 to +1 Moderate positive skew
>+1 Severe positive skew
17
11/7/18
Kurtosis
Thank You
bijugeorge1@gmail.com
9846100093
Wednesday, November 7, 2018 Dept of Community Medicine 104
18