You are on page 1of 46

Bio Statistics

Riffat Shaheen
In-charge Infection control Department
Dr. Ziauddin Hospital Karachi
Member of JICP
RN,RM & Post RN BScN
Objectives

 Overview of Biostatistical
Terms and Concepts
 Application of Statistical Tests
Research

To
 search again

 Diligent and systemic inquiry

 Discovery

 Seek answers to questions in an orderly

and systemic way


Types of Research

 Qualitative Research

 Quantitative Research
Population
 The population is all elements (individual,
Subjects or substance).
(All Nursing staff who are have license & working
in ZH) (300)

Sample
 A sample is a subset of the population.
100 nursing staff out of 300

Symbols
Give letter or alphabetical symbols to variable for
the purpose of identification.
Gender
(1) (a) - Male
(2) (b) - Female

Education
(1) (a) -none
(2) (b) - Primary
(3) (c) - Intermediate
(4) (d) -Senior Height
(5) (e) -Technical School
(6) (f) - University

Male are more smoker


(1) - Agree
(2) - Strongly agree
(3) - Disagree
(4) - Strongly Disagree

Physical Activity
(1) - Sitting
(2) - Moderate
(3) -Heavy
What is statistics
 Censes
 Calculation of data
 Recording of data

 Oxford American University define the word


statistics as “the Science of collecting, classifying
and interpreting information based on number of
things.
SPSS Software
 S= statistics
 P= package
 S= social
 S=sciences
Statistical package for social sciences

SPSS is a software package used for conducting statistical


analysis, manipulating data, and generating tables and
graphs that summarize data.
Biometry/Biostatistics
 Bio= Biology (sciences of Health)
 Metry/Statistics= Measurement
Kinds of statistics
Two kinds of statistics

1- Descriptive statistics
concerned with summarising or describing a
sample eg.

a) Central Tendency
 mean, median, mode, Std. Deviation,
Range, minimum, maximum, Variance

b) Frequencies
 The form of graphs, charts and tables.

c) Cross Tabulation
2- Inferential statistics
The branches of statistics, takes care of the
generalization.
concerned with generalising from a sample, to
make estimates and inferences

a) T-Test,
 One Sample t Test

 Independent-sample t Test

 Paired-sample t Test

b) Chi Square test


c) One Way ANOVA Test
Starting SPSS
 Start Programs spss for window spss10.0for
windows OK

1-Variable View
2- Data View
Variable
 A characteristic that varies from one experimental unit to
another is known as variable.

1- Categorical Variables
Nominal Scale
Gender
1- Male
2- Female
Ordinary Scale
Male are more smoker
1- Agree
2- Strongly agree
3- Disagree
4- Strongly Disagree
2- Measurement Variable
Height
Weight
Blood Pressure

3- Continue Variable
Person
Patient
Doctor
Age
Basic Statistics
Terms
n  total number of samples
x  sample
µ= Represent population mean
x (or  )  sample average
s 2  variance
s (or  )  standard deviation
CV  coefficien t of variation
s x or SE  standard error of the mean

FOR 220 Aerial Photo Interpretation and Forest Measurements


Basic Statistics  x
x   


 n 
Statistical computations Mean

Mean (the average)

__
X = 102+138+190+122+128+112+128+116+134+104+128
11

1402 = 127
11

FOR 220 Aerial Photo Interpretation and Forest Measurements


 Median (the middle value)
Odd
n+1= 11+1 = 12 = 6
2 2 2
Even
Mid two x + x
2
 Mode (the most frequently appearing

value)
Measures of Dispersion
• RANGE
 highest to lowest values

• STANDARD DEVIATION
 how closely do values cluster around the

mean value
• SKEWNESS
 refers to symmetry of curve
Basic Statistics
Statistical computations (measures of dispersion)
Range
• easy to compute
• fails to take into account how the data are distributed

Range  x (max)
- x (min)

FOR 220 Aerial Photo Interpretation and Forest Measurements


Measures of Dispersion
• RANGE
 highest to lowest values

• STANDARD DEVIATION
 how closely do values cluster around the

mean value
• SKEWNESS
 refers to symmetry of curve
Measures of Dispersion
• RANGE
 highest to lowest values

• STANDARD DEVIATION
 how closely do values cluster around the

mean value
• SKEWNESS
 refers to symmetry of curve
Skewness

Curve A Curve B
Mode
Median

negative
skew
Mean
Descriptive Statistics
A common first step in data analysis is to summarize
information about variables

Central Tendency
Formula
 Analyze--Descriptive Statistics– Frequency—Click= Open 1st Dialog
Box
 Select any one or more variables from left side of Dialog Box.
 Click on arrow button which is present between left and right side of
boxes.
 Click on statistic button which is present at last.= Open 2nd Dialog
Box
 Select the one or more Descriptive options like: mean, std Deviation,
minimum etc.
 Click on continue on 2nd Dialog Box.
 Click on OK on 1st Dialog Box
Descriptive Statistics
Frequencies
Descriptive procedure will not prove helpful for
interpreting categorical data.
The Frequencies option allows to obtain the
number of people within each educational level.

Formula
 Analyze--Descriptive Statistics–Frequencies—Click1st
Dialog Box
 Select the variable from the left side
 Click on arrow button
 Click on OK
Descriptive Statistics
Charts or Graphs
Charts allows to graphically examine their data in
several different forms.
Formula
 Analyze--Descriptive Statistics–Frequencies—
Click=1st Dialog Box
 Select the variable from the left side
 Click on arrow button
 Click on chart button which is present at last=
Open 2nd Dialog Box
 Click on any type of chart.
 Click on continue on 2nd Dialog Box
 Click on OK.
Descriptive Statistics
Cross Tabulation
The croostab procedure is useful for investigating of
information because it can provide information about the
intersection of two variables
Formula
 Analyze—Descriptive Statistics—Crosstab—OK= 1st
Dialog Box
 Select independent variable (Gender) for row and
dependent variable (Smoking Status)for column.
 Click on Cell button which is present at last= 2nd Dialog
Box.
 Select applied test
 Click on continue on 2nd Dialog Box.
 Click on OK on 1st Dialog Box
Experiment:
 A experiment is the process of collecting an observation
or taking measurement.

Events:
 The outcome of an experiment is called as an event.

Sample Space: The collection of all events of an


experiment is called sample space.

Equally Likely:
 If each event of an experiment has an equal chance to
be selected.
eg. Every individual has blood grouping.
Blood Group O A B AB Total
Frequency 226 206 50 20 502
Mutually Exclusive:
 Two events are said to be mutually

exclusive, if they can not occur together.


eg. Individual has cancer or not.
Disease Positive OR Disease Negative
Disease Positive = True +ve False -ve Total
Frequency 80 10 90
Inferential statistics
Probability: The science of uncertainty is
called probability. Probability gives us the
degree of confidence, How much our
conclusion and inference is correct.

Basic Properties of Probability:


 The probability of an event is always

between 0 and 1.
 The sample space then P(1) + P(2)…..=1
Probability
Equally Likely:
Blood Group O A B AB Total
Frequency 226 206 50 20 502
Each Event
P(O)= 226 = 0.45
502
P(A)= 206 = 0.41
502
P(B)= 50 = 0.0996
502
P(AB)= 20 = 0.0398
502
Sample Space
P(O) 0.45 + P(A) 0.41 + P(B) 0.0996 + P(B) 0.0398 = 0.9994 =1
 Mutually Exclusive
Disease Positive = True +ve False -ve Total
Frequency 80 10 90

Each Event
P(T+ve) = 80 = 0.888
90
P(F-ve)= 10 = 0.111
90

Sample Space
P(T+ve) 0.888 + P(F-ve) 0.111 =
0.999 =1
Meaning of life P
 P Value: the probability of observing a
result as extreme or more extreme than the
one actually observed from chance alone
P value  0.05 Reject Ho
P value > 0.05 Accept Ho

 P > 0.05 Not significant


 P = 0.01 to 0.05 Significant
 P = 0.001 to 0.01 Very significant
 P < 0.001 Extremely significant
Hypothesis:
Inferential statistics which is known as
hypothesis testing
 Null hypothesis
Ho: there is no difference between the groups
 Alternative hypothesis
H1: there is a difference between the groups
P value < 0.05 Reject Ho (Less than) Left side curve
P value > 0.05 Accept Ho (More than) Right side curve

= Not Equal Alternative hypothesis


≠ Equal Null hypothesis
 Null hypothesis
Ho: there is no difference between the groups
( If the null hypothesis is not rejected, we usually write the
conclusion that data do not provide sufficient evidence to
reject the null
Hypothesis)

 Alternative hypothesis
H1 / Ha: there is a difference between the groups
( If the null hypothesis is rejected, we usually write the
conclusion that data provide sufficient evidence to reject the
null
Hypothesis)
Steps in Statistical Testing
 Step 1: state the null and alternative
hypothesis
 Step 2: Decide on the significant level.
 Step 3: Determine the decision rule (test
statistic)
 Step 4: Apply the decision rule to sample
data (calculation)
 Step 5: State the conclusion in words.
One Sample t Test
 The One Sample t Test is used compare a single
sample with a population.

Q=Test the hypothesis that the mean of systolic


BP is equal to 140 (Population mean)

Formula
Analyze—compare mean—one sample t Test =Open Dialog
box—select the variable from left side—click on arrow—
filled test value—click on ok
Result of One Sample t Test
1- HO: µ = 140 VS H1: µ≠ 140.
Where µ= mean of systolic BP
Level of Significance 5% (α = 0.05)
2- One Sample “T” Test.

3- Calculation
 The Mean of 11 Cases Was 127.45 with standard deviation 23.80.
The ‘t’ value with df of 10 was -1.748 and the p-value was .111.

4- RESULT: - Since the p-value was .111, which was more than
0.05. Therefore we can not reject the null hypothesis

5- CONCLUSION : - Since we have not rejected the null hypothesis


we conclude that data do not provide enough evidence that the
mean of sys BP value is different than 140
Independent-sample t Test
 Independent-sample t Test is used to
compare two groups score on the same
variables

Q=Test the hypothesis that the mean of


systolic BP of smoker is significantly
different from than nonsmoker.
Formula
Analyze—compare mean—independent-sample t Test
=Open Dialog box—select the variable from left side to test
variable—click on arrow—filled Grouping variable with
define Group—click on continue then click on ok
Result of Independent-sample t
Test
1- HO: µ1 = µ2 VS H1: µ1 ≠ µ2.
Where µ1= mean of smoker
Where µ2= mean of non- smoker
Level of Significance 5% (α = 0.05)
2- Independent-Sample “T” Test.

3- Calculation
N1= 8 mean= 131.00 St.D 25.81
N2= 3 mean= 118.00 St.D 17.78
Using the levene’s test P value is .861
4- RESULT: - Since the p-value was .861, which was more than 0.05.
Therefore we can not reject the null hypothesis

5- CONCLUSION : - Since we have not rejected the null hypothesis we


conclude that data do not provide enough evidence that the mean of sys
BP of smoker is different than non-smoker.
Paired-sample t Test
 Paired-sample t Test is used to compare the
means of two variables with in a single group.

Q=Test the hypothesis that the mean of systolic


BP before physical Activity is differ after physical
Activity.

Formula
Analyze—compare mean—paired-sample t Test
=Open Dialog box—select the two variables
from left side to test variable—click on arrow—
click on ok
Result of Paired-sample t Test
1- HO: µ1- µ2 =0 VS H1: µ1- µ2 ≠0.
Where µ1= mean of before phy-activity
Where µ2= mean of after phy-activity
Level of Significance 5% (α = 0.05)
2- Paired-Sample “T” Test.

3- Calculation
N1= 11 mean= 1.91 St.D .83
N2= 11 mean= 127.45 St.D 23.80
The t value with df=10 is -17.760 & P value is .000 which is less than
0.001
4- RESULT: - Since the p-value was .000, which was less than 0.05.
Therefore we can reject the null hypothesis

5- CONCLUSION : - Since we have rejected the null hypothesis we


conclude that data provide enough evidence that there is different
b/w the mean of sys BP before physical activity is different than
after physical activity.
Chi-Square Test
 Chi-Square Test is used for to identify
relationship between two categorical
variables

Q=Test the hypothesis that there is any


relation b/w smoking status and
education.
Chi-Square Test
Formula
 Analyze—Descriptive Statistics—Crosstab—OK= 1st Dialog
Box
 Select independent variable ( Educational Level) for row
and dependent variable (Smoking status) for column.
 Click on Statistic button which is present at last= 2nd
Dialog Box.
 Select applied test
 Click on continue on 2nd Dialog Box.
 Click on Cell button which is present at last= 3rd Dialog
Box.
 Click on continue on 3rd Dialog Box
 Click on OK on 1st Dialog Box
Result Of Chi-Square Test
1- HO: There is no relationship b/w smoking & edu-level
H1: There is relationship b/w smoking & edu-level
Level of Significance 5% (α = 0.05)
2- Chi-square Test.

3- Calculation
N1= 10 smoker=7 non-smoker=3
df=
Smoker ratio with education level is non=1, primary=0, intermediate=1,
senior high=2, technical school=1, university=2

Non- Smoker ratio with education level is non=0, primary=1,


intermediate=0, senior high=0, technical school=1, university=1
Chi-square value is 4.444 with df=5 and p value=.487

4- RESULT: - Since the p-value was =.487, which was more than 0.05.
Therefore we can not reject the null hypothesis

5- CONCLUSION : - Since we have not rejected the null hypothesis we


conclude that data is not provide enough evidence that there is any
relationship b/w smoking and education level.
One Way ANOVA Test
 One Way ANOVA Test is used for compare more than
two groups score on the same variables.

Q=Test the hypothesis that the sys BP of educational level


people are different.
Formula
Analyze—compare mean—one way ANOVA—Click—open
Dialog Box
select the dependent variable (Sys BP) + select educational
level for factor box—click on Post Hoc=open dialog box.
Select on tukey—click on continue
Click on option= open dialog box
Select descriptive and homogeneity of variance—click on
continue then click on OK
Regression and correlation
 It is used for to identify relationship
between two measuring variables
Q=Test the hypothesis that there is any relation b/w
smoking status and education.
Formula
Analyze—Regression—Linear—Dependent (weight) +
Independent sys BP--OK

You might also like