Bio Statistics

GOOD
MORNING
02/23/2023 1
“When you can measure what you are
speaking and express it in numbers, you
know something about it but when you
cannot measure, when you cannot
express it in numbers, your knowledge is
of meager and unsatisfactory kind.”
– Lord Kelvin
02/23/2023 2
Tests in Tests of Significance
(Parametric and Non Parametric
Tests)
02/23/2023 3
Contents :
• Introduction
• Applications of biostatistics
• Terminologies
• Test of significance
–P value
–Normal distribution
–Confidence interval
–Level of significance
–Standard error
–Parametric tests
–Post hoc tests
–Non parametric test
• Conclusion
• References
02/23/2023 4
Introduction
What is statistics.?
– “Science of Statecraft”
– Latin – status, Italian – statista, German – statistik, French -

statistique.
– The history of statistics can be said to start around 1749

although, over time, there have been changes to the
interpretation of what the word statistics means.
02/23/2023 5
What is bio-statistics.?
• The application of statistics to a wide range of topics in biology.
• The science of biostatistics encompasses the design of

biological experiments, especially in medicine and agriculture.
• The collection, organizing analysing tabulating and the

interpretation of data and inference from, the results related to
living organisms and human beings.
02/23/2023 6
Father of Statistics
Sir Ronald A Fisher

02/23/2023 7
Applications of biostatistics
• In Physiology and Anatomy • In Medicine
o Limits of normality o To compare the efficacy of a particular
drug, operation or line of treatment.
o Difference between means and
o To find an association between two
proportions attributes, Eg., cancer and smoking
o Correlation between the variables o To identify signs and symptoms of a
disease or syndrome
• In Pharmacology
o To find the action of the drug • In Community Medicine and Public
Health
o To compare the action of two
o To test usefulness of sera and vaccines
different drugs or two successive in the field
dosages of the same drug o In epidemiological studies
o To find the relative potency of a o Awareness in Public
new drug with respect to a o Evaluate articles, journals
standard drug o Clinical practice or research
02/23/2023 8
DATA
QUANTITATIVE QUALITATIVE
DISC RETE CONTINOUS ORDINAL NOMINAL
02/23/2023 9
• Variable : Any character, characteristic, or quality
that varies.
• Dependent Variable : A variable that depends on or
is influenced by another variable.
• Independent variable: A variable that influences the

behavior of another variable, which can be
manipulated in a study.
E.g. Age [independent variable]

influences blood pressure [dependent variable].
02/23/2023 10
Measures of Central Tendency
• Mean: calculated by adding all the values in a
data set and dividing them by the total
number of values
• Median: It is the central value when all
observations are sorted in order.
• Mode: most commonly occurring value in the
data
02/23/2023 11
Measures of dispersion
• Range – difference between largest and smallest

values.
• Quartiles –values that divide the data into four
equal parts when the data is arranged in
ascending order.
• Standard deviation – Measure of magnitude of
the variation present in a set of data.
SD= ∑(x-ẍ)2
02/23/2023
(n-1) 12
• Variance: A measure of extent of the variation
present in a set of data
It is square of SD = ∑(x-ẍ)2
(n-1)
02/23/2023 13
• WHICH IS APPROPRAITE???
• MEAN >>>>>>>>>>>> STANDARD DEVAITION
• MEDIAN >>>>>>>>> INTER-QUARTILE RANGE
02/23/2023 14
Normal distribution
02/23/2023 15
Normal distribution
Height (cms) Frequency
142.5-145.0 3
145.0-147.5 8
147.5-150.0 15 Mean = 161.2 cms
150.0-152.5 45 SD= 5 cms
152.5-155.0 90
155.0-157.5 155
157.5-160.0 194
160-162.5 136
162.5-165.0 136
165.0-167.5 93
167.5-170.0 42
170.0-172.5 16
172.5-175.0 6
175.0-177.5 2
02/23/2023 Total 1000 16
Height (cms) Frequency
142.5-145.0 3
145.0-147.5 8
147.5-150.0 15
150.0-152.5 45
152.5-155.0 90
155.0-157.5 155
157.5-160.0 194 Mean ±
Mean ± 1 SD Mean ±
160-162.5 136 2SD 3 SD
162.5-165.0 136
165.0-167.5 93
167.5-170.0 42
170.0-172.5 16
172.5-175.0 6
175.0-177.5 2
Total
02/23/2023 1000 17
250
200 194
155
150
136 136
100
93
90
50 45
42
15 16
8 6
3 2
0
142.5- 145.0- 147.5- 150.0- 152.5- 155.0- 157.5- 160-162.5 162.5- 165.0- 167.5- 170.0- 172.5- 175.0-
145.0 147.5 150.0 152.5 155.0 157.5 160.0 165.0 167.5 170.0 172.5 175.0 177.5
02/23/2023 18
02/23/2023 19
• On preparing frequency distribution with small
class intervals of the data collected, we can
observe
1. Some observations are above the mean & others
are below the mean
2. If arranged in order, maximum number of frequencies
are seen in the middle around the mean & fewer at the
extremes decreasing smoothly
3. Normally half the observations lie above & half below
the mean & all are symmetrically distributed on each
side of mean
Skewness
• Skewness – as the static to measure the asymmetry
• coefficient of skewness is 0
 Positively (right) skewed
 Negatively (left) skewed
 Bimodal
02/23/2023 21
Kurtosis
• Kurtosis – is a measure of height of the distribution

curve
• Coefficient of kurtosis is 3
Leptokurtic(high)
Platykurtic (flat)
Mesokurtic (normal)
02/23/2023 22
Shape of a Distribution
• Describes how data is distributed
• Measures of shape
– Symmetric or skewed
Left-Skewed Symmetric Right-Skewed

Mean < Median < Mode Mean = Median =Mode Mode < Median < Mean
02/23/2023 23
Skewed Distribution
• Non-symmetrical distribution
– Mean, median, mode not the same
• Negatively skewed extreme scores at the lower end

– Mean < median <mode
– most did well, a few poorly
• Positively skewed
– at the higher end
– Mean >median >mode
– Most did poorly, a few well
• The further apart the mean and median, the more the
02/23/2023 24
distribution is skewed.
Examples of Normal and Skewed
35-SYSTOLIC BLOOD PRESSURE FIRST ER
44-DAYS IN ICU
160
1000
140
120 800
100
600
80
60 400
40
Frequency
200
Frequency
Std. Dev = 27.74 Std. Dev = 3.99

20 Mean = .9
Mean = 146.9
N = 925.00 0 N = 933.00
0
0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0
5.0 15.0 25.0 35.0 45.0 55.0 65.0
35-SYSTOLIC BLOOD PRESSURE FIRST ER 44-DAYS IN ICU
02/23/2023 25
Test of Significance
( the role of chance)
• Tests of significance are one of the central
concepts in statistics.
• These are the mathematical methods by which

the probability of an observed difference
occurring by chance is found.
• It is a statistical procedure by which one can

conclude if the observed results from the sample
is due to chance or not.
02/23/2023 26
Test of significance (continues…..)
Group SBP
MEAN SD
A (n1=50) 145 8.9
B (n2=50) 135 10
02/23/2023 27
Statistical hypothesis
• NULL HYPOTHESIS
“ No difference in average SBP between the
two groups.”
• ALTERNATE HYPOTHESIS
“Average SBP of group A is different from
group B.”
02/23/2023 28
P value(probability)
• Probability that a difference of at least as
extreme as those found in the observed data
would have occurred by chance when the null
hypothesis is true.
• High p value supports Null Hypothesis

• Low p value alternate Hypothesis
02/23/2023 29
Types of errors in test of significance
Decision Condition of Null hypothesis
True False
Accepts Null Correct Type II
hypothesis action error(beta)
Reject Null Type I error Correct

Hypothesis (alpha) action
02/23/2023 30
Confidence interval
• Confidence interval is range of values based

on a sample.
• Has a Lower limit (L) & upper limit (U)
• P (L ≤ µ ≤ U) = 0.95.
• CONFIDENCE LEVEL OR COEFFICIENT
02/23/2023 31
Level of significance
• Significance at 0.05 level means that:

1. our data are sufficiently far from the cut-offs for no
difference.
2.In this decision we are aware that we incur a probability

(P) or chance of 5% being wrong.
• The smaller p level (0.01,0.001,0.0001) and farther our that

from cut-off , less the chance we have for a wrong conclusion
Potter RH. Significance level and confidence interval. J Dent Res

1994;73:494-495
02/23/2023 32
DEGREE OF FREEDOM
• The number of degrees of freedom is the

number of values in the results that are free to
vary.
• It is important to calculate the degree(s) of
freedom when determining the significance of
a chi square statistic and the validity of the
null hypothesis.
02/23/2023 33
STANDARD ERROR
• SAMPLING VARIATION/ DISTRIBUTION
• STANDARD ERROR
The measure of such variations is standard
error and is
SD of observations in the sample

S.E. =
No. of observations in the sample
02/23/2023 34
• Standard error of sample mean=
• Standard error for sample proportion= PQ

n
• Standard error for difference in means=
• S.E for difference in proportions =
02/23/2023 35
STAGES OF TESTS OF SIGNIFICANCE
• Formulation of a null hypothesis
• Selection of statistics (test function), according to the
content of the null hypothesis and to the conditions
fulfilled by the random sample,
• Determination of the significance level of the test α (it is
equal to the probability of the type 1 error),
• Determination of the alternative hypothesis on the basis of
the random test results (form of negating null hypothesis),
• Determination of the limits of a so-called critical area,
according to the content of the alternative hypothesis in
such a way, that its area is equal to the significance level α,
• Drawing conclusions based on the position of the statistic
value in relation to the critical area.
02/23/2023 36
Types of tests
• PARAMETRIC
Statistical procedures based on the assumptions of the known
underlying distributions are referred to as parametric statistics.
RESULTS   VALIDITY (Assumptions)
• NON PARAMETRIC
Nonparametric method implies a method that is not for any specific

parameter.
ASSUMPTIONS  LESS STRINGENT
02/23/2023 37
Parametric v/s Non-parametric
Parametric Non-parametric
Assumed distribution Normal Any
Assumed variance Homogeneous Any
Typical data Ratio or Interval Ordinal or Nomial
Data set relationships Independent Any
Usual central measure Mean Median
Benefits Can draw more Simplicity; Less
conclusions affected by outliers
02/23/2023 38
TESTS IN TEST OF SIGNIFICANCE
Parametric
(normal distribution & Non-parametric
Normal curve ) (not follow
normal distribution)
Quantitative data Qualitative data

Qualitative
1) Z – prop test
1) Student ‘t’ test (quantitative converted
2) χ² test
1) Paired to qualitative )
2) Unpaired 1. Mann Whitney U test
2) Z test 2. Wilcoxon rank test
(for large samples) 3. Kruskal wallis test
3) One way ANOVA 4. Friedmann test
4) Two way ANOVA
How to know what to use
• There are many theoretical distributions,
both continuous and discrete.
• We use 4 of these a lot: z (unit normal), t,
chi-square, and F.
• Z and t are closely related to the sampling
distribution of means; chi-square and F /
ANOVA are closely related to the sampling
distribution of variances.
40
Objective of using tests of significance
• To compare – sample mean with

population
• Means of two samples
• Sample proportion with population
• Proportion of two samples
• Association b/w two attributes
41
Parametric Uses Non-parametric
Paired t test Test of diff b/n Wilcoxon signed
Paired observation rank test
Comparison of two Wilcoxon rank sum test

Two sample t test
groups Mann Whitney U test
Kendall’s s test
One way Anova Comparison of Kruskal wallis test

several groups
Two way Anova Comparison of groups Friedmann test

values on two variables
Spearman’s rank
Correlation Measure of association Correlation
coefficient B/n two variable Kendall’s rank
correlation
Normal test (Z test ) Chi square test

PARAMETRIC TESTS
1. Z-test
2. Chi Square test
3. t-test
-Independent sample t-test
-Paired t-test
4. F-test (ANOVA)
02/23/2023 43
Z Test
• Standard error test when sample size is large.
• z-test is preferable when n is greater than 30.
• Compare differences between proportions
• Data points should be independent from each

other
02/23/2023 44
• Degrees of freedom:
1. For the z-test degrees of freedom are not

required since z-scores of 1.96 and 2.58 are
used for 5% and 1% respectively.
2. For unequal and equal variance t-tests =
(n1 + n2) - 2
3. For paired sample t-test = number of pairs - 1
02/23/2023 45
• It is calculated by taking the observed
difference b/w the two proportions
(numerator) & dividing it by the standard error
of the difference b/w the two proportions
(denominator)
02/23/2023 46
t Test
• Standard error test for small samples
• The t-test assesses whether the means of two
groups are statistically different from each
other
• Two types
1. independent sample t test
2. paired t test
02/23/2023 47
Independent sample t test
• Compare the means of two independent
random samples from two populations
• Variable of interest is quantitative
• Assumptions :
1. Population from which the samples were
obtained must be normally or approximately
normally distributed
2. Homogeneity of variances
02/23/2023 48
Example
• OBJECTIVE: Birth weight of the new born is
related to presence of dental caries at first
trimester.
• A prospective study was conducted in the OBG

Department of yenepoya hospital. A total of
204 pregnant women were enrolled in the
study and they were followed till the delivery
02/23/2023 49
Dental n Mean birth SD
caries weight
NO 115 2.68 0.33
YES 89 2.63 0.37
02/23/2023 50
Paired t TEST
• Two measures taken on the same subject or
naturally occurring pairs of observation or two
individually matched samples.
• Variable of test is quantitative
• Assumptions:
“the difference between pairs in the populationis
independent and normally or approx. normally
distributed.”
02/23/2023 51
ONE AND TWO-TAILED TESTS
• The one-tailed is performed if the results are

interesting only if they turn out in a particular
direction.
• The two-tailed is performed if the results

would be interesting in either direction
02/23/2023 52
02/23/2023 53
ANOVA
• Extension of independent t test to compare
the means of more than two groups
• Why not compare all possible pairs by t test???

 Tedious
 Time consuming
 Confusing
 Potentially misleading (type 1 error is more)
02/23/2023 54
One way ANOVA
V/S
Two way ANOVA
• One-way ANOVA measures the significant effect of

one independent variable (IV), the two-way ANOVA
is used when there are more than one IV and
multiple observations for each IV.
• The two-way ANOVA is an extension of the one-way

ANOVA test that examines the influence of
different categorical independent variables on one
dependent variable.
02/23/2023 55
ANCOVA
• Analysis of covariance (ANCOVA) is a general
linear model which Blends ANOVA and regression
• ANCOVA evaluates whether population means of

a dependent variable (DV) are equal across levels
of a categorical independent variable (IV), while
statistically controlling for the effects of other
continuous variables that are not of primary
interest, known as covariates (CV).
02/23/2023 56
POST HOC TESTS
• We can use post hoc tests to tell us which
groups differ from the rest.
• Post hoc tests are designed for situations in

which the researcher has already obtained a
significant test and additional exploration of the
differences among means is needed to provide
specific information on which means are
significantly different from each other.
02/23/2023 57
• Types of Post Hoc Tests
1. Fisher's least significant difference

2. Bonferroni correction
3. Duncan's new multiple range test
4. Friedman test
5. Newman–Keuls method
6. Rodger's method
7. Scheffé's method
8. Tukey's range test
9. Dunnett's test
02/23/2023 58
Chi –Square test
• The chi-square (I) test is used to determine
whether there is a significant difference
between the expected frequencies and the
observed frequencies in one or more
categories.
02/23/2023 59
2 x 2 Contigency table
Tobacco chewing Dental caries Total
No yes
Yes 8 (14.3%) 48(85.7%) 56
No
83 (72.3%) 17(27.7%) 148
TOTAL 115 89 204
02/23/2023 60
Situation:
 Variables of interest are categorical

 To determine whether observed difference in
proportion between the study groups are
statistically significant
 To test the association or independence of 2
variables
02/23/2023 61
• c
• Degree of freedom
•Chisquare for 1 degree of freedom at 5% level
=3.84
02/23/2023 62
• Pearson's chi-squared test is used to assess two
types of comparison: tests of goodness of fit and
tests of independence
• A test of goodness of fit establishes whether or
not an observed frequency distribution differs
from a theoretical distribution.
• A test of independence assesses whether paired
observations on two variables, expressed in a
contingency table, are independent of each other
02/23/2023 63
McNemar's test
• The assumption of Chi-square is that the
samples are taken independently or are
unpaired. If not, you need to use McNemar's
test.
• If you have only a small sample size, you

should use Fishers exact test.
02/23/2023 64
EXAMPLE:
• McNemar's chi-squared = 2.5,

df = 1, p-value = 0.1138
02/23/2023 65
Pearson correlation coefficient
• Used to quantify the strength and direction of the
relationship between two variables.
• It was developed by Karl Pearson from a related idea

introduced by Francis Galton in the 1880
• It is a measure of the correlation (linear dependence)

between two variables X and Y, giving a value between
+1 and −1 inclusive.
02/23/2023 66
• Pearson's correlation coefficient between two
variables is defined as the covariance of the
two variables divided by the product of their
standard deviations.
02/23/2023 67
02/23/2023 68
NON PARAMETRIC
TESTS
02/23/2023 69
• Nonparametric methods can be defined as “methods
which do not involve hypothesis concerning specific
values of parameters or which are based on some
function of the sample observation whose sampling
distribution can be determined without knowledge
of the specific distribution function of the underlying
population.’’
• Also known as distribution-free statistics

(Jay S. Kim, Ronald J. Dailey, Biostatistics for oral healthcare,1 stEd, Munksgaard:
Blackwell, Chapter 14 ,Pg : 257)
02/23/2023 70
Why Should We.?
• They can be used without the normality assumption.
• They can be used with nominal or ordinal data.
• Hypothesis testing can be performed when population parameters, such as

mean μ and standard deviation σ are not involved.
• The computations are lighter in most cases and the procedures are easier to
understand.
• Because they deal with ranks rather than the actual observed values, non-
parametric techniques are less sensitive to the measurement errors than
parametric techniques and can use ordinal data rather than continuous data.
02/23/2023 71
Why Shouldn't We.?
• They tend to use less information than the parametric methods.
• They are less sensitive than the parametric tests. Thus the larger
differences are needed to reject the null hypothesis.
• They are less efficient than their parametric counterparts. Roughly,

this means that larger sample sizes are required for non-parametric
tests to overcome the loss of information.
02/23/2023 72
The Sign Test
• Oldest parametric test procedure
• Applied to compare paired samples
• Assumptions required :
– Random variables under consideration has a continuous
distribution
– The paired samples are independent
– The measurements scale is at least ordinal with each other
• Population may not necessarily be statistically independent
02/23/2023 73
• The null hypothesis of the sign test
• The median of Xi is equal to the median of Yi for all I
• Null hypothesis : in case of empirical decisions, the initial assumption is

that there is no difference between the groups (H0)
H0 : µ 1 = µ2 (The 2 means are equal)
• Eg : Dentist with 2 clinics in the same town -- gives advertising for one
in paper and none for the other – asks assistant to record patient flow
for 12 week.
02/23/2023 74
The Wilcoxon Rank Sum Test
• Considers the magnitude of differences via ranks
• Samples come from identical population
• Does not require two independent population to follow normal

distribution ,requires the sample are from continuous distribution –
avoid ties
• The test is based on ranks of observation rather than their actual

numerical values
• Also known as Mann-Whitney U Test
• Non parametric alternative to two sample t test for independent

samples.
02/23/2023 75
•The null and alternate hypothesis can be stated as :
H0 : P(X > Y ) = 1/2 (X and Y have the same distribution)
vs.
H1 : P(X > Y ) = 1/2 (X and Y have different distributions)
•or
H1 : P(X > Y ) > 1/2 (observations from population X are likely to be larger than those from
population Y ),
•or
H1 : P(X > Y ) < 1/2 (observations from population Y are likely to be larger than those from
population X)
•Steps in this test

–Compute Z = Wx - µw /σw
(Wx = sum of the ranks of the first sample

µw = (n1)(n1 + n2 + 1) /2
σw = √(n1 .n2) (n1 + n2 + 1) /12
02/23/2023 76
– Find the critical values. For the significance level α = 0.05, use
the z values of -1.96 and 1.96
– Reject the null hypothesis if Z = Wx - µw /σw < -1.96 or

Z = Wx - µw /σw > 1.96 . Accept the null hypothesis if -1.96 ≤ Z =
Wx - µw /σw ≤ 1.96
• Eg : If home bleaching products for teeth cause mercury to release

from amalgam filling under 10% and 15% carbamide peroxide.
02/23/2023 77
The Wilcoxon Signed Rank test
• Also used to test weather outcome of two treatments are the same
or the hypothesis that two population distribution are identical.
• Used in place of t test for dependent samples without assumption

of normal distribution
• Also referred to as Wilcoxon matched pair test for dependent

samples
• Based on the ranks of the absolute differences between the paired

observations rather than numerical values of the difference
• Appropriate for the observations that represent ordinal data

02/23/2023 78
• Test reduces the matched pair (Xi , Yi ) to a single observation by taking
the difference
Di = Xi − Yi (or Di = Yi − Xi ) for i = 1, 2, · · · , n
• Rank all of the Di without regard to sign. That is, rank the absolute
values |Di|.
• Affix the sign of the difference to each rank. This indicates which ranks
are associated with positive Di or negative Di .
• Compute T + = the sum of the ranks Ri +of the positive Di , and T − = the
sum of the ranks Ri + of the negative Di .
• Di = Xi − Yi = 0, such pairs will be deleted from the analysis and thus the
sample size will be reduced accordingly. When two or more Di are tied,
the average rank is assigned to each of the differences.
02/23/2023 79
• If the sum of the positive ranks T+ is different (much smaller or much
larger) from the sum of the negative ranks T −, we would conclude
that treatment A is different from treatment B, and therefore, the
null hypothesis H0 will be rejected.
• Without loss of generality, the test statistic is defined by
T+= Ri+ ,which is approximately normally distributed with

Mean = μ = n(n + 1)/4
• and
Variance = σ2 = n(n + 1)(2n + 1)/ 24
• Thus, Z = T + − μ
σ
02/23/2023 80
THE MEDIAN TEST
• The median test is a statistical procedure for testing whether two
independent populations (treatments) differ in central tendencies
when the populations are far from normally distributed.
• The hypotheses can be stated

H0 : Two treatments are from populations with the same median.
H1 : The populations of two treatments have different medians.
• if the null hypothesis is true, we expect that about half of each

sample observation to be below the combined median and about
half to be above.
02/23/2023 81
• When applying the χ2 test to a 2 × 2 contingency table, it is
computationally more convenient to use the following formula.
χ2 = N(|ad − bc| − N/2)2

(a + b)(c + d)(a + c)(b + d)
• The median test can be extended to examine whether three or
more samples came from populations having the same median.
02/23/2023 82
THE KRUSKAL-WALLIS RANK TEST
• Test may be employed to test whether the treatment means
are equal.
• Based on the ranks of the observations.
• The only assumption required about the population

distributions is that they are independent, continuous and of
the same shape.
• Recommended that at least five samples should be drawn from

each population
02/23/2023 83
• Kruskal-Wallis one-way ANOVA by ranks.
• k (k ≥ 3) population means are being compared and we wish

to test
H0 : μ1 = μ2 = · · · = μk vs.
H1 : not all μi are equal
• N = n1+ n2 + · · ·+ nk be the sum of the k samples. All N

observations are ranked from 1 to N. (n1, n2, · · ·, nk be the
number of samples taken from the k populations.)
• All N observations are ranked from 1 to N. Let Ri be the mean

of the ranks for the i th group.
02/23/2023 84
• The test statistics is known to be approximately a χ2 random
variable with (k − 1) degrees of freedom.
• Thus,
02/23/2023 85
02/23/2023 86
THE FRIEDMAN TEST
• Also known as the Friedman two-way analysis of variance by ranks
• there are no ties in the data, the test statistic is given by
where
N = number of groups
k = number of treatments
Ri = sum of the ranks for ith treatment
• When ties occur, we need to make an adjustment. Thus, the

expression of the test statistic is slightly more complicated, and can
be given by
02/23/2023 87
where
U = number of untied observations in the data
V = sum of (τ )3,
τ denotes the size of the ties.
02/23/2023 88
SPEARMAN’S RANK
CORRELATION COEFFICIENT
• This is a alternative to the Pearson correlation coefficient when

the normality assumption is not appropriate.
• can be used when the data can be ranked.
• computations for the Spearman rank correlation coefficient is

simpler , because they involve ranking the samples.
• If both samples have the same ranks, then ρ will be +1. If the
ranks of the two samples are completely opposite, then ρ will
be −1.
02/23/2023 89
• If there are no ties, the Spearman ran correlation coefficient is
defined by
• If there are many ties, the Spearman rank correlation coefficient is

given by
02/23/2023 90
Others
•Cochran test
• Logrank Test
•Anderson–Darling test
• permutation test
•Statistical Bootstrap Methods
• Rank products
•Cohen's kappa
• Siegel–Tukey test
•Kaplan–Meier
• Wald–Wolfowitz runs test
•Kendall's tau
•Kendall's W
•Kolmogorov–Smirnov test
•Kuiper's test
02/23/2023 91
Conclusion
• Parametric and nonparametric are two broad

classifications of statistical procedures.
• Parametric tests are based on assumptions

about the distribution of the underlying
population from which the sample was taken.
The most common parametric assumption is
that data are approximately normally
distributed
02/23/2023 92
• Nonparametric tests do not rely on assumptions about the shape
or parameters of the underlying population distribution.
• If you determine that the assumptions of the parametric

procedure are not valid, use an analogous nonparametric
procedure instead .
• As non-parametric methods make fewer assumptions, their

applicability is much wider than the corresponding parametric
methods. In particular, they may be applied in situations where
less is known about the application in question..
02/23/2023 93
STYLE- Choosing Tests
Tests Parametric Non-Parametric
Choosing Choosing a parametric Choosing a non-
test parametric test
Correlation test Pearson Spearman
Independent Independent-measures Mann-Whitney test
measures, 2 groups t-test
Independent One-way, independent Kruskal -Wallis test
measures, >2 groups – measures ANOVA
Repeated measures, 2 Matched –pair t-test Wilcoxon test
conditions
Repeated measures, One-way, repeated Friedman’s test
>2 conditions measures ANOVA
02/23/2023 94
FREQUENTLY ASKED QUESTIONS
• Define Data ? Types of collection of Data and
presentation of data.
• Measures of Central Tendency?
• Measures of Dispersion?
• The normal curve ?
• Tests of significance?
02/23/2023 95
References
• Kim and Dailey-1st edition- Biostatistics in oral health care.
• BK Mahajan- 6th edition-methods in biostatistics.
• K Visweshwara Rao -2nd edition-A manual for statistical methods for use in
health, nutrition and anthropology.
• Jekel JF, Katz DL, Elmore JG, Wild DMG, Epidemiology, Biostatistics and
Preventive Medicine, 3rd ed.2007.
• Greenstein G. Clinical versus statistical significance as they relate to the efficacy
of periodontal therapy. J Am Dent Assoc 2003;134:583-91.
• Potter RH. Significance level and confidence interval. J Dent Res 1194;73:494-495
02/23/2023 96
02/23/2023 97

Bio Statistics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bio Statistics

Uploaded by

Copyright:

Available Formats

GOOD

– Latin – status, Italian – statista, German – statistik, French -

– The history of statistics can be said to start around 1749

• The application of statistics to a wide range of topics in biology.

• The science of biostatistics encompasses the design of

• The collection, organizing analysing tabulating and the

Sir Ronald A Fisher

DISC RETE CONTINOUS ORDINAL NOMINAL

• Independent variable: A variable that influences the

E.g. Age [independent variable]

• Range – difference between largest and smallest

• MEAN >>>>>>>>>>>> STANDARD DEVAITION

• MEDIAN >>>>>>>>> INTER-QUARTILE RANGE

 Positively (right) skewed

 Negatively (left) skewed

• Kurtosis – is a measure of height of the distribution

Left-Skewed Symmetric Right-Skewed

• Negatively skewed extreme scores at the lower end

Std. Dev = 27.74 Std. Dev = 3.99

35-SYSTOLIC BLOOD PRESSURE FIRST ER 44-DAYS IN ICU

• These are the mathematical methods by which

• It is a statistical procedure by which one can

• High p value supports Null Hypothesis

Decision Condition of Null hypothesis

Reject Null Type I error Correct

• Confidence interval is range of values based

• CONFIDENCE LEVEL OR COEFFICIENT

• Significance at 0.05 level means that:

2.In this decision we are aware that we incur a probability

• The smaller p level (0.01,0.001,0.0001) and farther our that

Potter RH. Significance level and confidence interval. J Dent Res

• The number of degrees of freedom is the

SD of observations in the sample

• Standard error for sample proportion= PQ

• S.E for difference in proportions =

RESULTS   VALIDITY (Assumptions)

Nonparametric method implies a method that is not for any specific

ASSUMPTIONS  LESS STRINGENT

Quantitative data Qualitative data

• To compare – sample mean with

Comparison of two Wilcoxon rank sum test

One way Anova Comparison of Kruskal wallis test

Two way Anova Comparison of groups Friedmann test

Normal test (Z test ) Chi square test

• Data points should be independent from each

1. For the z-test degrees of freedom are not

• A prospective study was conducted in the OBG

YES 89 2.63 0.37

• The one-tailed is performed if the results are

• The two-tailed is performed if the results

• Why not compare all possible pairs by t test???

• One-way ANOVA measures the significant effect of

• The two-way ANOVA is an extension of the one-way

• ANCOVA evaluates whether population means of

• Post hoc tests are designed for situations in

1. Fisher's least significant difference

Yes 8 (14.3%) 48(85.7%) 56

TOTAL 115 89 204

 Variables of interest are categorical

• If you have only a small sample size, you

• McNemar's chi-squared = 2.5,

• It was developed by Karl Pearson from a related idea

• It is a measure of the correlation (linear dependence)

• Also known as distribution-free statistics