Professional Documents
Culture Documents
MORNING
02/23/2023 1
“When you can measure what you are
speaking and express it in numbers, you
know something about it but when you
cannot measure, when you cannot
express it in numbers, your knowledge is
of meager and unsatisfactory kind.”
– Lord Kelvin
02/23/2023 2
Tests in Tests of Significance
(Parametric and Non Parametric
Tests)
02/23/2023 3
Contents :
• Introduction
• Applications of biostatistics
• Terminologies
• Test of significance
–P value
–Normal distribution
–Confidence interval
–Level of significance
–Standard error
–Parametric tests
–Post hoc tests
–Non parametric test
• Conclusion
• References
02/23/2023 4
Introduction
What is statistics.?
– “Science of Statecraft”
02/23/2023 5
What is bio-statistics.?
02/23/2023 6
Father of Statistics
QUANTITATIVE QUALITATIVE
02/23/2023 9
• Variable : Any character, characteristic, or quality
that varies.
• Dependent Variable : A variable that depends on or
is influenced by another variable.
02/23/2023 11
Measures of dispersion
SD= ∑(x-ẍ)2
02/23/2023
(n-1) 12
• Variance: A measure of extent of the variation
present in a set of data
It is square of SD = ∑(x-ẍ)2
(n-1)
02/23/2023 13
• WHICH IS APPROPRAITE???
02/23/2023 14
Normal distribution
02/23/2023 15
Normal distribution
Height (cms) Frequency
142.5-145.0 3
145.0-147.5 8
147.5-150.0 15 Mean = 161.2 cms
150.0-152.5 45 SD= 5 cms
152.5-155.0 90
155.0-157.5 155
157.5-160.0 194
160-162.5 136
162.5-165.0 136
165.0-167.5 93
167.5-170.0 42
170.0-172.5 16
172.5-175.0 6
175.0-177.5 2
02/23/2023 Total 1000 16
Height (cms) Frequency
142.5-145.0 3
145.0-147.5 8
147.5-150.0 15
150.0-152.5 45
152.5-155.0 90
155.0-157.5 155
157.5-160.0 194 Mean ±
Mean ± 1 SD Mean ±
160-162.5 136 2SD 3 SD
162.5-165.0 136
165.0-167.5 93
167.5-170.0 42
170.0-172.5 16
172.5-175.0 6
175.0-177.5 2
Total
02/23/2023 1000 17
250
200 194
155
150
136 136
100
93
90
50 45
42
15 16
8 6
3 2
0
142.5- 145.0- 147.5- 150.0- 152.5- 155.0- 157.5- 160-162.5 162.5- 165.0- 167.5- 170.0- 172.5- 175.0-
145.0 147.5 150.0 152.5 155.0 157.5 160.0 165.0 167.5 170.0 172.5 175.0 177.5
02/23/2023 18
02/23/2023 19
• On preparing frequency distribution with small
class intervals of the data collected, we can
observe
1. Some observations are above the mean & others
are below the mean
2. If arranged in order, maximum number of frequencies
are seen in the middle around the mean & fewer at the
extremes decreasing smoothly
3. Normally half the observations lie above & half below
the mean & all are symmetrically distributed on each
side of mean
Skewness
• Skewness – as the static to measure the asymmetry
• coefficient of skewness is 0
Bimodal
02/23/2023 21
Kurtosis
Platykurtic (flat)
Mesokurtic (normal)
02/23/2023 22
Shape of a Distribution
• Describes how data is distributed
• Measures of shape
– Symmetric or skewed
02/23/2023 23
Skewed Distribution
• Non-symmetrical distribution
– Mean, median, mode not the same
• Positively skewed
– at the higher end
– Mean >median >mode
– Most did poorly, a few well
• The further apart the mean and median, the more the
02/23/2023 24
distribution is skewed.
Examples of Normal and Skewed
35-SYSTOLIC BLOOD PRESSURE FIRST ER
44-DAYS IN ICU
160
1000
140
120 800
100
600
80
60 400
40
Frequency
200
Frequency
02/23/2023 25
Test of Significance
( the role of chance)
• Tests of significance are one of the central
concepts in statistics.
Group SBP
MEAN SD
A (n1=50) 145 8.9
B (n2=50) 135 10
02/23/2023 27
Statistical hypothesis
• NULL HYPOTHESIS
“ No difference in average SBP between the
two groups.”
• ALTERNATE HYPOTHESIS
“Average SBP of group A is different from
group B.”
02/23/2023 28
P value(probability)
• Probability that a difference of at least as
extreme as those found in the observed data
would have occurred by chance when the null
hypothesis is true.
True False
Accepts Null Correct Type II
hypothesis action error(beta)
02/23/2023 30
Confidence interval
• P (L ≤ µ ≤ U) = 0.95.
02/23/2023 31
Level of significance
02/23/2023 33
STANDARD ERROR
• SAMPLING VARIATION/ DISTRIBUTION
• STANDARD ERROR
The measure of such variations is standard
error and is
02/23/2023 34
• Standard error of sample mean=
02/23/2023 35
STAGES OF TESTS OF SIGNIFICANCE
• Formulation of a null hypothesis
• Selection of statistics (test function), according to the
content of the null hypothesis and to the conditions
fulfilled by the random sample,
• Determination of the significance level of the test α (it is
equal to the probability of the type 1 error),
• Determination of the alternative hypothesis on the basis of
the random test results (form of negating null hypothesis),
• Determination of the limits of a so-called critical area,
according to the content of the alternative hypothesis in
such a way, that its area is equal to the significance level α,
• Drawing conclusions based on the position of the statistic
value in relation to the critical area.
02/23/2023 36
Types of tests
• PARAMETRIC
Statistical procedures based on the assumptions of the known
underlying distributions are referred to as parametric statistics.
• NON PARAMETRIC
02/23/2023 37
Parametric v/s Non-parametric
Parametric Non-parametric
Assumed distribution Normal Any
Assumed variance Homogeneous Any
Typical data Ratio or Interval Ordinal or Nomial
Data set relationships Independent Any
Usual central measure Mean Median
Benefits Can draw more Simplicity; Less
conclusions affected by outliers
02/23/2023 38
TESTS IN TEST OF SIGNIFICANCE
Parametric
(normal distribution & Non-parametric
Normal curve ) (not follow
normal distribution)
40
Objective of using tests of significance
41
Parametric Uses Non-parametric
Paired t test Test of diff b/n Wilcoxon signed
Paired observation rank test
Spearman’s rank
Correlation Measure of association Correlation
coefficient B/n two variable Kendall’s rank
correlation
1. Z-test
2. Chi Square test
3. t-test
-Independent sample t-test
-Paired t-test
4. F-test (ANOVA)
02/23/2023 43
Z Test
• Standard error test when sample size is large.
• z-test is preferable when n is greater than 30.
• Compare differences between proportions
02/23/2023 44
• Degrees of freedom:
02/23/2023 45
• It is calculated by taking the observed
difference b/w the two proportions
(numerator) & dividing it by the standard error
of the difference b/w the two proportions
(denominator)
02/23/2023 46
t Test
• Standard error test for small samples
• The t-test assesses whether the means of two
groups are statistically different from each
other
• Two types
1. independent sample t test
2. paired t test
02/23/2023 47
Independent sample t test
• Compare the means of two independent
random samples from two populations
• Variable of interest is quantitative
• Assumptions :
1. Population from which the samples were
obtained must be normally or approximately
normally distributed
2. Homogeneity of variances
02/23/2023 48
Example
• OBJECTIVE: Birth weight of the new born is
related to presence of dental caries at first
trimester.
02/23/2023 49
Dental n Mean birth SD
caries weight
NO 115 2.68 0.33
02/23/2023 50
Paired t TEST
• Two measures taken on the same subject or
naturally occurring pairs of observation or two
individually matched samples.
• Variable of test is quantitative
• Assumptions:
“the difference between pairs in the populationis
independent and normally or approx. normally
distributed.”
02/23/2023 51
ONE AND TWO-TAILED TESTS
02/23/2023 52
02/23/2023 53
ANOVA
• Extension of independent t test to compare
the means of more than two groups
02/23/2023 59
2 x 2 Contigency table
Tobacco chewing Dental caries Total
No yes
No
83 (72.3%) 17(27.7%) 148
02/23/2023 60
Situation:
02/23/2023 61
• c
• Degree of freedom
•Chisquare for 1 degree of freedom at 5% level
=3.84
02/23/2023 62
• Pearson's chi-squared test is used to assess two
types of comparison: tests of goodness of fit and
tests of independence
• A test of goodness of fit establishes whether or
not an observed frequency distribution differs
from a theoretical distribution.
• A test of independence assesses whether paired
observations on two variables, expressed in a
contingency table, are independent of each other
02/23/2023 63
McNemar's test
• The assumption of Chi-square is that the
samples are taken independently or are
unpaired. If not, you need to use McNemar's
test.
02/23/2023 64
EXAMPLE:
02/23/2023 65
Pearson correlation coefficient
• Used to quantify the strength and direction of the
relationship between two variables.
02/23/2023 67
02/23/2023 68
NON PARAMETRIC
TESTS
02/23/2023 69
• Nonparametric methods can be defined as “methods
which do not involve hypothesis concerning specific
values of parameters or which are based on some
function of the sample observation whose sampling
distribution can be determined without knowledge
of the specific distribution function of the underlying
population.’’
02/23/2023 70
Why Should We.?
• They can be used without the normality assumption.
• The computations are lighter in most cases and the procedures are easier to
understand.
• Because they deal with ranks rather than the actual observed values, non-
parametric techniques are less sensitive to the measurement errors than
parametric techniques and can use ordinal data rather than continuous data.
02/23/2023 71
Why Shouldn't We.?
• They tend to use less information than the parametric methods.
• They are less sensitive than the parametric tests. Thus the larger
differences are needed to reject the null hypothesis.
02/23/2023 72
The Sign Test
• Oldest parametric test procedure
• Assumptions required :
– Random variables under consideration has a continuous
distribution
– The paired samples are independent
– The measurements scale is at least ordinal with each other
02/23/2023 73
• The null hypothesis of the sign test
• Eg : Dentist with 2 clinics in the same town -- gives advertising for one
in paper and none for the other – asks assistant to record patient flow
for 12 week.
02/23/2023 74
The Wilcoxon Rank Sum Test
• Considers the magnitude of differences via ranks
02/23/2023 76
– Find the critical values. For the significance level α = 0.05, use
the z values of -1.96 and 1.96
02/23/2023 77
The Wilcoxon Signed Rank test
• Also used to test weather outcome of two treatments are the same
or the hypothesis that two population distribution are identical.
• Rank all of the Di without regard to sign. That is, rank the absolute
values |Di|.
• Affix the sign of the difference to each rank. This indicates which ranks
are associated with positive Di or negative Di .
• Compute T + = the sum of the ranks Ri +of the positive Di , and T − = the
sum of the ranks Ri + of the negative Di .
• Di = Xi − Yi = 0, such pairs will be deleted from the analysis and thus the
sample size will be reduced accordingly. When two or more Di are tied,
the average rank is assigned to each of the differences.
02/23/2023 79
• If the sum of the positive ranks T+ is different (much smaller or much
larger) from the sum of the negative ranks T −, we would conclude
that treatment A is different from treatment B, and therefore, the
null hypothesis H0 will be rejected.
• Thus, Z = T + − μ
σ
02/23/2023 80
THE MEDIAN TEST
• The median test is a statistical procedure for testing whether two
independent populations (treatments) differ in central tendencies
when the populations are far from normally distributed.
02/23/2023 81
• When applying the χ2 test to a 2 × 2 contingency table, it is
computationally more convenient to use the following formula.
02/23/2023 82
THE KRUSKAL-WALLIS RANK TEST
• Test may be employed to test whether the treatment means
are equal.
02/23/2023 83
• Kruskal-Wallis one-way ANOVA by ranks.
02/23/2023 84
• The test statistics is known to be approximately a χ2 random
variable with (k − 1) degrees of freedom.
• Thus,
02/23/2023 85
02/23/2023 86
THE FRIEDMAN TEST
• Also known as the Friedman two-way analysis of variance by ranks
where
N = number of groups
k = number of treatments
Ri = sum of the ranks for ith treatment
02/23/2023 88
SPEARMAN’S RANK
CORRELATION COEFFICIENT
• If both samples have the same ranks, then ρ will be +1. If the
ranks of the two samples are completely opposite, then ρ will
be −1.
02/23/2023 89
• If there are no ties, the Spearman ran correlation coefficient is
defined by
02/23/2023 90
Others
•Cochran test
• Logrank Test
•Anderson–Darling test
• permutation test
•Statistical Bootstrap Methods
• Rank products
•Cohen's kappa
• Siegel–Tukey test
•Kaplan–Meier
• Wald–Wolfowitz runs test
•Kendall's tau
•Kendall's W
•Kolmogorov–Smirnov test
•Kuiper's test
02/23/2023 91
Conclusion
02/23/2023 93
STYLE- Choosing Tests
Tests Parametric Non-Parametric
Choosing Choosing a parametric Choosing a non-
test parametric test
Correlation test Pearson Spearman
Independent Independent-measures Mann-Whitney test
measures, 2 groups t-test
Independent One-way, independent Kruskal -Wallis test
measures, >2 groups – measures ANOVA
Repeated measures, 2 Matched –pair t-test Wilcoxon test
conditions
Repeated measures, One-way, repeated Friedman’s test
>2 conditions measures ANOVA
02/23/2023 94
FREQUENTLY ASKED QUESTIONS
• Define Data ? Types of collection of Data and
presentation of data.
• Measures of Central Tendency?
• Measures of Dispersion?
• The normal curve ?
• Tests of significance?
02/23/2023 95
References
• Kim and Dailey-1st edition- Biostatistics in oral health care.
• BK Mahajan- 6th edition-methods in biostatistics.
• K Visweshwara Rao -2nd edition-A manual for statistical methods for use in
health, nutrition and anthropology.
• Jekel JF, Katz DL, Elmore JG, Wild DMG, Epidemiology, Biostatistics and
Preventive Medicine, 3rd ed.2007.
• Greenstein G. Clinical versus statistical significance as they relate to the efficacy
of periodontal therapy. J Am Dent Assoc 2003;134:583-91.
• Potter RH. Significance level and confidence interval. J Dent Res 1194;73:494-495
02/23/2023 96
02/23/2023 97