BIOSTATISTICS AND STATISTICAL METHODS

BIOSTATISTICS
Dr Lohith D
1st year MDS
Department of orthodontics
V S dental college.Bengaluru
1
CONTENTS
• Introduction
• History
• Applications
• Measures of central tendency
• Measures of dispersion
• Steps in statistical methods
• Methods of presentation of data
• Types of studies
• Sampling
2
• Null hypothesis
• Parametric and non –parametric tests
• Softwares-Statistical packages
• Conclusion
• References
3
INTRODUCTION
Statistics as a singular noun is “a science of figures”
Where as plural noun it means “figures” or numerical

data or information.
4
BIOSTATISTICS
BIOSTATISTICS can be defined as art and
science of collection, compilation, presentation,
analysis and logical interpretation of biological
data affected by multiplicity of factors
“An ounce of truth produces tons of statistics”

5
STATISTICS
The word ‘statistic’ is derived from an Italian

word statista meaning statesman.
OR the German word ‘statistik’ which means
political state
Zimmerman introduced the word statistics in
England.
6
HISTORY OF STATISTICS
During the outbreak of plague in england, in 1532 they
started publishing the weekly death statistics.This practice
continued and by 1632 published the bills of mortality and
they listed births and deaths.
7
8
HISTORY OF STATISTICS..
In 1662, John graunt used 30 years

of these bills to make predictions
about the number of people who
would die from various diseases and
proportions of male and female births
that could be expected.
John graunt(1620-1674)
father of health statistics

9
KNOWLEDGE OF STATISTICAL METHODS
1. Enables us to make intelligent use of the current literature.

2. Opens up new paths of experimental procedures
3. Enables a research worker to collect, analyze and present
his data in the most meaningful manner.
4. Allows a bioinformatics professional to use statistical
softwares in a meaningful manner
10
LIMITATIONS
Statistic laws are not exact laws like mathematical or

chemical laws but are only true in majority of cases.
Ex: when we say that the average height of an adult

indian is 5’ 6’’ , it does not indicate the height of an
individual but of a group of individuals.
11
SUBDIVISIONS OF STATISTICS
They can be seperated into two broad
categories:
1. Descriptive statistics
2. Inferential statistics
12
MEASURES OF CENTRAL
TENDENCY
Three common types

• MEAN
• MEDIAN
• MODE
13
MEAN
• The arithmetic mean is widely used in statistical
calculation. It is sometimes simply called Mean.
• To obtain the mean, the individual observations are first
added together, and then divided by the number of
observations.
• The operation of adding together is called 'summation'
and is denoted by the sign or S. The individual
observation is denoted by the sign and the mean is
denoted by the sign (called "X bar").
• = Sum of observations
No of observations
• = X1+X2+X3……………..+Xn
n 14
• The mean (x) is calculated thus : the age of 10
orthodontic patients was 15, 14, 16, 12, 18, 16, 17, 19,
21,23. The total was 171. The mean is 171 divided by 10
which is 17.1.
15
MEDIAN
• The median is an average of a different kind, which does not

depend upon the total and number of items.
• To obtain the median, the data is first arranged in an
ascending or descending order of magnitude, and then the
value of the middle observation is located, which is called the
median.
E.g.1)seven subjects are arranged in ascending
order . 3, 4, 4, (5), 5, 6, 7.
the fourth observation (5) is median in this series
E.g.2) 3, 4, 5, 6, 7, 8, 9, 10.
6+7= 13
median is 13/2= 6.5 16
MODE (z)
• The mode is the commonly occurring value in a

distribution of data. It is the most frequent item or the
most "fashionable" value in a series of observations.
• Selection of Mode = The Observation having
highest repetition.
• mode of the following data:
10, 11, 12, 26, 20, 40, 20, 10, 12, 10.
As 10 is repeating 3 times 10 is the mode.
17
DISPERSION
It is necessary to study the variation. This variation is

also known as dispersion. It gives us information, how
individual observations are scattered or dispersed from
the mean of large series.
18
MEASURES OF DISPERSION
• There must be individual variations. If we examine the
data of blood pressure or heights or weights of a large
group of individuals, we will find that the values vary
from person to person. Even within the same subject,
there may be variation from time. The questions that
arise are : What is normal variation ? And how to
measure the variation ?
• There are several measures of variation (or "dispersion”
as it is technically called) of which the following are
widely known:
(a) The Range
(b) The Mean or Average Deviation
19
(c) The Standard Deviation

THE RANGE
• The range is by far the simplest measure of dispersion. It is
defined as the difference between the highest and lowest
figures in a given sample. For example, from the following
record of diastolic blood pressure of 10 individuals –
• 83, 75,81, 79, 71,90, 75,95, 77,94.
• It can be seen that the highest value was 95 and the
lowest 71. The range is expressed as 71 to 95 or by the
actual difference (24).
• If we have grouped data, the range is taken as the
difference between the mid-points of the extreme
categories. The range is not of much practical importance,
because it indicates only the extreme values between the
two values and nothing about the dispersion of values
between the two extreme values.
20
THE MEAN DEVIATION
• It is the average of the deviations from the arithmetic

mean. It is given by the formula:
• M.D. = (x- ) /
• Example : The diastolic blood pressure of 10 individuals was
as follows : 83, 75, 81, 79, 71, 95, 75, 77, 84 and 90.
21
22
The Standard Deviation
• The standard deviation is the most frequently used

measure of deviation. In simple terms, it is defined as
"Root-Means- Square -Deviation." It is denoted by the
Greek letter sigma or by the initials S.D. The standard
deviation is calculated from the basic formula :
23
• When the sample size is more than 30, the above basic
formula may be used without modification. For smaller
samples, the above formula tends to underestimate the
standard deviation, and therefore needs correction, which
is done by substituting the denominator ( -1) for . The
modified formula is as follows :
24
• The steps involved in calculating the standard deviation
are as follows :
(a) First of all, take the deviation of each value from the
arithmetic mean, (x- )
(b) Then, square each deviation - ·(x- )2
(c) Add up the squared deviations- (x- )2
(d) Divide the result by the number of observations
[or] ( -1) in case the sample size is less than 30]
(e) Then take the square root, which gives the standard
deviation.
25
Example : The diastolic
blood pressure of 10
individuals was as
follows : 83, 75, 81,
79, 71, 95, 75, 77, 84,
90.
Calculate the
standard deviation.
26
STEPS IN STATISTICAL METHODS
1. Collection of data
2. Classification
3. Tabulation
4. Presentation by graphs
5. Descriptive statistics
6. Establishment of relationship
7. Interpretation
27
DATA
Whenever an observation is made, it will be recorded and

a collective recording of these observations, either
numerical or otherwise, is called a data.
Ex: recording the sex of a person in a group of persons
28
VARIABLE
In each of cases a certain observation is made for a
characteristic and this characteristics varies from one
observation to other observation and is called a variable
29
TYPES OF DATA
1. Qualitative
2. Quantitative
a)Discrete
b)continuous
3. Grouped / ungrouped
4. Primary / secondary
5. Nominal / ordinal
30
TYPES OF CLINICAL DATA THAT CAN BE
SUPPORTED BY STATISTICS
• Statistics can be used to help the reader make a

critical evaluation of virtually any quantitative data.
• It is important that the statistical techniques used are

appropriate for the given experimental design.
31
NEED FOR ORGANISING THE DATA
• Data collected and compiled from experimental work,

surveys, registers or records are raw data.
• These are unsorted and not very helpful in

understanding the underlying trends or its meaning.
• The objective of classification of data is to make data

simple, concise, meaningful, interesting and helpful in
further analysis.
32
METHODS OF PRESENTATION OF DATA
•Tabulation
•Charts and diagrams
33
TABLES
• Tables are devices for presenting data simply from

masses of statistical data.
• Tabulation is the first step before the data is used for
analysis or interpretation.
• A table can be simple or complex, depending upon the
number or measurement of a single set or multiple sets
of items.
34
GUIDELINES FOR PRESENTATION OF
TABLES
1. The tables should be numbered e.g., Table 1, Table 2,
etc.
2. A title must be given to each table. The title must be
brief and self explanatory.
3. The headings of columns or rows should be clear and
concise.
4. The data must be presented according to size or
importance; chronologically, alphabetically or
geographically.
5. If percentages or averages are to be compared, they
should be placed as close as possible.
35
6. No table should be too large.
7. Most people find a vertical arrangement better than
a horizontal one because, it is easier to scan the data
from top to bottom than from left to right.
8. Foot notes may be given, where necessary.
9. providing explanatory notes or additional information
36
TYPES OF TABLES
• Simple table:
they are one way tables which supply answer to
questions about one characteristic of data only.
37
FREQUENCY DISTRIBUTION TABLE
• In a frequency distribution table, the data is first split up
into convenient groups (class intervals) and the number
of items (frequency) which occur in each group is
shown in the adjacent column.
38
CHARTS AND DIAGRAMS
• Charts and diagrams are one of the most convincing

and appealing ways of depicting statistical results.
Diagrams and graphs are extremely useful because:
1. They are attractive to the eyes.
2. They give a bird’s eye view of entire data
3. They have lasting impression on the mind of
layman
4. They facilitate comparison of relating to different
time periods and regions.
39
BAR CHARTS
• Bar charts are merely a
way of presenting a set of
numbers by the length of a
bar. The length of the bar
is proportional to the
magnitude to be
represented.
• Bar charts are a popular
media of presenting
statistical data because
they are easy to prepare,
and enable values to be
compared visually.
40
41
HISTOGRAM
• It is. a pictorial diagram of

frequency distribution. It
consists of a series of
blocks .
• The class intervals are
given along the horizontal
axis and the frequencies
along the vertical axis. The
area of each block or
rectangle is proportional to
the frequency.
42
FREQUENCY POLYGON
• A frequency distribution
may also be represented
diagrammatically by the
frequency polygon. It is
obtained by joining the
mid-points of the
histogram blocks.
43
LINE DIAGRAM
• This diagram is useful to
study the changes of
valuables in the variable
over the time and is
simplest of the diagram.
• On the x axis the time such
as hours, days, weeks,
months or years are
represented and the value
of any quantity pertaining
to this is represented along
the y axis.
44
Pie charts
• These are so called because
the entire graph looks like a
pie and its components
represent slices cut from a
pie.
• The total angle at the the
centre of the circle is equal to
360 degree and it represents
the total frequency.
• It is divided into different
sectors corresponding to the
frequencies of variables in
the distribution.
45
PICTOGRAM
• Pictograms are a popular

method of presenting
data to the "man in the
street" and to those who
cannot understand
orthodox charts. Small
pictures or symbols are
used to present the data.
46
STATISTICAL MAPS
• When statistical data refer to

geographic or administrative
areas, it is presented either
as "Shaded Maps“ or "Dot
maps" according to suitability.
• The shaded maps are used
to present data of varying
size. The areas are shaded
with different colours, or
different intensities of the
same colour, which is
indicated in the key.
47
TYPES OF STUDIES
48
COHORT STUDY
Cohort study is another type of analytical (observational)

study which is usually undertaken to obtain additional
evidence to refute or support the existence of an association
between suspected cause and disease. Cohort study is
known by a variety of names : prospective study,
longitudinal study, incidence study, and forward-looking
study. The most widely used term, however, is "cohort
study“.
The distinguishing features of cohort studies are :
• The cohorts are identified prior to the appearance of the
disease under investigation
• the study groups, so defined, are observed over a period
of time to determine the frequency of disease among
them. 49
• the study proceeds forward from cause to effect.

CONCEPT OF COHORT
• In epidemiology, the term "cohort" is defined as a
group of people who share a common characteristic
or experience within a defined time period (e.g., age,
occupation, exposure to a drug or vaccine, pregnancy,
insured persons, etc). Thus a group of people born on
the same day or in the same period of time (usually a
year) form a "birth cohort". All those born in 2010
form the birth cohort of 2010.
• Persons exposed to a common drug, vaccine or
infection within a defined period constitute an
"exposure cohort".
• The comparison group may be the general population
from which the cohort is drawn, or it may be another
cohort of persons thought to have had little or no
exposure to the substance in question, but otherwise 50
similar.
Indications for cohort studies
• when there is good evidence of an association between

exposure and disease, as derived from clinical
observations and supported bydescriptive and case
control studies.
• when exposure is rare, but the incidence of disease high
among exposed, e,g., special exposure groups like those
in industries, exposure to X-rays, etc.
• when attrition of study population can be minimized,
e.g., follow-up is easy, cohort is stable, cooperative and
easily accessible.
• when ample funds are available.
51
INTERVENTIONAL STUDIES
These are also known as experimental studies or clinical
trials. In these studies the investigator decides which
subject gets exposed to a particular treatment (or
placebo). These studies may be cohort or case-control.
Ex-animal experiments,isolated tissue experiments,in vitro

experiments.
52
INTERVENTIONAL STUDIES
•Randomized controlled trials/clinical trials-with

patients as unit of study
•Field trials/community intervention studies-with

healthy people as unit of study
•Community trials-with communities as unit of study
53
SAMPLING
• When a large proportion of individuals or items or units

have to be studied, we take a sample.
• It is easier and more economical to study the sample than
the whole population or universe.
• Great care therefore is taken in obtaining a sample. It is
important to ensure that the group of people or items
included in the sample are representative of the whole
population to be studied
54
SAMPLE SELECTION-GUIDELINES
I. EFFICIENCY
II. REPRESENTATIVENESS
III. MEASURABILITY
IV. SIZE
V. COVERAGE
VI. GOAL ORIENTATION
VII. FEASIBILITY
VIII.ECONOMY AND COST EFFICIENCY
55
DIFFERENT SAMPLING DESIGNS
1. Simple random sampling

2. Systematic random sampling
3. Stratified random sampling
4. Cluster sampling
5. Sub sampling/ multistage sampling
6. Multiphase sampling
56
DETERMINATION OF SAMPLE SIZE
Quantitative data
4 SD2 SD= Standard deviation

N=
L2 L = allowable error
Journal of orthodontics Vol 31:2004,107-114

57
PRECISION
Individual biological variation, sampling errors and

measurement errors lead to random errors,which lead to
lack of precision in the measurement. This error can never
be eliminated but can be reduced by increasing the size of
the sample
58
PRECISION
PRECISION= square root of sample size
standarad deviation
Standard deviation remaining the same, increasing the

sample size increases the precision of the study.
59
EXPERIMENTAL VARIABILITY
ERROR/ DIFFERENCE / VARIATION
There are three types
1. Observer- subjective / objective
2. Instrumental
3. Sampling defects or error of bias
60
BIAS IN THE SAMPLE
This is also called as systematic error. This occurs when

there is a tendency to produce results that differ in a
systematic manner from the true values. A study with
small systematic error is said to have high
accuracy.Accuracy is not affected by the sample size.
61
BIAS IN THE SAMPLE..
Accuracy is not affected by the sample size.

There are as many as 45 types of biases,
however the important ones are:
1. Selection bias
2. Information bias
3. Confounding bias
62
ERRORS IN SAMPLING
SAMPLING ERRORS NON SAMPLING ERRORS
Faulty sampling design Coverage error

-due to non response or non
cooperation of the informant
Small size of the Observational error
sample -due to interviewers bias,imperfect
design
Processing error
-due to errors in statistical analysis
63
DISTRIBUTIONS
When you have a collection of points you begin the
initial analysis by plotting them on a graph to see how
they are distributed
64
DISTRIBUTION-TYPES
1. Normal or gaussian
2. Binomial
3. Poisson
4. Rectangular or uniform
5. Skewed
6. Log normal
7. Geometric
65
NORMAL OR GAUSSIAN
DISTRIBUTION
• When data is collected from very large number of

people and a frequency distribution is made with the
narrow class intervals, the resulting curve is smooth,
symmetrical and it is called normal curve.
66
In a normal curve
• (a). the area between one
standard deviation on either side
of the mean ( x ± 1 ) will
include approximately 68 per
cent of the values in the
distribution
• (b) the area between two
standard deviations on either
side of the mean( x ± 2 ) will
cover most of the values, i.e.,
approximately 95 per cent of the
values.
• (c) the area between ( x ± 3
) will include 99. 7 per cent of
the values. These limits on either
side of the mean are called
"confidence limits" 67
STANDARD NORMAL CURVE
1. The standard normal curve is bell shaped

2. The curve is perfectly symmetrical based on an
infinitely large number of observations. The maximum
number of observations is at the mean and the
number of observations gradually decrease on either
side with few observations at the extreme points.
3. The total area of the curve is one, its mean is zero
and standard deviation one.
4. All the three measures of central tendency, the mean,
median, and mode coincide.
68
BINOMIAL DISTRIBUTION
The binomial distribution is used for describing discrete not
the continuous data. These values are as a result of an
experiment known as bernoulli’s process.They are used to
describe
1. One with certain characteristic
2. Rest without this characteristic
The distribution of the occurrence of the charactreristic in

the population is defined bythe binomial distribution.
69
THE POISSON DISTRIBUTION
If in a binomial distribution the value of probability of

success and failure of an event becomes indefinitely small
and the number of observation becomes very large, then
binomial distribution tends to poisson distribution.
This is used to describe the occurrence of rare events in a

large population.
70
CRITICAL RATIO, Z SCORE
It indicates how much an observation is bigger or smaller

than mean in units of SD
Z ratio = Observation – Mean
Standard Deviation
The Z score is the number of SDs that the simple mean
depart from the population mean.
As the critical ratio increases the probability of accepting null
hypothesis decreases.
71
NULL HYPOTHESIS
It is a hypothesis which assumes that there is no

difference between two values such as population
means or population proportions.
When you are subjecting to null hypothesis certain
terminologies should be clear.
72
NULL HYPOTHESIS…..
CONCLUSION BASED ON SAMPLE

POPULATION
NULL HYPOTHESIS NULL HYPOTHESIS
REJECTED ACCEPTED
NULL HYPOTHESIS TYPE I ERROR CORRECT

TRUE DECISION
NULL HYPOTHESIS CORRECT DECISION TYPE II ERROR

FALSE
73
Parametric or Non-parametric?
• If the information about the population is completely

known by means of its parameters then statistical test is
called parametric test
• Eg: t- test, f-test, z-test, ANOVA Parametric Test
• If there is no knowledge about the population or

parameters, but still it is required to test the hypothesis
of the population. Then it is called non-parametric test
• Eg: mann-Whitney, rank sum test, Kruskal-Wallis test

Nonparametric test
74
Parametric Non Parametric
1 Student paired T test 1 Wilcoxan signed rank test
2 Student unpaired T test 2 Wilcoxan rank sum test

3 One way Anova 3 Kruskal wallis one way anova
4 Two way Anova 4 Friedman one way anova
5 Correlation coefficient 5 Spearman’s rank correlation
6 Regression analysis 6 Chi-square test
75
STUDENT’S ‘t’ TEST
This test is a parametric test described by

W.S.Gossett whose pen name was
“student”.Hence called as student’s t test. It is
used for small samples, i.e Less than 30.
T Test can be:
Paired t test
Unpaired t test 76
Unpaired ‘t’ TEST
This test is applied to unpaired data of independent

observations made on individuals of two different or
separate groups or samples drawn from two populations, to
test if the difference between the mean is real or it can be
attributed to sampling variability.
Ex: comparing intermolar width in boys and comparing
intermolar width in girls.
77
Paired “t” test
• It is applied to paired data of independent

observations from one sample only when each
individual gives a pair of observations.
• Ex: comparison of intermolar width in mixed dentition
period and intermolar width in permanent dentition
period of the same sample.
78
ANALYSIS OF VARIANCE
(ANOVA)
when 3 more or more groups of individuals with the
objective of determining whether any true differences
in mean performance exist among the conditions under
the study.
Ex: comparing the curve of spee, curve of Wilson, curve
of monsoon in 3 different groups.
1st group: control.
2nd group: serial extraction.
3rd group: late premolar extraction.
79
CHI- SQUARE(ᵡ2) TEST
 The letter “x” in Greek represents “chi”. As it is “x2”

or square of “x” it is called as “Chisquare test.”
 It was first introduced by a famous statistician “Karl

Pierson” in 1889.
80
• When the data is measured in terms of attributes or
qualities, and it is intended to test whether the
difference in the distribution of attributes in different
groups is due to sampling variation or not, the chi
square test is applied.
• It is used to test the significance of difference
between two proportions and can be used when
there are more than two groups to be compared.
81
• For example , if there are group Occurrence
two groups, one which of new
cavities
has received oral hygiene
Present Absent total
instructions and other has
No. who 10 40 50
not received any received
instructions and if it is instruction
desired to test if the s
occurrence of new cavities No.who 32 8 40

did not
is associated with the receive
instructions. instruction
s
total 42 48 90
82
STEPS
1. Test the null hypothesis: to test whether there is an
association between oral hygiene instructions received
and the occurrence of new cavities, state the null
hypothesis as ‘there is no association between oral
hygiene instructions received in dental hygiene and
occurrence of new cavities’.
2. X2 statistics is calculated as:
x2 = (O-E)2 / E
Where O = observed frequency and E = expected
frequency
Proportion of people with caries=42/90= 0.47
Proportion of people without caries=48/90=0.53
83
• Among those who received instructions:
expected number attacked=50x0.47= 23.5
expected number not attacked=50x0.53=26.5
• Among those who did not receive instructions
expected number attacked=40x0.47= 18.8
expected number not attacked=40x0.53= 21.2
GROUP ATTACKED NOT
NOTATTACKED
ATTACKED
NO. who O=10 O=40
received E= 23.5 E=26.5
instructions O-E=13.5 O-E=13.5
NO. who did not O =32 O=8
E= 18.8 E=21.2
O-E= 13.2 O-E=13.2
84
APPLYING THE X2 TEST
• X2= (O-E)2/E
(13.5)2/23.5 +(13.5)2/18.8 + (13.2)2/21.2
=7.76+6.88+9.27+8.22
=32.13
• FINDING THE DEGREE OF FREEDOM
it depends upon the number of column and rows in
the original table.
d.f=(column-1) (row-1)
(2-1) (2-1)
=1
85
• Probability tables: in the probability table, with a
degree of freedom of 1, the X2 value for a probability
of 0.05 is 3.84. since the observed value 32 is much
higher it is concluded that the null hypothesis is false
and there is a difference in caries occurrence in the
two groups with caries being lower in those who
received instructions.
86
COMPARABLE PARAMETRIC and
NON PARAMETRIC TESTS
use parametric Non parametric
To compare two paired Paired ‘t” test Wilcoxan signed rank

samples for equality of means test
To compare two independent Unpaired ‘t” test Mann Whitney test
samples for equality of means
To compare more than two ANOVA Kruskal-Wallis

samples for equality of means Chi square test
87
Miscellaneous :-
 Fisher’s exact test :
A test for the presence of an association between
categorical variables.
Used when the numbers involved are too small to

permit the use of a chisquare test.
 Friedman’s test :
A non- parametric equivalent of the analysis of
variance.
Permits the analysis of an unreplicated randomized

design. 88
 Kruskal wallis test :
A non-parametric test.
Used to compare the medians of several independent
samples.
It is the non-parametric equivalent of the one way ANOVA.
 Mc Nemar’s test :
A variant of a chi squared test, used when the data is paired.
89
MANN –WHITNEY TEST
• It is non parametric test equivalent to ‘t’ test.
• Used to compare the medians of 2 independent
sampling.
ex: comparison of cervical vertebral maturation index at
the pre and post treatment stages of 2 different groups.
pre treatment.
Group 1 Group 2
Fixed functional Premolar extraction
appliance
1 2 2
2 1 1
3 14 8
4 5 13
5 1 1
90
6 0 0
Post treatment
Group 1 Group 2
Fixed functional Premolar extraction
appliance
1 1 1
2 0 0
3 2 0
4 11 10
5 4 9
6 5 5
91
DISCRIMINANT FUNCTION ANALYSIS
It is used to classify cases into the values of a

categorical dependent, usually a dichotomy.If
discriminant function analysis is effective for a
set of data, the classification table of correct
and incorrect estimates will yield a high
percentage correct.
92
META ANALYSIS
• Gene glass(1976) coined the term ‘meta analysis’.

• Meta-analysis is a statistical technique for combining the
findings from independent studies.
• Meta-analysis is most often used to assess the clinical
effectiveness of healthcare interventions; it does this by
combining data from two or more randomised control trials.
• Meta-analysis of trials provides a precise estimate of
treatment effect, giving due weight to the size of the
different studies included.
93
AIMS OF META ANALYSIS
• Good meta-analyses aim for complete coverage of all

relevant studies.
• look for the presence of heterogeneity.
• explore the robustness of the main findings using
sensitivity analysis.
94
Systematic reviews
• Systematic review methodology is at the heart of meta-
analysis. This stresses the need to take great care to find
all the relevant studies (published and unpublished), and
to assess the methodological quality of the design and
execution of each study.
• The objective of systematic reviews is to present a
balanced and impartial summary of the existing research,
enabling decisions on effectiveness to be based on all
relevant studies of adequate quality.
• Frequently, such systematic reviews provide a quantitative
(statistical) estimate of net benefit aggregated over all the
included studies.
• Such an approach is termed a meta-analysis. 95
BENEFITS OF META-ANALYSES
Overcoming bias: The danger of unsystematic (or narrative)
reviews, with only a portion of relevant studies included, is that
they could introduce bias.
Meta-analysis carried out on a rigorous systematic review can
overcome these dangers – offering an unbiased synthesis of the
empirical data.
Precision: The precision with which the size of any effect can be
estimated depends to a large extent on the number of patients
studied.
• Meta-analyses, which combine the results from many trials, have
more power to detect small but clinically significant effects.
• Furthermore, they give more precise estimates of the size of any
effects uncovered.
• This may be especially important when an investigator is looking
for beneficial (or deleterious) effects in specific subgroups of 96
patients.
TRANSPARENCY: good meta-analyses should allow readers to
determine for themselves the reasonableness of the decisions taken
and their likely impact on the final estimate of effect SIZE.
REQUIREMENTS FOR META-ANALYSIS:The main requirement
for a worthwhile meta-analysis is a wellexecuted systematic
review.
• However competent the meta-analysis, if the original review was
partial, flawed or otherwise unsystematic, then the metaanalysis may
provide a precise quantitative estimate that is simply wrong.
• The main requirement of systematic review is easier to state than to
execute: a complete, unbiased collection of all the original studies of
acceptable quality that examine the same therapeutic question.
• There are many checklists for the assessment of the quality of
systematic reviews;however, the QUOROM statement (quality of
reporting of meta-analyses) is particularly recommended.
97
CONDUCTING META-ANALYSES
• Location of studies
• Quality assessment
• Calculating effect sizes
• Checking for publication bias
• Sensitivity analyses
• Presenting the findings
98
HETEROGENEITY
• A major concern about meta-analyses is the extent to

which they mix studies that are different in kind
(heterogeneity).
• One widely quoted definition of meta-analysis is: ‘a
statistical analysis which combines or integrates the
results of several independent clinical trials
considered by the analyst to be “combinable”.
• The key difficulty lies in deciding which sets of studies are
‘combinable’. Clearly, to get a precise answer to a specific
question, only studies that exactly match the question
should be included.
99
LIMITATIONS
• Assessments of the quality of systematic reviews and

meta-analysis often identify limitations in the ways they
were conducted.
• Flaws in meta-analysis can arise through failure to
conduct any of the steps in data collection, analysis
andpresentation described above.
100
YANCEY’S 10 RULES
-Evaluating Scientific literature
1. Be skeptical
2. Look for the data
3. Differentiate between descriptive and inferential

statistics
4. Question the validity of descriptive statistics
5. Question the validity of inferential statistics

101
AJODO-1996 559-563
YANCEY’S 10 RULES
-Evaluating Scientific literature
6. Be weary of correlation and regression analyses

7. Identify the population sampled
8. Identify the type of study
9. Look for the indices of probable magnitude of
treatment effects
10.Draw your own conclusions.
102
AJODO-1996 559-563
SOFTWARES-STATISTICAL PACKAGES
SPSS(Statistical package for the social sciences)
Developed in 1968 and current version is IBM SPSS,

popular in health sciences.
MINITAB-It access complete set of statistical

tools,including descriptive statistics,Hypothesis tests and
normality tests.
103
EPIINFO-Statistical software for epidemiology developed by
Centers for disease control and prevention in Atlanta.
-It is used worldwide for the rapid assessment of disease and
helps in public health education
 MICROSOFT EXCEL-It is similar to SPSS but provides an
extensive range of statistical functions,that perform
calculations from basic mean,median,mode to the more
complex stastistical distributions and probability tests.
104
Conclusion
Biostatistics is an integral part of research protocols. In any

field of investigation ,data obtained is subsequently
classified, analyzed and tested for accuracy by statistical
methods.A good understanding of biostatistics can improve
clinical decision making, program evaluation and research
with regard to both individuals and groups of people.
105
REFERENCES
• Park K, Park’s text book of preventive and social medicine,
23rd edition.
• Soben Peter S, essential of public health dentistry, 4th
edition.
• Basic epidemiology 2nd edition R Bonita, R Beaglehole, T
Kjellstrom.
• Mahajan BK, methods in biostatistics. 6th edition
• Rao K Visweswara, Biostatistics – A manual of statistical
methods for use in health, nutrition & anthropology. 2nd
edition.2007
• Determination of sample size-Journal of orthodontics
Vol 31:2004,107-114
• Ten rules for reading clinical research reports. John M
. Yancey AJODO may 1996. vol 109.
• Selection of Statistical Software for Solving Big Data
Problems: A Guide for Businesses, Students, and
Universities. Ceyhun osgur, Michelle kleckner, Yang Li.
SAGE OPEN. May 12, 2015.
• What is meta-analysis? Iain K Crombie PhD
FFPHM Professor of Public Health, University of
Dundee. Huw TO Davies PhD Professor of Health
Care Policy and Management, University of St
Andrews. 107
THANK YOU
108

BIOSTATISTICS AND STATISTICAL METHODS

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BIOSTATISTICS AND STATISTICAL METHODS

Uploaded by

Copyright:

Available Formats

BIOSTATISTICS

Statistics as a singular noun is “a science of figures”

Where as plural noun it means “figures” or numerical

BIOSTATISTICS can be defined as art and

science of collection, compilation, presentation,

analysis and logical interpretation of biological

data affected by multiplicity of factors

“An ounce of truth produces tons of statistics”

The word ‘statistic’ is derived from an Italian

During the outbreak of plague in england, in 1532 they

started publishing the weekly death statistics.This practice

continued and by 1632 published the bills of mortality and

they listed births and deaths.

In 1662, John graunt used 30 years

father of health statistics

1. Enables us to make intelligent use of the current literature.

Statistic laws are not exact laws like mathematical or

Ex: when we say that the average height of an adult

They can be seperated into two broad

Three common types

• The median is an average of a different kind, which does not

• The mode is the commonly occurring value in a

It is necessary to study the variation. This variation is

(c) The Standard Deviation

• It is the average of the deviations from the arithmetic

• The standard deviation is the most frequently used

Whenever an observation is made, it will be recorded and

Ex: recording the sex of a person in a group of persons

In each of cases a certain observation is made for a

characteristic and this characteristics varies from one

observation to other observation and is called a variable

• Statistics can be used to help the reader make a

• It is important that the statistical techniques used are

• Data collected and compiled from experimental work,

• These are unsorted and not very helpful in

• The objective of classification of data is to make data

•Charts and diagrams

• Tables are devices for presenting data simply from

• Charts and diagrams are one of the most convincing

• It is. a pictorial diagram of

• Pictograms are a popular

• When statistical data refer to

Cohort study is another type of analytical (observational)

• the study proceeds forward from cause to effect.

• when there is good evidence of an association between

Ex-animal experiments,isolated tissue experiments,in vitro

•Randomized controlled trials/clinical trials-with

•Field trials/community intervention studies-with

•Community trials-with communities as unit of study

• When a large proportion of individuals or items or units

1. Simple random sampling

4 SD2 SD= Standard deviation

Journal of orthodontics Vol 31:2004,107-114

Individual biological variation, sampling errors and

PRECISION= square root of sample size

Standard deviation remaining the same, increasing the

There are three types

1. Observer- subjective / objective

3. Sampling defects or error of bias

This is also called as systematic error. This occurs when

Accuracy is not affected by the sample size.

SAMPLING ERRORS NON SAMPLING ERRORS