Professional Documents
Culture Documents
• Test1 10%
• Total 100%
01/15/2024 Biostatistics 2
References
• 1. LECTURE NOTES biostatistics For Health Science Students, by Getu
Degu and Fasil Tessema
https://www.cartercenter.org/resources/pdfs/health/ephti/library/lec
ture_notes/env_health_science_students/ln_biostat_hss_final.pdf
• 2. Introductory Biostatistics for the Health Sciences, 2003 by John
Wiley & Sons, Inc.
https://onlinelibrary.wiley.com/doi/book/10.1002/0471458716
• 3. Knapp RG &miller MC III. Clinical Epidemiology and Biostatistics.
Williams and Wilkins, Baltimore, Maryland. 1992
01/15/2024 Biostatistics 3
Chapter one
Introduction to Biostatistics
OBJECTIVES
meanings:
- counts or measurements
01/15/2024 Biostatistics 5
Cont….
• - E.g., the population statistics of Ethiopia include: total population
number, age-sex distribution of the population, fertility rates, birth
rates, death rates, educational status of the population, etc.
- collecting,
- organizing,
- analyzing, and
01/15/2024 Biostatistics 7
Cont.……
• Unless and otherwise explicitly indicated, keep this last
meaning of the term in mind whenever we talk about
statistics
• NB: It can be concluded that all statistics are numerical data but all numerical
data are not statistics unless they satisfy all the essential characteristics of
statistics
Statistics can be classified as
(ii) Descriptive Statistics
5. Statistics
01/15/2024 and statistical jargons pervade the medical and public health literature
Biostatistics 12
Limitations of statistics:
1. It deals with only those subjects of inquiry that are capable of being
quantitatively measured and numerically expressed.
that measured and takes any value for individual person or object .
01/15/2024 Biostatistics 14
Cont.….
• There are four levels of measurement scales, and, therefore, four types
of data.
1. Nominal data
2. Ordinal data
3. Interval data
4. Ratio data
01/15/2024 Biostatistics 15
Nominal Data
Represent categories or names. No implied order.
Each item must fit into one and only one category
01/15/2024 Biostatistics 16
Cont.….
• Sex—Male, Female
01/15/2024 Biostatistics 17
Ordinal Data
Represents categories or names
01/15/2024 Biostatistics 19
Cont.,,,,
• But, ratios cannot be meaningfully interpreted.
for example. “Zero” doesn’t mean “no” here; “zero” means “less”
01/15/2024 Biostatistics 20
Ratio data
Applies to numerical data that have an absolute zero-point origin.
◦ E.g., length, height, weight, pressure, etc.
01/15/2024 Biostatistics 21
Exercise-1
01/15/2024 Biostatistics 23
quantitative data….
(ii)Numerical continuous data
◦ Observations theoretically lie along a continuum.
◦ Restricting factor is the degree of accuracy of the measuring
instrument.
◦ Most clinical measurements are numerical cont.
E.g., blood pressure, serum cholesterol level, weight, height, etc.
01/15/2024 Biostatistics 24
Chapter two
Data collection, organization and presentation
2. Understand the criterion for the selection of a method to organize and present data
3. Identify the different methods of data collection and criterion that we use to select
4. Define a questionnaire, identify the different parts of a questionnaire and indicate
01/15/2024 Biostatistics 25
Data collection
• Before any statistical work can be done data must be collected.
• Depending on the type of variable and the objective of the study different data
• Observation
Language barriers
Expense
Invasion of privacy
Suspicion
sources.
1) Primary data: These are those data, which are collected by the
study.
• such data are primary data for the agency that collected initially
• secondary for someone else who uses these data for his own
purposes.
• May have errors, due to its purpose being different from the purpose
of the user of these secondary data
• There may have bias and the size of the sample may be inadequate,
• 3) The probability that the method will provide good coverage, i.e.
•
Types of Questions
• Interviews and self-administered questionnaires are probably the most commonly
used research data collection techniques
• Depending on how questions are asked and recorded we can distinguish two major
possibilities
01/15/2024 - Open –ended questions,Biostatistics
and closed questions 34
Steps in Designing a Questionnaire
• Step1: CONTENT: Take your objectives and variables as your starting point.
Step 5: TRANSLATION
Methods of data organization & presentation
• Raw data cant show useful information
01/15/2024 Biostatistics 37
Cont.,,,,,
• Importance of statistical tables
1. Tabulated data can be easily understood.
2. Have lasting impression.
3. Facilitate comparison.
4. Make easier the summation of items and detection of omissions and errors.
5. Avoid unnecessary repetitions and details
01/15/2024 Biostatistics 38
What do you feel?
• Example: Consider the following narrative
description!
“Seven (4.8%) of the smokers and 28 (9.5%) of the chewers started the habit
when they were primary school students….. Forty six (31.7 %) of the
lifetime smokers and 134 (45.6%) of the lifetime chewers started smoking
and chewing when they were senior secondary
school students. Thirty seven (25.5 %) of the ever smokers and 52 (17.7 %)
of the lifetime chewers started smoking and chewing during
their first year at college.”
01/15/2024 Biostatistics 39
Construction of tables
1. Tables should be as simple as possible.
2. Tables should be self-explanatory. For that purpose
• Title should be clear and to the point( a good title answers: what? when?
where? how classified ?) and it be placed above the table.
• Each row and column should be labeled.
• Numerical entities of zero should be explicitly written rather than indicated by
a dash. Dashed are reserved for missing or unobserved data.
• Totals should be shown either in the top row and the first column or in the last
row and last column.
3. If data are not original, their source should be given in a footnote
Parts of a statistical table
Title
Caption
Stub
Body
Head note [optional]
Foot note [optional]
Source [optional]
01/15/2024 Biostatistics 41
Types of Tables
illiterate 150
literate 70
01/15/2024 Biostatistics 42
2. Two-way table, or a cross-tabulation shows two characteristics
and is formed when either of the caption or the stub is divided into two
or more parts.
Example: HIV sero-status vs. sex
positive negative
male 3 5
female 4 7
01/15/2024 Biostatistics 43
Cont.…..
3. High order table; one in which three or more characteristics
are represented.
Table 4. distribution of health professions by sex and resident.
profession Sex resident total
Urban rural
doctors male 8 35 43
female 2 16 18
nurses Male 46 36 82
female 23 77 100
total 79 164 243
01/15/2024 Biostatistics 44
2. Graphical presentation of data
Why graphs?
◦ Attraction.
◦ Help in deriving required information in less time and with ease.
◦ Facilitate comparison.
◦ Reveal unsuspected patterns.
◦ Greater memorizing value
01/15/2024 Biostatistics 45
Limitations of Diagrammatic Representation
2. Titles are usually placed below the graph and it should again question what?
Where? When? How classified?
4. The axes label should be placed to read from the left side and from the bottom.
6. The numerical scale representing frequency must start at zero or a break in the
line should be shown.
Types of Diagrams
◦ line diagram (graph)
◦ bar diagram (graph)
◦ pie diagram (chart)
◦ Histogram
◦ Frequency polygon
◦ Ogive curve
◦ ‘box-and-whisker’ plot
◦ Scatter plots
01/15/2024 Biostatistics 49
Line Graph
For the study of some variables according to the passage of time.
01/15/2024 Biostatistics 50
Response to administration of zidovudine in two groups of AIDS
patients in hospital X, 1999
8
7
6
Blood zidovudine
concentration
5
4
3
2
1
0
10
20
70
80
100
120
170
190
250
300
360
Time since administration (Min.)
01/15/2024 Biostatistics 51
Bar graph
• Used to present a categorical variable
01/15/2024 Biostatistics 53
Distribution of patients in hopital X by source of referal, 1999
769
800
700 623
600
300 256
200 161
97
100
0
Other GP OPD Casualty Other
hospital
Source of referal
01/15/2024 Biostatistics 54
b. Multiple bar graph
01/15/2024 Biostatistics 55
c. Component (stacked) bar graph
• If there are different quantities forming the sub-divisions of the totals,
simple bars may be sub-divided in the ratio of the various sub-divisions
to exhibit the relationship of the parts to the whole.
100 Mixed
P. vivax
80 P. falciparum
60
Percent
40
20
0
August October December
2003
01/15/2024 Biostatistics 57
Pie chart
It is a circle divided into sectors/sections by calculating the angle at
the center proportional to the quantity of the item being represented.
01/15/2024 Biostatistics 58
Distribution fo cause of death for females, in England and Wales, 1989
Others
8%
Digestive System
4%
Injury and Poisoning
3%
Circulatory system
Respiratory system
42%
13%
Neoplasmas
30%
01/15/2024 Biostatistics 59
Histogram
For each class a bar whose width extends from the lower boundary to the upper
boundary of the class and whose length is determined by the class frequency will
be erected
40
35
30
No of women
25
20
15
10
0
14.5-19.5 19.5-24.5 24.5-29.5 29.5-34.5 34.5-39.5 39.5-44.5 44.5-49.5
Age group
01/15/2024 Biostatistics 61
Frequency polygon
Used to present only one numerical continuous variable.
Based on a grouped frequency distribution
Each class represented by its mid-point
Frequencies of each class are labeled on the y-axis along the mid-
points of classes
Points representing the mid-points of successive classes are joined by
straight lines
The curve must be extended to the x-axis at each end
The total area under the polygon will be equal to the total area
under the histogram.
01/15/2024 Biostatistics 62
Frequency polygon
700
600
500
400
300
200
N1AGEMOTH
01/15/2024 Biostatistics 63
Ogive curve
cumulative frequency curve
Prepare a graph with the cumulative frequency on the vertical axis and
the true upper-class limits (class boundaries) of the interval scaled
along the X-axis (horizontal axis).
01/15/2024 Biostatistics 64
Cumulative Frequency and Cum. Rel. Freq. of Age
of 25 ICU Patients
10-19 3 12 3 12
20-29 1 4 4 16
30-39 3 12 7 28
40-49 0 0 7 28
50-59 6 24 13 52
60-69 1 4 14 56
70-79 9 36 23 92
80-89 2 8 25 100
Total 25 100
01/15/2024 Biostatistics 65
Cumulative frequency of 25 ICU patients
01/15/2024 Biostatistics 66
Box and whisker plot
01/15/2024 Biostatistics 69
• A scatter diagram is constructed by drawing X-and Y-axes.
• Each observation is represented by a point or dot().
140
120
Saturation of bile
100
80
60
40
20
0
0 10 20 30 40 50 60 70 80
Age
01/15/2024 Biostatistics 70
3.Frequency distribution
01/15/2024 Biostatistics 71
Frequency distribution
A frequency distribution has two main parts; namely,
gender frequency
male 40
01/15/2024 female 38 Biostatistics 72
Frequency Distributions
Higher dimensional tables of FD tables consists of.
◦ Classes/ categories,
◦ Frequencies of the classes or categories, and
◦ Other pertinent information
Can be:
1. Categorical distributions, or
2. Grouped frequency distributions
01/15/2024 Biostatistics 73
1.1. Categorical distribution
01/15/2024 Biostatistics 75
Steps to construct GFD
• (1) Choosing the classes,
• K=1+3.322×logn
01/15/2024 Biostatistics 77
2. Determine the length or width
(W) of the class interval
•
01/15/2024 Biostatistics 78
3. Determine class limits
• Definition: the smallest and largest values that
go into any class are called Class Limits.
To find the upper limit of the first class, subtract one from the lower limit of the second class.
Then continue to add the class width to this upper limit to find the rest of the upper limits.
Find the boundaries by subtracting 0.5 units from the lower limits and adding 0.5 units to the
upper limits. The boundaries are also halfway between the upper limit of one class and the lower
limit of the next class. Depending on what you're trying to accomplish, it may not be necessary to
01/15/2024 Biostatistics 79
find the boundaries.
Determine the true class limits or class
boundaries
Limits which are determined mathematically to make an interval of a continuous
variable continuous in both directions.
The true limits are what the tabulated limits would correspond with if one could
measure exactly
Use one more decimal place than that used in the data set
Adding the class width (w) to the lower boundary of a class gives the upper
boundary of that particular class and the lower boundary of the next higher class
01/15/2024 Biostatistics 80
Determine the mid-points of classes (class
marks or Xc)
•
01/15/2024 Biostatistics 81