You are on page 1of 43

Biostatistics & Research Methodology

2021/22

Block 1 – Lecture 1

Introduction to
Biostatistics & Research
Aktham Osama Abdulazeez, MBChB
Meet Your Instructor
➢ Aktham Osama Abdulazeez
➢ MBChB
College of Medicine,
University of Baghdad
➢ Interests:
➢ Neurosciences
➢ Neurosurgery
➢ Research & Biostatistics
Introduction to Biostatistics

▪ Statistics is a field of study concerned with collection,


organization, summarization and analysis of data (raw
material).
▪ Statisticians try to interpret and communicate the results
to others.
▪ The tools of statistics are employed in many fields:
Business, education, economics, etc.…
▪ When the data analyzed are derived from the biological
science and medicine, we use the term biostatistics.
Concept of Entities (‫)كينونات‬
Concept of Entities (‫)كينونات‬
Entities, Variables and Values

Entity Variables Values


Person Age 0-150 years
Binary
Gender dichotomous Male, female
Arab, Kurdish, Turkman,
Ethnicity
Yezidi, etc…
Car Model Hyundai, KIA, BMW, Saipa
Manufacture year 2010, 2015, 2021, etc…
Engine V4, V6, V8
White, black, yellow, red,
Color
etc…
Populations
▪ Population of entities: the largest collection of entities for
which we have an interest at a particular time.
▪ Population of values: the largest collection of values of a
variable.

▪ Populations can be finite or infinite. Difference? Examples of


each?

▪ NOTE THAT in research, a population DOES NOT


NECESSARILY refer to people.

▪ A target population is the population under study.


Samples

▪ A sample is a specific subgroup of a population that


you will collect data from.

▪ The size of a sample is always less than the total size


of the population (otherwise it’s not a sample
anymore).
Types of Statistics
Descriptive Statistics Inferential Statistics

Includes: Analysis of data (reaching decisions


• Collection about a large body of data by
• Organization examining only a small number of
• Summarization of data. data).

Descriptions of measurements
(variables) taken about a group of
people.

Descriptive statistics are used by


researchers to report on
populations and samples.
Variables
• A variable is a characteristic that takes on different
values in different persons, places, or things.

• e.g.:
• Heights of adult males.
• Weights of preschool children.
• Ages of patients seen in a dental clinic.
counted measured
Variable Scales
Variable Scales: Nominal
▪ It uses names, numbers or other symbols. e.g. males &
females.
▪ The variables are simply “named” or labeled, with no specific
order.
▪ Also called the categorical variable scale, and doesn’t involve
a quantitative value or order.
▪ Dichotomous (binary) variables are nominal variables which
have only two categories or levels. For example, "male or
female“, “yes or no”, etc...
Variable Scales: Ordinal
▪ Each measurement is assigned to one of a limited number of
categories
▪ that are ranked in a graded order ( 1st, 2nd, 3rd..).
▪ Ordinal Scale maintains descriptive qualities along with an
intrinsic order.
▪ Yet, it is void of an origin of scale. Thus, the distance between
variables can’t be calculated.
▪ E.g. Likert scales.
Variable Scales: Ordinal (Likert Scales)
Variable Scales: Interval
▪ Each measurement is assigned to one of unlimited categories
that are equally spaced with no true zero point.

▪ The main characteristic of this scale is the equidistant


difference between objects.

▪ E.g. Celsius/Fahrenheit temperature scales.


Variable Scales: Ratio

▪ Measurement begins at a true zero point and the scale has


equal intervals.

▪ Best examples of ratio scales are weight and height,


temperature in Kelvin.
In contrast to interval scales, when working with ratio variables,
the ratio of two measurements has a meaningful interpretation.
E.g.:
• A weight of 4 grams is twice as heavy as a weight of 2 grams.
• However, a temperature of 10 degrees should not be
considered twice as hot as 5 degrees C.
PLEASE NOTE THAT
IN PRACTICE
INTERVAL/RATIO SCALES
ARE MOSTLY TREATED THE SAME
Descriptive Statistical Work Includes

1. Collection of data
2. Organization of data
• Grouping
• Tables
• Frequency distribution (number of occurrences)
• Relative frequency distribution (proportion of occurrences)
• Graphs (histograms, pie charts, bar charts, polygons, etc…)
3. Summarization of data
• Measures of central tendency
• Measures of non-central tendency (centiles)
• Measures of variation
Grouping of Interval/Ratio Data

• Internationally agreed classifications


For example age is classified into: neonate (first 4 weeks), infant
(<1 year), toddler (second year of life), under 5 children, school age
children (5-9), teenagers (10-15), young adults (16-59), middle age
(60-79), elderly (80+).
• Quantiles (percentiles, terciles and quartiles)
Very useful when no agreed method of classification is available. It
is unbiased and useful in detecting patterns or associations.
• Class intervals & Sturge rule
• Personal experience
Grouping of Data by Class Intervals

• To group a set of observations, we select a set of contiguous,


non overlapping intervals, such that:
1. Each value in the set of observation can be placed in
one, and only one, of the interval.
2. No single observation should be missed.

• This is used to create meaningful groupings of data that serve


to summarize observations.

• The interval is called a CLASS INTERVAL.


Calculating the Number of Class Intervals

The number of class intervals should:


▪ Not be too few: because of the loss of important
information.
▪ Not be too many: because of the loss of the needed
summarization.

We can follow Sturge's Rule:

k= number of class intervals


n= number of observations in the set
Calculating the Width of Class Intervals

The width of the class intervals should be the same, if possible


(an exception is often practically employed for the first and last
intervals).

W= Width of the class interval


R= Range (largest value – smallest value)
K= Number of class intervals
Tabular Organization of Data
▪ Frequency distribution: the number of observations falling
into each class interval.
▪ Relative frequency distribution: the proportion of
observation in the particular class interval relative to the total
observations.

▪ Cumulative frequency distribution: calculated by adding the


number of observation in each class interval to the number of
observations in the class interval above.
▪ Cumulative relative frequency distribution: calculated by
adding the relative frequency in each class interval to the
relative frequency in the class interval above.
Tabular Organization of Data
Exercise
▪ This table shows the number of hours
of 45 patients who slept following the
administration of a certain anesthetic.
▪ Construct a table showing:
1. Frequency
2. Relative frequency
3. Cumulative frequency
4. Cumulative relative frequency distribution.
Exercise
▪ Number of class
intervals:
K = 1 + 3.322 log n
= 1 + 3.322 log45
= 1 + 3.322 X 1.653
= 6.4 = 6
▪ Width of class intervals:
Exercise 2
▪ This table shows the weight (in
ounces) of malignant tumors removed
from the abdomen of 57 patients.
▪ Construct a table showing:
1. Frequency
2. Relative frequency
3. Cumulative frequency
4. Cumulative relative frequency distribution.
Exercise 2
▪ Number of class Interval Frequency Rel. Freq. Cum. Freq. Cum. Rel. Freq.

intervals: 11-20 5 8.77 5 8.77

K = 1 + 3.322 log n 21-30 21 36.86 26 45.61

31-40 8 14.03 34 59.64


= 1 + 3.322 log57
41-50 14 24.56 48 84.21
= 1 + 3.322 X 1.755
51-60 3 5.26 51 89.47
= 6.8 = 7
61-70 4 7.01 55 96.49
▪ Width of class intervals:
71-80 2 3.5 57 100
𝑹 79−12
W= = = 9.57 = 10 Totals 57 99.99
𝑲 7
Review
Time
Previous Years

Question 1

A population which we are interested in studying is called:

A. Infinite population
B. Target population
C. Final population
D. Standard population
Previous Years

Question 1

A population which we are interested in studying is called:

A. Infinite population
B. Target population
C. Final population
D. Standard population
Previous Years

Question 2

The WHO grades astrocytomas (a group of brain tumors) into 4


grades; labeled as grades I, II, III, and IV. The variable scale of
this grading system is:

A. Nominal
B. Ordinal
C. Numerical
D. Interval/ratio
Previous Years

Question 2

The WHO grades astrocytomas (a group of brain tumors) into 4


grades; labeled as grades I, II, III, and IV. The variable scale of
this grading system is:

A. Nominal
B. Ordinal
C. Numerical
D. Interval/ratio
Previous Years

Question 3

Which of the following data represents an interval/ratio scale?

A. Ability to speak English


B. Smoking status recorded as: nonsmoker, current smoker,
ex-smoker
C. Gender
D. Height
Previous Years

Question 3

Which of the following data represents an interval/ratio scale?

A. Ability to speak English


B. Smoking status recorded as: nonsmoker, current smoker,
ex-smoker
C. Gender
D. Height
Previous Years

Question 4

Which of the following variables can be described as being


binary?

A. Age
B. Gender
C. Number of children
D. Blood group
Previous Years

Question 4

Which of the following variables can be described as being


binary?

A. Age
B. Gender
C. Number of children
D. Blood group
Thank You
Questions? Ask in the group or note it down for our next
live session!
Hope to see you next time ☺

You might also like