You are on page 1of 30

HEMA 204- Hematology

Biostatistics
Lesson 2
Sampling, Numerical Presentation of Data

N. Yassine Biostatistics 1
Objective

• Introduce the sampling distribution


• Organize qualitative data into a frequency table
• Organize quantitative data into a frequency distribution
• Present qualitative data graphically using bar charts and pie charts
• Represent a frequency distribution for a quantitative data.
• 2-K rule

N. Yassine Biostatistics 2
Why Sampling?
• When the population is too large
• When the time required to study the entire population is too
long
• When the cost of studying the entire population is too high
• When it is very difficult or even impossible to survey, measure, or
locate the individuals or objects of the population
• When the objects of the population must be destroyed when
examined
• When the sample results are adequate estimates of the
population parameters

N. Yassine Biostatistics 3
Sampling Methods
• Samples must be representative and randomly selected
• Simple random sampling: Each member, or each group of a
particular size, of the population has the same probability of
being selected
• Systematic sampling: Sample members are selected
according to a starting point and a fixed interval whose
length is the quotient obtained when dividing the population
size by the required sample size.

N. Yassine Biostatistics 4
Systematic sampling
Himaya NGO is seeking to form a systematic sample of 500
volunteers from a population of 5000.
1.Calculate and fix the sampling interval. (The number of
elements in the population divided by the number of
elements needed for the sample.)
The interval is N/n = 5000/500 = 10
2. Choose a random starting point between 1 and the
sampling interval.
Start at the first observation
3. Lastly, repeat the sampling interval to choose subsequent
elements.
Hence, they select every 10th person in the population to
build a sample systematically.
N. Yassine Biostatistics 5
Sampling Methods
• Stratified sampling: Populations member are grouped
according to certain characteristics (age group, gender,
location, …). From each group or strata, a certain number of
members are selected based on simple random or systematic
sampling.
- Divide the population into various strata
- Select a random sample from each strata
- The sample size from each strata should be proportional

N. Yassine Biostatistics 6
Stratified Sampling
The distribution of employees in a certain
company:
male, full-time: 90
male, part-time: 18
female, full-time: 9
female, part-time: 63
A stratified sample of 40 staff according to the
above categories.
N. Yassine Biostatistics 7
Stratified Sampling
% male, full-time = 90 ÷ 180 = 50%
% male, part-time = 18 ÷ 180 = 10%
% female, full-time = 9 ÷ 180 = 5%
% female, part-time = 63 ÷ 180 = 35%
Hence, the stratified sample should have
50% (20 individuals) should be male, full-time.
10% (4 individuals) should be male, part-time.
5% (2 individuals) should be female, full-time.
35% (14 individuals) should be female, part-time.
N. Yassine Biostatistics 8
Sampling Methods
• Cluster sampling: When a list of population members is not available
or the population members are located within disperse geographical
locations (clusters), the two-stage process of cluster sampling can be
deployed. First, a sample of clusters is selected. Second, cluster
members are sampled.
An organization aims to survey the performance of Moodle across
Germany. They can divide the entire country’s population into cities
(clusters) and select further towns with the highest population and also
filter those using Moodle.

N. Yassine Biostatistics 9
Frequency Tables and Graphical Presentation:
Qualitative Data

Qualitative data can be summarized using

• Frequency and Relative Frequency tables

• Bar Graph

• Pie Chart

N. Yassine Biostatistics 10
Frequency Table
Frequency table is a grouping of qualitative data into mutually
exclusive and collectively exhaustive classes showing the number of
observations in each class.

• Mutually exclusive: classes are disjoint or do not overlap.


• Collectively exhaustive: classes contain all observations.
• Class frequency: number of observations in the class.
• Class relative frequency: class frequency divided by the total.

N. Yassine Biostatistics 11
Bar Graph

• A graph that represents the classes of qualitative data using


rectangles or bars.
• Classes are shown on the horizontal axis and frequencies on the
vertical axis.
• The class frequencies are proportional to the heights of the bars.

N. Yassine Biostatistics 12
Pie Chart

• A graph that represents the proportion or percentage frequencies of


qualitative data class.
• Each class is represented by a piece of the pie whose angle is the
relative frequency multiplied by 360o.

N. Yassine Biostatistics 13
Example 2.1
The following are the blood groups of a sample of 50 patients admitted
to ER in the last 24 hours:

O-, O+, O-, O+, A+, O-, O+, B-, O-, AB+, O+, B-, A+, O+, O+, A+, O+, O+,
O+, B-, AB+, O+, A+, O-, O+, O-, O+, O-, A+, O-, O+, O-, AB+, O+, B-, A+,
O-, O+, A+, O+, O-, O+, AB+, O+, O+, A+, O-, B-, AB+, O-

N. Yassine Biostatistics 14
Frequency Table

Blood Group Frequency Relative Frequency % Frequency

O+ 19 19/50 = 0.38 38.0 %


O- 13 13/50 = 0.26 26.0 %
AB+ 5 5/50 = 0.1 10.0 %
A+ 8 8/50 = 0.16 16.0 %
B- 5 5/50 = 0.1 10.0 %
Total 50 1.00 100.0 %

N. Yassine Biostatistics 15
Bar Chart
Blood Group
20 19

18

16

14 13

12

10
8
8

6 5 5

0
O+ O- AB+ A+ B-

N. Yassine Biostatistics 16
Pie Chart

Blood-Groups

10%

O+
16% 38% O-
AB+
A+
10%
B-

26%

N. Yassine Biostatistics 17
Frequency Distributions and Graphical
Presentation: Quantitative Data
• Quantitative data can be summarized using:

• Frequency and Relative Frequency distribution


• Histogram
• Frequency Polygon Graph
• Cumulative Frequency Polygon Graph (Ogive)
• Stem and Leaf Plot
• Box Plot

N. Yassine Biostatistics 18
Frequency Distributions: Quantitative Data
• Frequency distribution: A grouping of quantitative data into mutually
exclusive and collectively exhaustive classes showing the number of
observations in each class.
• Class lower limit: smallest value in the class (included).
• Class upper limit: largest possible value in the class (excluded).
• Class frequency: number of observations that are greater than or
equal to the between lower and upper limits of the class.
• Class relative frequency: class frequency divided by the total.
• Class interval: difference between its upper and lower limits.
• Class upper midpoint: average of its upper and lower limits.
N. Yassine Biostatistics 19
Example 2.2
• Refer to example 1.1. Construct a frequency distribution based that
represent letter grades of A, B, C, D, and F.
66 80 73 58 63 77
72 52 86 65 75 67
67 69 73 70 68 66
80 75 78 67 73 61
66 72 65 71 69 82
83 72 78 74 60 91
81 84 90 73 64 68
87 82 88 76 85 68
68 79 79 68 75 72
65 66 80 59 75 64

N. Yassine Biostatistics 20
Set the individual class limits, frequency table
A frequency distribution representing letter grades must have a class
interval of 10. Since no grade in the data is less 50, the first class must
have a lower limit of 10. The frequency distribution is:
Class Frequency Relative Frequency % Frequency

50 up to 60 3 0.050 5.0 %
60 up to 70 22 0.367 36.7 %
70 up to 80 21 0.350 35.0 %
80 up to 90 12 0.200 20.0 %
90 up to 100 2 0.033 3.3 %
Total 60 1.00 100.0 %

N. Yassine Biostatistics 21
Example 2.2
Consider the first class.
• Class interval: 60-50 = 10 for 1st class.
• Midpoint: (50+60)/2 = 55.
• Frequency: 3 indicating that there are three grades are between 50
and 60.
• Relative frequency: 0.05 indicating that the proportion of grades
between 50 and 60 is 0.05.
• Percent frequency: 5% indicating that the percentage of grades
between 50 and 60 is 5%.

N. Yassine Biostatistics 22
2-K Rule

• Step 1: Determine the number of classes K


Use the total number of observations n to find the smallest
positive whole number K satisfying 2K > n.
• Step 2: Determine the class interval I
Let H = highest value and L = lowest value. Choose I so that
I ≥ (H − L)/K.
Round up to some convenient number.

N. Yassine Biostatistics 23
2-K Rule

Step 3: Set the individual class limits


- Choose the first lower limit to be less than or equal to L.
- Use K and I to find the upper and lower limits of the class.
- Make sure that the classes are collectively exhaustive.
• Step 4: Construct the frequency distribution.

N. Yassine Biostatistics 24
Example 2.3
• A random sample of 30 students was selected and the number of
hours each student studied last week was recorded.
• 15, 23, 19, 15, 18, 23, 14, 20, 13, 20, 17, 12, 20, 13, 31, 18, 29, 17, 18,
11, 26, 15, 15, 17, 33, 31, 23, 12, 27, 16.
a) Organize the data into a frequency distribution.
b) Find the relative frequency distribution and the percent frequency
distribution.
c) Find the cumulative frequency, relative cumulative frequency, and
percent cumulative frequency distributions.

N. Yassine Biostatistics 25
Ex. 2.3: Number of Classes
• To find the smallest K satisfying 2K>30, consider the powers of 2:
• The smallest value satisfying 2K > n is K = 5.
• Number of classes selected is 5.

K 1 2 3 4 5
2K
2 4 8 16 32

N. Yassine Biostatistics 26
Ex. 2.3: Class Interval and Class Limits
• I ≥ (H − L)/k

• I ≥ (33 – 11)/5

• I ≥ =4.4

• Choose I = 5 and use class interval of 5.

• Smallest observation is 11.

• Choose the first lower limit to be 10.

N. Yassine Biostatistics 27
Frequency Distribution
Study Hours Frequency Relative % Frequency
Frequency
10 up to 15 6 0.2 20 %
15 up to 20 12 0.4 40 %
20 up to 25 6 0.2 20 %
25 up to 30 3 0.1 10 %
30 up to 35 3 0.1 10 %
Total 30 1.0 100.0 %

N. Yassine Biostatistics 28
Cumulative Frequency Distribution
Study Hours Cumulative Relative % Cumulative
Frequency Cumulative Frequency
Frequency
10 up to 15 6 0.2 20 %
15 up to 20 18 0.6 60 %
20 up to 25 24 0.8 80 %
25 up to 30 27 0.9 90 %
30 up to 35 30 1.0 100 %

N. Yassine Biostatistics 29
Cumulative Frequency Distribution
Study Hours Cumulative Relative % Cumulative
Frequency Cumulative Frequency
Frequency
10 up to 15 6 0.2 20 %
15 up to 20 18 0.6 60 %
20 up to 25 24 0.8 80 %
25 up to 30 27 0.9 90 %
30 up to 35 30 1.0 100 %

N. Yassine Biostatistics 30

You might also like