Professional Documents
Culture Documents
FREQUENCY DISTRIBUTIONS
PSYCH-BC 212 Psychological Statistics
Lectures in the Chapter Target Course Learning Outcomes
Lecture 03 – Frequency • Apply the underlying principles of statistical techniques
Distributions and • Interpret statistical data
Frequency
Distribution Table
11 to 15 16 to 20 21 to 25 26 to 30 31 to 35 36 to 40 41 to 45
JULY AUGUST
Key Features of this Chapter
Calculating fractions, decimals, and percentages; distinguishing scales of measurements; identifying discrete and continuous
variables; creating charts in Microsoft Office® Excel
Summarize data sets into frequency tables and frequency graphs; create tables and charts using Microsoft Office ® Excel
and other electronic tools; interpret data based on frequency tables and graphs
Manage descriptive data; draw conclusions out of frequency tables and graphs; appreciate the role of frequency tables and
graphs in psychological research
Frequency
Lecture Distributions and
03 Frequency
Distribution Tables
1. Create frequency distribution tables
A frequency distribution is an organized tabulation of the number of individuals located in each category on the scale
of measurement.
The expression ∑𝑓 is the sum of all frequencies in the frequency distribution table. Equivalently, it is equal the sample
size or population size. That is, ∑𝑓 = 𝑛 or ∑𝑓 = 𝑁 depending on whether the data was taken from a sample or the
population
𝑿 𝒇
10 1
9 1
8 0
To obtain ∑𝑓, simply add all
7 4
entries under the 𝑓 column.
6 2
5 4
4 3
∑𝑓 = 15
∑𝑋𝑓
The expression ∑𝑋𝑓 is the sum of all scores in the raw data. Equivalently, it is the summation of the 𝑋𝑓 column in a
(extended) frequency distribution table. In sum books, ∑𝑋 is used instead of ∑𝑋𝑓. Note that these two refer to the
same expression.
𝑿 𝒇 𝑿𝒇
10 1 10
9 1 9
To obtain ∑𝑋𝑓, add a column
8 0 0 𝑋𝑓 to the frequency
distribution table which
7 4 28
contains products of 𝑋 and 𝑓.
6 2 12 Then, add all the entries
under this new column.
5 4 20
4 3 12
∑𝑋𝑓 = 91
∑𝑋 2 𝑓
The expression ∑𝑋 2 𝑓 is the sum of all squares in the raw data. Equivalently, it is the summation of the 𝑋 2 𝑓 column in a
(extended) frequency distribution table.
𝑿 𝒇 𝑿𝟐 𝑿𝟐 𝒇
10 1 100 100
9 1 81 81 To obtain ∑𝑋 2 𝑓, add
columns 𝑋 2 and 𝑋 2 𝑓 to the
8 0 64 0
frequency distribution table
7 4 49 196 which contains products of 𝑋
and 𝑓. Then, add all the
6 2 36 72 entries under this new
5 4 25 100 column.
4 3 16 38
∑𝑋 2 𝑓 = 587
Using the raw data at the left,
3 4 5 4 3
7 6 3 3 6
• Create a frequency distribution table; and,
3 5 5 5 3
6 8 4 2 4
• Calculate ∑𝑓, ∑𝑋𝑓, and ∑𝑋 2 𝑓.
The proportion (also called relative frequency), denoted 𝑃, of a value or category is the ratio of the frequency of the
value or category to the sample or population size.
𝑿 𝒇 𝑷
10 1 0.07 To obtain proportion or relative frequency, we first need
the sample or population size 𝑛 = ∑𝑓. Then, divide the
9 1 0.07 frequency of the category or value by the sample or
population size. In other words,
8 0 0.00 𝑓
𝑃= .
7 4 0.27 𝑛
The 𝑃 column should total to 1 (although due to
6 2 0.13 rounding, only a close value might be obtained).
5 4 0.27
We will agree to round off proportions to the second
4 3 0.20 decimal place.
𝑛 = 15
Percentages
The percentage of a category or value is the number or amount of parts per hundred units of the frequency.
Equivalently, it is 100 times the proportion or relative frequency.
𝑿 𝒇 %
To obtain percentage, we first need the sample or
10 1 6.7 population size 𝑛 = ∑𝑓. Then, divide the frequency of
the category or value by the sample or population size
9 1 6.7
and then multiplying it by 100%. In other words,
8 0 0.0 𝑓
percentage = × 100%.
7 4 26.7 𝑛
The percentage (%) column should total to 100
6 2 13.3 (although due to rounding, only a close value might be
obtained).
5 4 26.7
4 3 20.0 As a rule, we round off percentages to not more than
two decimal places, whichever is more comfortable.
𝑛 = 15
Using the raw data at the left,
• Percentages
Note that usually, rows of the frequency distribution table are arranged in the descending order of the 𝑋 column
(although it is not wrong, and sometimes they are still arranged in ascending order). The cumulative frequency of a
category or value is the frequency of the other categories or values that are less than or equal to this value.
The cumulative percentage of a category or value is the percentage of the other categories or values that are less than
or equal to this value.
• Cumulative percentages
The percentile rank of a particular score is defined as the percentage of individuals in the distribution with scores at or
below the value. When a score is identified by its percentile rank, then the score is called a percentile.
What is “interpretation”?
Data interpretation is the process of converting data into a useful information. It is the step of giving meaning to the
findings of the study. It is where we “make sense” of the data.
How do we interpret?
According to Dates and Schoen (n.d.), the basic way to interpret data is to ask the research question again and generate
the answer based on the data collected or statistics obtained.
Example
𝑋 𝑓
90 - 100 9
80 – 89 2
70 – 79 2 Here’s a grouped
60 – 69 3 frequency distribution table
50 – 59 2
40 – 49 1
30 – 39 1
Grouped Frequency Distribution Tables (GFDT)
What is a GFDT?
A grouped frequency distribution table is a frequency distribution table wherein we are counting groups of scores instead
of individual values.
Class intervals
Instead of individual values, the rows of a grouped frequency distribution table are composed of class intervals, which are
intervals or range of values of a predefined length.
The highest score in a class interval is called the upper bound; the lowest score is called the lower bound. The number of
values (equal to upper bound – lower bound + 𝟏) is the width of the interval.
General Guidelines in
Producing GFDTs
General Guidelines in Producing GFDTs
22 rows
𝑋 𝑓
1 Keep about 10 class intervals. 99 to 100 1
97 to 98 0
If there are more class intervals, the grouped frequency distribution table would be 95 to 96 0
93 to 94 2
cumbersome. If there are fewer, you would lose information. 91 to 92 1
89 to 90 2
87 to 88 2
9 rows 85 to 86 4
𝑋 𝑓 5 rows 83 to 84 5
81 to 82 2
96 to 100 1 79 to 80 4
𝑋 𝑓
91 to 95 3 77 to 78 4
91 to 100 4
86 to 90 5 75 to 76 6
81 to 90 15 73 to 74 4
81 to 85 10
71 to 80 23 71 to 72 5
76 to 80 13 69 to 70 1
61 to 70 7 67 to 68 1
71 to 75 10
51 to 60 1 65 to 66 1
66 to 70 2 63 to 64 2
61 to 65 5 61 to 62 2
59 to 60 0
56 to 60 1 57 to 58 1
General Guidelines in Producing GFDTs
CI length: 5 CI length: 6
𝑋 𝑓 95 to 100 1
96 to 100 1 89 to 94 5
91 to 95 3 83 to 88 11
86 to 90 5 77 to 82 10
81 to 85 10 71 to 76 15
76 to 80 13 65 to 70 3
71 to 75 10 59 to 64 4
66 to 70 2 53 to 58 1
61 to 65 5
56 to 60 1
General Guidelines in Producing GFDTs
If the width of each class interval is 5, then either the lower or upper bounds should be a multiple of 5 like 5,
10, 15, 20, etc. It is more usual to find this guideline applied for lower bounds.
All intervals should be of the same width. They should cover the range of scores completely with no gaps and
no overlaps so that any particular score belongs in exactly one interval.
Example
Example
33 25 25 33 47 43 32 48 47 59
50 36 48 20 48 20 63 45 51 22
33 29 43 10 64 47 42 45 47 36
28 28 30 50 45 42 29 43 38 44
38 52 38 37 30 41 61 55 53 36
Example
Step 1
The highest score is 64. The lowest is 10. So, the range is,
This is in order to decide whether grouping is
really needed. Usually when this is greater than
15, it needs to be grouped.
𝑅 =𝐻 −𝐿+1
The number of ungrouped rows is given by
= 64 − 10 + 1
𝑅 =𝐻 −𝐿+1 = 55
Step 2
Step 3
Construct the table with 𝐺 rows. The lower bound of the bottom class interval should be the highest multiple of the
width that is less than or equal to the lowest score in the data set.
Once you have the lower bound, the upper bound So if 10 is the lower bound, then the upper bound is
of this class interval is simply 10 + 5 − 1 = 14
𝑈 = 𝐿+𝑤−1 The next class interval would have a lower bound of 15 and an
where 𝑈 is the upper bound; 𝑤 is the width; and upper bound of
𝐿 is the lower bound.
15 + 5 − 1 = 19
and so on.
Example
COUNTIF Counts the number of cells that satisfy a particular criterion in a range
COUNTIFS Counts the number of cells that satisfy multiple criteria in multiple
ranges of cells
“<“ and “>” “less than” and “greater than”
“<=“ and “>=“ “less than or equal to” and “greater than or equal to”
& Concatenate text
33 25 25 33 47 43 32 48 47 59
50 36 48 20 48 20 63 45 51 22
33 29 43 10 64 47 42 45 47 36
28 28 30 50 45 42 29 43 38 44
38 52 38 37 30 41 61 55 53 36
▪ Gravetter, F. J., Wallnau, L. B., Forzano, L. A. B., &
Witnauer, J. E. (2020). Essentials of statistics for
the behavioral sciences. Cengage Learning.
Interval and ratio data are usually continuous although they may also occur
in discrete form.
Histogram
Graphs for Interval and Ratio Data:
There are two choices for graphing interval and ratio data:
• Frequency Polygon – a line graph like chart that in which the midpoints
of the bins are connected by line segments
Frequency Polygon
To produce a histogram:
Notes
1. The column bars are called
bins.
2. About 10 bins are
recommended. Bar Graph
Histogram
Online Histogram
Generators
Easy Histogram Maker
https://www.socscistatistics.com/
descriptive/histograms/ Example
Use Microsoft Office Excel to generate a histogram of this height data.
Histogram maker – Statistics
Kingdom
https://www.statskingdom.com/h
istogram-maker.html
Google Sheets
sheets.google.com
To produce a frequency
polygon:
Step 3. At the nearest point at Frequency polygons are sometimes confused with line graphs. Note that
the left of the lowest 𝑥-value, frequency polygons are graphs of an independent variable vs frequency
starting at zero, connect the dots while line graphs are a graph of the relationship between an independent
by drawing lines and end at the variable and a dependent variable. In other words, a frequency polygon is a
nearest point at the right of the type of line graph (if we consider frequency as a dependent variable) but
highest 𝑥-value at zero. not all line graphs are frequency polygons.
(Frequency polygons start at
zero and end at zero.)
Notes
1. For class intervals, plot the
points at the midpoint.
Online Frequency
Polygon Generator
Example
Easy Frequency Polygon Maker Use Microsoft Office Excel to generate a frequency polygon of this height
https://www.socscistatistics.com/ data.
descriptive/polygon/
There is no one-click feature in Excel to generate frequency polygons.
Instead, we follow the following steps:
5 3
2.5
4
2
3
1.5
2 1
1 0.5
0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Positively skewed
distributions
Skewed Distributions
Positively skewed distributions Skewed distributions are those of which the population tend to pile up in
imply that a larger part of the one end of the scale. They imply an “imbalance” between low and high
population have lower scores. scores.
Mathematically, MODE < MEAN
< MEDIAN. A usual example of
this is when a quiz is too difficult
that most students got lower 60 60
30 30
20 20
Negatively skewed 10 10
distributions 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10