Professional Documents
Culture Documents
PORT1
PORT1
Introduction
Now that you have access to multiple forms of data from standard and teacher-
made tests, you organize and summarize them so that they provide useful
information. Data can be organized through tables such as in a frequency
distribution, and data can be presented in a visual format through the use of graphs
and charts such as a histogram, frequency polygon or a scatter-plot.
Statistical methods are used to summarize and describe data. One way to
summarize a set of scores is to look at the measures of central tendency (mean,
median, and mode). Another method to describe how much scores are different
from one another is by examining measures of variability (range and standard
deviation).
Frequency Distribution
In statistics, the term “frequency” refers to the number of times each score or event
occurs. We are usually interested in frequency as it relates to the number of
students obtaining each score on a test. One way to record frequencies is in a
frequency distribution table, where each score is listed in a column on the left side
and the frequency with which it occurred is listed on the right. While a frequency
table helps to organize data, it does not provide a great deal of descriptive
information about the scores. Frequency distributions provide the initial organization
and information that is the starting point for many other statistical methods.
The teacher wants to examine the general performance trends among his eighth
grade language arts students in order to evaluate student learning and his own
instruction. She gives a mid-term language arts exam and obtains the following
scores for his students:
95 91 100 96 92 91 87 84 70 65 96 65 56 86 43 65 22 40 93
To evaluate those scores, she creates a frequency distribution table to organize the
test data. The obtained scores are listed in order from highest to lowest.
Frequency Distribution
100 | 1
96 || 2
95 | 1
93 | 1
92 | 1
91 || 2
87 | 1
86 | 1
84 | 1
70 | 1
65 ||| 3
56 | 1
43 | 1
40 | 1
22 | 1
The data from a frequency table can be displayed graphically. A graph can provide a
visual display of the distributions, which gives us another view of the summarized
data. The graphic representation of the relationship between two different test
scores through the use of scatter plots. We learned that we could describe in
general terms the direction and strength of the relationship between scores by
visually examining the scores as they were arranged in a graph. Some other types
of graphs include histograms and frequency polygons.
A histogram is a bar graph of scores from a frequency table. The horizontal x-axis
represents the scores on the test, and the vertical y-axis represents the frequencies.
The frequencies are plotted as bars.
One way to summarize your data is to look at the measures of central tendency:
mean, median, and mode.
Median
100
The median is the point in the distribution that splits the scores in two
equal groups, which is also known as the midpoint of a distribution, or the
50th percentile. To calculate the median, organize the raw scores in rank
96
order. The median is the middle value on the scale that divides the number
of scores into equal halves, if the number of scores is odd. When the
number of scores is even, the median is calculated as the average of the
96 two middle scores.
Mode
95
The mode is the most frequently occurring score in a distribution. There
are no mathematical calculations needed for the mode. Once the data are
93 organized in a frequency distribution format, the mode can be identified. In
some cases there may be more than one mode in a distribution if two or
more scores share the highest frequency. A set of scores with two modes is
called bimodal; those with more than two are called multimodal. The
92 following is a step-by-step demonstration of calculating the measures of
central tendency. Mr. Walker creates a table to display the range of scores
obtained by his students:
91
Calculating Mean, Median, and Mode
91
65
65
65
56
43
Representation of the Mean, Median, and Mode on a Curve
22
number that divides the scores into two equal groups and is the score that occurs
most frequently.
The shape of a distribution of your test scores can provide useful clues about your
test and your students’ performance. When representing students’ scores on a
graph, the scores often will be positively or negatively skewed. When the
distribution is positively skewed, that implies that the most frequent scores (the
mode) and the median are below the mean. If your test is very difficult, there may
be many low scores and few high ones. The distribution of scores would have a
shape similar to the one depicted below that is positively skewed.
When the tail points to the left, the distribution is negatively skewed. In this
distribution there are high scores and relatively few low scores. Notice that the
mean is influenced by the skewing.
The mean can be distorted if there are some scores that are extremely different
(outliers) from the mean of the majority of scores for the group. Consequently, the
median is the most descriptive measure of central tendency.
Indicators of Variability
Variability is the dispersion of the scores within a distribution. Given a test, a group
of students with a similar level of performance on a specific skill tend to have scores
close to the mean. Another group with varying levels of performance will have
scores widely spread and further from the mean. In other words, how varied are the
scores? Two common measures of variability are the range and standard deviation.
Range
The range, R, is the difference between the lowest and the highest scores in a
distribution. The range is easy to compute and interpret, but it only indicates the
difference between the two extreme scores in a set.
If we use the scores from Mr. Walker’s class (above), we would calculate the range
as: Range (R) = the highest score – the lowest score in the distribution.
95 91 100 96 92 91 87 84 70 65 96 65 56 86 43 65 22 40 93
Standard Deviation
A more useful statistic than simply knowing the range of scores would be to see how
widely dispersed different scores is from the mean. The most common measure of
variability is the standard deviation (SD). The standard deviation is defined as the
numeric index that describes how far away from the mean the scores in the
distribution are located. The formula for the standard deviation is:
The higher the standard deviation, the wider the distribution of the scores is around
the mean. This indicates a more heterogeneous or dissimilar spread of raw scores
on a scale. A lower value of the standard deviation indicates a narrower distribution
(more similar or homogeneous) of the raw scores around the mean.
Table 5.1.6
96 20.4 416.16
96 20.4 416.16
95 19.4 376.36
93 17.4 302.76
92 16.4 268.96
91 15.4 237.16
91 15.4 237.16
87 11.4 129.96
86 10.4 108.16
84 8.4 70.56
70 -5.6 31.36
65 -10.6 112.36
65 -10.6 112.36
65 -10.6 112.36
56 -19.6 384.16
43 -32.6 1062.76
40 -35.6 1267.36
22 -53.6 2872.96
M = mean = 75.6
N = number of scores = 19