Professional Documents
Culture Documents
O.M. “ Although analysis from raw quantitative data is possible (as we saw in the previous
lesson), it is not always practical for large data sets and therefore at times it is
necessary to apply structure to the data before applying calculations for the
analysis process. We can structure quantitative data in two ways: ungrouped and
grouped. However, in this lesson, we will look specifically at how to generate
descriptive statistics from ungrouped data. We will also look at how to interpret
the results generated from analyzing data structured using a stemplot. ”
⇒ there are two middle items in the 20th and 21st rank
To see this, we attach an additional column called the cumulative
frequency.
Number Frequency Cumulative
of sessions Frequency
0 7 7
1 1 7+1=8
2 1 8+1=9
3 3 9 + 3 = 12
4 5 12 + 5 = 17 20th rank
5 8 17 + 8 = 25 21st rank
6 5 30
7 4 34
8 3 37
9 2 39
10 1 40
Ans: median is 5
𝑠𝑢𝑚 𝑜𝑓 (𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠 × 𝑑𝑎𝑡𝑎 𝑖𝑡𝑒𝑚𝑠)
(iii) Mean =
𝑠𝑢𝑚 𝑜𝑓 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠
∑ 𝑓𝑥
= ∑𝑓
∑ 𝑓𝑥 182
Hence, mean = ∑𝑓
= = 4.55
40
Thus, the 30th value is a 6 and the 31st value is a 7. So the upper quartile is
(6 + 7) ÷ 2 = 6.5
1
Position of the lower quartile = (n + 1)th rank =
4
1 1 th
= (40 + 1)th rank = 10 rank
4 4
x f fx x – 𝑥̅ (x – 𝑥̅ )2 f (x – 𝑥̅ )2
0 7 0 – 4.55 20.7025 144.9175
1 1 1 – 3.55 12.6025 12.6025
2 1 2 – 2.55 6.5025 6.5025
3 3 9 – 1.55 2.4025 7.2075
4 5 20 – 0.55 0.3025 1.5125
5 8 40 0.45 0.2025 1.62
6 5 30 1.45 2.1025 10.5125
7 4 28 2.45 6.0025 24.01
8 3 24 3.45 11.9025 35.7075
9 2 18 4.45 19.8025 39.605
10 1 10 5.45 29.7025 29.7025
∑ 𝑓 = 40 ∑(𝑥 − 𝑥 ̅ ) = 313.9
∑(𝑥− 𝑥̅ ) = 313.9
Using the results in the table above, we get variance = ∑ 𝑓= 40
≈ 7.85 (3 s.f. )
Hence, the variance of the swimming sessions is 7.85
(The value 7.85 indicates a fairly high variability(inconsistency) within the data)
∑(𝑥− 𝑥̅ ) = 313.9
b) Standard deviation = √ ∑ 𝑓= 40
= √7.85
≈ 2.80 (3 s.f.)
Hence, the standard deviation of swimming sessions is 2.80
(The value 2.8 indicates that the data has a fairly low degree of spread.)
Mean
Median
Mode
Mode Mean
Median
Mean Mode
Median
Example 2: The marks of two students as stated in their report books are:
Oliver: 78 82 65 92 83 85 64 75 79
Olivia: 86 75 83 77 78 87 67 95 79
The back-to-back stemplot for this data is given below:
( Note: A back-to-back stemplot is a variation which enables easy comparison between two sets of data.)
64+65+75+78+79+82+83+85+92
(iii) Mean for Oliver’s scores =
9
= 703 ÷ 9 ≈ 78.1
67+75+77+78+79+83+86+87+95
Mean for Olivia’s scores =
9
= 727 ÷ 9 ≈ 80.8
(b) Distribution curve for Oliver’s scores:
Since the mean is less than the median, the distribution is skewed to the left
(negatively skewed) which means Oliver’s performance tends towards scores less
than 79.
Distribution for Olivia’s scores.
Since the mean is more than the median, the distribution is skewed to the
right (positively skewed) which means Olivia’s performance tends towards scores
more than 79.
TAKE-AWAYS
• Ungrouped data refers to data organized into single values of an attribute along with
their respective frequencies.
• The cumulative frequency is the sum of frequencies up to a particular item in the data
set.
• Variance is a measure of the degree of variability(inconsistency) within the data set
• Standard deviation is a measure of the degree of spread i.e. how far the data set in
general deviates from a central value (such as mean)
• If mean, mode and median are equal, the distribution is described as symmetrical
• If mean is greater than the median, the distribution is skewed to the right (positively
skewed)
• If mean is less than the median, the distribution is skewed to the left (negatively
skewed)