Adv Stat Central Tendency Variations

MEASURES OF CENTAL TENDENCY
and
VARIABILITY
Prepared by:
DR LEONORA T. DELA CRUZ
Central Tendency
 Central tendency (sometimes called “measures of location,”
“central location,” or just “center”) is a way to describe
what’s typical for a set of data. Central tendency doesn’t tell
you specifics about the individual pieces of data, but it does
give you an overall picture of what is going on in the entire
data set.
 It is a single value that attempts to describe a set of data by
identifying the central position within that set of data. As
such, measures of central tendency are sometimes called
measures of central location.
 A balance scale. The point at which the distribution is in
balance.
Measures of Central Tendency
 The sum of the value of each observation in a dataset divided by the number of observations.
This is also known as the arithmetic average.
1.  The mean can be used for both continuous and discrete numeric data.
Mean  The mean cannot be calculated for categorical data, as the values cannot be summed.
 As the mean includes every value in the distribution the mean is influenced by outliers and
skewed distributions.
 The median is the middle value in distribution when the values are arranged in ascending or
descending order. It divides the distribution in half (there are 50% of observations on either side
of the median value). In a distribution with an odd number of observations, the median value is
2.
the middle value.
Median  The median is less affected by outliers and skewed data than the mean, and is usually the
preferred measure of central tendency when the distribution is not symmetrical.
 The median cannot be identified for categorical nominal data, as it cannot be logically ordered.
 The value that occurs most often in the data.
 Can be found for both numerical and categorical (non-numerical) data.
 In some distributions, the mode may not reflect the center of the distribution very well.
3.
 It is also possible for there to be more than one mode for the same distribution of data, (bi-
Mode modal, or multi-modal).
 In some cases, particularly where the data are continuous, the distribution may have no mode
at all (i.e. if all values are different).
How does the shape of a distribution influence the
Measures of Central Tendency?
Symmetrical distributions:
When a distribution is symmetrical, the mode, median and
mean are all in the middle of the distribution.
Positively skewed distributions:
The mean to be ‘pulled’ toward the right tail of the distribution.
Although there are exceptions to this rule, generally, most of the
values, including the median value, tend to be less than the
mean value.
Negatively skewed distributions:
The mean to be ‘pulled’ toward the left tail of the distribution.
Although there are exceptions to this rule, generally, most of the
values, including the median value, tend to be greater than the
mean value.
Finding the mean and median of ungrouped data
The position of the median is:
{(n + 1) ÷ 2}th value,
where n is the number of values in a set of data.

Exercises:
Exercises 6 (Raw Data): The ages of 15 randomly selected customers at a local Best Buy are listed
below:
23, 21, 29, 24, 31, 21, 27, 23, 24, 32, 33, 19, 24, 21, 31
Determine the mean, median and mode of the data.
Array: 19, 21, 21, 21, 23, 23, 24, 24, 24, 27, 29, 31, 31, 32, 33
Mean Median Mode:
8th score The value that occurs most
x̄ = 383 Mdn = 24 Mo: 21 and 24

15
x̄ = 25.53
Exercises:
Exercises 7 (Frequency Table): Determine the mean, median and mode
of a simple frequency distribution of the retirement age data.
fx Mean Median
162 6th score
55
56 Mdn = 57
114 x̄ = 623
11
116
x̄ = 56.64 Mode:
120
n = 11
The value that occurs most
Σfx = 623
Mo: 54
Mean, Median and Mode for Grouped Data
• Mean
■ Median
■ Mode
Exercises 8 (Grouped Data): A group of University students took part in a sponsored
race. The number of laps completed is given in the table below. Use the information to (a)
calculate an estimate for the mean number of laps; (b) Determine the mode; and (c)
Solve for the median.
Number of Frequency Midpoint (f x) Cumulative Frequency
Laps (f) (x) (F)
1–5 2 3 6 2
6 – 10 9 8 72 11
11 – 15 13 13 169 24
16 – 20 22 18 396 46
21 – 25 17 23 391 63
26 – 30 25
28 700 88
31 – 35 2
33 66 90
36 – 40 1
38 38 91
Σf or n = 91 Σfx = 1,838
Laps (f) (x) (F)
1–5 2 1st to 2nd
3 6 2
6 – 10 9 8 72 3rd to 11th
11
11 – 15 13 13 169 24 12th to 24th
16 – 20 22 18 396 46 25th to 46th
21 – 25 17 23 391 63 47th to 63rd
26 – 30 25
28 700 88 64th to 88th
31 – 35 2
33 66 90 89th to 90th
36 – 40 1
38 38 91 91st
Σf or n = 91 Σfx = 1,838
Laps (f) (x) (F) Mean
1–5 2 3 6 2
6 – 10 9 8 72 11
11 – 15 13 13 169 24
16 – 20 22 18 396 46
21 – 25 17 23 391 63 x̄ = 1,838
26 – 30 25
28 700 88 91
31 – 35 2
33 66 90
36 – 40 1
38 38 91
x̄ = 20.19
Σf or n = 91 Σfx = 1,838
Laps (f) (x) (F) Median
1–5 2 3 6 2
6 – 10 9 8 72 11
Lower Boundary
11 – 15 13 13 169 24
16 – 20 22 18 396 46
15.5
21 – 25 17 23 391 63 91 - 24
26 – 30 25 2
28 700 88
31 – 35 2 Mdn = 15.5 + 22 5
33 66 90
36 – 40 1
38 38 91
Mdn = 15.5 + 4.89
Σf or n = 91 Σfx = 1,838 Mdn = 20.39
Exercises 8 (Grouped Data): A group of University students took part in a sponsored race. The
number of laps completed is given in the table below. Use the information to (a) calculate an
estimate for the mean number of laps; (b) Determine the mode; and (c) Solve for the median.

Laps (f) (x) (F) Mode
1–5 2 3 6 2
6 – 10 9 8 72 11
11 – 15 13 13 169 24
16 – 20 22 18 396 46
21 – 25 17 23 391 63
26 – 30 25 25 - 17
28 700 88
31 – 35 2 Mo = 25.5 + 5
33 66 90
36 – 40 1 2(25) - 17- 2
38 38 91
Mo = 25.5 + 1.29
Σf or n = 91 Σfx = 1,838 Mo = 26.79
Describing Other Locations in a Distribution
 The median of a distribution splits the data into two

equally-sized groups.
 Quartiles are the three values that split a data set into
four equal parts. Note that the 'middle' quartile is the
median.
Quartiles
Quartiles for Ungrouped Data:
Exercises:
Exercises 9: The following data are marks obtained by 20 students in a test of
statistics. Determine the Q1, Q2, and Q3
53 74 82 42 39 20 81 68 58 28
67 54 93 70 30 55 36 38 29 61
Array 2 28 29 30 36 38 39 42 53 54 55 58 61 67 68 70 74 81 82 93
0
Oder 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Q1 = value of (20 + 1)th item Q2 = value of (20 + 1)th item Q3 = value of 3(20 + 1)th item
4 2 4
Q1 = value of 5.25th item Q2 = value of 10.5th item Q3 = value of 15.75th item
Q1 = 36 + 0.25 (38 – 36) Q2 = 54 + 0.5 (55 – 54) Q3 = 68 + 0.75 (70 – 68)
Q1 = 36 + 0.25 (2) = 36 + 0.5 Q2 = 54 + 0.5 (1) = 54 + 0.5 Q3 = 68 + 0.75 (2) = 68 + 1.5
= 36.5 = 54.5 = 69.5
Quartiles for Grouped Data
 To find the location of the quartiles use:
 q1 = (¼) n for Q1;
 q2 = (½) n for Q2; and
 q3 = (¾) n for Q3
 Formula to determine values for quartiles:

Exercise 10: Shown below is a data on Number of births to women by
current age. Calculate Q1 and Q3.
Age in Number of Cumulative
location of q1 = (¼) n
Years Births (f) Frequency
location of q1 = (¼) 570
(F)
68 location of q1 = 142.5
15 – 19 68 259
20 – 24 191 429 142.5 - 68
25 – 29 170 Q1 = 19.5 + 5
531
191
30 – 34 102 560
35 – 39 29 568 Q1 = 19.5 + 1.95
40 – 44 8 570
Q1 = 21.45
45 – 49 2
n = 570
Exercise 10: Shown below is a data on Number of births to women by
current age. Calculate Q1, Q2, and Q3.
Age in Number of Cumulative
Q3 = ? location of q3 = (¾) n
Years Births (f) Frequency
location of q3 = (¾) 570
(F)
68 location of q3 = 427.5
15 – 19 68 259
20 – 24 191 429 427.5 - 259
25 – 29 170 Q3 = 24.5 + 5
531 170
30 – 34 102 560
35 – 39 29 568 Q3 = 24.5 + 4.96
40 – 44 8 570 Q3 = 29.46
45 – 49 2
n = 570
MEASURES OF VARIABILITY
Variability refers to how spread out a group of data is. It measures how much the
scores differ from each other. Variability is also referred to as dispersion or spread. Data
sets with similar values are said to have little variability, while data sets that have values
that are spread out have high variability.
Groups in semester 2 show more dispersion (or variability in size) than those in semester
1.
Four Frequently used Measures of Variability
1. Range is simply the highest score minus the lowest score. It is easy to
calculate and very much affected by extreme values (range is not a
resistant measure of variability)
Range of ungrouped data:

R = maximum – minimum
Range of grouped data:
R = upper boundary of the highest interval – lower boundary of the lowest interval
2. The interquartile range (IQR) is the difference between upper and lower quartiles and
denoted as IQR. In some texts the interquartile range is defined differently. It is defined
as the difference between the largest and smallest values in the middle 50% of a set of
data. IQR is not affected by extreme values. It is thus a resistant measure of variability.
IQR = upper quartile – lower quartile = Q3 – Q1 = 75th percentile – 25th percentile.

3. The standard deviation is a measure that summarizes the amount by which every
value within a dataset varies from the mean. Effectively it indicates how tightly the
values in the dataset are bunched around the mean value. When the values in a
dataset are pretty tightly bunched together the standard deviation is small. When
the values are spread apart the standard deviation will be relatively large. The
standard deviation is usually presented in conjunction with the mean and is
measured in the same units.
4. The Variance is defined as the average of the squared differences from the Mean.
Formula for the standard deviation and variance
n = 20 Mark (X) x – x̅ (x – x̅)2
Exercises 43
48
50
-18.9
-13.9
-11.9
357.21
193.21
141.61
Exercises 11 (Ungrouped Data): Shown below 50 -11.9 141.61
are the examination marks for 20 students 52 -9.9 98.01
following a particular module. Determine the 52 -9.9 98.01
values of (a) Range; (b) IQR; (c) Standard 56 -5.9 34.81
58 -3.9 15.21
Deviation; and (d) Variance using the table
59 -2.9 8.41
below 60 -1.9 3.61
60 74 76 78 66 68 50 56 58 43
62 0.1 0.01
48 50 59 62 70 71 80 65 52 52 65 3.1 9.61
66 4.1 16.81
Mean Range 68 6.1 37.21
70 8.1 65.61
R = 80 – 43 71 9.1 82.81
74 12.1 146.41
x̄ = 1,238 R = 37 14.1 198.81
76
20 78 16.1 259.21
80 18.1 327.61
x̄ = 61.9
Σx = 1,238 Σ(x– x̅ )2 = 2,235.79
n = 20 Mark (X) x – x̅ (x – x̅)2
Exercises 43
48
50
-18.9
-13.9
-11.9
357.21
193.21
141.61
58 -3.9 15.21
59 -2.9 8.41
below 60 -1.9 3.61
60 74 76 78 66 68 50 56 58 43
62 0.1 0.01
48 50 59 62 70 71 80 65 52 52 65 3.1 9.61
66 4.1 16.81
Q1 = value of (20 + 1)th item Q3 = value of 3(20 + 1)th item
4 68 6.1 37.21
4
Q3 = value of 15.75th item 70 8.1 65.61
Q1 = value of 5.25th item
71 9.1 82.81
Q1 = 52 + 0.25 (52 – 52) Q3 = 70 + 0.75 (71 – 70)
74 12.1 146.41
Q1 = 52 + 0.25 (0) = 52 + 0 Q3 = 70 + 0.75 (1) = 70 + 0.75 76 14.1 198.81
= 52.0 = 70.75
78 16.1 259.21
80 18.1 327.61
IQR = Q3 – Q1 = 70.75 – 52.0 = 18.75
Σx = 1,238 Σ(x– x̅ )2 = 2,235.79
n = 20 Mark (X) x – x̅ (x – x̅)2
Exercises 43
48
50
-18.9
-13.9
-11.9
357.21
193.21
141.61
58 -3.9 15.21
59 -2.9 8.41
below 60 -1.9 3.61
60 74 76 78 66 68 50 56 58 43
62 0.1 0.01
48 50 59 62 70 71 80 65 52 52 65 3.1 9.61
66 4.1 16.81
Standard Deviation Variance
68 6.1 37.21
70 8.1 65.61
71 9.1 82.81
74 12.1 146.41
s2 = (10.85) 2 76 14.1 198.81
2,235.79
78 16.1 259.21
20 – 1 s2 = 117.72
80 18.1 327.61
s = 10.85 Σx = 1,238 Σ(x– x̅ )2 = 2,235.79
Exercises
Exercises 12 (Ungrouped Data in Frequency distribution): 15 students were asked how
many hours (x) they worked per day. Their responses in hours are listed below. Determine
the values of (a) Range; (b) IQR; (c) Standard Deviation; and (d) Variance
Hours of Work Frequency fx x – x̅ (x – x̅)2 (x – x̅)2 f Range: R=8–2=6
(x) (f)
2 2 4 -2.93 8.58 17.16 Q1 = value of (15 + 1)th item = value of 4.00th item
4 4 16 -0.93 0.86 3.44
4
5 5 25
Q1 = 4.00
7 3 0.07 0.0049 0.0245
Q3 = value of 3(15 + 1)th item = value of 12th item
8 1 21 2.07 4.28 12.85 4
8 3.07 9.42 9.42 Q3 = 7.00
n = 15 Σfx = 74 Σ(x – x̅)2 f = 42.89

IQR = Q3 – Q1 = 7.00 – 4.00 = 3.00
x̄ = 74 = 4.93
15
Exercises
Exercises 12 (Ungrouped Data in Frequency distribution): 15 students were asked how
many hours (x) they worked per day. Their responses in hours are listed below. Determine
the values of (a) Range; (b) IQR; (c) Standard Deviation; and (d) Variance
Hours of Work Frequency fx x – x̅ (x – x̅)2 (x – x̅)2 f Standard Deviation Variance
(x) (f)
2 2 4 -2.93 8.5849 17.1698
4 4 16 -0.93 0.8649 3.4596
5 5 25
7 3 0.07 0.0049 0.0245
21 s2 = (1.75) 2
8 1 2.07 4.2849 12.8547 42.89
8 3.07 9.4249 9.4249 15 – 1 s2 = 3.06
n = 15 Σfx = 74 Σ(x – x̅)2 f = 42.89
s = 1.75
x̄ = 74 = 4.93
15
Exercises
Exercise 13 (Grouped Data): 220 students were asked the number of hours per week
they spent watching television. With this information, calculate the mean and standard
deviation of hours spent watching television by the 220 students. Determine the values
of (a) Range; (b) IQR; (c) Standard Deviation; and (d) Variance
Number of hours per week spent watching television

Hours Number of Midpoint x – x̅ (x – x̅)2 (x – x̅)2 f
students (x)
(f)
F
10 – 14 2 2
15 – 19 12 14
20 – 24 23 37
25 – 29 60 97
30 – 34 77 174
35 – 39 38 212
40 – 44 8 220
n = 220
Exercises

students (x) Mean
(f)
F
10 – 14 2 2 12 -17.82 317.55242 635.1048 x̄ = (12)(2) + (17)(12) + (22)(23)… (42)(8)
15 – 19 12 14 17 -12.82 164.3524 1972.2288 220
20 – 24 23 37 22 -7.82 61.1524 1406.5052
25 – 29 60 97 27 -2.82 7.9524 477.144 x̄ = 6,560 = 29.82
30 – 34 77 174 32 2.18 4.7524 365.9348
220
35 – 39 38 212 37 7.18 51.5524 1958.9912
40 – 44 8 220 42 12.18 148.3524 1186.8192
n = 220 Σ(x – x̅)2 f = 8002.728

Exercises

Hours Number of Midpoint x – x̅ (x – x̅)2 (x – x̅)2 f Range
students (x)
(f)
F R = 44.5 – 9.5
10 – 14 2 2 12 -17.82 317.55242 635.1048
15 – 19 12 14 17 R = 35
-12.82 164.3524 1972.2288
20 – 24 23 37 22 -7.82 61.1524 1406.5052
25 – 29 60 97 27 -2.82 7.9524 477.144
30 – 34 77 174 32 2.18 4.7524 365.9348
35 – 39 38 212 37 7.18 51.5524 1958.9912
40 – 44 8 220 42 12.18 148.3524 1186.8192
n = 220 Σ(x – x̅)2 f = 8002.728

Exercises
location of q1 = (¼) n = 55
students (x)
(f) 55 - 37
F
10 – 14 2 2 12 -17.82 317.55242 635.1048
Q1 = 24.5 + 5
15 – 19 12 14 17 -12.82 164.3524 1972.2288
60
20 – 24 23 37 22 -7.82 61.1524 1406.5052
25 – 29 60 97 27 -2.82 7.9524 477.144
Q1 = 24.5 + 1.5
30 – 34 77 174 32 2.18 4.7524 365.9348
35 – 39 38 212 37 7.18 51.5524 1958.9912 Q1 = 26
40 – 44 8 220 42 12.18 148.3524 1186.8192
n = 220 Σ(x – x̅)2 f = 8002.728

Exercises
location of q3 = (¾) n = 165
students (x)
(f) 165 - 97
F
10 – 14 2 2 12 -17.82 317.55242 635.1048
Q3 = 29.5 + 5
15 – 19 12 14 17 -12.82 164.3524 1972.2288
77
20 – 24 23 37 22 -7.82 61.1524 1406.5052
25 – 29 60 97 27 -2.82 7.9524 477.144
Q3 = 29.5 + 4.42
30 – 34 77 174 32 2.18 4.7524 365.9348
35 – 39 38 212 37 7.18 51.5524 1958.9912 Q3 = 33.92
40 – 44 8 220 42 12.18 148.3524 1186.8192
n = 220 Σ(x – x̅)2 f = 8002.728

Exercises

students (x)
(f)
F
10 – 14 2 2 12 -17.82 317.55242 635.1048 IQR = 33.92 – 26 = 7.92
15 – 19 12 14 17 -12.82 164.3524 1972.2288
20 – 24 23 37 22 -7.82 61.1524 1406.5052
25 – 29 60 97 27 -2.82 7.9524 477.144
30 – 34 77 174 32 2.18 4.7524 365.9348
35 – 39 38 212 37 7.18 51.5524 1958.9912
40 – 44 8 220 42 12.18 148.3524 1186.8192
n = 220 Σ(x – x̅)2 f = 8002.728

Exercises
Standard Deviation
Hours Number of Midpoint x – x̅ (x – x̅)2 (x – x̅)2 f 8002.728
students (x)
(f) 220 – 1
F
10 – 14 2 2 12 -17.82 317.55242 635.1048
15 – 19 12 14 17 -12.82 164.3524 1972.2288 s = 6.05
20 – 24 23 37 22 -7.82 61.1524 1406.5052
25 – 29 60 97 27 -2.82 7.9524 477.144
30 – 34 77 174 32 2.18 4.7524 365.9348 Variance
35 – 39 38 212 37 7.18 51.5524 1958.9912
40 – 44 8 220 42 S2 = (6.05) 2 = 36.60
12.18 148.3524 1186.8192
n = 220 Σ(x – x̅)2 f = 8002.728

Quiz
I. Refer to the array of test scores below. Solve for the Mean, Median, Mode,
Q1, Q2, Q3, Range, IQR, Standard Deviation (s), and Variance (s2).
20 28 30 36 39 42 55 58 61 67 68 70 74 82 93
II. Refer to the grouped data below. Calculate the Mean, Median, Mode, Q1,
Q2, Q3, Range, IQR, Standard Deviation (s), and Variance (s2).
Class Interval Frequency (f)
(c. i.)
8 – 11 3
12 – 15 2
16 – 19 4
20 – 23 5
24 – 27 3
28 – 31 1
32 – 35 2

Adv Stat Central Tendency Variations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adv Stat Central Tendency Variations

Uploaded by

Copyright:

Available Formats

MEASURES OF CENTAL TENDENCY

The position of the median is:

{(n + 1) ÷ 2}th value,

where n is the number of values in a set of data.

x̄ = 383 Mdn = 24 Mo: 21 and 24

Number of Frequency Midpoint (f x) Cumulative Frequency

 The median of a distribution splits the data into two

 Formula to determine values for quartiles:

Range of ungrouped data:

Range of grouped data:

IQR = upper quartile – lower quartile = Q3 – Q1 = 75th percentile – 25th percentile.

n = 15 Σfx = 74 Σ(x – x̅)2 f = 42.89

Number of hours per week spent watching television

Number of hours per week spent watching television

n = 220 Σ(x – x̅)2 f = 8002.728

Number of hours per week spent watching television

n = 220 Σ(x – x̅)2 f = 8002.728

n = 220 Σ(x – x̅)2 f = 8002.728

n = 220 Σ(x – x̅)2 f = 8002.728

Number of hours per week spent watching television

n = 220 Σ(x – x̅)2 f = 8002.728

n = 220 Σ(x – x̅)2 f = 8002.728

You might also like