PART A: DESCRIPTIVE STATISTICS
CHAPTER TWO
Measures of Central Location
Keep in mind we are doing Descriptive Statistics, the summary of data, often using one
number to represent a set of many numbers. The term Central Location means, in
some way, the average. We will discuss 4 different types of averages.
(A) The Arithmetic Mean, or more simply, the mean, is nothing but the average. We all
know what an average is, but we will give it a more formal definition anyway. Given a
set of numbers, X1, Xe, Xo, .... the mean of a SAMPLE is indicated by (X-BAR):
ie., the sum of the numbers divided by the number of numbers. And the mean of a
POPULATION is indicated by 1, another Greek letter, the lower case MU.
At this stage, it may be a good idea to explain the difference between a SAMPLE and a
POPULATION. A Population consists of everything we are interested in, and a Sample
is a part (or a sub-set) of that everything. For instance, if you are conducting a survey
to determine the average age of the students in the School of Business, BCIT, then ALL
the students currently registered in the School of Business, BCIT, make up the
population. An example of a sample would be everybody whose last name begins with
the letter “B". Another example would be everybody in Set A, Financial Management,
Notice, from the 2 equations above, that there are n elements (things, objects, people)
ina sample, but N in a population, with n < N, always.
We usually consider a set of numbers as a sample, and we will refer to the mean as X-
bar.
We use the mean so much that itis a good idea to look at some of its properties.
atFirst, consider the relationship between the marginal and the average. We are all taking
Micro-Economics. We may remember an economic theorem that states: If the marginal:
cost is less than the average cost, the average cost will go down, and therefore it pays
to produce that 1 extra unit until the marginal is equal to the average cost.
Let us use some number to demonstrate the above. Suppose there are 3 of you. Your
sister is 23, your brother 21, and you are 19. The average age = (23 + 21 + 19)/3=
21. Let us further suppose your parents, for some reason, decide to adopt a baby. He
is 1. This, 1 year, is the marginal, which is less than the average of 21. Now the
average age Is (23 + 21 +19 +1)/4= 16, The average has now gone down from 21 to
16. We also see that the new average, 16, is nowhere near the age of the 3 grown-ups.
This brings us to the second property of the mean, which is that the mean can be very
easily affected by extremely small or extremely large values.
Let us use another, more dramatic, example to illustrate this point. Imagine a small
apartment building, with only 6 households. Their annual incomes are
$55,000, $62,000
$67,000 ‘$60,000
$71,000, $66,000
The average income is (55 + 62 + 67 + 60 +71 + 66) / 6 = 381/6 = 63.5 or $63,500.
Let us now change the scenario. On the top floor, the penthouse, lives the owner of the
building. He has several other apartment buildings in downtown Vancouver. He goes
around collecting rent, and his annual income is $2,000,000.
$2,000,000
$67,000 $62,000
‘$71,000 $66,000
The average income now is (2,000 + 67 + 60 + 71 + 66) / 5 = 2,264/5 = 452.8 or
$452,800. If | said to you, “You see that apartment building over there? The average
income is close to half a million dollars.” Wouldn't you be tempted to conclude
everybody there is a millionaire? But the truth is far from it.
In the presence of extremely large or extremely small values, the median is a better
indication of central location.
{B) The Median is defined as the value of the middle number (or the mean of the
values of the 2 middle numbers) when all the numbers are. arranged in order of
magnitude. For this set of numbers: 23, 21, 19, 24, 28, itis easy to be tempted to
think 19 is the median because it is in the middle. But remember we have to arrange
them into order of magnitude first: 19, 21, 23, 24, 25, The median is in fact 23.
22Now consider the following set of numbers: 19, 21, 23, 24, 25, 28. The median is
the average of the 2 middle numbers: (23 + 24) /2 = 23.5
‘We should be able to see that there are ALWAYS as many numbers below the median
as there are above it. In the first data set, the median is 23. There are 2 numbers
below it, 19 and 21; and 2 above it, 24 and 25. In the second set, the median is 23.5.
There are 3 numbers below it, and 3 above it.
Let us now refer back to the apartment. For the first scenario, the median is $64,000. If
we first of all arrange the numbers in order, 55, 60, 62, 66, 67, 71, the median is the
average of 62 and 68, which is 64, or $64,000.
For the second scenario, arranging the numbers into order, we have 60, 68, 67, 71,
2,000. The median is 67, or $67,000, which is a much better indication of the center, or
average, of the set of numbers.
(C) The Mode, is defined as the value that appears the most often. It is easy to find.
However, it has certain disadvantages. First, there may be more than 1 mode. In this
sot of numbers: 19, 20, 21, 21, 23, 24, 24, 27. There are 2 modes, 21 and 24,
because they both appear twice, more offen than the other numbers. Second, the
mode may not even exist. For instance, in this set of numbers: 19, 20, 22, 25, 27,
fone of the numbers appear more often than any other number. We say that the modo
does not exist. For this reason, the mode is NOT is very reliable measure of central
location.
(0) The Weighted Mean. Suppose you have $3,000, and make 3 different
investments, some invested at 10%, some at 11%, and some at 12%. Is your average
rate of return equal to (10 + 11 + 12)%/3 = 11%? The answer is YES, but ONLY if you
invested equal amount ($1,000) in each, and NO otherwise. Let us suppose you
invested $200 at 10%, $300 at 11%, and $2,500 at 12%. What is the average rate of
retun earned? This kind of average is called the weighted mean, and is given by the
following formula:
ew
w
First, remember the difference between EXW; and XW, that we discussed in a
previous lecture. Make sure you don't make the mistake. Second, a lot of beginning
‘students have difficulty telling the difference between X, the variable, and W, the
weights. It is illustrative if we list the numbers in a table like this:
28