You are on page 1of 5

DERIVATIONS: SECTIONS 3 AND 4 OF STAT 2.

1X
Ani Adhikari University of California, Berkeley Spring 2013 This handout contains algebraic derivations of some results about averages and SDs that are covered in Sections 3 and 4 of Stat 2.1X. Quick review of sigma notation Let x1 , x2 , . . . , xn be a list of numbers. Here n is a xed positive integer that denotes the number of entries in the list. The notation n xi
i=1

is used to denote the sum of the entries in the list. Basic properties of sums 1. Let y1 , y2 , . . . , yn be another list, also consisting of n entries. Then
n n n

(xi + yi ) =
i=1 i=1

xi +
i=1

yi

2. Let a and b be constants. Then


n n n n

axi = a
i=1 i=1

xi
i=1

(xi + b) =
i=1

xi + nb

THE AVERAGE The average or mean of the list will be denoted by , which is the Greek letter mu. The average is dened to be 1 n = xi n i=1 Two ways of smoothing: By Property 2 of sums, 1 = n
n n

xi =
i=1 i=1

1 xi = n

n i=1

xi n

This formalizes the idea that the we can think of the average as a smoothing operation conducted in one of two ways: The denition of the average says, Put everyones contribution into one big pot and then divide the pot equally among the people. That is equivalent to saying, Divide each persons contribution equally among all people.

Markovs inequality: Statement 1. Suppose xi 0 for 1 i n. Let k > 0 be a constant.The proportion of entries that are greater than or equal to k is at most 1/k . Proof. First note that an equivalent statement is: Markovs inequality: Statement 2. Suppose xi 0 for 1 i n. Let c > 0 be a constant. The proportion of entries that are greater than or equal to c is at most /c. That the two statements are equivalent can be seen by setting c = k. We will now prove the second form.

n = all entries =

xi xi + all entries<c all entriesc + c 0 xi

all entriesc c(number of entries c) The rst term in the third line uses the fact that the entries in the list are non-negative. Divide by n on both sides to get c(proportion of entries c) which is equivalent to (proportion of entries c) /c as was to be proved. Deviations from average. Let di = xi be the ith deviation from average. Then and hence the average of the list of deviations is 0. Proof.
n n n n i=1 di

= 0,

di =
i=1 i=1

( x i ) =
i=1

xi n = n n = 0

The second equality follows from Property 2 of sums. THE STANDARD DEVIATION Denition. The standard deviation of a list of numbers is dened by SD = root mean square of the deviations from average The SD will be denoted by , which is the Greek letter sigma. By denition, = 1 n
n

(xi )2
i=1

The quantity inside the square root is called the variance of the list: variance = mean square of the deviations from average 2

That is, the variance is


2

1 = n

(xi )2
i=1

The unit of measurement of variance is the square of that of the list, so it is often hard to interpret physically. However, variance has mathematical properties that make it easy to compute. Therefore many results about SDs are derived by rst nding the variance using its computational properties, and as a nal step taking the square root to get the SD which has units that make sense. Computational formula. Here is a useful property of the variance, which speeds up computation. variance = (mean of squares) (square of mean) That is, 2 = Proof. 2 = 1 n
n n 2 x2 i i=1

1 n

( x i ) 2 =
i=1

1 n

n 2 (x2 i 2xi + ) i=1

1 n

x2 i
i=1

1 n

1 2xi + n i=1 1 n
n

2
i=1

1 n

x2 i 2
i=1

xi +
i=1

1 2 n n

1 n

n 2 2 x2 i 2 + i=1

1 n

n 2 x2 i i=1

Note. Since 2 0, the computational formula implies that mean of squares square of mean for all lists; and that the mean of the squares is equal to the square of the mean if and only if the variance is 0, that is, if and only the standard deviation is 0, that is, if and only if all the numbers in the list are the same.

Chebychevs inequality: In any list of numbers, the proportion of entries that are k or more SDs away from the average is at most 1/k 2 . Proof. The idea of the proof is to notice that the variance is the mean of the list of squared deviations, and to apply Markovs inequality to that list.
2 Step 1. Let di = xi be the ith deviation from average, as before. Let wi = d2 i = (xi ) be the ith squared deviation. The variance is the mean squared deviation, and so

1 = n

wi
i=1

In other words, w1 , w2 , . . . , wn is a list of non-negative numbers whose average is 2 . Markovs inequality can be applied to this list. Statement 2 of Markovs inequality says that for any c > 0, (proportion of entries such that wi c) 2 /c Step 2. For xi to be k or more SDs away from the average, its distance from must be at least k . So it must satisfy |xi | k . So our job is to show that (proportion of entries such that |xi | k ) 1/k 2 Equivalently, we have to show that (proportion of entries such that (xi )2 k 2 2 ) 1/k 2 Equivalently we have to show that, in the notation of Step 1, (proportion of entries such that wi k 2 2 ) 1/k 2 Step 3. So take c = k 2 2 in the result of Step 1. This leads to (proportion of entries such that wi k 2 2 ) 2 /k 2 2 = 1/k 2 Linear transformations. Let a and b be constants. Construct a new list whose ith element is yi = axi + b. Because we will now have a couple of means and SDs in our calculations, let us give them names that distinguish them from each other: x = average of the list of xs x = SD of the list of xs y = average of the list of y s y = SD of the list of y s Then y = ax + b Proof. y = 1 n
n n

y = |a|x

(axi + b) =
i=1

1 n

axi +
i=1

n 1 1 1 nb = a xi + b = a nx + b = ax + b n n n i=1

To prove the formula for the SDs, notice that the ith deviation of the y s is yi y = (axi + b) y = (axi + b) (ax + b) = a(xi x ) 4

In other words, each deviation of the y s is equal to a times the corresponding deviation of the xs. This implies that each squared deviation of the y s is equal to a2 times the corresponding squared deviation of the xs. Thus
2 2 y = a2 x

and so

y = |a|x

because the SD is the positive square root of the variance. Standard units. For each i, let zi = xi in standard units be dened by zi = Then z = 0 and z = 1. Proof. For each i, zi = 1 x xi = axi + b x x x i x x

where a = 1/x and b = x /x . Apply the results about the mean and SD of a linear transformation to get 1 x z = ax + b = x =0 x x and z = 1 x = 1 x

Note: We are assuming x > 0, because if x were equal to 0 then we could not divide by it. But if x = 0 then all the xs would be equal, which is not a case that requires analysis by conversion into other units.