You are on page 1of 2

Statistics Study Guide

Measures of Central Tendancy


Arithmetic Mean

parts (i.ee, percentiles). When given a table of ordered data,


we can use the following formula to find the quartile values.

Let the random variable X have variate values of xi for


i = 1, 2, . . . , n. The arithmetic mean is
x
=

n
1X
xi
n i=1

If there are different frequencies for the values of xi , denoted


fi , we can rewrite the arithmetic mean as a weighted average
where the weight are the frequencies
Pn
fi x i
x
= Pi=1
n
i=1 fi
=

n
1X
fi x i
n i=1

Geometric Mean
When the variate values xi are all strictly greater than zero,
the geometric mean is defined as
G = (x1 x2 xn )1/n =

n
hY

xi

xi i

i1/n

The rth moment about the origin is


0
1 Pn
r
E(X r ) = r = n
i=1 fi xi

Moments About the Mean

Mode
The variate value which occurs the most. When given a table
of ordered data, we can use the following formula to find the
mode.
fm fm1
M ode = l +
h
2fm (fm+1 + fm1 )

Measures of Dispersion
Range
Range = xmax xmin

Inter-Quartile Range

If there are different frequencies for the values of xi , denoted


fi , the geometric mean is
n
hY

fm

where i = 1, 2, 3, l = the lower limit of the group which has the


Qi (called the modal class), N = total number of observations,
C = the cumulative frequency up to the upper limit of the
group preceding the modal class, fm = frequency of modal
class, and h = width of the modal class.

i1/n

i=1

G = (xf11 xf22 xfnn )1/n =

Qi = l +

iN
4

Moments About the Origin

IQR = QU QL = Q3 Q1

th
The
about P
the mean is
 r moment

n
1
r
E (X )r = r = n
i=1 fi (xi )

Factorial Moments
th
The
moment
is
 r  factorial


0
E X(r) = (r) = E X(X 1)(X 2) (X r + 1)

Moment Identities
About the Mean
0 = 1
0
1 = 1 = 0
2 = 2 2
0

3 = 3 32 1 +

Harmonic Mean
When the variate values xi are all nonzero, to find the
harmonic mean, first invert all the variate values and then find
the arithmetic mean. This will give you the invers of the
harmonic mean
Pn
1
n
1X 1
1
i=1 fi xi
= Pn
=
fi
H
n i=1 xi
i=1 fi
H = Pn

fi
i=1 xi

(2) = 2
231
0

(3) = 3 3

4 = 4 43 1 + 62 12 314

Skewness
To measure the departure or lack of symmetry of the
probability distribution

Variance
Let us first define the mean square deviation from any
arbitrary value a

The geometric mean can be simplified in the following way


n
n
n
Y
1
1X
1X
ln(
xi ) =
ln(xi ) =
fi ln(xi )
n
n i=1
n i=1
i=1

Factorial
(1) = 1

2
3 2
) = 33
3
2
p
1 = 1

1 = (

i=1

ln(G) =

About the Origin

s2 =

n
1X
fi (xi a)2
n i=1

This equation sums all the differences between each variate


value and the value a. It is trying to capture how dispersed
the data is from a. When we want to measure the deviation
from the origin (i.e, a = 0) then the formula reduces to
s2 =

Kurtosis
To measure the kurtosis, or flatness, of the probability
distribution

n
1X
fi x2i
n i=1

When we find the mean square deviation from the mean, ,


the mean square deviation is minimized.

2 =

4
22

2 = 2 3

Coefficient of Variation

Proof
cv =

Mean and
SD of Combined Data
P

Median

ni x
i
ni
n ln(G1 )+n2 ln(G2 )
ln(Gmean) = 1
n1 +n2
2
n +n 2
1 n2
var = 1 n1 +n2 2 + (nn+n
x1
2 (
1
2
1
2)

mean =

The numerical value seperating the higher half of the variates


from the lower half.
P (X xm ) = P (X xm ) = 0.50
Other ways that the variates can be divided is into 4 equal
parts (i.e. quartiles), 10 equal parts (i.e, deciles), or 100 equal

i=1
P
m

Probability

i=1

x
2 )2

Moments
The rth moment about a is

1
n

Pn

i=1

fi (xi a)r .

The sample space is the set of all possible outcomes of an


experiment, denoted . The subset of interest is called an
event, dentoted A, B, . . .. The probability of an event A,
denoted P (A) , is the ratio of the number of elements that are
favorable to event A over the total number of elements in .

Properties
0 P (A) 1
P () = 1
P (A) + P (Ac ) = 1
P (A B) = P (A) + P (B) P (A B)
If A and B are mutually exclusive events, then P (A B) = 0.
We refer to the probability of event A given that event B has
P (AB)
occurred as the conditional probability, P (A | B) = P (B) .
P (A B) = P (A | B)P (B) If events A and B are independent,
then P (A | B) = P (A) and P (A B) = P (A)P (B).

F () = 0
F () = 1
F 0 (x) > 0
The last property just means that the CDF is nondecreasing
function. To find P (a < X b) = F (b) F (a).

where E(X) = np, V ar(X) = npq, MX (t) = (q + pet )n ,


n!
WX (t) = (1 + pt)n , and (r) = (nr)!
pr .
This formula arises from the following sum:
(p + q)n =

Properties of Expected Value and


Variance Operators
E(aX + bY ) = aE(X) + bE(Y )
V ar(aX + b) = a2 V ar(X)

n  
X
n
x=0

where

n
x

px q nx

n!
.
x!(nx)!

Random Variables

Moment Generating Function

Poisson Distribution

A random variable X associates a real number to each


outcome in . There are two types:

The MGF allows us to find the moments about the origin. We


define MX (t) = E(etX ). To find the rth moment, take the rth
0 (0).
derivative of M and then set t = 0, MX

Used to model the number of random occurrences on a


specified unit of space or time

Discrete: the set of all possible outcomes is a finite set


or countably infinite.
Continuous: the set of all possible outcomes is a finite
set or countably infinite.

Discrete PDF and CDF


The discrete probability distribution function is
f (x) = P (X = x).
f
(x) 0
P
x f (x) = 1
The discrete cumulative
P distribution function is
F (x) = P (X x) = tx f (x).

Factorial Moment Generating Function


The FMGF allows us to find the factorial moments. We define
WX (t) = E((1 + t)X ). To find the rth moment, take the rth
0 (0).
derivative of W and then set t = 0, WX

Discrete Probability Distributions


Binomial Distribution
To find the probability of x successes in n trials where the
probability of succes is p
n
P (X = x) =
px q nx
x

P (x = x) =

mx e m
where m = E(X) = V ar(X)
x!
t

The MGF is MX (t) = em(e 1) . This arises from the following


sum

X
m2
m3
mx
em =
=1+m+
+
+
x!
2!
3!
x=0
Also note that (r) = mr .