You are on page 1of 2

# Statistics Study Guide

Arithmetic Mean

## parts (i.ee, percentiles). When given a table of ordered data,

we can use the following formula to find the quartile values.

## Let the random variable X have variate values of xi for

i = 1, 2, . . . , n. The arithmetic mean is
x
=

n
1X
xi
n i=1

## If there are different frequencies for the values of xi , denoted

fi , we can rewrite the arithmetic mean as a weighted average
where the weight are the frequencies
Pn
fi x i
x
= Pi=1
n
i=1 fi
=

n
1X
fi x i
n i=1

Geometric Mean
When the variate values xi are all strictly greater than zero,
the geometric mean is defined as
G = (x1 x2 xn )1/n =

n
hY

xi

xi i

i1/n

## The rth moment about the origin is

0
1 Pn
r
E(X r ) = r = n
i=1 fi xi

Mode
The variate value which occurs the most. When given a table
of ordered data, we can use the following formula to find the
mode.
fm fm1
M ode = l +
h
2fm (fm+1 + fm1 )

Measures of Dispersion
Range
Range = xmax xmin

Inter-Quartile Range

## If there are different frequencies for the values of xi , denoted

fi , the geometric mean is
n
hY

fm

## where i = 1, 2, 3, l = the lower limit of the group which has the

Qi (called the modal class), N = total number of observations,
C = the cumulative frequency up to the upper limit of the
group preceding the modal class, fm = frequency of modal
class, and h = width of the modal class.

i1/n

i=1

## G = (xf11 xf22 xfnn )1/n =

Qi = l +

iN
4

IQR = QU QL = Q3 Q1

th
The
the mean is
 r moment

n
1
r
E (X )r = r = n
i=1 fi (xi )

Factorial Moments
th
The
moment
is
 r  factorial


0
E X(r) = (r) = E X(X 1)(X 2) (X r + 1)

Moment Identities
0 = 1
0
1 = 1 = 0
2 = 2 2
0

3 = 3 32 1 +

Harmonic Mean
When the variate values xi are all nonzero, to find the
harmonic mean, first invert all the variate values and then find
the arithmetic mean. This will give you the invers of the
harmonic mean
Pn
1
n
1X 1
1
i=1 fi xi
= Pn
=
fi
H
n i=1 xi
i=1 fi
H = Pn

fi
i=1 xi

(2) = 2
231
0

(3) = 3 3

4 = 4 43 1 + 62 12 314

Skewness
To measure the departure or lack of symmetry of the
probability distribution

Variance
Let us first define the mean square deviation from any
arbitrary value a

n
n
n
Y
1
1X
1X
ln(
xi ) =
ln(xi ) =
fi ln(xi )
n
n i=1
n i=1
i=1

Factorial
(1) = 1

2
3 2
) = 33
3
2
p
1 = 1

1 = (

i=1

ln(G) =

s2 =

n
1X
fi (xi a)2
n i=1

## This equation sums all the differences between each variate

value and the value a. It is trying to capture how dispersed
the data is from a. When we want to measure the deviation
from the origin (i.e, a = 0) then the formula reduces to
s2 =

Kurtosis
To measure the kurtosis, or flatness, of the probability
distribution

n
1X
fi x2i
n i=1

## When we find the mean square deviation from the mean, ,

the mean square deviation is minimized.

2 =

4
22

2 = 2 3

Coefficient of Variation

Proof
cv =

Mean and
SD of Combined Data
P

Median

ni x
i
ni
n ln(G1 )+n2 ln(G2 )
ln(Gmean) = 1
n1 +n2
2
n +n 2
1 n2
var = 1 n1 +n2 2 + (nn+n
x1
2 (
1
2
1
2)

mean =

## The numerical value seperating the higher half of the variates

from the lower half.
P (X xm ) = P (X xm ) = 0.50
Other ways that the variates can be divided is into 4 equal
parts (i.e. quartiles), 10 equal parts (i.e, deciles), or 100 equal

i=1
P
m

Probability

i=1

x
2 )2

Moments
The rth moment about a is

1
n

Pn

i=1

fi (xi a)r .

## The sample space is the set of all possible outcomes of an

experiment, denoted . The subset of interest is called an
event, dentoted A, B, . . .. The probability of an event A,
denoted P (A) , is the ratio of the number of elements that are
favorable to event A over the total number of elements in .

Properties
0 P (A) 1
P () = 1
P (A) + P (Ac ) = 1
P (A B) = P (A) + P (B) P (A B)
If A and B are mutually exclusive events, then P (A B) = 0.
We refer to the probability of event A given that event B has
P (AB)
occurred as the conditional probability, P (A | B) = P (B) .
P (A B) = P (A | B)P (B) If events A and B are independent,
then P (A | B) = P (A) and P (A B) = P (A)P (B).

F () = 0
F () = 1
F 0 (x) > 0
The last property just means that the CDF is nondecreasing
function. To find P (a < X b) = F (b) F (a).

## where E(X) = np, V ar(X) = npq, MX (t) = (q + pet )n ,

n!
WX (t) = (1 + pt)n , and (r) = (nr)!
pr .
This formula arises from the following sum:
(p + q)n =

## Properties of Expected Value and

Variance Operators
E(aX + bY ) = aE(X) + bE(Y )
V ar(aX + b) = a2 V ar(X)

n  
X
n
x=0

where

n
x

px q nx

n!
.
x!(nx)!

Random Variables

## Moment Generating Function

Poisson Distribution

## A random variable X associates a real number to each

outcome in . There are two types:

## The MGF allows us to find the moments about the origin. We

define MX (t) = E(etX ). To find the rth moment, take the rth
0 (0).
derivative of M and then set t = 0, MX

## Used to model the number of random occurrences on a

specified unit of space or time

## Discrete: the set of all possible outcomes is a finite set

or countably infinite.
Continuous: the set of all possible outcomes is a finite
set or countably infinite.

## Discrete PDF and CDF

The discrete probability distribution function is
f (x) = P (X = x).
f
(x) 0
P
x f (x) = 1
The discrete cumulative
P distribution function is
F (x) = P (X x) = tx f (x).

## Factorial Moment Generating Function

The FMGF allows us to find the factorial moments. We define
WX (t) = E((1 + t)X ). To find the rth moment, take the rth
0 (0).
derivative of W and then set t = 0, WX

## Discrete Probability Distributions

Binomial Distribution
To find the probability of x successes in n trials where the
probability of succes is p
n
P (X = x) =
px q nx
x

P (x = x) =

mx e m
where m = E(X) = V ar(X)
x!
t

## The MGF is MX (t) = em(e 1) . This arises from the following

sum

X
m2
m3
mx
em =
=1+m+
+
+
x!
2!
3!
x=0
Also note that (r) = mr .