You are on page 1of 3

Dear Students,

Please take your time to read

this short reviewer, it tackles some basic concepts of statistics and

mathematics that will be essential in our succeeding discussions. I fashioned this module to be an easy
read. I will not discuss these in class anymore!
Also, please bring a 3x5 index card next meeting and a 1x1 picture of yourself.
Have a good weekend and start this term right!
Note: We define X & Y as random variables.

Moments of The Probability Distribution

a. The First Moment- The Mean ( )

( )

( )

( )

*Simply put, the equation is just the expected value of X is the weighted sum (weighted by
their probabilities of occurring, f(X) ) of the actual/realized values of X.
The mean is the long-run average value of the random variables over many trials.
It is a measure of centrality [ The other one is the median ]

b. The Second Moment- The Variance (

( ) (

It is the sum of the variations of the actual values from the long-run average (Measure of
As you can see from the equation, the variance is the weighted sum of the deviations of our
actual data from our computed average. So to say, it is just the distance of the actual data from
the mean.
So why is it squared? It is to eliminate potential cancelling out of values, remember that the
square of a negative number is positive.

c. Covariance


) {(


From the equation, it can be seen that the covariance is the weighted sum of the difference of
the random variables (X & Y) from its average values (
It measures how the variables co-move together
For example, looking at the equation it can be seen that if the difference of X from its mean is
positive and the difference of Y from its mean is negative, the product of the two will be a
negative number. Repeating these tests for each realized value of X & Y and summing it up will
draw a clear picture of how X & Y move together. If the sum results to a positive (negative)
number, then the variables move in the same (opposite) direction.

d. Correlation Coefficient

-The correlation coefficient is the standardized covariance, meaning we divide it by the standard
deviations of X & Y.
-It measures the degree/strength of linear association between the two variables.
-Why do we standardize? It is to eliminate the units, because it is possible that X & Y have
different units.
- The correlation coefficient is a number that lies between -1 and 1.

Hypothesis Testing

We define:

The null hypothesis is our a-priori expectation or our conventional wisdom while our alternative
hypothesis is our challenge to this belief.
Therefore we can define hypothesis testing as trying to disprove current beliefs (null) about our sample
at hand by using test statistics (mean, variance, etc. ) as our proof.
Possible outcomes of hypothesis testing
Do Not Reject

When Null is true

TYPE 1 Error
No Error

When null is false

No Error
Type 2 Error


The P-value and rejection rules

the p-value is the lowest significance level at which a null could be rejected ( )
it is also the chance of committing the type I error
we usually set the p-value as .01, .05 or .10.

The rejection rule: if the computed p-value (from testing) is less than our pre-set p-value(either .01,.05
or .10) then we reject the null hypothesis.


Think of our pre-set p-value as our maximum tolerable level of committing a mistake, so if the
computed p is greater than this threshold then our chance of committing a type I error is greater
and therefore it is unacceptable!

Some important mathematical concept to remember

These properties will be very helpful to you once we start deriving in our discussions.

Summation Notation Properties

Taken from : (You can review the summation notation from this

b. Properties of Expected Values

You can check out the first page of this pdf file: