You are on page 1of 55

General Course

Summary

All the formulas…


Prof. David Moreno Vincent

Facultad de Ciencias de la
Economía y de la Empresa
Bibliography
Key Definitions

• A population is the collection of all items of interest


or under investigation
• N represents the population size

• A sample is an observed subset of the population


• n represents the sample size

• A parameter is a specific characteristic of a


population
• A statistic is a specific characteristic of a sample
Types of Data

Data

Categorical Numerical

Examples:
n Marital Status Discrete Continuous
n Eye Color
(Defined categories or
groups) n Number of Children n Weight
(Counted items) (Measured characteristics)
Arithmetic Mean
n

åx i
x1 + x 2 +  + x n
x= i=1
=
n n

Geometric Mean “G”


n
G= N n1 ×
x x
1
n2 × × ×
2
nn =
xn N
Õx
i =1
ni
i

Harmonic Mean “H”


N N
H= =
1 1 1 n
1
n+ n + ××× + n å n
x
1
1
x 2
2
x n
n
x
i =1 i
i
Finding the Median (iii)

• If data are given in intervals


𝐷𝑎𝑡𝑎 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝐹𝑟𝑒𝑞 𝐶𝑢𝑚
Where:
𝐼# 𝑛# 𝑁# • 𝐼! are the intervals
𝐼1 𝑛1 𝑁1 • 𝐿! is the limit of the Interval
• 𝑐! is the range of the Interval
• 𝑁! is the cumulative frecuency
𝐼! 𝑛! 𝑁!
𝐼2 𝑛2 𝑁2 = 𝑁

• Step 1 - Identify position: N/2


• Step 2 - Determine interval (say, 𝐼! )
• Step 3 - Determine range (𝑐! ) and Limit of the previous one 𝐿!"#

𝑁
2 + 𝑁!"#
𝑀𝑒 = 𝐿!"# + + 𝑐!
𝑛!
Variance

å ( x - x)
n 2
i ni
s2 = i =1

Standard Deviation

n
ni
= å ( xi - x)
2
S= s = 2

i =1 N
Chebyshev’s Theorem

For any population with mean μ and standard deviation


σ , and k > 1 , the percentage of observations that fall
within the interval
[μ + kσ]
Is at least

100[1 - (1/k )]%


2
Coefficient of Variation

• Measures relative variation


• Shows variation relative to mean
• Can be used to compare two or more sets of data
measured in different units

æ sö
CV = çç ÷÷ × 100%
èx ø
Assymmetry measures

• Coefficients of assymmetry of PEARSON:


x - Mo 3( x - Me)
Ap = =
S S

• Coefficient of assymmetry of FISHER:

∑%#$! " 𝑛#
𝑚" 𝑥# − 𝑥̅
𝑔! = " = 𝑁
𝑆 "'
∑%#$! 𝑥# − 𝑥̅ & 𝑛# &
𝑁
Coefficient of Kurtosis

m4
g2 = 4 - 3
S
• Because, in a Normal distribution:

!
𝑚! = 3𝑆
Conditional distribution

• Distribution of values for one variable that exists when


the values of the other variable is specified.

X nij fi/j
x1 n1j f1/j n· j = å nij
....... ...... ..... i

xi nij fi/j nij


fi / j =
..... ..... ...... n· j
Covariance

∑&#$% 𝑥# − 𝑥̅ 𝑦# − 𝑦.
𝐶𝑜𝑣(𝑥, 𝑦) = 𝑆!" =
𝑛

Coefficient of Correlation

Cov (x , y)
r=
sX sY
Obtaining Linear Relationships

• An equation can be fit to show the best linear


relationship between two variables:

Y = a + bx

• Where Y is the dependent variable and X is the


independent variable
Linear regression

S xy
• Regression line y/x: Y=a+bx a= y- 2
x
S x

sxy
by / x = 2
s x

Correlation coefficient
sxy
r=
sx s y
Coefficient of determination

• The coefficient of determination (R²) is a number that


measures how well a statistical model predicts an
outcome. You can interpret the R² as the proportion of
variation in the dependent variable that is predicted by the
statistical model.
&
𝑆()
𝑅& = 𝑟 & =
𝑆(& 𝑆)&
ONLY IN LINEAR REGRESSION!

• It is normally expressed as a porcentaje and is also known


as “goodness of fit”
Simple indexes

• They measure the variation of a single magnitude

pt
I =
t
0
p0

pt Price in current period

p0 Price in base period


Complex indexes

• Arithmetic mean
n
n
pit
å I å t
io
1 pi 0
I0 =
t 1
=
n n
• Geometric mean

n n
pit
I =n
t
0 ÕI 1
t
i0 =n Õ 1 pio
LASPEYRES

• Weight by the quantity of the base year

• GOODS: x1 , x2 ,....., xn

• WEIGHT q10 , q20 ,....., qn 0 ( qi 0 )


n n

åI w i i
n
p
å pioit pio qio åp it qio
Lp = i =1
n
= i
n = i
n
å pio qio
å wi
i =1
i åp
i
io qio
PAASCHE

• Weight by the quantity of the current year

• GOODS: x1 , x2 ,....., xn

• WEIGHT q1t , q2t ,....., qnt ( qit )


n n

åI w i i
n

å
pit
pio
pio qit åp it qit
Pp = i =1
n
= i
n = i
n
å pio qit
å wi
i =1
i åp
i
io qit
EDGEWORTH

• Weight by the average of the quantities of the base


and current years

n n

åp w it i åp it (qio + qit )
Ep = 1
n
= 1
n

åp
1
io wi åp
1
io (qio + qit )
FISHER

• Is obtained from the indexes of LASPEYRES and


PAASCHE

E p = L p × Pp
Value of the n items in monetary units of year t:
n
vt = å pit qit
i =1

vt åp q it it n

0
= i =1
n
= å pi 0 qit = v0
Pt
å pit qit i =1

0
i =1
n Pt DEFLATOR
åp
i =1
q
i 0 it
Probability

Prof. David Moreno Vincent

Facultad de Ciencias de la
Economía y de la Empresa
Assessing Probability

• There are three approaches to assessing the


probability of an uncertain event:

1. Classical probability
NA number of outcomes that satisfy the event
probability of event A = =
N total number of outcomes in the sample space

• Assumes all outcomes in the sample space are equally likely to


occur
Counting the Possible Outcomes

• Use the Combinations formula to determine the


number of combinations of n things taken k at a
time

1 𝑛!
𝐶0 =
𝑘! 𝑛 − 𝑘 !

• where
• n! = n(n-1)(n-2)…(1)
• 0! = 1 by definition
Probability Postulates

• 1. If A is any event in the sample space S, then

0≤𝑃 𝐴 ≤1
• 2. Let A be an event in S, and let Oi denote the basic
outcomes. Then
𝑃 𝐴 = 2 𝑃(𝑂# ) 1
*
(the notation means that the summation is over all the basic
outcomes in A)

• 3. P(S) = 1
Probability Rules

• The Complement rule:

( = 1 − P(A), i.e. P A
P A ( +P A =1

• The Addition rule:


• The probability of the union of two events is

P A ∪ B = P A + P B − P(A ∩ B)
Conditional Probability

• A conditional probability is the probability of one


event, given that another event has occurred:

𝑃(𝐴 ∩ 𝐵) The conditional


𝑃 𝐴𝐵 = probability of A given
𝑃(𝐵) that B has occurred

𝑃(𝐴 ∩ 𝐵) The conditional


𝑃 𝐵𝐴 = probability of B given
𝑃(𝐴) that A has occurred
Odds: Example

• Calculate the probability of winning if the odds of


winning are 3 to 1:
!(#)
odds = = 3 to 1
%&!(#)

• Which is the same as saying that

3 x (1- P(A)) = P(A)


3 – 3P(A) = P(A)
3 = 4P(A)
P(A) = 0.75
Bayes’ Theorem

• Conditional probability of a random event A, given B.

𝑃(𝐵|𝐴+ )𝑃(𝐴+ ) 𝑃(𝐵|𝐴+ )𝑃(𝐴+ )


𝑃 𝐴+ 𝐵 = % =
∑#$! 𝑃(𝐵|𝐴# )𝑃(𝐴# ) 𝑃(𝐵)

• where:
Aj = jth event of n mutually exclusive and
collectively exhaustive events
B = new event that might impact P(Aj)
Expected Value

• Expected Value (or mean) of a discrete distribution


(Weighted Average)

𝜇 = 𝐸 𝑥 = 2 𝑥𝑃 𝑥
(
x P(x)
• Example: Toss 2 coins, 0 .25
x = # of heads, 1 .50
compute expected value of x: 2 .25

E(x) = (0 x .25) + (1 x .50) + (2 x .25)


= 1.0
Variance and Standard Deviation

• Variance of a discrete random variable X

𝜎 ' = 𝐸(𝑋 − 𝜇)' = 0(𝑋 − 𝜇)' 𝑃 𝑥


(

• Standard Deviation of a discrete random variable X

𝜎= 𝜎' = 0(𝑋 − 𝜇)' 𝑃 𝑥


(
Linear Functions of Random Variables

• Let a be any constant.

• a) E a = a and Var a = 0
i.e., if a random variable always takes the value a, it
will have mean a and variance 0

• b) E aX = a𝜇! and Var aX = 𝑎" 𝜎!"

i.e., the expected value of a·X is a·E(x)


Linear Functions of Random Variables

• Let random variable X have mean µ! and variance 𝜎!"


• Let a and b be any constants.
• Let Y = a + bX
• Then the mean and variance of Y are
𝜇) = 𝐸 𝑎 + 𝑏𝑋 = 𝑎 + 𝑏𝜇(

𝜎)' = 𝑉𝑎𝑟 𝑎 + 𝑏𝑋 = 𝑏 ' 𝜎('

• so that the standard deviation of Y is

σY = b σX
Bernoulli Distribution: Mean and Variance

• The mean is µ = P

μ = E(X) = å xP(x) = (0)(1- P) + (1)P = P


X

• The variance is σ2 = P(1 – P)

σ = E[(X - μ) ] = å (x - μ) P(x)
2 2 2

= (0 - P) (1- P) + (1- P) P = P(1- P)


2 2
Binomial Distribution Formula

n! X n-X
P(x) = P (1- P)
x ! (n - x )!

P(x) = probability of x successes in n trials,


with probability of success P on each
Example: Flip a coin four
trial times, let x = # heads:

x = number of ‘successes’ in sample, n=4


(x = 0, 1, 2, ..., n) P = 0.5
n = sample size (number of trials 1 - P = (1 - 0.5) = 0.5
or observations) x = 0, 1, 2, 3, 4
P = probability of “success”
Hypergeometric Distribution Formula

S! (N - S)!
´
CSxCNn--xS x! (S - x)! (n - x)!(N - S - n + x)!
P(x) = =
CNn N!
n! (N - n)!
Where
N = population size
S = number of successes in the population
N – S = number of failures in the population
n = sample size
x = number of successes in the sample
n – x = number of failures in the sample
Poisson Distribution Formula

-λ x
e λ
P(x) =
x!
where:
x = number of successes per unit
l = expected number of successes per unit
e = base of the natural logarithm system (2.71828...)
Poisson Distribution Characteristics

• Mean μ = E(x) = λ

§ Variance and Standard Deviation


σ = E[( X - µ ) ] = λ
2 2

σ= λ
where l = expected number of successes per unit

Statistics for Business and Economics, 6e


Chap 5-40
© 2007 Pearson Education, Inc.
The Uniform Distribution

(continued)
The Continuous Uniform Distribution:

1
if a £ x £ b
b-a
f(x) =
0 otherwise

where
f(x) = value of the density function at any x value
a = minimum value of x
b = maximum value of x
Properties of the Uniform Distribution

a+b
• The mean of a uniform distribution is μ=
2

2
(b - a)
• The variance is σ2 =
12
Linear Functions of Variables

• Let W = a + bX , where X has mean μX and


variance σX2 , and a and b are constants

• Then the mean of W is

μW = E(a + bX) = a + bμX


• the variance is

σ 2
W = Var(a + bX) = b σ
2 2
X

• the standard deviation of W is


σW = b σX
Linear Functions of Variables

• An important special case of the previous results is the


standardized random variable

X - μX
Z=
σX

• which has a mean 0 and variance 1


The Normal Probability
Density Function

• The formula for the normal probability density


function is

1 -(x -μ)2 /2σ 2


f(x) = e
2πs

Where e = the mathematical constant approximated by 2.71828


π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
x = any value of the continuous variable, -¥ < x < ¥
The Standardized Normal

• Any normal distribution (with any mean and variance


combination) can be transformed into the
standardized normal distribution (Z), with mean 0
and variance 1
f(Z)

Z ~ N(0,1) 1
Z
0
• Need to transform X units into Z units by subtracting the
mean of X and dividing by its standard deviation

X -μ
Z=
σ
Normal Distribution Approximation for Binomial
Distribution
(continued)

• The shape of the


binomial distribution is
approximately normal if
n is large

• The normal is a good X - E(X) X - np


approximation to the Z= =
binomial when Var(X) nP(1- P)
nP(1 – P) > 9

• Standardize to Z from
a binomial distribution:
The Exponential Distribution

• Defined by a single parameter, its mean l (lambda)

• The cumulative distribution function (the probability that


an arrival time is less than some specified time t) is

-λt
F(t) = 1- e

where e = mathematical constant approximated by 2.71828


l = the population mean number of arrivals per unit
t = any value of the continuous variable where t > 0
Types of convergence
Almost sure convergence

x n ¾¾® x
c .s .

A series of random variables {ξn} “not necessarily independent”,


is said to almost surely converge to a random variable ξ if

(
P Lim x n = x = 1
n ®¥
)
Convergence in probability

xn ¾
¾® x P

A series of random variables {ξn}, is said to converge in


probability to a random variable ξ if for any e>0

Lim P(x n - x > e ) = 0


n ®¥

Lim P( x n - x < e ) = 1
n ®¥
Convergence in distribution

xn ¾
¾® x d

A series of random variables {ξn}, with distribution


functions Fn(t), converges in distribution to the random
variable ξ, with distribution function F(t), if and only if the
series {Fn(t) } converges to F(t).
Si:
Lim Fx n (t ) = Fx (t )
n ®¥

for all t at which F ξ(t) is continuous.


Convergence in mean of rth order
x n ¾¾® x mr

• A sequence of random variables {ξn} converges in the rth mean


to a random variable ξ if
r
Lim E x n - x = 0
n ®¥
r
having E xn as finite
r
Ex
• We know that if the r moment exists, all the inferior also do.
Thus, if s ≤ r:
x n ¾¾® x Þ x n ¾¾® x
mr ms

• If r=2, it is called the mean-square convergence.


Relation between different types of
convergence

Almost Sure Mean Convergence

Probability

Distribution
Moivre-Laplace theorem

Given a sequences of random variables ξn B(n,p) with:


E (x n ) = np y V (x n ) = npq
We can define a new random variable ηn so that:

xn - E (xn ) xn - np d
hn = = ¾¾® N (0,1)
sx n
npq
Given that:
xn ¾
¾® N ( np, npq )
d
Central limit theorem

Let a sequence {ξn} of random variables, the typification of the


sequence Yn defined as:
Yn= ξ1+ ξ2+…+ ξn
converges in distribución towards a normally distributed random
variable N(0,1):
Yn - E (Yn )
¾
¾®
d
N (0,1)
sY n

CLT allows us to compute probabilitices without the need to know


the corresponding distribution, assuming a big enough sample
size.

You might also like