You are on page 1of 6

ECON240 Formula help sheet

Measures of central tendency and location


𝑥
Population mean (𝜇): 𝜇 = ∑𝑁
𝑖=1 𝑁
𝑖

𝑥
Sample mean (𝑥̄ ): 𝑥̄ = ∑𝑛𝑖=1 𝑛𝑖
1
Geometric sample mean (𝑥̄𝑔 ): 𝑥̄𝑔 = 𝑛√(𝑥1 × 𝑥2 × ⋯ × 𝑥𝑛 ) = (𝑥1 × 𝑥2 × ⋯ × 𝑥𝑛 )𝑛
∑𝑛
𝑖=1 𝑤𝑖 𝑥𝑖
Weighted mean (𝑥̄ 𝑤 ): 𝑥̄ 𝑤 = 𝑛

Median (𝑄2 ) position: 0.5 ∙ (𝑛 + 1)th position of ordered data


First quantile (𝑄1) position: 0.25 ∙ (𝑛 + 1)th position of ordered data
Third quantile (𝑄3 ) position: 0.75 ∙ (𝑛 + 1)th position of ordered data
Where 𝑁: population size; 𝑛: sample size; and 𝑤𝑖 is the weight of the ith observation.

Measures of variability
(𝑥𝑖 −𝜇)2
Population variance (𝝈𝟐 ): 𝝈𝟐 = ∑𝑁
𝑖=1 𝑁

(𝑥𝑖 −𝑥̅ )2
Sample variance (𝑠 2 ): 𝑠 2 = ∑𝑛𝑖=1 𝑛−1

(𝑥𝑖 −𝜇)2
Population standard deviation (𝜎): 𝜎 = √𝜎 2 = √∑𝑁
𝑖=1 𝑁

(𝑥𝑖 −𝑥̅ )2
Sample standard deviation (𝑠): 𝑠 = √𝑠 2 = √∑𝑛𝑖=1 𝑛−1

𝜎
Population coefficient of variation (CV): 𝐶𝑉 = (𝜇) ⋅ 100%

𝑠
Sample coefficient of variation (CV): 𝐶𝑉 = (𝑥̅ ) ⋅ 100%

Inter-quantile range (𝐼𝑄𝑅): 𝐼𝑄𝑅 = 𝑄3 – 𝑄1


𝑥𝑖 −𝜇
Z-score (𝑧): 𝑧= 𝜎

Where 𝑁: population size; 𝑛: sample size; 𝜇: population mean; 𝑥̅ : sample mean.


ECON240 Formula help sheet

Chebyshev Theorem:
For any population with mean (𝜇) and standard deviation (𝜎), and 𝑘 > 1, the percentage of
1
observations that fall within the interval [𝜇 + 𝑘𝜎] is at least 100 [1 − (𝑘 2 )] %

Empirical rule:
If the data distribution is bell-shaped with mean (𝜇) and standard deviation (𝜎),
68% of the values in the data are within the 𝜇 ± 𝜎 range
95% of the values in the data are within the 𝜇 ± 2 ∙ 𝜎 range
99.7% of the values in the data are within the 𝜇 ± 3 ∙ 𝜎 range

Measures of relationships between two variables (𝒙 𝒂𝒏𝒅 𝒚)


∑𝑁
𝑖=1(𝑥𝑖 −𝜇𝑥 )(𝑦𝑖 −𝜇𝑦 )
Population covariance (𝜎𝑥𝑦 ): Cov(x, y) = 𝜎𝑥𝑦 = 𝑁

∑𝑛
𝑖=1(𝑥𝑖 −𝑥̄ )(𝑦𝑖 −𝑦̄ )
Sample covariance (𝑠𝑥𝑦 ): Cov(x, y) = 𝑠𝑥𝑦 = 𝑛−1

Cov(𝑥,𝑦) ∑𝑁
𝑖=1(𝑥𝑖 −𝜇𝑥 )(𝑦𝑖 −𝜇𝑦 )
Population correlation coefficient (𝜌): 𝜌= =
𝜎𝑥 𝜎𝑦 𝑁𝜎𝑥 𝜎𝑦

Cov(𝑥,𝑦) ∑𝑛
𝑖=1(𝑥𝑖 −𝑥̄ )(𝑦𝑖 −𝑦̄ )
Sample correlation coefficient (𝑟): 𝑟= =
𝑠𝑥 𝑠𝑦 (𝑛−1)𝑠𝑥 𝑠𝑦

Where 𝑁: population size; 𝑛: sample size; 𝜇𝑥 : population mean of variable 𝑥; 𝜇𝑦 : population


mean of variable 𝑦; 𝑥̅ : sample mean of variable 𝑥; 𝑦̅:sample mean of variable 𝑦.

Probability rules:

Complement rule: 𝑃(𝐴) = 1 − 𝑃(𝐴)

Addition rule: 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)


𝑃(𝐴∩𝐵) 𝑃(𝐴∩𝐵)
Conditional probability: 𝑃(𝐴|𝐵) = and 𝑃(𝐵|𝐴) =
𝑃(𝐵) 𝑃(𝐴)

Multiplication rule: 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴|𝐵)𝑃(𝐵) and 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐵|𝐴)𝑃(𝐴)

From join distribution table: 𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐵) + 𝑃(𝐴 ∩ 𝐵̅ )


Statistical independence if: 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵)
𝑃(𝐵|𝐴)𝑃(𝐴)
Bayes rule: 𝑃(𝐴|𝐵) = 𝑃(𝐵)
ECON240 Formula help sheet

Discrete probability distributions:


Probability distribution function: 𝑃(𝑋 = 𝑥)
Cumulative distribution function: 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥)
Expected value of discrete random variable: 𝜇 = 𝐸[𝑋 ] = ∑𝑥 𝑥 ∙ 𝑃(𝑋 = 𝑥)
Variance of discrete random variable: 𝜎 2 = 𝑉𝑎𝑟[𝑋] = ∑𝑥(𝑥 − 𝐸[𝑋])2 ∙ 𝑃(𝑋 = 𝑥)

Std. deviation of discrete random variable: 𝜎 = √𝑉𝑎𝑟[𝑋] = √∑𝑥(𝑥 − 𝐸[𝑋])2 ∙ 𝑃(𝑋 = 𝑥)

Mean of transformed random variable: 𝐸[𝑌] = 𝐸[𝑎 + 𝑏 ∙ 𝑋] = 𝑎 + 𝑏 ∙ 𝐸[𝑋]


Variance of transformed random variable: 𝑉𝑎𝑟[𝑌] = 𝑉𝑎𝑟[𝑎 + 𝑏 ∙ 𝑋] = 𝑏 2 ∙ 𝑉𝑎𝑟[𝑋]
*where 𝑎 and 𝑏: constants; 𝑋 is a random variable.

Covariance of joint probability distribution: 𝐶𝑜𝑣(𝑋, 𝑌) = ∑𝑥 ∑𝑦(𝑥 − 𝜇𝑥 )(𝑦 − 𝜇𝑦 ) 𝑃(𝑥, 𝑦)


𝐶𝑜𝑣(𝑋,𝑌)
Correlation of joint probability distribution: 𝐶𝑜𝑟𝑟(𝑋, 𝑌) = 𝜎𝑋 𝜎𝑌

Bernoulli distribution: 𝑃(𝑋 = 0) = (1 − 𝜌) and 𝑃(𝑋 = 1) = 𝜌


Mean of Bernoulli distribution: 𝐸[𝑋] = 𝜌
Variance of Bernoulli distribution: 𝑉𝑎𝑟[𝑋] = 𝜌(1 − 𝜌)

Std. deviation of Bernoulli distribution: 𝑆𝐷[𝑋] = √𝜌(1 − 𝜌)


𝑛!
Binomial distribution: 𝑃(𝑋 = 𝑥) = 𝜌 𝑥 (1 − 𝜌)(𝑛−𝑥) ∙ 𝑥!(𝑛−𝑥)!

Mean of Binomial distribution: 𝐸[𝑋] = 𝑛 ∙ 𝜌


Variance of Binomial distribution: 𝑉𝑎𝑟[𝑋] = 𝑛 ∙ 𝜌(1 − 𝜌)

Std. deviation of Binomial distribution: 𝑆𝐷[𝑋] = √𝑛 ∙ 𝜌(1 − 𝜌)


𝑒 −𝜆 ∙𝜆𝑥
Poisson distribution: 𝑃(𝑋 = 𝑥) = 𝑥!

Mean of Poisson distribution: 𝐸[𝑋] = 𝜆


Variance of Poisson distribution: 𝑉𝑎𝑟[𝑋] = 𝜆

Std. deviation of Poisson distribution: 𝑆𝐷[𝑋] = √𝜆


Poisson approximation to Binomial: 𝜆 = 𝑛𝜌 if 𝑛𝜌 ≤ 7
Where n: sample size; 𝜌: success probability; 𝜆: average number of occurrences in
period/space; 0! = 1; 𝑛! = 𝑛 ∙ (𝑛 − 1) ∙ (𝑛 − 2) ∙ … ∙ 1; and 𝑒 = 2.71828 (Euler’s number)
ECON240 Formula help sheet

Continuous probability distributions:


1
𝑖𝑓𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
Uniform distribution probability density function: 𝑓(𝑥) = {
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
*where 𝑎: minimum; 𝑏: maximum.
𝑎+𝑏
Uniform distribution mean: 𝜇= 2

(𝑏−𝑎)2
Uniform distribution variance: 𝜎2 = 12

−(𝑥−𝜇)2
1
Normal distribution probability density function: 𝑓(𝑥) = √2𝜋𝜎2 𝑒 2𝜎2

Exponential distribution probability density function: 𝑓(𝑡) = 𝜆𝑒 −𝜆𝑡 𝑓𝑜𝑟 𝑡 > 0

Exponential cumulative distribution function: 𝐹(𝑡) = 1 − 𝑒 −𝜆𝑡

Sampling distributions:
𝑥̅
Expected value of sample mean: 𝐸[𝑋̄] = 𝜇𝑋̅ = ∑𝑆𝑖=1 𝑆𝑖 = 𝜇
1
Variance of sample mean: 𝑉𝑎𝑟[𝑋̄] = 𝜎𝑋̄ 2 = 𝑆 ∑𝑆𝑖=1(𝑥̅𝑖 − 𝐸[𝑋̅])2

1 𝜎
Standard error of sample mean: 𝑆𝐸[𝑋̄] = 𝜎𝑋̄ = √𝑆 ∑𝑆𝑖=1(𝑥̅𝑖 − 𝐸[𝑋̅])2 =
√𝑛

𝜎 𝑁−𝑛
Standard error of sample mean (with finite population correction): 𝑆𝐸[𝑋̄] = ∙ √𝑁−1
√𝑛

𝑋 −𝜇 ̄ 𝑋̄ −𝜇
Standardized value (𝑧) for normally distributed sample mean value 𝑋̄: 𝑧 = 𝜎 𝑋̅ = 𝜎
̄
𝑋 √𝑛

Expected value of sample variance: 𝐸(𝑠 2 ) = 𝜎 2


2𝜎4
Variance of sample variance: 𝑉𝑎𝑟(𝑠 2 ) = 𝑛−1

(𝑛−1)𝑠2
If the population distribution is normal, then has a chi-square distribution with 𝑛 − 1
𝜎2
degrees of freedom (𝜒 2 𝑛−1)

Where 𝑆: number of possible samples; 𝑥̄ : sample mean;𝜇: population mean; 𝑠: sample


standard deviation; 𝜎: population standard deviation; 𝑛: sample size.
ECON240 Formula help sheet

Confidence intervals:
𝜎
Population mean, 𝜎 known: 𝑥̄ ± 𝑧(𝛼)
2 √𝑛
𝑠
Population mean, 𝜎 unknown: 𝑥̄ ± 𝑡(𝛼,𝑛−1)
2 √𝑛

(𝑛−1)𝑠2 (𝑛−1)𝑠2
Population variance: and
𝜒2 𝛼 𝜒2 𝛼
(𝑛−1, ) (𝑛−1,1− )
2 2

𝑁−𝑛
Finite population correction (when sample covers more than 5% of the population): √ 𝑁−1

Where 1 − 𝛼: confidence level; 𝑥̄ : sample mean; 𝜎: population standard deviation; 𝑠: sample


standard deviation; 𝑛: sample size; 𝑧(.) : value in standard normal distribution; 𝑡(.) : value in
2
Student’s t-distribution; 𝜒(.) : chi-square value.

Hypothesis tests (test statistics):


𝑥̅ −𝜇𝑜
Test statistic for population mean, 𝜎 known: 𝑧= 𝜎
√𝑛

𝑥̄ −𝜇0
Test statistic for population mean, 𝜎 unknown: 𝑡= 𝑠
√𝑛

2 (𝑛−1)𝑠2
Test statistic for population variance: 𝜒𝑛−1 = 𝜎02

Where 𝑥̄ : sample mean; 𝜎: population standard deviation; 𝑠: sample standard deviation; 𝑛:


sample size; 𝜇0 : value of population mean in null hypothesis; 𝜎02 : value of population variance
in null hypothesis.

Regression analysis
Linear regression model: 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀𝑖
Estimated regression model: 𝑦̂𝑖 = 𝑏0 + 𝑏1 𝑥𝑖
Residuals from estimation: 𝑒𝑖 = 𝑦𝑖 − (𝑏0 + 𝑏1 𝑥𝑖 )
Least squared coefficients (OLS):
∑𝑛
𝑖=1(𝑥𝑖 −𝑥̄ )(𝑦𝑖 −𝑦̄ ) 𝐶𝑜𝑣(𝑥,𝑦)
Slope: 𝑏1 = ∑𝑛 2
=
𝑖=1(𝑥𝑖 −𝑥̄ ) 𝑉𝑎𝑟(𝑥)

Intercept: 𝑏0 = 𝑦̄ − 𝑏1 𝑥̄
ECON240 Formula help sheet

Regression sum of squares (or model sum of squares) = ∑(𝑦̂𝑖 − 𝑦̅)2


Error sum of squares (or residual sum of squares) = ∑(𝑦𝑖 − 𝑦̂𝑖 )2
Total sum of squares = 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 + 𝑒𝑟𝑟𝑜𝑟 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
R-squared (𝑅 2 ): 𝑅2 = 𝑡𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
2
∑(𝑦𝑖 −𝑦
̂ 𝑖)

Variance of regression slope (𝑠𝑏21 ): 𝑠𝑏21 = 𝑛−2


∑(𝑥𝑖 −𝑥̄ )2

Standard error of regression slope (𝑠𝑏1 ): 𝑠𝑏1 = √𝑠𝑏21

𝑏1 −𝛽1
Test statistic for the slope: 𝑡= 𝑠𝑏1
, with (𝑛 − 2) degrees of freedom

Multiple regression model: 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥1,𝑖 + ⋯ + 𝛽𝑘 𝑥𝑘,𝑖 + 𝜀𝑖

Estimated multiple regression model: 𝑦𝑖 = 𝑏0 + 𝑏1 𝑥1,𝑖 + ⋯ + 𝑏𝑘 𝑥𝑘,𝑖 + 𝑒𝑖

Multiple regression OLS coefficients: 𝐛 = (𝐗 ′ 𝐗)−1 𝐗 ′ 𝐘

𝑦1 1 𝑥1,1 … 𝑥𝐾,1 𝑏1
𝑦2 1 𝑥1,2 … 𝑥𝐾,2 𝑏
where 𝒀=[ ] , 𝑿=[ ] , 𝒃 = [ 2]
⋮ ⋮ ⋮ … ⋮ ⋮
𝑦𝑛 𝒏𝒙𝟏 1 𝑥1,𝑛 … 𝑥𝐾,𝑛 𝑏
𝒏𝒙𝑲 𝐾 𝑲𝒙𝟏

∑𝑛 𝑒2 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠


Variance of the errors: 𝑠𝑒2 = 𝑛−𝐾−1
𝑖=1 𝑖
= 𝑛−𝐾−1
𝑆𝑆𝐸 𝑒𝑟𝑟𝑜𝑟 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠

𝑅̄ 2 = 1 −
(𝑛−𝐾−1) (𝑛−𝐾−1)
Adjusted R-squared: 𝑆𝑆𝑇 =1− 𝑡𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
(𝑛−1) (𝑛−1)

Coefficient of multiple correlation: 𝑅 = 𝑟(𝑦̂, 𝑦) = √𝑅 2


Confidence interval of population slope: 𝑏𝑗 ± 𝑡(n−K−1, 𝛼 ∙ 𝑆𝑏𝑗
2)

𝑏𝑗 −𝑏𝑜
Test statistic for population slope: 𝑡=
𝑆𝑏𝑗
(𝑤𝑖𝑡ℎ 𝑑. 𝑓. 𝑛 − 𝐾 − 1 )

where 𝑏𝑗 : slope estimator; 𝑛: sample size; 𝐾: number of independent variables; 𝑆𝑏𝑗 : standard error of
slope estimator 𝑏𝑗 ; 𝑏0 : value of population slope in null hypothesis; d.f.: degrees of freedom

𝑀𝑆𝑅 (𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠)/𝐾


F-test: 𝐹= = (𝑒𝑟𝑟𝑜𝑟 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠)/ (𝑛−𝐾−1)
𝑠𝑒2

You might also like