You are on page 1of 7

I.

Basic probability formulas


● P(A ⋃ B) = P(A) + P(B) - P(A ⋂ B)
P ( A ⋂ B)
● P(A | B) =
P(B)
P (B∨ A). P( A)
● P(A | B) =
P(B)
● If A, B independent: P(A ⋂ B) = P(A) . P(B)

II. Discrete random variables


● ℳ = E(x) = ∑ xi . P(x=xi)

● σ 2 = V(x) =∑ (xi - ℳ )2 . P(x=xi)


2
= ∑ xi2 . P(x=xi) - ℳ

● E(ax + by) = a.E(x) + b.E(y)


● V(ax + by) = a2 . V(x) + b2 . V(y)
● Probability mass function: f(xi) = P(x=xi)
● Cumulative distribution function: F(xi) = P(x≤ xi)
● Some special distribution:
1. Discrete uniform distribution
1
○ P(x=Xi) =
n
a+b
○ ℳ=
2
2
(b−a+ 1) −1
○ σ2 =
12
2. Binomial distribution
○ P(x=k) = nCk . pk . (1-p)n-k
○ ℳ = n.p
○ σ 2 = n.p . (1-p)
3. Poisson distribution
−λ .T
e
○ P(x=k) = ( λ .T)k
k!
○ ℳ = λ .T
○ σ 2 = λ .T
4. Hypergeometric distribution
KCk .(N−K )C(n−k)
○ P(x=k) =
NCn
○ ℳ = n.p
N −n
○ σ 2 = n.p.(1-p).
N−1
5. Geometric distribution
○ P(x=k) = (1-p)k-1 . p
1
○ ℳ=
p
1− p
○ σ2 = 2
p
6. Negative binomial distribution
○ P(x=k) = (k-1)C(r-1) . pr . (1-p)k-r
r
○ ℳ=
p
r .(1− p)
○ σ2 =
p2

III. Continuous random variable


b
● Probability density function f(x): P(a<x<b) = ∫ ❑f(x) dx
a

● Cumulative distribution function F(x):


○ F(xi) = P(x≤xi)
○ F(xi)’ = f(xi)
+∞
● ℳ = E(x) = ∫ x . f (x )d x
−∞

+∞
● E(xn) = ∫ x n . f (x )d x
−∞

+∞
2
● σ = V(x) = ∫ x 2 . f (x)d x - ℳ
2

−∞

● Some special distribution:


1. Continuous uniform distribution
1
○ f(x) = , a≤x≤b
b−a
= 0 , elsewhere
a+b
○ ℳ=
2
2
2 (b−a)
○ σ =
12
2
2. Normal distribution N( ℳ , σ )
x−ℳ
○ z=
σ
1 x
2

○ f(z) = 2
√❑ . e
○ ϕ (x) = p(z<xi)
○ ϕ (-x) = 1 - ϕ (x)
3. Normal distribution approximate binomial and poisson distribution
a. Binomial (np > 5 and n(1-p) > 5)
x−n . p
■ z=
√❑
■ P(XBINORM ≤ a) = P(XNORMAL ≤ a+0.5)
■ P(XBINORM ≧ a) = P(XNORMAL ≧ a-0.5)
b. Poisson
x−λ
■ z=
√❑
■ P(XPOISSON ≤ a) = P(XNORMAL ≤ a+0.5)
■ P(XPOISSON ≧ a) = P(XNORMAL ≧ a-0.5)

4. Exponential distribution
○ f(x) = λ . e− λ. T , x > 0

○ = 0 , elsewhere
○ P(x ≧ a) = e
− λ. a
,(a > 0)
1
○ ℳ=
λ
1
○ σ2 = 2
λ

IV. Descriptive statistic (Take a sample of size n from population N)


∑ xi
● Sample mean: x =
n
n+1 x ceil(L) + x floor (L)
● Sample median: L = so Median =
2 2
● Mode: Số phần tử xuất hiện nhiều nhất
● Range: max - min
2
∑( x−x i )
● Sample variance: s2 =
n−1
● Quatiles:
n+1 x ceil(L ) + x floor (L )
○ L1 = so Q1 = 1 1

4 2

n+1 x ceil(L ) + x floor (L )


○ L2 = so Q2 = 2 2

2 2

3.(n+1) x ceil(L ) + x floor (L )


○ L3 = so Q3 = 3 3

4 2

V. Sampling distribution
2
● Population mean ℳ , variance σ . Sample size n. (Normal distribution or n > 30):
2
σ
○ Phân phối của X có dạng: N( ℳ , )
n
2 2
σ1 σ2
○ Phân phối của X 1 - X 2 có dạng: N( ℳ 1 - ℳ 2 , + )
n1 n2
● For proportion of population p, sample size n. (np ≧ 5 or n.(1-p) ≧ 5):
P .(1−P)
○ Phân phối của ^
P có dạng: N( P , )
n
P 1 .(1−P1) P 2 .(1−P2)
○ Phân phối của ^
P1 - ^
P2 có dạng: N( P1 - P2 , + )
n1 n2

VI. Statistical intervals - Test claims for one sample


● (l, u) = ( X - E, X + E)
● width = 2E
● P-value = 2 . P(Z > |Z0|)
1. Population variance known
σ
○ E = z α/ 2 .
√❑
X− ℳ
○ z0 =
σ / √❑
2. Population variance unknown
○ n > 30:
S
■ E = z α/ 2 .
√❑
X− ℳ
■ z0 =
S / √❑
○ n ≤ 30:
S
■ E = t α / 2 ,n−1 .
√❑
X− ℳ
■ t0 =
S / √❑
● For propotion:
○ (l, u) = ( ^
P - E, ^
P + E)
○ E = z α/ 2 . √ ❑
^
○ z 0 = P−P
√❑
○ Nếu đề không cho ^
P, mặc định ^
P = 0.5
● Nếu là one-side thì tương tự nhưng thay α /2 thành α

VII. Test claims for 2 samples (2 population independent, normal distribution or both n1, n2 > 30)
● (l, u) = ( X 1 - X 2 - E , X 1 - X 2 + E)
1. Population variance known
○ E = z α/ 2 . √ ❑

X 1− X 2−Δ 0
○ z0 =
√❑
2. Population variance unknown
2 2
○ Assume σ 1 = σ 2

■ Degree of freedom: df = n1 + n1 + 2
2 2
2 (n 1−1). S 1 +(n2−1) . S 2
■ Sp =
n1+ n2−2
■ E = t α / 2 ,df . √ ❑

X 1− X 2−Δ 0
■ t0 =
√❑
2 2
○ Not assume σ 1 = σ 2

( )
2 2 2
S1 S2
+
n1 n2
■ Degree of freedom: df =
S14 S24
2
+ 2
n 1 .(n1 −1) n2 .(n2−1)
■ E = t α / 2 ,df . √ ❑
X 1− X 2−Δ 0
■ t0 =
√❑
● For propotion:
○ (l, u) = ( ^
P1 - ^
P2 - E , ^
P 1- ^
P2 + E)
○ E = z α/ 2 . √ ❑
x1 + x 2
○ ^
P= (trong đó xi = n . ^
Pi )
n1 +n 2
^ ^
○ z 0 = P 1− P2−Δ 0
√❑
VIII. Linear Regression

● SXY = ∑( x i−x ) ( y i− y ) = ∑ x i y i −n . x . y
2 2 2
● SXX = ∑( x i−x ) = ∑ x i −n . x
2 2 2
● SYY = ∑( y i− y ) = ∑ y i −n . y

S XY ∑ xi y i−n . x . y
● Slope: ^
β1 = = 2 2
S XX ∑ x i −n . x
● Intercept: ^
β0 = y - ^
β1 . x
2
● Error sum of square: SSE = ∑( y i− ^
y i)
2
● Regression sum of square: SSR = ∑( ^
y i− y )
2
● Total sum of square: SST = ∑( y i− y )
● SSE + SSR = SST
● Standard error: σ^ = √ ❑
S XY
● Coefficient of correlation: R = √ ❑ =
√❑
● Test claims about the slope (df = n-2):
○ se( ^
β 1) = √ ❑
^
β1 −β1 , 0
○ t0 =
se ( β^ )
1

● Test claims about the intercept (df = n-2):


○ se( ^
β 0) = √ ❑
^
β 0−β 0 ,0
○ t0 =
se ( β^0 )
R−0
● Test claims about the coefficient of correlation (df = n-2): t 0 =
√❑

You might also like