You are on page 1of 10

MATERIALS SB

CHAPTER 1
❖ Descriptive data and Inferential Data
❖ Critical thinking
+ Conclusion from small samples
+ Conclusion from non-random samples
+ Conclusion from rare event
+ Poor survey methods
+ Post-hoc fallacy

CHAPTER 2
❖ Level of measurement
+ Nominal measurement
+ Ordinal measurement
+ Interval measurement
+ Ratio measurement
❖ Sampling method
+ Simple random sample
+ Systematic sample
+ Stratified sample Random sampling method
+ Cluster sample
+ Judgment sample
+ Convenience sample Non-random sampling method
+ Focus group

CHAPTER 3
❖ Stem and leaf
❖ Dot plot
❖ Sturges’ Rule: k ¿ 1+3.3 log log (n)

❖ Bin width

❖ Histogram and shape


CHAPTER 4: Descriptive Statistics
n
1
❖ Sample mean: x = ∑ ❑ x i
n i=1

❖ Geometric mean: G = √ x 1 x 2 … x n
n

❖ Range: R = x max − x min


xmin + x max
❖ Midrange =
2


n

❖ Sample standard deviation: s = ∑ ❑(xi − x )2


i=1
n−1
σ s
❖ Coefficient of variation of population: CV = 100× μ ; CV of sample: CV = 100× x
xi − μ x −x
❖ Standardized variable of population: z i= ; of sample: z i= i
σ s
Q1+Q 2
❖ Midhinge =
2
n

∑ ❑(x i − x )( y i − y)
i=1
❖ Sample correlation coefficient: r =

√ √∑
n n

∑ ❑(xi − x ) 2 2
❑( y i − y )
i =1 i=1

❖ Fences and Unusual Data Values


Inner fences Outer fences
Lower fence Q1 – 1.5(Q3 – Q1) Q1 – 3(Q3 – Q1)
Upper fence Q3 + 1.5(Q3 – Q1) Q3 + 3(Q3 – Q1)

CHAPTER 5: Probability
P(A ) 1− P ( A )
❖ Odds for A: ; Odds against A:
1− P ( A ) P(A )
❖ General Law of Addition: P ( A ∪ B ) =P ( A )+ P ( B ) − P ( A ∩ B )
P ( A ∩ B)
❖ Conditional probability: P ( B ) = P( B)

❖ Independence property: P ( A ∩ B )=P ( A ) . P(B)


P( A∨B). P(B)
❖ Bayes’ Theorem: P ( A ) = ' '
P (A∨B) . P ( B )+ P ( A∨B ). P( B )
n!
❖ Permutation: nPr =
(n − r)!
n!
❖ Combination: nCr =
r ! (n −r )!
CHAPTER 6: Discrete Probability Distributions
1
❖ Uniform PDF: P( X =x) = ; x=a ,a+ 1, … , b
b −a+ 1
n! x n−x
❖ Binomial PDF: P( X =x) = π (1 − π ) ; x=0 , 1 ,2 , … , n
x ! (n − x ) !
x −λ
λ e
❖ Poisson PDF: P( X =x) = ; x=0 , 1 ,2 , …
x!
❖ Bài toán vé số = (value win) x P(win) + (value lose) x P(lose)
x n−x
Cs CN−s
❖ Hypergeometric: P( X =x) = n
CN

λ x e− λ
❖ Poisson approximation for binomial: λ=nπ ; P(X =x) = (when n ≥ 20 ; π ≤ 0.05)
x!
❖ Binomial approximation to the hypergeometric:
n s
when <0.05 → use n as sample size, π=
N N
❖ Geometric: P ( X=x )=π ( 1− π )x − 1;
P ( X ≤ x )=1 −(1− π ) x

CHAPTER 7: Continuous Probability Distributions


2
x−μ
❖ Normal distribution: f ( x) = 1 e
− 0.5( )
σ
σ √2 π
1 −0.5 z 2

❖ Standard normal distribution: f ( x) = e


√2 π
1
❖ Exponential Distribution PDF: f ( x) = λ e − λx ; Standard deviation =
λ
❖ Exponential Distribution CDP: P ( X ≤ x )=1 − e− λx
❖ Normal Approximation to Binomial: μ=nπ ; σ =√ nπ (1 − π ) for nπ ≥ 10 and
n(1 − π )≥10
❖ Normal Approximation to Poisson: μ= λ; σ =√ λ for λ ≥ 10
❖ Linear Transformation. Rule 1: μaX +b =a μ X + b (mean of a transformed variable)
❖ Linear Transformation. Rule 2: σ aX +b=a σ X (standard deviation of transformed
variable)
❖ Linear Transformation. Rule 3: μ X +Y =μ X + μY (mean of sum of tow random variables
X and Y)
❖ Linear Transformation. Rule 4: σ X + Y =√ σ 2X + σ 2Y (standard deviation of sum if X and Y
are independent)
❖ Covariance: σ X + Y =√ σ 2X + σ 2Y + 2σ XY . If X and Y are correlated.

CHAPTER 8: Sampling Distributions and Estimation


Commonly Used Formulas in Sampling Distributions
x
❖ Sample proportion: p = n

σ
❖ Standard error of the sample mean: σ x =
√n
σ
❖ Confidence interval for μ, known σ: x ± z α / 2
√n
s
❖ Confidence interval for μ, unknown σ: x ± t α / 2 with d . f .=n− 1
√n
❖ Standard error of the sample proportion: σ p =
√ π (1 − π )
n

❖ Confidence interval for π: p ± z α /2


√ p(1 − p)
n
zσ 2
❖ Sample size to estimate μ: n=( )
E

❖ Sample size to estimate π: n= ( )


z 2
E
π (1− π)

❖ Interpretation:
P - value Interpretation
P > 0,05 No evidence against H0
P < 0.05 Moderate evidence against H0
P < 0.01 Strong evidence against H0
P < 0.001 Very strong evidence against H0

CHAPTER 9: One – Sample Hypothesis Test


Commonly Used Formulas in One-Sample Hypothesis Tests
x − μ0
x − μ0
❖ Test statistic for a Mean (known σ): zcalc = = σ
σx
√n
x − μ0
❖ Test statistic for a Mean (unknown σ): tcalc = s
√n
p− π 0
p−π0
❖ Test statistic for a Proportion: zcalc =
σp
=
√ π 0 (1 − π 0)
n
❖ Test for one variance:
CHI-SQUARE DISTRIBUTION compares the sample variance with a benchmark

2
2
(n −1) s s2: sample variance
χ calc = ;
σ2 σ2: population variance.

With n: sample size

CHAPTER 10: Two-sample Hypothesis Tests


Commonly Used Formulas in Two-Sample Hypothesis Tests
❖ Test Statistic for Zero Difference of Means:

2 2
s 1 s2
❖ Confidence interval for μ1 − μ2: ( x 1 − x 2) ± t α /2 +
n 1 n2
*Note: for paired t test, n is number of pairs
n

❖ Paired t test: d =
∑ ❑ d i (mean of n differences)
i=1
n


2
n
(d − d)
❖ St. Dev of n differences: sd = ∑ ❑ ni −1
i=1

d − μd
❖ Test statistic for paired samples: t calc = sd
√n
❖ Degree of freedom: d . f .=n− 1
❖ The ith paired difference is: Di= X 1 i – X 2 i
sd
❖ Confidence interval for μD: D ±t α /2
√n

( p1 − p2 ) −(π 1 − π 2)


❖ Test statistic for equality of proportion: z calc = 1 1 ;
p(1− p)( + )
n1 n 2

x1 + x 2 x1 x2
p= ; p1= ; p2=
n1 + n2 n1 n2
❖ Confidence interval for the difference of two proportions, π 1 − π 2:

( p1 − p2)± z α / 2.

2
p1 (1 − p1 ) p2 (1− p 2)
n1
+
n2
s1
❖ Test statistic (two variances): F calc = 2 , with df 1 =n1 – 1, df 2 =n2 – 1
s2
1
❖ F-test: F R =Fdf 1 , df 2; FL =
F df 2 , df 1

CHAPTER 11: Analysis of Variance


❖ One-Factor ANOVA Table:

❖ Tukey’s test: Tukey is a two-tailed test for equality of paired means from groups
compare simultaneously
H0: μ j=μk
H1: μ j ≠ μk
| y j − y k|


Tcalc = 1 1
MSE ( + )
n j nk
Tcalc > Tc,n-c -> Reject H0, Tc,n-c is a critical value. (pg. 449)

❖ Hartley’s Test:
H0: σ 21=σ 22 =…=σ 2c
H1: σ not equal
2
s max
Hcalc = 2 > Hcritical (or Hc,n/c-1) (pg. 451)
s min
CHAPTER 12: Simple Regression
Commonly Used Formulas in Simple Regression
n

∑ ❑( x i − x )( y i − y)
i=1
❖ Sample correlation coefficient: r =

√∑ √∑
n n
2 2
❑( xi − x ) ❑( y i − y )
i =1 i=1

❖ Test statistic for zero correlation: tcalc = r

n
√ n −2
1 −r
2 with
d . f .=n− 2, tα/2

∑ ❑( x i − x)( y i − y )
i=1
❖ Slope of fitted regression: b1 = n

∑ ❑( x i − x)2
i=1

❖ Intercept of fitted regression: b 0= y −b1 x


n n
❖ Sum of squared residuals: SSE=∑ ❑( y i − ^y i) =∑ ❑( y i − b0 −b1 x i)
2 2

i=1 i=1
n

∑ ❑( y i − ^y i)2 SSE SSR


i=1
❖ Coefficient of determination: R2 = 1 - =1- =
n
SST SST
∑ ❑( yi − y )2
i=1


n

❖ Standard error of the estimate: s = ∑ ❑( y i − y)2 =


i=1
n− 2
√ SSE
n− 2

s
❖ Standard error of the slope: sb =

n
, with d . f .=n− 2,
1
∑ ❑(xi − x )2
i =1

b1 − 0
❖ T test for zero slope: tcalc = sb 1

❖ Confidence interval for true slope: b 1 −t α/ 2 . sb ≤ β 1 ≤b 1+ t α /2 . s b ,


1 1

with d . f .=n− 2


2
1 (x i − x)
^
y ±t s +
❖ Confidence interval for conditional mean of Y: i α / 2 . n n

∑ ❑(x i − x)2
i=1


2
1 (x i − x)
1+ +
❖ Prediction interval for Y: ^y i ±t α / 2 s. n n

∑ ❑( x i − x)2
i=1

Excel Output:
Regression Statistics
R square SSR
R2 =
SST
Standard Error
se =
√ SSE
n− 2

ANOVA

df SS MS F
Regression k (SSR) SSR MSR
MSR =
k MSE
Residual n-k-1 (SSE) MSE =
SSE
n −k −1
Total n-1 (SST)
Coefficient Standard Error T Stat
Intercept (b0) (sb0)
(b0)
Square feet (b1) (sb1) b1 − β 1
(b1) sb
1

You might also like