Understanding Key Concepts in Statistics and Probability

MATERIALS SB
CHAPTER 1
❖ Descriptive data and Inferential Data
❖ Critical thinking
+ Conclusion from small samples
+ Conclusion from non-random samples
+ Conclusion from rare event
+ Poor survey methods
+ Post-hoc fallacy
CHAPTER 2
❖ Level of measurement
+ Nominal measurement
+ Ordinal measurement
+ Interval measurement
+ Ratio measurement
❖ Sampling method
+ Simple random sample
+ Systematic sample
+ Stratified sample Random sampling method
+ Cluster sample
+ Judgment sample
+ Convenience sample Non-random sampling method
+ Focus group
CHAPTER 3
❖ Stem and leaf
❖ Dot plot
❖ Sturges’ Rule: k ¿ 1+3.3 log log (n)
❖ Bin width
❖ Histogram and shape

CHAPTER 4: Descriptive Statistics
n
1
❖ Sample mean: x = ∑ ❑ x i
n i=1
❖ Geometric mean: G = √ x 1 x 2 … x n
n
❖ Range: R = x max − x min

xmin + x max
❖ Midrange =
2
√
n
❖ Sample standard deviation: s = ∑ ❑(xi − x )2

i=1
n−1
σ s
❖ Coefficient of variation of population: CV = 100× μ ; CV of sample: CV = 100× x
xi − μ x −x
❖ Standardized variable of population: z i= ; of sample: z i= i
σ s
Q1+Q 2
❖ Midhinge =
2
n
∑ ❑(x i − x )( y i − y)
i=1
❖ Sample correlation coefficient: r =
√ √∑
n n
∑ ❑(xi − x ) 2 2
❑( y i − y )
i =1 i=1
❖ Fences and Unusual Data Values

Inner fences Outer fences
Lower fence Q1 – 1.5(Q3 – Q1) Q1 – 3(Q3 – Q1)
Upper fence Q3 + 1.5(Q3 – Q1) Q3 + 3(Q3 – Q1)
CHAPTER 5: Probability
P(A ) 1− P ( A )
❖ Odds for A: ; Odds against A:
1− P ( A ) P(A )
❖ General Law of Addition: P ( A ∪ B ) =P ( A )+ P ( B ) − P ( A ∩ B )
P ( A ∩ B)
❖ Conditional probability: P ( B ) = P( B)
❖ Independence property: P ( A ∩ B )=P ( A ) . P(B)

P( A∨B). P(B)
❖ Bayes’ Theorem: P ( A ) = ' '
P (A∨B) . P ( B )+ P ( A∨B ). P( B )
n!
❖ Permutation: nPr =
(n − r)!
n!
❖ Combination: nCr =
r ! (n −r )!
CHAPTER 6: Discrete Probability Distributions
1
❖ Uniform PDF: P( X =x) = ; x=a ,a+ 1, … , b
b −a+ 1
n! x n−x
❖ Binomial PDF: P( X =x) = π (1 − π ) ; x=0 , 1 ,2 , … , n
x ! (n − x ) !
x −λ
λ e
❖ Poisson PDF: P( X =x) = ; x=0 , 1 ,2 , …
x!
❖ Bài toán vé số = (value win) x P(win) + (value lose) x P(lose)
x n−x
Cs CN−s
❖ Hypergeometric: P( X =x) = n
CN
λ x e− λ
❖ Poisson approximation for binomial: λ=nπ ; P(X =x) = (when n ≥ 20 ; π ≤ 0.05)
x!
❖ Binomial approximation to the hypergeometric:
n s
when <0.05 → use n as sample size, π=
N N
❖ Geometric: P ( X=x )=π ( 1− π )x − 1;
P ( X ≤ x )=1 −(1− π ) x
CHAPTER 7: Continuous Probability Distributions

2
x−μ
❖ Normal distribution: f ( x) = 1 e
− 0.5( )
σ
σ √2 π
1 −0.5 z 2
❖ Standard normal distribution: f ( x) = e

√2 π
1
❖ Exponential Distribution PDF: f ( x) = λ e − λx ; Standard deviation =
λ
❖ Exponential Distribution CDP: P ( X ≤ x )=1 − e− λx
❖ Normal Approximation to Binomial: μ=nπ ; σ =√ nπ (1 − π ) for nπ ≥ 10 and
n(1 − π )≥10
❖ Normal Approximation to Poisson: μ= λ; σ =√ λ for λ ≥ 10
❖ Linear Transformation. Rule 1: μaX +b =a μ X + b (mean of a transformed variable)
❖ Linear Transformation. Rule 2: σ aX +b=a σ X (standard deviation of transformed
variable)
❖ Linear Transformation. Rule 3: μ X +Y =μ X + μY (mean of sum of tow random variables
X and Y)
❖ Linear Transformation. Rule 4: σ X + Y =√ σ 2X + σ 2Y (standard deviation of sum if X and Y
are independent)
❖ Covariance: σ X + Y =√ σ 2X + σ 2Y + 2σ XY . If X and Y are correlated.
CHAPTER 8: Sampling Distributions and Estimation

Commonly Used Formulas in Sampling Distributions
x
❖ Sample proportion: p = n
σ
❖ Standard error of the sample mean: σ x =
√n
σ
❖ Confidence interval for μ, known σ: x ± z α / 2
√n
s
❖ Confidence interval for μ, unknown σ: x ± t α / 2 with d . f .=n− 1
√n
❖ Standard error of the sample proportion: σ p =
√ π (1 − π )
n
❖ Confidence interval for π: p ± z α /2

√ p(1 − p)
n
zσ 2
❖ Sample size to estimate μ: n=( )
E
❖ Sample size to estimate π: n= ( )

z 2
E
π (1− π)
❖ Interpretation:
P - value Interpretation
P > 0,05 No evidence against H0
P < 0.05 Moderate evidence against H0
P < 0.01 Strong evidence against H0
P < 0.001 Very strong evidence against H0
CHAPTER 9: One – Sample Hypothesis Test

Commonly Used Formulas in One-Sample Hypothesis Tests
x − μ0
x − μ0
❖ Test statistic for a Mean (known σ): zcalc = = σ
σx
√n
x − μ0
❖ Test statistic for a Mean (unknown σ): tcalc = s
√n
p− π 0
p−π0
❖ Test statistic for a Proportion: zcalc =
σp
=
√ π 0 (1 − π 0)
n
❖ Test for one variance:
CHI-SQUARE DISTRIBUTION compares the sample variance with a benchmark
2
2
(n −1) s s2: sample variance
χ calc = ;
σ2 σ2: population variance.
With n: sample size
CHAPTER 10: Two-sample Hypothesis Tests

Commonly Used Formulas in Two-Sample Hypothesis Tests
❖ Test Statistic for Zero Difference of Means:
√
2 2
s 1 s2
❖ Confidence interval for μ1 − μ2: ( x 1 − x 2) ± t α /2 +
n 1 n2
*Note: for paired t test, n is number of pairs
n
❖ Paired t test: d =
∑ ❑ d i (mean of n differences)
i=1
n
√
2
n
(d − d)
❖ St. Dev of n differences: sd = ∑ ❑ ni −1
i=1
d − μd
❖ Test statistic for paired samples: t calc = sd
√n
❖ Degree of freedom: d . f .=n− 1
❖ The ith paired difference is: Di= X 1 i – X 2 i
sd
❖ Confidence interval for μD: D ±t α /2
√n
( p1 − p2 ) −(π 1 − π 2)
√
❖ Test statistic for equality of proportion: z calc = 1 1 ;
p(1− p)( + )
n1 n 2
x1 + x 2 x1 x2
p= ; p1= ; p2=
n1 + n2 n1 n2
❖ Confidence interval for the difference of two proportions, π 1 − π 2:
( p1 − p2)± z α / 2.
√
2
p1 (1 − p1 ) p2 (1− p 2)
n1
+
n2
s1
❖ Test statistic (two variances): F calc = 2 , with df 1 =n1 – 1, df 2 =n2 – 1
s2
1
❖ F-test: F R =Fdf 1 , df 2; FL =
F df 2 , df 1
CHAPTER 11: Analysis of Variance

❖ One-Factor ANOVA Table:
❖ Tukey’s test: Tukey is a two-tailed test for equality of paired means from groups
compare simultaneously
H0: μ j=μk
H1: μ j ≠ μk
| y j − y k|
√
Tcalc = 1 1
MSE ( + )
n j nk
Tcalc > Tc,n-c -> Reject H0, Tc,n-c is a critical value. (pg. 449)
❖ Hartley’s Test:
H0: σ 21=σ 22 =…=σ 2c
H1: σ not equal
2
s max
Hcalc = 2 > Hcritical (or Hc,n/c-1) (pg. 451)
s min
CHAPTER 12: Simple Regression
Commonly Used Formulas in Simple Regression
n
∑ ❑( x i − x )( y i − y)
i=1
❖ Sample correlation coefficient: r =
√∑ √∑
n n
2 2
❑( xi − x ) ❑( y i − y )
i =1 i=1
❖ Test statistic for zero correlation: tcalc = r
n
√ n −2
1 −r
2 with
d . f .=n− 2, tα/2
∑ ❑( x i − x)( y i − y )
i=1
❖ Slope of fitted regression: b1 = n
∑ ❑( x i − x)2
i=1
❖ Intercept of fitted regression: b 0= y −b1 x

n n
❖ Sum of squared residuals: SSE=∑ ❑( y i − ^y i) =∑ ❑( y i − b0 −b1 x i)
2 2
i=1 i=1
n
∑ ❑( y i − ^y i)2 SSE SSR

i=1
❖ Coefficient of determination: R2 = 1 - =1- =
n
SST SST
∑ ❑( yi − y )2
i=1
√
n
❖ Standard error of the estimate: s = ∑ ❑( y i − y)2 =

i=1
n− 2
√ SSE
n− 2
s
❖ Standard error of the slope: sb =
√
n
, with d . f .=n− 2,
1
∑ ❑(xi − x )2
i =1
b1 − 0
❖ T test for zero slope: tcalc = sb 1
❖ Confidence interval for true slope: b 1 −t α/ 2 . sb ≤ β 1 ≤b 1+ t α /2 . s b ,

1 1
with d . f .=n− 2
√
2
1 (x i − x)
^
y ±t s +
❖ Confidence interval for conditional mean of Y: i α / 2 . n n
∑ ❑(x i − x)2
i=1
√
2
1 (x i − x)
1+ +
❖ Prediction interval for Y: ^y i ±t α / 2 s. n n
∑ ❑( x i − x)2
i=1
Excel Output:
Regression Statistics
R square SSR
R2 =
SST
Standard Error
se =
√ SSE
n− 2
ANOVA
df SS MS F
Regression k (SSR) SSR MSR
MSR =
k MSE
Residual n-k-1 (SSE) MSE =
SSE
n −k −1
Total n-1 (SST)
Coefficient Standard Error T Stat
Intercept (b0) (sb0)
(b0)
Square feet (b1) (sb1) b1 − β 1
(b1) sb
1

Understanding Key Concepts in Statistics and Probability

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Understanding Key Concepts in Statistics and Probability

Uploaded by

Copyright:

Available Formats

MATERIALS SB

❖ Histogram and shape

❖ Range: R = x max − x min

❖ Sample standard deviation: s = ∑ ❑(xi − x )2

❖ Fences and Unusual Data Values

❖ Independence property: P ( A ∩ B )=P ( A ) . P(B)

CHAPTER 7: Continuous Probability Distributions

❖ Standard normal distribution: f ( x) = e

CHAPTER 8: Sampling Distributions and Estimation

❖ Confidence interval for π: p ± z α /2

❖ Sample size to estimate π: n= ( )

CHAPTER 9: One – Sample Hypothesis Test

With n: sample size

CHAPTER 10: Two-sample Hypothesis Tests

CHAPTER 11: Analysis of Variance

❖ Test statistic for zero correlation: tcalc = r

❖ Intercept of fitted regression: b 0= y −b1 x

∑ ❑( y i − ^y i)2 SSE SSR

❖ Standard error of the estimate: s = ∑ ❑( y i − y)2 =

❖ Confidence interval for true slope: b 1 −t α/ 2 . sb ≤ β 1 ≤b 1+ t α /2 . s b ,

You might also like