You are on page 1of 34

24

26
27 26, 30, 24, 32, 32, 31, 27 and 29?
29
30
31 4.5 4
32 2
32 5
37 4
5
2
6
2.333333

0.4 0.2
0.5 0.2375
CHAPTER 6: Discrete Probability Distributions A discrete random variabl
PDF
CDF + ax0.99=25
E(X) = µ = Sigma(x.Px)

UNIFORM DISTRIBUTION is one of the simplest discrete models.


It describes a random variable with a finite number of consecutive integer values from a to b
a lower limit
b upper limit
PDF a 20 PDF
b 60 CDF(few)
CDF mean = u 40 CDF
mean = u (a+b)/2 Std 11.83216
Std

BINOMIAL DISTRIBUTION describes the number of successes in a fixed number of inde


where each trial has only two possible outcomes: success or failure
P(0)+P(1) = (1-p)+p=1

p prob of success p 0.9 PDF


n number of trials n 200 CDF (few)
mean = µ n.p mean = µ 180 CDF
std = s std = s 4.242641
PDF x 6
CDF

Skewed right if pi < 0.5


Skewed left if pi > 0.5
Symmetric if pi - 0.5
APPROXIMATION "BINO & POISSON" if n >= 20 and π <= 0.05
n 500 x
p 0.003
l=n.p 1.5
PDF 0.2231
CDF 0.2231
0.7769

APPROXIMATION "BINO & HYPERGEO" pi=s/N if n/N < 0.05 and symmetric if π = 0.05
p= s/N
n/N #REF!
N s
n x
screte random variable has a countable number of distinct values.

ger values from a to b.

0.0243902
0.2439024
0.7560976

xed number of independent trials.

=𝒔/𝑵

𝒔𝒕𝒅=√(𝒏.𝒑𝒊(𝟏−𝒑𝒊))
4.38E-184 𝒏)/(𝑵−𝟏))
4.395E-184
1
metric if π = 0.05
POISSON DISTRIBUTION describes the number of times an event occurs in a fixed interv
or space given the average rate of occurrence and assuming that the events occur independ
1/tgian để có 1 event

 number of events x 2 PDF 0.267784


mean   1.8 CDF (few) 0.730621
std = s√() std = s 1 CDF 0.269379
PDF
CDF
Always right-skewed,
l càng lớn thì ít dốc

HYPERGEOMETRIC DISTRIBUTION Bóc theo thứ tự


describes the probability of obtaining a certain number of successes the events being

N: pop size N 9 PDF 0.040


n: sample size n 5 CDF(less) 1.000
s: number of success_p s 4 CDF 0.000
x: number of success_s x 4
=𝒔/𝑵 pi 0.444
std 0.786
𝒔𝒕𝒅=√(𝒏.𝒑𝒊(𝟏−𝒑𝒊)).√((𝑵−
𝒏)/(𝑵−𝟏))

GEOMETRIC DISTRIBUTION models the number of trials required to achieve the fir
where the probability of success is constant across all trials.
pi: probability of success p 0.5 PDF 0.00098
mean = 1µ/p x 10 CDF (few) 0.99902
std = s mean = µ 2 CDF 0.000977
std = s 1.4142136
PDF
CDF
curs in a fixed interval of time
ents occur independently of each other.

es the events being studied are non-independent

d to achieve the first success in a sequence of independent trials,


x P(x) x.P(x) E(x) = u x-u (x-u)^2 (x-u)^2.var std
CHAPTER 7: CONTINUOUS PROBANILITY DISTRIBUTIONS

APPROXIMATION Normal = Binomial

Binomial P(X>=18) Bi P(X<= 18)


Normal P(X>17.5) Norm P(X< 18.5)

APPROXIMATION Normal = Poisson


If LAMDA >= 10,
UNIFORM CONTINUOUS DISTRIBUTION describes a continuous ran
that is equally likely to take on any value within a specified interval.
PDF has constant height, CDF increases linearly to 1
>= and > are the same
U(a,b)
a lower limit a 10
b Upper limit b 16
PDF Mean=u 13
Std 1.732050808
CDF x 13

P(c < X< d) = (d - c)/(b - a)

NORMAL DISTRIBUTION
Bell-shaped / Symmetric / Mesokurtic
Mean, median, and mode are all equal and are located at the center of the curve.
Z=(x-u)/s
z 25
PDF (casio) u 14
s 3
x 25

STANDARD NORMAL DISTRIBUTION


z (x-u)/s u 7000
s 420
PDF z -1.72
x 6000
CDF Norm.s.dist(z,1)
abc was 2.7 std above the mean -> z= 2.7

EXPONENTIAL DISTRIBUTION median =ln(2)/lamda


1/
tgian để có 1 event

Mean 
PDF mean rate 3.60
CDF std = s 1.897366596
x 0.5

chú ý 7.61 / 305


scribes a continuous random variable

PDF -9.9375
CDF(few) 0.5
CDF 0.5
P(c<x<d) 166.6667 c 3000 d 4000

the center of the curve.

CDF(few) 0.999877 inv 25,4,3


CDF(more) 25

CDF(few) 0.042716
CDF(more) 0.957284
between cd 3.921E-55 c 225 d 450

CDF (few) 0.834701


CDF 0.165299
between 0.000746 c 2 d 4
CHAPTER 8: SAMPLING DISTRIBUTION the distribution of the samp
CENTRAL LITMIT THE
Expected range of Sample

CI
90%
95%
99%
alpha
0.1

alpha
0.01

Margin error
75

alpha
0.1

Margin error
0.5

margin of error E

std propor
CHECK NOMAl?
CHECK NOMAl?
BẢNG NÀY CHỈ DÙNG CHO Z CÒN. T. THÌ PHẢI TRA APPENDI
the distribution of the sample mean X_ approaches a normal distribution with mean μ and standard deviation = σ /Căn n as the sample
CENTRAL LITMIT THEOREM
Expected range of Sample Means

alpha z_a/2 Lưu ý Z lấy a/2


0.1 -1.644854 0.024998
0.05 -1.959964
0.01 -2.575829
u s n z Std error upper lower
25 1.25 16 -1.644854 0.3125 24.48598 25.51402

x- s n z_a/2 Std error upper lower alpha x-


36.4 14.5 40 2.576 2.292 42.30419 30.49581 0.05 45.66
width width Margin error t
alpha s z_a/2 n 5.904192 -5.904192 4
0.01 300 -2.575829 106.1583

MEDTHOD TO ESTIMATE SIGMA (for m


M2: Assume Uniform POP
M3: Assume Normal POP
M4: Poisson Arrivals

MEDTHOD TO ESTIMATE
M1: Assume that pi = 0.5
x n p z_a/2 MOE upper Lower M2: nếu pi khác nhiều 0.5 -> dùng p t
12 25 0.48 -1.644854 -0.164354 0.315646 0.644354 m3
std error width width
0.09992 -0.164354 0.164354
alpha p z_a/2 sample size determination The width of the confidence interval fo
0.05 0.789474 -1.959964 2.553878 Sample size
Confidence level
margin of error E 0.09992 Sample proportion p
n 250 E tỉ lệ nghich với n n
pi or p 0.06 muốn E giảm thì tăng m
std propor 0.01502 TRUE
CHECK NOMAl? Yes when n/N greater than 5%
CHECK NOMAl? Yes N 1000 Check >5%
n 90
FPCF 0.954417

TRA APPENDIX D
on = σ /Căn n as the sample size increases.

s n t df estimated std error lower upper


27.79 21 2.085963 20 6.06427516965823 33.010144 58.30986
s sample size determination width width
0 -12.649856 12.64986

D TO ESTIMATE SIGMA (for mean)


me Uniform POP σ = √[(b - a)^2/12 ]
me Normal POP σ = (b - a)/6
σ= √λ

D TO ESTIMATE SIGMA (for p)


me that pi = 0.5
i khác nhiều 0.5 -> dùng p thay pi

of the confidence interval for π depends on


CHAPTER 9: ONE SAMPLE HYPOTHESES TEST
kiểm tra xem trung bình mẫu có khác biệt đáng kể so với trung bình tổng thể giả

TYPE I AND TYPE II ERROR

Type I error (also called a false positive).


Type II error (also called a false negative).

DECISION RULES AND CRITICAL VALUES

Find Critical value of Z


alpha Right Left tailed tailed
0.05 1.645 -1.645 -1.960 1.960
0.1 1.282 -1.282 -1.645 1.645
0.01 2.326 -2.326 -2.576 2.576
0.025 1.960 -1.960 -2.241 2.241
TESTING A MEAN: KNOWN POPULATION VARIANCE
Critical value is the boundary between two regions

Left

Right

Left

Right

Two Tail
Left

Right

Two Tail
Z-Test if two tailed P > alpha, cannot reject Ho

alpha x- µo s n
0.05 55.82 56 0.77 49
Z_crit Z_calc P-value
-1.645 -1.636 0.0509
Z_crit Z_calc P-value
1.645 -1.636 0.9491
tuỳ th
Z_a/2 Z_calc P-value (-z) lower upper width CI width CI
-1.960 -1.636 1636.0000 55.640 56.000 -0.18 0.18
P-value (z)
1.8982

alpha x- µo s n df= n-1


0.05 209 60 13 20 19
T_crit T_calc P-value
-1.729 51.258 1.0000

T_crit T_calc P-value


1.7291 51.258 0.0000

T_crit T_calc P-value lower upper width CI width CI


2.0930 51.258 0.0000 202.916 215.084 -6.084 6.084

alpha x n p p_0
0.05 39 150 0.26 0.02
z_crit z_calc p_value Check normal
-1.644854 20.995626 1 Check normal

z_crit z_calc p_value


1.644854 20.995626 0

z_crit z_calc p_value


-1.959964 20.995626 2
x 30
n 150
pi 0.02
Check normal no
Check normal Yes
INDEPENDENT SAMPLE
Z_Test KNOWN VARIANCE
alpha 0.01
s1 3 s2 3
x1- 13.4 x2- 15.2
n1 18 n2 18
NT SAMPLE INDEPENDENT SAMPLE
N VARIANCE T_Test UNKNOWN VARIANCE

z_crit -2.575829 alpha x1- 240.000 x2- 252.000


z_calc -1.8000 0.05 S1 20.000 S2 15.000
p_value n1 10 n2 14
n1-1 9 n2-1 13
〖𝑠 1 〗 ^2/𝑛 40〖𝑠 2 〗 ^2/𝑛16.07143

Equal variance d.f T_crit T_calc 〖𝑺𝒑〗


p_value
^𝟐
22 -1.717 -1.603 0.061648 296.5909
Unequal variance d.f T_crit T_calc p_value
16 -1.753 -1.603 0.06494

T_crit T_calc p_value RIGHT


1.717 -1.603 0.938352
T_crit T_calc p_value
1.753 -1.603 0.93506
T_crit T_calc p_value TWO TAIL
2.074 -1.603 0.123297
T_crit T_calc p_value
2.1314 -1.603 0.129880

PAIRED Ho: u_d = 0


H1: u_d khác 0 d.f
alpha d_gạch S_d n n-1
0.05 0.8286 1.755 7 6

COMPARE PROPORTION

alpha 0.05
x1 70 x2 104 pc
n1 140 n2 260 z_crit
p1 0.5 p2 0.4 z_calc
p_value

F- TEST : COMPARE TWO VARIANCE


If the test statistic F is much less than 1 or much greater than 1, we
LEFT a s1 s2
0.05 n1 3 n2
df1 2 df2
RIGHT a s1 s2
0.05 n1 n2
df1 df2

TWO TAIL a s1 s2
0.05 n1 n2
df1 df2

FOOLED F TEST
Ho
H1 u1 - u2 = 0
u1 - u2 khác 0

Sp LEFT
17.22181

INDEPENDENT SAMPLE

TWO TAIL 5.1 3 12.1 6.2 11.5 7.8


3.2 2.2 8.7 7.7 9.4 7.8

diff 1.9 0.8 3.4 -1.5 2.1 0


mean 0.82857143
Std 1.75472152

Criti T T_calc p_value


-1.943 1.249 0.1290

CI CI
0.435 -0.002008 0.2020084
-1.6449 Z-RIGHT 1.644854 Z-TWO TAIL-1.95996398
1.924207 1.924207 1.92420724
0.9728 0.027164 1.94567139

1 or much greater than 1, we would reject the hypothesis of equal population variances
F_crit 0.05218
4 F_calc #DIV/0!
3 p-value
F_crit
F_calc
p-value

F_crit
F_calc
p-value
2.2
3.1

-0.9
CHAPTER 11 Analysis of Variance
ANOVA Assumptions
• Observations on Y are independent.
• Populations being sampled are normal.
• Populations being sampled have equal variances

group mea overal mea group sz c (số grou 5 df1= c-1 4 Ho: u1=u2=u3=u4
22.7 489.1169 n (overal s22 df2= n-c 15 H1: Not all the means are equal
20.5 489.1169 alpha 0.05 F-calc > F-crit -> reject Ho
#DIV/0! 489.1169
SSB= sumS 744.000 MSB 186 F_calc F_crit P_value
SSE=sumS 751.500 MSE 50.1 3.713 3.056 0.027136
SS_Total 1495.5

630.83300
CHAPTER 12: SIMPLE REGRESSION 2.9999838

Correlation Coefficient (HỆ SỐ TƯƠNG QUAN) is denoted r. Its value will fall in the interval [-1;1]
This measures the degree of linearity in the relationship between two random variables X and Y and

4601

r 0.803546 array1 (xi) array2 (yi)


1 0.1
3 0.4
5 0.3
2 0.5
6 0.8
7 0.87
l the means are equal
-crit -> reject Ho

Tests for Significant Correlation Using Student’s t


sample correl dùng để estimate pop correl H0: ρ=0
erval [-1;1] H1: ρ khác 0

alpha 0.05 t_crit 2.068658


r 0.6 t_calc 3.596874
t_crit =T.INV.2T(alpha,deg_freedom) n 25 p_value 0.000761
p_value=T.DIST.2T(t_calc, df) df= n-2 23

SLOPE & INTERCEPT Whether population correl

alpha
n
df=n-2

t-crit= T.INV.2T(alpha,df)

It ranges from -1.0 to +1.0 inclusive


It measures the strength of the relationship between two variables
A value of 0.00 indicates two variables are not related
Whether population correlation is zero
S_b1 0.86095 Sb0
0.05 b1 1.9641 b0
25 t_slope 2.281317 t_intercept #DIV/0!
23 T-crit 1.713872 T-crit 2.068658
p 0.016056

You might also like