You are on page 1of 13

Chap 2: Types of variables:

4 levels of measurement Numerical (quatitative) (gallon/miles)

Nominal (label, name, coded...) -discrete and countinous (5 vs 0.5)

Ordinal (order, ranking, no distance Categorical (qualitative)


meaningful) -verbal, coded
Interval (khoả ng, ko có 0 value, độ C) +time serries (ss qua ngà y, thá ng)
Ratio (heigh, weight, tiền, income, age, +cross-sectional (ss observations in class..)
return on invest) meaningful zero point.

Chap 1 There are two primary kinds of statistics:


• Descriptive statistics refers to the collection,
organization, presentation, and summary of
data (either using charts and graphs or using a
numerical summary).
• Inferential statistics refers to generalizing from a
sample to a population, estimating
unknown population parameters, drawing
conclusions, and making decisions.

Chapter 3 and 4: Numerical Descriptive Measures

Topic Formula

Tổ ng (Xi – x ngang ) = 0

Sample Mean

Population Mean

Weighted Mean

Percentile Location

Range
Range = Max – Min

Sample MAD

1
Population MAD

Sample Variance

Sample Standard Deviation

Population Variance

Population Standard Deviation

Sample Coefficient of Variation

Population Coefficient of Variation

Sharpe Ratio

z-Score

Sample Mean for Grouped Data

Sample Variance for Grouped Data

Population Mean for Grouped Data

Population Variance for Grouped


Data

Sample Covariance

Population Covariance

Sample Correlation Coefficient

Population Correlation Coefficient

2
The second quartile Q2 is the median.

the interquartile range Q3 – Q1 (denoted IQR) measures the degree of spread in the data (the middle 50
percent).

Chapter 5: Introduction to Probability

Intersect(giao-and), union (hop-or),

3
Topic Formula

Complement Rule

Addition Rule (qui tac cong)

Addition Rule for Mutually


Exclusive Events P(A giao B) = 0 XUNG KHAC

Conditional Probability

Multiplication Rule (qui tac nhan)


A & B independent P (A GIAO B) = P(A) X P(B) lưu ý nhận biết biến độc lập

BIEN CO DOC LAP P (A/B) = P(A), P(B/A) = P(B)

Total Probability Rule

Bayes’ Theorem
or

Chapter 6: Discrete Probability Distributions tổng P(x) = 1

Topic Formula

Expected Value of a Discrete


Random Variable
Variance of a Discrete Random
Variable
Standard Deviation of a Discrete
Random Variable
Binomial Distribution
Bernoulli
At least >= 2
At most =< 2 = nCx x ....
Fewer than 2 (< 2) = 1 – P(X<2) Excel* PDF =BINOM.DIST(x, n, π, 0) = P (X = x)
Fewer than 2 or more than 2 Excel* CDF =BINOM.DIST(x, n, π, 1) = P (X=< x)
P(X < 2) + P (X > 2) π < .50 skewed right
π = .50 symmetric
π > .50 skewed left

Expected Value of a Binomial


Random Variable
Variance of a Binomial Random
Variable
Standard Deviation of a Binomial
Random Variable

4
ALWAYS SKEWED RIGHT
Lệch ít khi lamda lớn
Poisson Distribution Excel* PDF =POISSON.DIST(x, λ, O)
Excel* CDF =POISSON.DIST(x, λ, 1)

Expected Value of a Poisson


Random Variable
Variance of a Poisson Random
Variable =S
Standard Deviation of a Poisson
Random Variable
Hypergeometric Distribution
sampling is without replacement
from a finite population.

If n/N < .05, it is safe to use the


binomial, using =HYPGEOM.DIST(x, n, s, N, 0)
sample size n
success probability π = S/N.
Expected Value of a
Hypergeometric Random Variable
Variance of a Hypergeometric
Random Variable

Standard Deviation of a
Hypergeometric Random Variable

Chapter 7: Continuous Probability Distributions

Topic Formula

Cumulative Distribution Function = x -a / d-a

Continuous Uniform Distribution

Expected Value of a Uniform Distribution


mean

Standard Deviation of a Uniform


Distribution
P ( c < x < d) = d-c/b-a
Standard Transformation of the Normal
Random Variable

5
PDF in Excel* =NORM.DIST(x,μ,σ,0)
CDF in Excel* =NORM.DIST(x,μ,σ,1)
Random data in Excel =NORM.INV(RAND(), μ,σ)

Inverse Transformation of the Normal


Random Variable =NORM.INV(area, μ, σ)
=NORM.S.INV(area)

Mean of an Exponential Distribution

Standard Deviation of an Exponential


Distribution CDF in Excel* = NORM.S.DIST(z,1)
Random data in Excel =NORM.S.INV(RAND())
Exponential Cumulative Distribution
Function

Chapter 8: Sampling and Sampling Distributions

Topic E = z x sigma/căn n

Expected Value of the Sample Mean

Standard Error of the Sample Mean

Standard Transformation of the Sample


Mean
Expected Value of the Sample
Proportion
Standard Error of the Sample
Proportion

Standard Transformation of the Sample


Proportion

Finite Population Correction Factor for


the Sample Mean

Finite Population Correction Factor for


the Sample Proportion

Confidence Interval for when is


Known

Confidence Interval for when is


Unknown ;

Confidence Interval for p


, p=x/n = mean

6
Required Sample Size when Estimating
the Population Mean

Required Sample Size when Estimating


the Population Proportion nếu ko có pi thì cho 0.5
Sigma = (a-b)/6

Interval width of proportion P +/- Z (𝛼/2) x căn [p(p-1)/n]

Chapter 9: Hypothesis Testing

Topic
Type 2 error có cost cao hơn 1
Formula
Alpha tăng b giam va nguoc lai
N tăng thì a và b giam

Test Statistic for when is Known

Test Statistic for when is Unknown


;

Test Statistic for p

7
Chapter 10: Comparisons Involving Means

Topic Formula

Confidence Interval for if


and are known

Confidence Interval for if


;
and are unknown but
assumed equal
;

Confidence Interval for if


;
and are unknown and
cannot be assumed equal

Test Statistic for if and


are known

Test Statistic for if and


are unknown but assumed ;
equal
;

Test Statistic for if and


are unknown and cannot be ;
assumed equal

Confidence Interval for ;

Test Statistic for


;

Confidence Interval for

Test Statistic for Testing if


is zero

8
Test Statistic for Testing if
is not zero

Chapter 11: Comparisons Involving Proportions


When testing for differences between treatment means, the t-
Dependent variable (numerical) statistic is based on degree error of freedom/ a confidence
Independent variable (categorical) interval is computed with mean square error.
F distribution always positive and b/t 0 and 1

Grand Mean for ANOVA

Sum of Squares Due to Treatments


SSTR

Mean Square for Treatments MSTR

Error Sum of Squares

Mean Square Error

;
Test Statistic for a One-way
ANOVA Test Fcalc = F (𝛼) (df1, df2). F test is a right-tailed test.
F cr = =F.INV.RT(α,df1,df2). Dùng α=.05
p-value is =F.DIST.RT(Fcacl,3,16).

9
Chapter 12: Basics of Regression Analysis

Topic Tính r =CORREL (array1,array2) (sample correlation


coefficient)
R measures the strength of the
linear relationship between two
variables. * P-values
an inverse relationship =T.DIST.2T(t,deg_freedom). df=n-2 (2 Tailed)
between X and Y -> negative = T.DIST.RT ( Tcacl, df)
correlation T cr= t.inv.2T (prob, df)

Simple Linear Regression Model

Sample Regression Equation for


the Simple Linear Regression
Model
∑ ( x i− x ) ( y i − y )
Slope of the Sample Regression b 1= 2
Equation ∑ ( x i−x )
Intercept of the Sample
Regression Equation

Standard Error of the Estimate


;
k = number of independent variables

Total Sum of Squares

Sum of Squares due to Regression

Sum of Squares due to Error

Coefficient of Determinant
R2 = 36 nghia la explain 36 percent of variation in Y
critical value for a correlation
coefficient

10
Adjusted

Test Statistics for the Test of b j−β j 0


t cal= ; df =n−k−1
Individual Significance se ( b j)

Confidence Interval for b j ±t α/ 2 ,df se ( b j ); df =n−k−1


Confidence Interval for the
Expected Value of y ; df =n−−k −1

Prediction Interval for an


Individual Value of y
df =n−k−1
;
Residuals for the Regression Model
Time series plot
b0 =INTERCEPT (YData, XData)
b1 =SLOPE (YData, XData)

confidence level 1-𝛼 𝛼 𝛼 /2 Z 𝛼 /2


90% 0.9 0.1 0.05 1.645
95% 0.95 0.05 0.025 1.96
99% 0.99 0.01 0.005 2.576
Increase sample size, 𝛼 -> reduce interval

11
12
13

You might also like