Basic Statistics Formula Sheet PDF

Basic Statistcs Formula Sheet
Steven W. Nydick
May 25, 2012
This document is only intended to review basic concepts/formulas from an introduction to statistics course. Only mean-based
procedures are reviewed, and emphasis is placed on a simplistic understanding is placed on when to use any method. After reviewing
and understanding this document, one should then learn about more complex procedures and methods in statistics. However, keep
in mind the assumptions behind certain procedures, and know that statistical procedures are sometimes flexible to data that do not
necessarily match the assumptions.
1
Descriptive Statistics
Elementary Descriptives (Univariate & Bivariate)
Name Population Symbol Sample Symbol Sample Calculation Main Problems Alternatives
P
x
Mean x x = N
Sensitive to outliers Median, Mode
(xx)2
P
Variance x2 s2x s2x = N 1
Sensitive to outliers MAD, IQR
2
Standard Dev x sx sx = sx Biased MAD
P
(xx)(yy)
Covariance xy sxy sxy = N 1
Outliers, uninterpretable units Correlation
sxy
Correlation xy rxy rxy = sx sy
Range restriction, outliers,
P
(zx zy )
rxy = N 1
nonlinearity
z-score zx zx zx = xx
sx
; z = 0; s2z = 1 Doesnt make distribution normal
Simple Linear Regression (Usually Quantitative IV; Quantitative DV)
Part Population Symbol Sample Symbol Sample Calculation Meaning
Regular Equation yi = + xi + i yi = a + bxi + ei yi = a + bxi Predict y from x

P
sxy (xx)(yy)
Slope b b= s2
= P
(xx)2
Predicted change in y for unit change in x
x
Intercept a a = y bx Predicted y for x = 0
Standardized Equation zyi = xy zxi + i zyi = rxy zxi + ei zyi = rxy zxi Predict zy from zx

sxy sx
Slope xy rxy rxy = sx sy
=b sy
Predicted change in zy for unit change in zx
Intercept None None 0 Predicted zy for zx = 0 is 0
Effect Size P2 R2 2
ryy 2
= rxy Variance in y accounted for by regression line
2
Inferential Statistics
t-tests (Categorical IV (1 or 2 Groups); Quantitative DV)
Test Statistic Parameter Standard Deviation Standard Error df t-obt

qP
(xx)2 sx x0
One Sample x sx = N 1

N
N 1 tobt = sx

N
qP
(DD)2 DD 0
Paired Samples D D sD = ND 1
sD ND 1 tobt = s
ND D
ND
q
(n1 1)s2 2
q
1 +(n2 1)s2 1 1 (x1 x2 )(1 2 )0
Independent Samples x1 x2 1 2 sp = n1 +n2 2
sp n1
+ n2
n1 + n2 2 tobt = q
sp n1 + n1
1 2
Correlation r =0 NA NA N 2 tobt = r r
1r 2
N 2
qP
(yy)2 a0 b0
Regression (FYI) a&b & e = N 2
sa & sb N 2 tobt = sa
& tobt = sb
t-tests Hypotheses/Rejection
Question One Sample Paired Sample Independent Sample When to Reject
Greater Than? H0 : # H0 : D # H 0 : 1 2 # Extreme positive numbers

H1 : > # H1 : D > # H 1 : 1 2 > # tobt > tcrit (one-tailed)
Less Than? H0 : # H0 : D # H 0 : 1 2 # Extreme negative numbers
H1 : < # H1 : D < # H 1 : 1 2 < # tobt < tcrit (one-tailed)
Not Equal To? H0 : = # H0 : D = # H 0 : 1 2 = # Extreme numbers (negative and positive)
H1 : 6= # H1 : D 6= # H1 : 1 2 6= # |tobt | > |tcrit | (two-tailed)
t-tests Miscellaneous
Test Confidence Interval: % = (1 )% Unstandardized Effect Size Standardized Effect Size
One Sample x tN 1; crit(2-tailed) sx

N
x 0 d = x0
sx
Paired Samples D tND 1; crit(2-tailed) sD D d = D

sD
ND
q
Independent Samples (x1 x2 ) tn1 +n2 2; crit(2-tailed) sp 1
n1
+ 1
n2
x1 x2 d = x1 x2
sp
3
One-Way ANOVA (Categorical IV (Usually 3 or More Groups); Quantitative DV)
Source Sums of Sq. df Mean Sq. F -stat Effect Size

Pg
Between j=1 nj (xj xG )2 g1 SSB/df B M SB/M SW 2 = SSB
SST
Pg
Within j=1 (nj 1)s2j N g SSW/df W
xG )2
P
Total i,j (xij N 1
1. We perform ANOVA because of family-wise error -- the probability of rejecting at least one true H0 during multiple tests.
2. G is grand mean or average of all scores ignoring group membership.
3. xj is the mean of group j; nj is number of people in group j; g is the number of groups; N is the total number of people.
One-Way ANOVA Hypotheses/Rejection
Question Hypotheses When to Reject
Is at least one mean different? H 0 : 1 = 2 = = k Extreme positive numbers
H1 : At least one is different from at least one other Fobt > Fcrit
Remember Post-Hoc Tests: LSD, Bonferroni, Tukey (what are the rank orderings of the means?)
Chi Square (2 ) (Categorical IV; Categorical DV)
Test Hypotheses Observed Expected df 2 Stat When to Reject

2
PR PC (fO ij fE ij )
Independence H0 : Vars are Independent From Table N pj p k (Cols - 1)(Rows - 1) i=1 j=1 fE ij
Extreme Positive Numbers
H1 : Vars are Dependent 2obt > 2crit
PC (fO i fE i )2
Goodness of Fit H0 : Model Fits From Table N pi Cells - 1 i=1 fE i
Extreme Positive Numbers
H1 : Model Doesnt Fit 2obt > 2crit
1. Remember: the sum is over the number of cells/columns/rows (not the number of people)
2. For Test of Independence: pj and pk are the marginal proportions of variable j and variable k respectively
3. For Goodness of Fit: pi is the expected proportion in cell i if the data fit the model
4. N is the total number of people
4
Assumptions of Statistical Models
Correlation Regression
1. Estimating: Relationship is linear 1. Relationship is linear
2. Estimating: No outliers 2. Bivariate normality
3. Estimating: No range restriction 3. Homoskedasticity (constant error variance)

4. Testing: Bivariate normality 4. Independence of pairs of observations
One Sample t-test Independent Samples t-test

1. x is normally distributed in the population
2. Independence of observations 1. Each group is normally distributed in the population
2. Homogeneity of variance (both groups have the same variance in the population)
Paired Samples t-test
1. Difference scores are normally distributed in the population
3. Independence of observations within and between groups (random sampling &
2. Independence of pairs of observations random assignment)
One-Way ANOVA Chi Square (2 )

1. No small expected frequencies
1. Each group is normally distributed in the population
Total number of observations at least 20
2. Homogeneity of variance Expected number in any cell at least 5
2. Independence of observations
3. Independence of observations within and between groups Each individual is only in ONE cell of the table
Central Limit Theorem Possible Decisions/Outcomes

Given a population distribution with a mean and a variance 2 , the sampling dis- H0 True H0 False
tribution of the mean using sample size N (or, to put it another way, the distribution
2 Rejecting H0 Type I Error () Correct Decision (1 ; Power)
of sample means) will have a mean of x = and a variance equal to x2 = N ,
which implies that x = N . Furthermore, the distribution will approach the normal Not Rejecting H0 Correct Decision (1 ) Type II Error ()
distribution as N , the sample size, increases. 2
Power Increases If: N , , , Mean Difference , or One-Tailed Test

Basic Statistics Formula Sheet PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Statistics Formula Sheet PDF

Uploaded by

Copyright:

Available Formats

Basic Statistcs Formula Sheet

Simple Linear Regression (Usually Quantitative IV; Quantitative DV)

Part Population Symbol Sample Symbol Sample Calculation Meaning

Regular Equation yi = + xi + i yi = a + bxi + ei yi = a + bxi Predict y from x

Intercept a a = y bx Predicted y for x = 0

Intercept None None 0 Predicted zy for zx = 0 is 0

Test Statistic Parameter Standard Deviation Standard Error df t-obt

Question One Sample Paired Sample Independent Sample When to Reject

Greater Than? H0 : # H0 : D # H 0 : 1 2 # Extreme positive numbers

Test Confidence Interval: % = (1 )% Unstandardized Effect Size Standardized Effect Size

One Sample x tN 1; crit(2-tailed) sx

Paired Samples D tND 1; crit(2-tailed) sD D d = D

Source Sums of Sq. df Mean Sq. F -stat Effect Size

One-Way ANOVA Hypotheses/Rejection

Question Hypotheses When to Reject

Is at least one mean different? H 0 : 1 = 2 = = k Extreme positive numbers

Chi Square (2 ) (Categorical IV; Categorical DV)

Test Hypotheses Observed Expected df 2 Stat When to Reject

3. Estimating: No range restriction 3. Homoskedasticity (constant error variance)

One Sample t-test Independent Samples t-test

One-Way ANOVA Chi Square (2 )

Central Limit Theorem Possible Decisions/Outcomes

You might also like

Basic Statistics Formula Sheet PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Statistics Formula Sheet PDF

Uploaded by

Copyright:

Available Formats

Basic Statistcs Formula Sheet

Simple Linear Regression (Usually Quantitative IV; Quantitative DV)

Part Population Symbol Sample Symbol Sample Calculation Meaning

Regular Equation yi = + xi + i yi = a + bxi + ei yi = a + bxi Predict y from x

Intercept a a = y bx Predicted y for x = 0

Intercept None None 0 Predicted zy for zx = 0 is 0

Test Statistic Parameter Standard Deviation Standard Error df t-obt

Question One Sample Paired Sample Independent Sample When to Reject

Greater Than? H0 : # H0 : D # H 0 : 1 2 # Extreme positive numbers

Test Confidence Interval: % = (1 )% Unstandardized Effect Size Standardized Effect Size

One Sample x tN 1; crit(2-tailed) sx

Paired Samples D tND 1; crit(2-tailed) sD D d = D

Source Sums of Sq. df Mean Sq. F -stat Effect Size

One-Way ANOVA Hypotheses/Rejection

Question Hypotheses When to Reject

Is at least one mean different? H 0 : 1 = 2 = = k Extreme positive numbers

Chi Square (2 ) (Categorical IV; Categorical DV)

Test Hypotheses Observed Expected df 2 Stat When to Reject

3. Estimating: No range restriction 3. Homoskedasticity (constant error variance)

One Sample t-test Independent Samples t-test

One-Way ANOVA Chi Square (2 )

Central Limit Theorem Possible Decisions/Outcomes

You might also like

Regular Equation yi = + xi + i yi = a + bxi + ei yi = a + bxi Predict y from x