You are on page 1of 4

Notation

n = sample size
x = sample mean
s = sample stdev

Chapter 3

Qj = jth quartile
N = population size
m = population mean

Lower limit Q1 1.5 . IQR, Upper limit Q3 1.5 . IQR

xi
n

xi
N
Population standard deviation (standard deviation of a variable):
Population mean (mean of a variable): m =

Range: Range Max Min


Sample standard deviation:
(xi - x )2
s =
B n - 1

or

s =

x2i - (xi)2>n
B

n - 1

g(xi - m)2
B
N

s =

or

gx2i
B N

- m2

x - m
s

Probability Concepts

Probability for equally likely outcomes:

Rule of total probability:

f
P(E ) =
N
where f denotes the number of ways event E can occur and
N denotes the total number of outcomes possible.
Special addition rule:
P(A or B or C or ) = P(A) + P(B) + P(C) +
(A, B, C, mutually exclusive)
Complementation rule: P(E) 1 P(not E)
General addition rule: P(A or B) P(A) P(B) P(A & B)
Conditional probability rule: P(B A) =

P(A & B)
P(A)

General multiplication rule: P(A & B) P(A) P(B A)


Special multiplication rule:
P(A & B & C & ) = P(A) # P(B) # P(C)

P(B) = a P(Aj) # P(B Aj)


j=1

(A1, A2, , Ak mutually exclusive and exhaustive)


Bayess rule:
P(Ai B) =

P(Ai) # P(B Ai)


k
P(Aj) # P(B Aj)

aj=1
(A1, A2, , Ak mutually exclusive and exhaustive)
Factorial: k! = k(k - 1) 2 # 1
Permutations rule: mPr =

m!
(m - r)!

Special permutations rule: mPm = m!


Combinations rule: mCr =

m!
r!(m - r)!

Number of possible samples: N Cn =

(A, B, C, independent)

N!
n!(N - n)!

Discrete Random Variables

Mean of a discrete random variable X: m = xP(X = x)


Standard deviation of a discrete random variable X:
s = 2(x - m)2P(X = x) or s = 2x2P(X = x) - m2
Factorial: k! = k(k - 1) 2 # 1
n
n!
Binomial coefficient: a b =
x
x!(n - x)!
Binomial probability formula:
n
P(X = x) = a bpx(1 - p)n - x
x

Chapter 6

s =

Standardized variable: z =

Interquartile range: IQR Q3 Q1

Chapter 5

p = population proportion
O = observed frequency
E = expected frequency

Descriptive Measures

Sample mean: x =

Chapter 4

s = population stdev
d = paired difference
pN = sample proportion

where n denotes the number of trials and p denotes the success


probability.
Mean of a binomial random variable: np
Standard deviation of a binomial random variable:
s = 1np(1 - p)
Poisson probability formula: P(X = x) = e-l
Mean of a Poisson random variable:
Standard deviation of a Poisson random variable: s = 1l

The Normal Distribution

z-score for an x-value: z =

x - m
s

lx
x!

x-value for a z-score: x = m + z # s

Chapter 7

The Sampling Distribution of the Sample Mean

Mean of the variable x: mx = m

Chapter 8

Standard deviation of the variable x: sx = s> 1n

Confidence Intervals for One Population Mean

Standardized version of the variable x:

Studentized version of the variable x:

x - m

z =

t =

s> 1n

z-interval for ( known, normal population or large sample):

x ; ta>2 #

Margin of error for the estimate of : E = za>2 #

n = a

za>2 # s
E

s> 1n

t-interval for ( unknown, normal population or large sample):

s
x ; za>2 #
1n

Sample size for estimating :

x - m

s
1n

s
1n

with df n 1.

rounded up to the nearest whole number.

Chapter 9

Hypothesis Tests for One Population Mean

z-test statistic for H0: 0 ( known, normal population or


large sample):
x - m0
z =
s> 1n
t-test statistic for H0: 0 ( unknown, normal population or
large sample):
t =

Symmetry property of a Wilcoxon signed-rank distribution:


W1 - A = n(n + 1)>2 - WA
Wilcoxon signed-rank test statistic for H0: 0 (symmetric
population):
W sum of the positive ranks

x - m0
s> 1n

with df n 1.

Chapter 10

Inferences for Two Population Means

Pooled sample standard deviation:


sp =

(n1 -

+ (n2 n1 + n2 - 2
1)s21

1)s22

Pooled t-test statistic for H0: 1 2 (independent samples,


normal populations or large samples, and equal population
standard deviations):
x1 - x2
t =
sp 2(1>n1) + (1>n2)
with df n1 n2 2.

t =

x1 - x2
2(s21>n1) + (s22>n2)

with df .
Nonpooled t-interval for 1 2 (independent samples, and
normal populations or large samples):
(x1 - x2) ; ta>2 # 2(s21>n1) + (s22>n2)
with df .

Pooled t-interval for 1 2 (independent samples, normal


populations or large samples, and equal population standard
deviations):
(x1 - x2) ; ta>2 # sp 2(1>n1) + (1>n2)
with df n1 n2 2.

Symmetry property of a MannWhitney distribution:


M1 - A = n1(n1 + n2 + 1) - MA
MannWhitney test statistic for H0: 1 2 (independent samples and same-shape populations):
M sum of the ranks for sample data from Population 1

Degrees of freedom for nonpooled t-procedures:


=

Nonpooled t-test statistic for H0: 1 2 (independent samples,


and normal populations or large samples):

[(s21>n1) + (s22>n2)]2

Paired t-test statistic for H0: 1 2 (paired sample, and normal


differences or large sample):

(s21>n1)2
(s22>n2)2
+
n1 - 1
n2 - 1

rounded down to the nearest integer.

t =
with df n 1.

d
sd > 1n

Paired t-interval for 1 2 (paired sample, and normal differences


or large sample):
sd
d ; ta>2 #
1n
with df n 1.

Chapter 11

Paired Wilcoxon signed-rank test statistic for H0: 1 2 (paired


sample and symmetric differences):
W sum of the positive ranks

Inferences for Population Standard Deviations

x2-test statistic for H0: 0 (normal population):


n - 1 2
x2 =
s
s20
with df n 1.

F-test statistic for H0: 1 2 (independent samples and normal


populations):
F = s21>s22

x2-interval for (normal population):


n - 1#
n - 1#
s to
s
A x2a>2
A x21 - a>2

F-interval for s1>s2 (independent samples and normal populations):


s
1 # s1
1
# 1
to
s
s
2
2Fa>2
2F1 - a>2 2

with df n 1.

Chapter 12

with df (n1 1, n2 1).

with df (n1 1, n2 1).

Inferences for Population Proportions

Sample proportion:
x
n
where x denotes the number of members in the sample that have
the specified attribute.

z-test statistic for H0: p1 p2:

pN =

z-interval for p:

pN ; za>2 # 2pN (1 - pN )>n

(Assumption: both x and n x are 5 or greater)


Margin of error for the estimate of p:
E = za>2 # 2pN (1 - pN )>n
Sample size for estimating p:
za>2 2
za>2 2
b or n = pN g (1 - pN g) a
b
n = 0.25 a
E
E
rounded up to the nearest whole number (g educated guess)

z =

(Assumptions: independent samples; x1, n1 x1, x2, n2 x2 are


all 5 or greater)
z-interval for p1 p2:
( pN 1 - pN 2) ; za>2 # 2pN 1(1 - pN 1)>n1 + pN 2(1 - pN 2)>n2
(Assumptions: independent samples; x1, n1 x1, x2, n2 x2 are
all 5 or greater)
Margin of error for the estimate of p1 p2:
E = za>2 # 2pN 1(1 - pN 1)>n1 + pN 2(1 - pN 2)>n2
Sample size for estimating p1 p2:
n1 = n2 = 0.5 a

z-test statistic for H0: p p0:


z =

pN - p0

2p0(1 - p0)>n
(Assumption: both np0 and n(1 p0) are 5 or greater)
x1 + x2
Pooled sample proportion: pN p =
n1 + n 2

Chapter 13

pN 1 - pN 2
2pN p(1 - pN p)2(1>n1) + (1>n2)

za>2
E

or
n1 = n2 = 1 pN 1g (1 - pN 1g) + pN 2g (1 - pN 2g)2 a

za>2
E

rounded up to the nearest whole number (g educated guess)

Chi-Square Procedures

Expected frequencies for a chi-square goodness-of-fit test:


E np
Test statistic for a chi-square goodness-of-fit test:
x2 = (O - E)2>E
with df c 1, where c is the number of possible values for the
variable under consideration.
Expected frequencies for a chi-square independence test or a
chi-square homogeneity test:
R#C
E =
n
where R row total and C column total.

Test statistic for a chi-square independence test:


x2 = (O - E)2>E
with df (r 1)(c 1), where r and c are the number of possible
values for the two variables under consideration.
Test-statistic for a chi-square homogeneity test:
x2 = (O - E)2>E
with df (r 1)(c 1), where r is the number of populations
and c is the number of possible values for the variable under
consideration.

Chapter 14

Descriptive Methods in Regression and Correlation


2
>Sxx
Regression sum of squares: SSR = ( yN i - y)2 = Sxy

Sxx, Sxy , and Syy:


Sxx = (xi - x)2 = x2i - (xi)2>n

2
>Sxx
Error sum of squares: SSE = ( yi - yN i )2 = Syy - Sxy

Sxy = (xi - x)( yi - y) = xi yi - (xi)(yi)>n

Regression identity: SST SSR SSE

Syy = ( yi - y) =
2

y2i

- (yi) >n
2

Regression equation: yN = b0 + b1x, where


Sxy
1
b1 =
and b0 = (yi - b1 xi) = y - b1x
n
Sxx
Total sum of squares: SST = ( yi - y) = Syy
2

Chapter 15

Coefficient of determination: r 2 =

Linear correlation coefficient:


1
n - 1 (xi - x)( yi - y)
r =
sx sy

or r =

Sxy
1SxxSyy

Inferential Methods in Regression and Correlation

Population regression equation: y = b 0 + b 1x


SSE
Standard error of the estimate: se =
An - 2
Test statistic for H0: 1 0:
b1
t =
se> 1Sxx
with df n 2.

Prediction interval for an observed value of the response variable


corresponding to xp:
yN p ; ta>2 # se

b1 ; ta>2 #
with df n 2.

1 +

Test statistic for H0: r = 0:


t =

se
1Sxx

Confidence interval for the conditional mean of the response


variable corresponding to xp:
(x - xi>n)2
yN p ; ta>2 # se 1 + p
An
Sxx
with df n 2.

(xp - xi>n)2
1
+
n
Sxx

with df n 2.

Confidence interval for 1:

Chapter 16

SSR
SST

r
1 - r2
An - 2

with df n 2.
Test statistic for a correlation test for normality:
Rp =

xiwi

2Sxx w2i
where x and w denote observations of the variable and the corresponding normal scores, respectively.

Analysis of Variance (ANOVA)

Notation in one-way ANOVA:


k number of populations
n total number of observations
x mean of all n observations
nj size of sample from Population j
xj mean of sample from Population j
s2j variance of sample from Population j
Tj sum of sample data from Population j
Defining formulas for sums of squares in one-way ANOVA:
SST = (xi - x)2
SSTR = nj (xj - x)2
SSE = (nj -

1)s2j

One-way ANOVA identity: SST SSTR SSE


Computing formulas for sums of squares in one-way ANOVA:
SST = x2i - (xi)2>n
SSTR = (Tj2>nj) - (xi)2>n
SSE = SST - SSTR
Mean squares in one-way ANOVA:
SSTR
SSE
MSTR =
MSE =
k - 1
n - k

Test statistic for one-way ANOVA (independent samples, normal


populations, and equal population standard deviations):
F =

MSTR
MSE

with df (k 1, n k).
Confidence interval for i j in the Tukey multiple-comparison
method (independent samples, normal populations, and equal
population standard deviations):
(xi - xj) ;

qa
12

# s2(1>ni) + (1>nj)

where s = 1MSE and q is obtained for a q-curve with parameters


k and n k.
Test statistic for a KruskalWallis test (independent samples,
same-shape populations, all sample sizes 5 or greater):
k R2
j
SSTR
12
H =
or H =
- 3(n + 1)
SST>(n - 1)
n(n + 1) a n
j=1

where SSTR and SST are computed for the ranks of the data,
and Rj denotes the sum of the ranks for the sample data from
Population j. H has approximately a chi-square distribution
with df k 1.