You are on page 1of 6

CAPE MATHEMATICS

Cheat Sheet

• A population consists of all elements whose SAMPLING TECHNIQUES


characteristics are being studied
• a sample is a set of data drawn from a Simple Random Sampling
population for examination. Every member of the population has an equal
• A census is analysis using the entire population chance of being selected. You can use random
• A survey is analysis using the sample numbers or lottery technique.
Systematic
• A parameter is any characteristic of the
𝑁
population Select every kth element; where 𝑘 = 𝑛
• a statistic on the other hand is a Stratified
characteristic of the sample Group items into strata with a similar
• A variable is some characteristic of a population characteristic. Choose proportionally from
𝑛
or sample that assumes different values for each stratum 𝑥 × 𝑁
different elements Cluster
• Qualitative data is a categorical Form clusters with high variation within.
measurement expressed not in terms of Instead of sampling individuals from each
numbers subgroup, you randomly select entire
subgroups.
• Quantitative data is a numerical
Quota:
measurement
• A discrete variable is a variable whose values are Form strata just as in stratified random
countable. sampling but choose within each using
• A Continuous Variable can assume any value in a convenience.
given range, and cannot be counted
• Primary data is data observed or collected
directly from first-hand experience. GRAPHS
• Secondary data is data published and Histogram
collected in the past or from other parties. x: class boundaries
y: frequencies

SAMPLING Polygon
x: midpoints
Sampling frame
y: frequencies
The sampling frame is the actual list of
individuals that the sample will be drawn Ogive
from. x: upper class boundaries
y: cumulative frequencies

1
Class Width = Upper Class Boundary – MEASURES OF SPREAD
Lower Class Boundary
Variance
UCB + LCB
Midpoint = Raw
2
𝟐
∑ 𝒙𝟐 ∑𝒙
−( )
Mean 𝒏 𝒏

Raw Ungrouped & Grouped

𝟐
∑𝒙 ∑ 𝒙𝟐 𝒇 ∑𝒙𝒇
𝒙̄ = −( )
𝒏 ∑𝒇 ∑𝒇
Ungrouped & Grouped Sample Standard Deviation
∑ 𝒙𝒇
𝒙̄ =
∑𝒇 s = sample var iance = s 2

Mode:

𝒇𝟏 − 𝒇𝟎 Quartile Deviation:
𝒍+( )×𝒄
𝟐𝒇𝟏 − 𝒇𝟎 − 𝒇𝟐
𝑸𝟑 − 𝑸𝟏
𝑸𝑫 =
Median 𝟐

𝒏+𝟏 Interquartile Range:


−𝒎
𝑸𝟐 = 𝒍 + ( 𝟐 )×𝒄 𝑰𝑸𝑹 = 𝑸𝟑 − 𝑸𝟏
𝒇
Lower Quartile
Skewness:
𝒏+𝟏
−𝒎
𝑸𝟏 = 𝒍 + ( 𝟒 )×𝒄
𝒇

Upper Quartile

𝟑𝒏 + 𝟏
−𝒎
𝑸𝟑 = 𝒍 + ( 𝟒 )×𝒄
𝒇

2
PROBABILITY Normal Distribution

Compliment Law If the random variable X has a normal


distribution with mean 𝜇and variance 𝜎 2 , i.e.
𝑷(𝑩̄ ) = 𝟏 − 𝑷(𝑩)
𝑋−𝜇
𝑋~𝑁(𝜇, 𝜎 2 ), then 𝑍 =
Addition Law 𝜎

𝑃(𝑍 < 𝑎) = 𝜙(𝑎)


𝑷(𝑨 ∪ 𝑩) = 𝑷(𝑨) + 𝑷(𝑩) − 𝑷(𝑨 ∩ 𝑩)
𝑃(𝑍 > 𝑎) = 1 − 𝜙(𝑎)
Mutually Exclusive Events
𝑃(𝑍 < −𝑎) = 1 − 𝜙(𝑎)
𝑷(𝑨 ∪ 𝑩) = 𝑷(𝑨) + 𝑷(𝑩)
𝑃(𝑍 > −𝑎) = 𝜙(𝑎)
𝑷(𝑨 ∩ 𝑩) = 𝟎
𝑃(𝑎 < 𝑍 < 𝑏) = 𝜙(𝑏) − 𝜙(𝑎)
Conditional Probability
𝑃(−𝑎 < 𝑍 < −𝑏) = 𝜙(𝑎) − 𝜙(𝑏)
𝑷(𝑨 ∩ 𝑩)
𝑷(𝑨|𝑩) = 𝑃(−𝑎 < 𝑍 < 𝑏) = [𝜙(𝑏) + 𝜙(𝑎)] − 1
𝑷(𝑩)

Independent Events
THE NORMAL APPROXIMATION TO
𝑷(𝑨|𝑩) = 𝑷(𝑩) THE BINOMIAL DISTRIBUTION

𝑷(𝑨 ∩ 𝑩) = 𝑷(𝑨) × 𝑷(𝑩)


If X ~ Bin(n, p) and np>5 and nq>5, then
Discrete Random Variables X − np
X ~ N (np, np(1 − p)) and Z =
np(1 − p)
𝑬(𝑿) = 𝝁 = ∑ 𝒙𝑷(𝒙)

𝑽𝒂𝒓(𝑿) = ∑ 𝒙𝟐 𝑷(𝒙) − (𝝁)𝟐


Continuity Correction
In trying to move from a discrete binomial
distribution to a continuous normal
Binomial Distribution
distribution an adjustment must be made to
If the random variable X has a binomial such the range of the discrete values as follows
that𝑋~𝐵𝑖𝑛(𝑛, 𝑝), then
1. 𝑃(𝑋 > 𝑎) → 𝑃(𝑋 > 𝑎 + 0.5)
𝑃(𝑋 = 𝑥) = ( 𝑛𝐶𝑥 )(𝑝)𝑥 (1 − 𝑝)𝑛−𝑥
2. 𝑃(𝑋 ≥ 𝑎) → 𝑃(𝑋 ≥ 𝑎 − 0.5)
𝑚𝑒𝑎𝑛 = 𝑛𝑝
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑛𝑝(1 − 𝑝) 3. 𝑃(𝑋 < 𝑎) → 𝑃(𝑋 < 𝑎 − 0.5)
4. 𝑃(𝑋 ≤ 𝑎) → 𝑃(𝑋 ≤ 𝑎 + 0.5)
5. 𝑃(𝑋 = 𝑎) → 𝑃(𝑎 − 0.5 < 𝑋 < 𝑎 + 0.5)
3
Steps: 𝟏 (∑ 𝒙)𝟐
𝒔𝟐 = 𝒏−𝟏 (∑ 𝒙𝟐 − 𝒏
)
1. Define the random variable
2. Write down 𝑋~𝐵𝑖𝑛(𝑛, 𝑝) 𝟏 (∑ 𝒙𝒇)𝟐
𝒔𝟐 = (∑ 𝒙𝟐 𝒇 − )
3. Calculate 𝜇 = 𝑛𝑝and 𝜎 2 = 𝑛𝑝(1 − 𝑝) 𝒏−𝟏 𝒏
4. Write down 𝑋~𝑁(𝑛𝑝, 𝑛𝑝(1 − 𝑝))
5. Write the P statement: P( Z  a)
6. Do the continuity Correction CONFIDENCE INTERVAL FOR THE
7. Standardize and solve MEAN

𝛼 1+𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙
To find , we look for
2 2
THE DISTRIBUTION OF THE
SAMPLE MEAN Confidence Level z
99% 2.576
If 𝑋~𝑁(𝜇, 𝜎 2 ) and if the population is
98% 2.326
normal or the population is not normal but n
97% 2.17
is large, then 95% 1.96
𝜎2 90% 1.645
𝑋̅~𝑁 (𝜇, )
𝑛
Population normal; population standard
𝑥̅ − 𝜇 deviation known
𝑎𝑛𝑑 𝑧 = 𝜎
 
x−z x+z
√𝑛 n n
THE DISTRIBUTION OF THE
SAMPLE PROPORTION Population not normal; population standard
𝑝𝑞 deviation not known; n≥30
𝑃𝑠 ~𝑁 (𝑝, )
𝑛
𝜎̂ 𝜎̂
𝑥̅ − 𝑧 < 𝜇 < 𝑥̅ + 𝑧
1 √𝑛 √𝑛
(𝑃𝑠 ± 2𝑛) − 𝑝
𝑍=
𝑝𝑞 Population normal; population standard

𝑛 deviation not known; n < 30

ESTIMATION 𝜎̂ 𝜎̂
𝑥̅ − 𝑡 < 𝜇 < 𝑥̅ + 𝑡
Unbiased estimate for the Mean √𝑛 √𝑛
∑𝑥
𝑥̅ =
𝑛
Unbiased Estimator of the Variance

4
HYPOTHESIS TESTING FOR THE HYPOTHESIS TESTING FOR THE
MEAN PROPORTION
Null Hypothesis Null Hypothesis

𝐻0 : 𝜇 = 𝜇0 𝐻0 : 𝑝 = 𝑝0
Alternative hypothesis 𝑯𝟏 Alternative hypothesis 𝑯𝟏

𝐻1 : 𝜇 > 𝜇0 [Right tail test] 𝐻1 : 𝑝 > 𝑝0 [Right tail test]


𝐻1 : 𝜇 < 𝜇0 [Left tail test] 𝐻1 : 𝑝 < 𝑝0 [Left tail test]
𝐻1 : 𝜇 ≠ 𝜇0 [Two- Tail test] 𝐻1 : 𝑝 ≠ 𝑝0 [Two- Tail test]

Test Statistic: Test Statistic:


Population normal; population standard Population normal; population standard
deviation known deviation known

𝑥̄ − 𝜇0 𝑝𝑠 − 𝑝0
𝑧= 𝑧=
𝜎 𝑝𝑜𝑞𝑜

√𝑛 𝑛

REGRESSION
Population not normal; population standard y=a+bx
deviation not known; n large

𝑥̄ − 𝜇0
𝑧=
𝜎̂
√𝑛
Population normal; population standard
deviation not known; n small

𝑥̄ − 𝜇0
𝑡=
𝜎̂
√𝑛

5
Coefficient of correlation

CHI SQUARED TEST OF


INDEPENDENCE

Null Hypothesis
H0: Variables are independent.

Alternative Hypothesis
H1: Variables are dependent

Test Statistics

 =
2 (O − E)
2

E
where E =
(row total)(coloumn total)
sample size
Rejection Region
We reject Ho if  test  2
2

df = (R − 1)(C − 1)

Limitations of the chi square test


•The test is limited to two variables.
•The contingency tables must be at least 2
rows and 2 columns.
•Too many cells with expected frequency
less than 5 limit the accuracy of the decision
arising from the test. Accordingly, the
number of cells with expected frequency
less than 5 must be limited to 20% of all
cells; otherwise, the decision will be invalid.
•The quality of the decision is influenced by
the quality of the data collection.

You might also like