Professional Documents
Culture Documents
Notes A3
Greek letters Population
Latin letter Sample
n
- IQR=Q3 −Q1
Outliers in - Calculate IQR
percentiles - 1.5 × IQR
- If the value is less than or
more than 1.5 × IQR then it
is an outlier
Box and whisker plot:
Measures of Dispersion
Range - Difference between maximum and
minimum values
- Simplest measure of spread
- Relies on the most extreme outliers
Mean absolute deviation (MAD) - Measures the average absolute
difference from the mean
-
|x 1−X|+|x 2−X|…
n
Variance Standard deviation
Sample variance and population variance Sample SD or population SD
x − X )2 + ( x 2−X )2 … σ =√ σ
2
2 ( 1
σ =
n−1
Square root of the variance value
Measures the average squared difference
from the mean
Coefficient of variation (CV) - Adjusts for differences between the
magnitudes of the means
- A unit-less measure of mean-
adjusted dispersion
- Allows easy comparison across data
σ
- CV =
X
The z-score:
- Number of standard deviations away from the mean
1. Equal to the mean 0 z-score
2. Less than the mean negative z-score value
3. More than the mean positive z-score value
Converting values into z-score is called the standardization of data.
Standardizing a value:
z x i −X
i=
σ
value−mean
z value=
standard deviation
Converting z score back to original value:
x i=X + z i σ
Chebyshev’s theorem:
1. If we know the standard deviation and the mean – we know 75% of the data
2. Empirical rule – for many datasets, we can say what fraction of observations fall
within 1,2 and 3 SDs from either side.
For any data, the proportion of observations that lie within k standard deviations from the
1−1
mean is at least 2 where k > 1
k
Analysis of relative location:
Theorem:
1. Applies to all datasets, regardless of their distributions
2. Defines a lower bound on the percentages of observations lying in a given interval
3. Actual percentages can be much greater
Empirical rule:
1. More precise
2. Only applies for symmetrical, bell-shaped distributions
σ xy =
∑ [( xi −X ) ( y i−Y ) ¿ ] ¿
n−1
- Correlation (r xy ∨ρ xy ¿ – describes both the strength and direction of the relationship
between two variables, x and y.
σ xy
ρ xy =
σxσ y
Note that −1 ≤ ρ ≤ 1