Professional Documents
Culture Documents
Statistics
Topic 2: Descriptive Statistics for Data
1 / 25
Measures of Relative Standing
2 / 25
Measures of Relative Standing
I We have special names for the 25th, 50th, and 75th percentiles,
namely quartiles.
3 / 25
Interquartile Range
I Large values of this statistic mean that the 1st and 3rd
quartiles are far apart indicating a high level of variability.
4 / 25
Measures of Linear Relationship
I Covariance
I Coefficient of correlation
5 / 25
Covariance
I Examples:
6 / 25
Covariance
I Population covariance:
N
(xi − µx)(yi − µy )
X
i=1
σxy =
N
I Sample covariance:
n
(xi − x̄)(yi − ȳ)
X
i=1
sxy =
n−1
7 / 25
Covariance
8 / 25
Correlation Coefficient
Sxy
r=
Sx Sy
9 / 25
Correlation Coefficient
10 / 25
Using Excel
I Covariance
=covar(range of X, range of Y)
I Correlation
=correl(range of X, range of Y)
11 / 25
Parameters and Statistics
Population Sample
Size N n
Mean µ X̄
Variance σ2 S2
Standard deviation σ S
Covariance σxy Sxy
Correlation ρ r
12 / 25
Summary of Data
13 / 25
Bar Chart
14 / 25
Pie Chart
15 / 25
Line Chart
16 / 25
Scatter Plots
I Shows the relationship between two numerical variables.
I Example: income v.s food consumption
17 / 25
Scatter plots
18 / 25
Box Plot
I It shows the positions of the
quartiles, outliers, the
largest and smallest values
except outliers.
I Example:
Food consumption
expenditure (US 1941
Family Budget Survey data)
I Whiskers: max length is
1.5*IQR; stretch from box
to furthest data point
(within this range)
I Points further out from box
marked with circles; called
outliers.
19 / 25
Histogram
20 / 25
Example
Household incomes in a hypothetical Sydney suburb.
21 / 25
Other Characteristics of a “Distribution”
22 / 25
Kurtosis (K): a measure of the weight in the tails
I Mesokurtic:
K = 3 (“Normal distribution”)
I Platykurtic:
K< 3
I Leptokurtic:
K >3
23 / 25
Skewness and Kurtosis
25 / 25