You are on page 1of 14

SEHH107 1

COMPUTATIONAL
TOOLS FOR
STATISTICS
D E S C R I P T I V E S TAT I S T I C S - N U M E R I C A L
MEASURE
OUTLINE

• Measure of Central Tendency

• Measure of Variation

• Measure of Position
DIFFERENT TYPES OF MEASUREMENT

Quantitative Data

Graph Measure

Measure of Measure of Measure of


Line Chart Scatter Plot
Central Tendency Variation Position

Standard
Mean Weighted Mean Mode Median Range Variance Percentile Quartile
Deviation
MEASURE OF CENTRAL TENDENCY This 6 feet tall
guy got drowned
• Mean: Average values of the observation(s) in a 5 feet deep
swimming pool

• Why Mean?
– Widely used in daily context
– Can be performed by someone without statistical background

• Why Not Mean?


– Affected by outliers

Population Sample
𝑥1 + 𝑥2+⋯ + 𝑥𝑁 𝑥1 + 𝑥2+⋯ + 𝑥𝑛
𝜇= 𝑥=
𝑁 𝑛
MEASURE OF CENTRAL TENDENCY
• Mode:Value(s) with highest frequency

• Why Mode?
– Can be obtained by direct observation

• Why Not Mode?


– May not be unique

Special Case Mode


All observations occur once No mode
All observations have the same frequency All observations are mode
2+ observations have the same frequency More than 1 mode
MEASURE OF CENTRAL TENDENCY

• Median: values that sperate the upper and lower half of a dataset

• Why Median?
– Not affected by outliers

• Why not Median?


– Does not use all the observations in calculation

Position of Values of the


median position of
𝑛+1
= median
2
MEASURE OF CENTRAL TENDENCY

• The following data shows the number of minced pork rice (in bowls) sold by Ching Ching shop in
20 selected days

350 345 348 390 392 387 367 298 333 312
324 389 303 402 1001 945 322 336 334 356

• Find the mean, mode median of the number of minced pork rice (in bowls)
– Mean: 411.7
– Median: 349
– Mode: No mode
MEASURE OF CENTRAL
TENDENCY
• Weighted average: average of values which are
scaled by the weightings

POPULATION SAMPLE – Elsa’s (part time staff in Ching Ching shop)


work schedule differs every week due to
her school work. In the past semester,
she worked:
𝑥1 𝑤1 + 𝑥2 𝑤2+⋯ + 𝑥𝑁 𝑤𝑁 𝑥1 𝑤1 + 𝑥2 𝑤2+⋯ + 𝑥𝑛 𝑤𝑛
• 14 hours for 3 weeks
𝑤1 + 𝑤2 + ⋯ + 𝑤𝑁 𝑤1 + 𝑤2 + ⋯ + 𝑤𝑛
• 18 hours for 4 weeks
• 20 hours for 2 weeks
• 25 hours for 4 weeks
– What is the average number of working
hours for Elsa per week? 19.54 hours
MEASURE OF VARIATION

• Range: Largest observation – Smallest observation

• Find the range for the example on p.7


• 703

• What is the problem with range? Affected by outliers


MEASURE OF VARIATION
• Variance: measurement of the spread between numbers in a dataset

• What are the differences between the formula?


– Difference 1
• 1st and 2nd formula
– For population
• 3rd and 4th formula
Population (𝜎 2 ) Sample (𝑠 2 )
– For Sample

– Difference 2 σ(𝑥𝑖 − 𝜇)2 (σ 𝑥 )


2 σ(𝑥𝑖 − 𝑥)ҧ 2 (σ 𝑥 )
2
σ 𝑥𝑖2 − 𝑖 σ 𝑥𝑖2 − 𝑖
• 1st and 3rd formula 𝑁 𝑁 𝑛−1 𝑛
– Defining Formula 𝑁 𝑛−1
• 2nd and 4th formula
– Computational formula
Population (𝜎 ) Sample (𝑠 )

MEASURE OF σ(𝑥𝑖 − 𝜇)2


𝑁
σ 𝑥𝑖2 − 𝑁𝑖
2
(σ 𝑥 ) σ(𝑥𝑖 − 𝑥)ҧ 2
𝑛−1
σ 𝑥𝑖2
2
(σ 𝑥 )
− 𝑛𝑖

VARIATION 𝑁 𝑛−1

• Why do we need to have standard deviation? To have the correct unit as it is squared in the
variance

• If there is another shop selling minced pork rice with a mean of 400 bowls a standard
deviation of 15 bowls, which shop has a relatively more stable customer flow?
– It is better to make use of Coefficient of variation to compare the two shops’ stability
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
– 𝑚𝑒𝑎𝑛
× 100%
194.605972
– CV for Ching Ching Shop = 411.7
× 100%=47.27%
15
– CV for the other shop = × 100%=3.75%
400

The other shop has a more stable flow because it has a smaller CV than Ching Ching Shop
MEASURE OF POSITION
• Percentile: the nth percentile means that n% of data have values equal of below it
Calculate the
• Find the 67th percentile and 80th percentile for problem on p.7 𝑝
position i = 𝑛
100
67
• Position of 67th percentile = × 20 = 13.4 ≈ 14
100
• 67th percentile = 387
i is an integer i is not an integer
80
• Position of 80th percentile = 100 × 20 = 16
(390+392)
• 80th percentile = = 391
2 Take the average Round up i to the
value of position i next integer and
position (i+1) take its value
MEASURE OF POSITION
• Quartile: 25th 50th 75th percentile which divide the whole dataset into 4 equal parts
• For the Ching Ching Shop example
25 (324+333)
– Position of 25th percentile (Q1) = × 20 = 5 Q1= = 328.5
100 2
50 (348+350)
– Position of 50th percentile (Q2) = 100
× 20 = 10 Q2= 2
= 349
75 (389+390)
– Position of 75th percentile (Q3) = 100
× 20 = 15 Q3= 2
= 389.5

• Inter Quartile Range (IQR) = Q3-Q1 (Measure of variation)


• For the Ching Ching Shop example
– IQR = 389.5-328.5=61

25 % of the data 25 % of the data 25% of the data 25% of the data

Q1 Q2 Q3
ADDITIONAL NOTES

• What is 5 number summary? Minimum, Q1, Q2, Q3, Maximum

• Skewness of the distribution

Symmetric Left-skewed Right-skewed

You might also like