L St t LearnStat

Session on

Learning Statistics the Easy Way

MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS

OBJECTIVES
At the end of the session, the participants should be able to:

1. 1 Describe data using the common measures of central tendency; 2. Describe data in terms of their variability and skewness; and 3. Determine the most applicable measure of pp central tendency given different types of distribution.
2011 LearnStat Sessions 2 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

OUTLINE
1. Measures of Central Tendency ٠Mean Median ٠Median ٠Mode 2. 2 Measures of Dispersion 3. Skewness 4. 4 Types of Distribution

2011 LearnStat Sessions 3

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

M Measures of Central Tendency u f n n y A. MEAN - commonly referred to as the average or arithmetic mean. - most widely used measure of central location.

X

=

Sum of all values in the data set Total number of observations

2011 LearnStat Sessions 4

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Measures of Central Tendency Ages
Example of mean computation Mean Age g X = 30+28+…+25 13 = 318/13 = 24 5 24.5

of 13 Job Applicants
Applicant Number 1 2 3 4 5 6 7 8 9 10 11 12 13 Total Age 30 28 25 35 25 34 20 19 26 18 17 16 25 318
BUREAU OF LABOR AND EMPLOYMENT STATISTICS

2011 LearnStat Sessions 5

Measures of Central Tendency M a ur f n ra n n y
B. MEDIAN - the value of the middle item in a set of observations which has been arranged in an ascending or descending order of magnitude. - is the centermost value in a distribution. th t st l i dist ib ti

2011 LearnStat Sessions 6

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Measures of Central Tendency
Ages of 13 Job Applicants Example of finding the median (Number of observations is odd)
Applicant Number 12 11 10 8 7 13 5 3 9 2 1 6
2011 LearnStat Sessions 7

Age 16 17 18 19 20 25 25 25 26 28 30 34 35
BUREAU OF LABOR AND EMPLOYMENT STATISTICS

The median value is the middle most value in the data set.

Median age = 25

4

Measures of Central Tendency
Ages of 14 Job Applicants Example of finding the median (Number of observations is even)
Applicant Number 12 11 10 8 7 13 5 3 9 2 1 6 4
2011 LearnStat Sessions 8

Age 16 17 18 19 20 25 25 26 26 6 28 30 34 35 35
BUREAU OF LABOR AND EMPLOYMENT STATISTICS

The median value is the sum of the two middle most values in the data n set divided by 2. Median age = 25 + 26 2 = 25.5

14

Measures of Central Tendency M a ur f n ra n n y
C. MODE - is the value in the data set that occurs most frequently.
Ages of 13 Job Applicants
Applicant Number N b 12 11 10 8 7 13 Age 16 17 18 19 20 25 25 25 26 28 30 34 35
BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Example of finding the mode

5 3 9 2 1 6

Mode = 25 is the value that occurs most frequently

2011 LearnStat Sessions 9

4

Measures of Central Tendency M a ur f n ra n n y
Advantages of the MEAN:
takes into account all observations observations. can be used for further statistical calculations and mathematical manipulations.

Disadvantages of the MEAN:
easily affected by extreme values. cannot be computed if there are missing values due to p g omission or non-response. in grouped data with open-ended class intervals, the mean cannot b computed. t be t d
2011 LearnStat Sessions 10 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Measures of Central Tendency y
Advantages of the MEDIAN: not affected by extreme values. can be computed even for grouped data w th open with openended class intervals. Disadvantages of the MEDIAN: Observations f b from d ff different d data sets h have to b be merged to obtain a new median, whether group or g p ungrouped data are involved.
2011 LearnStat Sessions 11 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Measures of Central Tendency
Advantage of the MODE: can be easily identified through ocular inspection. Disadvantages of the MODE:
does not possess the desired algebraic property of th d t th d i d l b i t f the mean that allows further manipulations. like the median, observations from different data sets median have to be merged to obtain a new mode, whether group or ungrouped data are involved.
2011 LearnStat Sessions 12 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

MEASURES OF DISPERSION Let us take 5 sets of observations Set 1: Set 2: Set 3: Set 4: Set 5: 45 45 44 41 44 45 46 45 43 45 47 46 46 48 48 48 48 49 48 49 50 50 51 55 49

x = 47

Questions remain unanswered even after getting the mean: How variable are the data sets? How do the values in each data set differ from each other? How are the values in each data set clustered or dispersed from each other?
2011 LearnStat Sessions 13 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Measures of Dispersion
-

group of analytical tools that describes the spread or variability of a data set set.

2011 LearnStat Sessions 14

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Importance of the measures of dispersion • supplements an average or a measure of central tendency • compares one group of data with another fd h h • indicates how representative the average is. is

2011 LearnStat Sessions 15

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

A measure of dispersion can be expressed in several ways: p p y

Range

Measures of Dispersion

Quartile Deviation Mean Absolute Deviation Variance/ Standard Deviation Coefficient of variation

Based on the position of an observation i a b ti in distribution Measures the dispersion around an average

Expressed in a relative value
BUREAU OF LABOR AND EMPLOYMENT STATISTICS

2011 LearnStat Sessions 16

SKEWNESS
describes the degree to which the data deviates from symmetry. when the distribution of the data is not symmetrical, it is said to be asymmetrical or skewed.

2011 LearnStat Sessions 17

BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Types of Distribution (in R l ti t M (i Relation to Mean, M di Median and M d ) d Mode)

Symmetrical/Normal Distribution
• Bell shaped distribution • The mean, median and mode are all located at one point.

Mean = Median = Mode
2011 LearnStat Sessions 18 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Positively Skewed Distribution
No. of obser rvations

• Observations are mostly concentrated towards the smaller values and there are some l d th extremely high values. • Also called skewed to the right distribution

Mode Median Mean

Income

Mode < Median < Mean
2011 LearnStat Sessions 19 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Negatively Skewed Distribution g y
No. of obser rvations

• Observations are mostly concentrated towards the larger values and there are some extremely low values. • Al called skewed t th left Also ll d k d to the l ft distribution.

Mean M Median Mode

Age of BLES staff g

Mean < Median < Mode
2011 LearnStat Sessions 20 BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Considerations to be made when using the three most common measures of central tendency: mm m f y
Distribution Normal Level of Measurement Interval or Ratio Measure to Use Mean Other Considerations When further statistical calculations or mathematical manipulations are needed When all observations are considered in the computation When distribution has open-ended open ended intervals When interested in the most frequently occurring observation
BUREAU OF LABOR AND EMPLOYMENT STATISTICS

Skewed Skewed

Ordinal Nominal

Median Mode

2011 LearnStat Sessions 21

Special Topic on Rounding Off l R d ff
Rules for Rounding off Numbers:

If the first di it t b d th fi t digit to be dropped i l d is less th 5 than 5, round down. If the first digit to be dropped is greater than or equal to 5 round up 5, up.

Examples: E l
• • • • •

Round off 185.5 into a whole number: 186 185 5 Round off 185.468 into a whole number: 185 Round off 184.51 into a whole number: 185 184 51 Round off 2.0547 into one decimal place: 2.1 Round off 2.073 i t t R d ff 2 073 into two d im l places: 2 07 decimal l s: 2.07

More Examples:
1. M 1 Manual Computation l • 2010 labor productivity (at constant 2000 prices) = (GDP/Employed)

5,701,539M = = 158 222 26 = 158,222 158,222.26 158 222 36.035M *
• Region VI-Employment growth rate (2009-2010): g p y g ( )
⎛ 2,974 ∗ ⎞ Growth Rate = ⎜ − 1 ⎟ × 100 = (1 .03156 − 1) × 100 2,883 ⎝ 2 883 * ⎠ = 0.03156 x 100 = 3.156% = 3 .2%
*In LFS, fi *I LFS figures are expressed in th s ds s x ss d i thousands.

2. Electronic Computation
In Microsoft Excel, you can use the following syntax: =round(value to be rounded off, number of decimal place to be retained) The value to be rounded off can be a single number or a formula to obtain a single number number. Example: • Round off 275.689 into two decimal places: 275 689 =round(275.689, 2) = 275.69 • 2010 labor productivity at constant 2000 prices: p y p

⎛ ⎛ ⎛ 5,701,539 ⎞ ⎞ ⎞ = round ⎜ ⎜ ⎜ ⎟ ×1,000 ⎟ ,0 ⎟ = 158,222 36,035 ⎠ ⎠ ⎝ ⎝ ⎝ 36 035 ⎠

Labor Productivity Worksheet

Growth Rate Worksheet