You are on page 1of 22

Measures of central tendency

& dispersion
Session 7
Descriptive Analysis
• Provides simple summaries about the sample and the measures.

• Helps to determine the normality of the distribution.

• Comprises:
• Measures of central tendency
• the extent to which the data values group around the central value.

• Measures of dispersion
• the amount of dispersion or scatter of values away from a central value

• Measures of Divergence from Normality


• the pattern of the distribution of values from the lowest value to the highest value.
Measures of central tendency
• Arithmetic Mean

• Weighted Mean

• Geometric Mean

• Median

• Mode
Measures of central tendency

where
• Weighted Mean - takes into account the importance of each value to
the overall total by assigning weights

Where

• Geometric Mean – used to measure the rate of change of a variable


over time
CODING AND FINDING MEAN
• Median - divides the into two equal parts so that one half of items
falls above it and the other half below it.

• The median can be computed by using the following statistical


formula:

• Mode - most frequent, repeated or common value in the data.


Measures of dispersion
• Range

• Standard deviation

• Variance

• Quartiles, Deciles, Percentiles, Interquartile range

• Quartile deviation, Coefficient of range, Coefficient of quartile deviation, Coefficient of variation

• Box-plots
• Range - The difference between the highest and lowest observed
values

• Mathematically, Range = value of highest observation – value of


lowest observation

• Heavily influenced by extreme values

• Ignores the nature of the variation among all the other observations.
Fractiles
• In a frequency distribution, a given fraction or proportion of the data lie at or
below a fractile.

• The interfractile range is a measure of the spread between two fractiles in a


frequency distribution.

• Fractiles have special names depending on the number of equal parts into which
they divide the data.

• 10 equal parts = deciles

• 4 equal parts = quartiles

• 100 equal parts = percentiles


Quartiles
• Divide the data into 4 equal parts

25% 25% 25% 25%

Q1 Q2 Q3

• The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are
larger.

• The second quartile, Q2, is the same as the median (50% of the observations are smaller and
50% are larger)

• The third quartile, Q3, is the value for which 75% of the observations are smaller and 25% are
larger.
Computation of Quartiles


Variance
• Population Variance (σ2) • Sample Variance (s2)

• Average (approximately) of squared • Average (approximately) of squared


variation of values from the variation of values from the sample
population mean. mean.

• Where • Where
Standard Deviation
• Most commonly used measure of variation.

• Shows variation around the mean.

• Square root of variance.

• Has the same units as the original data.

• The larger the standard deviation is, the more the observations deviate, on an average, away
from the mean.

• The smaller the standard deviation is, the less the observations deviate, on an average, from
the mean.
Standard Deviation
• Population Standard deviation • Sample Standard deviation
• A measure of the “average” • A measure of the “average” scatter
scatter around the population around the sample mean
mean

• Where • Where
Comparing Standard Deviations

Smaller standard deviation

Larger standard deviation


Coefficient of variation (CV)
• Measures relative variation.

• Shows variation relative to mean.

• Always in percentage (%).

• Can be used to compare the variability of two or more sets of data


measured in different units.
Coefficient of variation(CV)
• CV for population • CV for sample
Comparing Coefficients of Variation
• Stock A:
• Average price last year = $50
• Standard deviation = $5
s $5
 
CVA    100% 100% 10%
x  $50
• Stock B:
• Average price last year = $100
• Standard deviation = $5
s $5
CVB   100% 100% 5%
x  $100
Summary Characteristics
 The more the data are spread out, the greater the range, variance, and standard
deviation.

 The less the data are spread out, the smaller the range, variance, and standard
deviation.

 If the values are all the same (no variation), all these measures will be zero.

 None of these measures are ever negative.


BOX-PLOT CHART: (5 POINT
SUMMARY)
Pitfalls in Numerical Descriptive Measures

• Data analysis is objective


• Should report the summary measures that best describe
and communicate the important aspects of the data set

• Data interpretation is subjective


• Should be done in fair, neutral and clear manner

You might also like