You are on page 1of 61

MEASURES OF DISPERSION

• Measures of dispersion
describe the spread of the
data. They include the range,
interquartile range, standard
deviation and variance. Range
and Interquartile Range. The
range is given as the smallest
and largest observations. This
is the simplest measure of
variability.
Machine 1 Machine 2

10.07 8. 01

5.85 7. 99

8. 1 7.95

9. 52 8.03

6.41 15 8.02

Mean: 8.0 Mean: 8.0


STANDARD
DEVIATION
What is standard deviation?

-The standard deviation of a set of


numerical data makes use of the
amount by which each individual
data value deviates from the
mean.
If x₁, x₂, is a population of n numbers
with a mean of μ,
then the standard deviation of the
population is
If x₁, x₂, is a sample of n numbers with a
mean of x,
then the standard deviation of the
sample is
Standard deviation of the population.

Population number = 8.01, 7.99, 7.95, 8.03, 8.02

= 8.01 +7.99 +7.95 +8.03 + 8.02 = 40 =8


5 5
2
(Xi - ) = (8.01 - 8)² + (7.99 - 8)² + (7.95 - 8)² + (8.03 - 8)²
+ (8.02 - 8)²
(Xi - )2 = (0.01)² + ( -0.01)² +(-0.05)² + (0.03)² + (0.02)²
(Xi - )2 = 0.0001 + 0.0001 + 0.0025 + 0.0009 + 0.0004
(Xi - )2 = 0.004
= 0.004 = 0.03
5
Standard deviation of the sample

Population number = 8.01, 7.99, 7.95, 8.03, 8.02

x = 8.01 +7.99 +7.95 +8.03 + 8.02 x = 40 x =8


5 5
(X - X )2 = (8.01 - 8)² + (7.99 - 8)² + (7.95 - 8)² + (8.03 - 8)²
+ (8.02 - 8)²
(Xi - X)2 = (0.01)² + ( -0.01)² +(-0.05)² + (0.03)² + (0.02)²
(Xi - X)2 = 0.0001 + 0.0001 + 0.0025 + 0.0009 + 0.0004
(Xi - X)2 = 0.004
S = 0.004 S = 0.004 S = 0.03
n-1 5-1
Variance
Variance

- A statistic known as the variance is


also used as a measure of dispersion.

- The variance for a given set of data is


the square of the standard deviation
of the data.
Sample Variance Formula
X Xi- X (Xi -X)²
X = 40 X =8
7.95 - 0.05 0.0025 5
7.99 - 0.01 0.0001
8.01 0.01 0.0001 S² = 0.004
8.02 0.02 0.0004 5-1
8.03 0.03 0.0009
40 0.004 S² = 0.004
4
S² = 0.001
MEASURES OF RELATIVE
POSITION
The number of standard deviations
between the data value and the mean is
known as the data value's z-score or
standard score

Z-Scores

The z-score for a given data value x is the


number of standard deviations that x is
above or below the mean of the data.
We always remember that;
• If a z-score is equal to 0, it is on the
mean.
A positive z-score indicates the raw score
is higher than the mean.
• For example, if a z-score is equal to +1,it
is 1 standard deviation above the mean.
A negative z-score reveals the raw score
is below the mean.
• For example, if a z-score is -2, it is 2
standard deviations below the mean.
The following formulas show how to
calculate the z-score for a data value x in a
population and in a sample.

population: Raw score

Mean of the
population

Standard
deviation of the
population
Sample:
Raw score

Mean of the sample

Standard
deviation of the sample
EXAMPLE 1
Compare z-Scores
Raul has taken two tests in his chemistry
class. He scored 72 on the first test, for which
the mean of all scores was 65 and the
standard deviation was 8. He received a 60 on
a second test, for which the mean of all scores
was 45 and the standard deviation was 12. In
comparison to the other students, did Raul do
better on the first test or the second test?
SOLUTION:
Given for the First test:

72 65 8
Formula:

Raul scored 0.875 standard deviation


above the mean on the first test.
GIVEN FOR THE SECOND TEST:

60 45 12

Formula:

Raul score 1.25 standard deviation above


the mean on the second test.
PERCENTILE
Most standardized examinations provide scores
in terms of percentiles, which are defined as
follows:

pth Percentile
A value x is called the pth percentile of a data
set provided p% of the data values are less
than x.
The following formula can be used to find the
percentile that corresponds to a particular data
value in a set of data

PERCENTILE FOR A GIVEN DATA VALUE


Given a set of data and a data value x

Number of data values


less than x
Percentile of score x= .100
Total number of data
values
EXAMPLE
EXAMPLE
QUARTILES
Quartiles
keyword quarter which is to divide into
four.

In Quartiles, the data set is partitioned into 4


approximately equal group. The medians that
occupy the demarcation lines are the
Quartiles (Q1 ,Q2, Q3)
Q1 is the "middle" value in the first half of
the rank-ordered data set.

Q2 is the median value in the set.

Q3 is the "middle" value in the second half


of the rank-ordered data set.
How to find the Quartiles using the
Medians

A. Rank the data.


B. Find the median and label it Q2.
C. Find the median Q1 of the data
group of data values less than Q2.
D. Find the median Q3 of the group of
data values greater than Q2.
EXAMPLE

FIND THE QUARTILES USING


MEDIAN
35 31 29 28 29 31 27 33 32
39 31
Sort from lowest to
Highest.

Q Q Q
1 2 3
EXAMPLE
Normal Distribution

is a continuous probability distribution that is


symmetrical around its mean, most of the
observations cluster around the central peak,
and the probabilities for values further away
from the mean taper off equally in both
directions.
Frequency Distributions

is a representation that displays


the number of observations within a
given interval. ...

Frequency distributions are


particularly useful for normal
distributions, which show the
observations of probabilities
divided among standard deviations.
Histogram

A histogram is a bar graph-like


representation of data that buckets a
range of outcomes into columns along
the x-axis.

The y-axis represents the number


count or percentage of occurrences in
the data for each column and can be
used to visualize data distributions.
What is histogram and its uses?

A histogram allows you to see the


frequency distribution of a data set. It
offers an “at a glance” picture of a
distribution pattern, charted in specific
categories. Histograms are one of the
most frequently used methods for
charting historical data. ... It's a
simple chart that employs a horizontal
and vertical axis.
EXAMPLE
"THE NORMAL
DISTRIBUTIONS
AND EMPIRICAL
RULE."
NORMAL DISTRIBUTIONS FORMS A
BELL - SHAPE CURVE THAT IS
SYMMETRIC ABOUT VERTICAL LINE
THROUGH THE MEAN OF THE DATA.
SOMETIMES CALLED THE BELL
CURVE OR NORMAL CURVE.
PROPERTIES OF A NORMAL DISTRIBUTION

🐶 THE GRAPH IS SYMMETRIC ABOUT A


VERTICAL LINE THROUGH THE MEAN OF THE
DISTRIBUTIONS.
🐶. THE MEAN, MEDIAN, MODE ARE EQUAL.
🐶. THE Y-VALUE OF EACH POINT ON THE
CURVE IS THE PERCENT (EXPRESS AS A
DECIMAL) OF THE DATA AT THE
CORRESPONDING X-VALUE.
🐶. AREAS UNDER THE CURVE THAT ARE
SYMMETRIC ABOUT THE MEAN ARE EQUAL.
🐶. THE TOTAL AREA UNDER CURVE IS 1.
THE FOLLOWING RULE, CALLED THE
EMPIRICAL RULE, DESCRIBES THE
PERCENTS OF DATA THAT IS WITHIN 1, 2,
AND 3 STANDARD DEVIATIONS OF THE
MEAN IN A NORMAL DISTRIBUTIONS.
EMPIRICAL RULE OF A NORMAL
DISTRIBUTION
🐶68% OF THE DATA LIE WITHIN I
STANDARD DEVIATION OF THE MEAN

🐶 95% OF THE DATA LIE WITH 2


STANDARD DEVIATIONS OF THE MEAN.

🐶99.7% OF THE DATA LIE WITHIN 3


STANDARD DEVIATIONS OF THE MEAN.
EXAMPLE
EXAMPLE
The Standard
normal distribution

►If the original distribution of X is a normal
distribution, then the corresponding
distribution of z-scores will be also a normal
distribution. This normal distribution of z
scores is called the Standard Normal
Distribution. See figures 4.7. It has a mean
of 0 and standard deviation is 1.
The Standard Normal Distribution

►The standard normal distribution is the


normal distribution that has a mean 0 and
standard deviation of 1.

►Table and calculators are often used to


determine the area under a portion of the
standard normal curve.

You might also like