You are on page 1of 11

lOMoARcPSD|31766614

Measures Of Dispersion And Relative Standing

BS Biology (Aklan State University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by IES Officer (iesofficer19@gmail.com)
lOMoARcPSD|31766614

Measures Of Dispersion And Relative Standing

The data sets may have the same mean, but when we look at their graphs, the data sets look
different from each other because of their variability.

1. Measures of Dispersion

Measure of central tendency give us good information about the scores in our distribution.
However, we can have very different shapes to our distribution, yet have the same central
tendency.

The measures of dispersion or measures of variation show how observations in a data set
vary from the mean. It will give us information about the spread of the scores in our distribution.
Are the scores clustered close together over a small portion of the scale, or are the scores
spread out over a large segment of the scale?

A. Range. The range is the difference between the high and low score in a distribution. Simply
subtract the two numbers to find the range. So, in the distribution: 1, 3, 5, 9, 11 the range is
11 – 1 = 10. Remember to subtract the two numbers to give one number for the final answer.
However, the range does not use the concept of deviation. It is affected by outliers but does not
consider all values in the data set.
Example: Find the range of the numbers of ounces (oz) dispensed by Machine 1 and Machine
2.

Machine 1
R= 10.07 – Machine 1 Machine 2 5.85
9.52 8.01 R= 4.22 oz
6.41 7.99
10.07 7.95 Machine 2
5.85 8.03 R= 8.03-7.95
8.15 8.02 R= 0.08 oz
X= 4.22 oz X= 0.08 oz Range of a set of
negative numbers

If your set includes negative numbers, the range will still be positive because subtracting a
negative is the same as adding.

When dealing with range, imagine the numbers on the number line. The range is simply the
space between the two extreme values.

B. Variance

The variance is a measure of variability. It is calculated by taking the average of squared


deviations from the mean, or the variance for a given data set is the square of the standard
deviation of the data. Variance tells you the degree of spread in your data set. The more
spread the data, the larger the variance is in relation to the mean.

Population variance
When you have collected data from every member of the population that you’re interested in,
you can get an exact value for population variance.

Sample variance
When you collect data from a sample, the sample variance is used to make estimates or
inferences about the population variance.

NOTE: Be careful in identifying what kind of data your dealing with in the given problem, is it
population or sample data?

Variance of a Population Variance of a Sample

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

2 = ∑ (− ) 2
2 =∑ (̅− ̅ )2
σ s
� �−1

σ2 = population variance ∑ s2 = sample variance ∑


= Summation or sum = Summation or sum
of… of…
X = each value X = each value ̅
� = population mean = sample mean
N = number of values in the population n = number of values in the sample

Example: Find the variance of the following sample data: 46, 69, 32, 60, 52, 41

So we will use the sample variance formula:


(∑̅ ( − ̅)2 s2 =

�−1
Step 1: Find the mean
To find the mean, add up all the scores, then divide them by the number of scores/observation.

Mean (x̅ ) = ∑ � x̅ = (46 + 69 + 32 + 60 + 52 + 41) = 300 = 50 Where did you get the 6?
n 6 6

Step 2: Find each score’s deviation from the mean


Subtract the mean from each score to get the deviations from the mean. Since x̅ = 50, take
away 50 from each score.

Scores or Observations (X) Deviation from the mean (X- x̅ )


46 46 - 50 = -4
49 49 – 50 = 19
32 32 – 50 = -18
60 60 – 50 = 10
52 52 – 50 = 2
41 41 – 50 = -9
or

Step 3: Square each deviation from the mean


Multiply each deviation from the mean by itself. This will result in positive numbers.

Scores or Observations Deviation from the mean Squared deviations from


(X) (X- x̅ ) the x̅ )2
(X- mean
2
46 46 - 50 = -4 (-4) = 4 x 4 = 16
49 49 – 50 = 19 (19) 2 = 19 x 19 = 361
32 32 – 50 = -18 (-18) 2 = -18 x -18 = 324
60 60 – 50 = 10 (10) 2 = 10 x 10 = 100
52 52 – 50 = 2 (2) 2 = 2 x 2 = 4
41 41 – 50 = -9 (-9) 2 = -9 x -9 = 81

or

Step 4: Find the sum of squares


Add up all of the squared deviations. This is called the sum of squares.

Sum of squares or

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

16 + 361 + 324 + 100 + 4 + 81 = 886 s2 =

Step 5: Divide the sum of squares by n – 1 or N


Divide the sum of the squares by n – 1 (for a sample) or N (for a population). Since
we’re working with a sample, we’ll use n – 1, where n = 6.

Variance 886 ÷ (6 – 1) = 886 ÷ 5 = 177.2

or or FINAL ANSWER
�� = ���. �
s2 = s2 =

What is variance used for in statistics?

Statistical tests such as variance tests or the analysis of variance (ANOVA) use sample variance
to assess group differences of populations. They use the variances of the samples to assess
whether the populations they come from significantly differ from each other.

C. Standard Deviation

The standard deviation is the average amount of variability in your dataset. It tells you, on
average, how far each value lies from the mean.

A high standard deviation means that values are generally far from the mean, while a low
standard deviation indicates that values are clustered close to the mean.
Standard deviation is a useful measure of spread for normal distributions.

In normal distributions, data is symmetrically distributed with no skew. Most values cluster
around a central region, with values tapering off as they go further away from the center. The
standard deviation tells you how spread out from the center of the distribution your data is on
average.

Many scientific variables follow normal distributions, including height, standardized test scores,
or job satisfaction ratings. When you have the standard deviations of different samples, you can
compare their distributions using statistical tests to make inferences about the larger
populations they came from.

Example: Comparing different standard deviations. You collect data on job satisfaction ratings
from three groups of employees using simple random sampling.

The mean (M) ratings are the same for each group – it’s the value on the x-axis when the curve
is at its peak. However, their standard deviations (SD) differ from each other. The standard
deviation reflects the dispersion of the distribution. The curve with the lowest standard deviation
has a high peak and a small spread, while the curve with the highest standard deviation is more
flat and widespread.

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

Image from: https://cdn.scribbr.com/wp-content/uploads/2020/09/Job-satisfaction-ratings-of-three-groups.svg

Population standard deviation


When you have collected data from every member of the population that you’re interested in,
you can get an exact value for population standard deviation.

Sample standard deviation


When you collect data from a sample, the sample standard deviation is used to make estimates
or inferences about the population standard deviation.

Standard Deviation of Population Standard Deviation of Sample

� ̅̅

�−1

σ = population standard deviation s = sample standard deviation


∑ = Summation or sum of… ∑ = Summation or sum of…
X = value in the data distribution X = value in the data distribution
� = population mean ̅ = sample mean
N = population size or the Total Number of n = sample size or the Total Number of
Observation Observation
Exampl Find the standard deviation of the following sample data: 46, 69, 32, 60, 52,
e: 41

So we will use the sample variance formula:


̅
=
�−1
Step 1: Find the mean
To find the mean, add up all the scores, then divide them by the number of scores/observation.

Mean (x̅ ) = ∑ � x̅ = (46 + 69 + 32 + 60 + 52 + 41) = 300 = 50 n Where did you get the 6?
6 6

Step 2: Find each score’s deviation from the mean


Subtract the mean from each score to get the deviations from the mean. Since x̅ = 50, take
away 50 from each score.

Scores or Observations (X) Deviation from the mean (X- x̅ )


46 46 - 50 = -4

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

49 49 – 50 = 19
32 32 – 50 = -18
60 60 – 50 = 10
52 52 – 50 = 2
41 41 – 50 = -9

or

Step 3: Square each deviation from the mean


Multiply each deviation from the mean by itself. This will result in positive numbers.

Scores or Observations Deviation from the mean Squared deviations from


(X) (X- x̅ ) the mean (X- x̅ )2
46 46 - 50 = -4 (-4) 2 = 4 x 4 = 16
49 49 – 50 = 19 (19) 2 = 19 x 19 = 361
32 32 – 50 = -18 (-18) 2 = -18 x -18 = 324
60 60 – 50 = 10 (10) 2 = 10 x 10 = 100
52 52 – 50 = 2 (2) 2 = 2 x 2 = 4
41 41 – 50 = -9 (-9) 2 = -9 x -9 = 81

or

Step 4: Find the sum of squares


Add up all of the squared deviations. This is called the sum of squares.

Sum of squares
16 + 361 + 324 + 100 + 4 + 81 = 886 or

Step 5: Divide the sum of squares by n – 1 or N


Divide the sum of the squares by n – 1 (for a sample) or N (for a population). Since
we’re working with a sample, we’ll use n – 1, where n = 6.

Variance 886 ÷ (6 – 1) = 886 ÷ 5 = 177.2

or or or FINAL ANSWER
� � = ��. ��
� �

From learning that SD = 13.31, we can say that each score deviates from the mean by 13.31
points on average.

Why is standard deviation a useful measure of variability?

Although there are simpler ways to calculate variability, the standard deviation formula weighs
unevenly spread out samples more than evenly spread samples. A higher standard deviation
tells you that the distribution is not only more spread out, but also more unevenly spread out.

D. Co-efficient variation

The coefficient of variation (CV) is a measure of relative variability. It is the ratio of the standard
deviation to the mean (average). For example, the expression “The standard deviation is 15% of
the mean” is a CV.

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

The CV is particularly useful when you want to compare results from two different surveys or
tests that have different measures or values. For example, if you are comparing the results from
two tests that have different scoring mechanisms. If sample A has a CV of 12% and sample B
has a CV of 25%, you would say that sample B has more variation, relative to its mean.

The formula for the coefficient of variation is:

Coefficient of Variation for a population Coefficient of Variation for a sample


� �
�� = 100% �� = 100%
� x̅
CV = Coefficient of Variation CV = Coefficient of Variation s =
� = Standard Deviation for population Standard Deviation for sample x̅
� = Mean for population data = Mean for sample data
��
or simply = �
100% Mean (x̅)
where
SD= Standard Deviation
x̅ = Mean of the data series
Example: Suppose we have a sample data 60.25, 62.38, 65.32, 61.41 and 63.23 of a population.
Let’s calculate the coefficient of variation for this data.

Step 1: Calculate the mean value of the data set in the first step.

FINAL ANSWER
Mean (x̅ ) = � x̅ = (60.25 + 62.38 + 65.32 + 61.41 + 63.23) = 312.59 (x̅ ) = 62.51
∑ n 5 5

Step 2: Calculate the standard deviation for the same values by placing values in the above SD formula.

then,

then,
√∑ (5.11) + (0.017) + (7.90) + (1.21) +
(0.52) =
5−1
then, or or or = √3.68 FINAL ANSWER
√14.72 √14.72 � � = �. ��
� �

Step 3: Calculate the coefficient of variance after getting mean and SD.
��
�� = �100%
Mean (x̅)
��= 0.31� 100% FINAL ANSWER
�� = 100% �� = �. �

NOTE: If the value of the coefficient of variation is less than 10, it is perceived as very good
values.
CV between 10 and 20 is also good value, but if this value gets greater than 30, it is not
acceptable.

The coefficient of variation has great importance when it comes to the variation in a data set. The
coefficient of variance is important because the normal standard deviation must also be
interpreted in light of the mean value. The real value of the CV is not dependent on the unit in

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

which measurements are taken in comparison. The coefficient of variance can be used instead
of the SD for comparison between data sets of varying units.
2. Measures of Relative Standing
Measures of relative standing can be used to compare values from different data sets, or to
compare values within the same data set.

Empirical Rule

The Empirical Rule is a statement about normal distributions. Your textbook uses an abbreviated
form of this, known as the 95% Rule, because 95% is the most commonly used interval. The
95% Rule states that approximately 95% of observations fall within two standard deviations of
the mean on a normal distribution.

A normal distribution is symmetrical and bell-shaped. A specific type of symmetrical


distribution, also known as a bell-shaped distribution.

On a normal distribution about 68% of data will be within one standard deviation of the
mean, about 95% will be within two standard deviations of the mean, and about 99.7% will
be within three standard deviations of the mean.

Often we want to describe an observation in relation to the distribution of all observations. We


can do this using a z-score. By converting observations to z-scores, we can compare
observations from different distributions. The z-score measures the relative standing of a
particular measurement in a data set. The z-score is the distance between an individual score
and the mean in standard deviation units; also known as a standardized score.

Population z-score Sample z-score

−� ̅ ̅

�= �=
� �

� = original data value � = original data value
� = mean of the original distribution ̅ = mean of the original distribution �
� = standard deviation of the original = standard deviation of the original
distribution distribution

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

z-distribution
A bell-shaped distribution with a mean of 0 and standard deviation of 1, also known as the
standard normal distribution.

Example: Milk
A study of 66,831 dairy cows found that the mean milk yield was 12.5 kg per milking with a
standard deviation of 4.3 kg per milking (data from Berry, et al., 2013).

a. A cow produces 18.1 kg per milking. What is this cow’s z-score?


FINAL ANSWER

̅ ̅
− � = �. ���
�= �= �=

Interpretation: This cow’s z-score is 1.302; her milk production was 1.302 standard deviations
above the mean.

b. A cow produces 12.5 kg per milking. What is this cow’s z-score?


FINAL ANSWER

̅ ̅
− �=�
�= �= �=

Interpretation: This cow’s z-score is 0; her milk production was the same as the mean.

c. A cow produces 8 kg per milking. What is this cow’s z-score?


FINAL ANSWER

̅ ̅
− � = −�. ���
�= �= �=

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

Interpretation: This cow’s z-score is -1.047; her milk production was 1.047 standard
deviations below the mean.

3. Symmetry and Measure of Skewness

We say that a distribution is symmetric if it can be folded along a vertical axis so that the two
sides of the graph coincide. Below is an example of histogram that show symmetric
distributions.

Image from: Image from:


https://mathbitsnotebook.com/JuniorMath/Statistics/shapeUh3a https://mathbitsnotebook.com/JuniorMath/Statistics/shape1a33
.jpg .jpg

If a distribution lacks symmetry with respect to a vertical axis, the distribution is said to be
asymmetric or skewed.

Left-Skewed Right-Skewed
Image from: https://blog.minitab.com/hubfs/Imported_Blog_Media/skewedhistograms.jpg

Skewed to the right or positively skewed distribution has a right tail longer than the left tail. A
positively skewed distribution indicates that he mean is greater than the median of the data set.
On the other hand a distribution with the left tail longer than the right tail is called negatively

Downloaded by IES Officer (iesofficer19@gmail.com)


lOMoARcPSD|31766614

skewed or skewed to the left. A negatively skewed distribution indicates that the mean is
less than the median of the data set.

Downloaded by IES Officer (iesofficer19@gmail.com)

You might also like