You are on page 1of 8

STATISTICS AND PROBABILITY counts as a descriptive statistic for the data from

WEEK 1 which the statistics is computed.


FUNCTIONS AND RELATIONS Example: The Philippine government gives the following
report about the population of the Philippines
STATISTICS – came from the Latin word “status” YEAR POPULATION
meaning state. It is a branch of mathematics concerned 1990 48,098,460
with the collection, classification, analysis and 2000 60,703,206
interpretation of numerical data with a definite purpose 2010 92,337,852
in any study. Especially as it relates to the analysis of If we compute the growth rate for a decade one to
population characteristics by the inference of sampling. another, then it is descriptive statistics. Descriptive
STATISTICS is defined as a science that studies data to statistics also includes statistical techniques such as
measure of central location, dispersion, and other
be able to make a decision. Hence, it is a tool in
measures to describe data. They are presenting usually in
decision-making process. tabular and graphical form.

IMPORTANCE OF STATISTICS 2) INFERENTIAL STATISTICS – consists of generalizing


 helps each and every one to use proper methods in from samples to populations, performing
collecting data. hypothesis testing, determining relationship among
 helps on how to properly analyze the data collected variables and making prediction. It consists of
from each respondent and effectively present the techniques for reaching conclusions about a
result to the public. population based upon information contained in a
 help us to make new discoveries and fix issues that sample. It uses statistical techniques for analysis of
currently happening in our society. data and testing the reliability of the estimates.
 to make decisions based on the data gathered and
Example: If you want to know the percentage of
to make predictions. unemployed in our country, then a random sample taken
form the population can be used to estimate the
Importance of Statistics on different fields. population can be used to estimate the proportion of the
1) Politics – basis of election of candidates depends on unemployed from the sample to make an inference. The
the survey made by pollsters to predict the efficacy of vaccines to treat COVID 19 patients is done
outcome the records the voter’s preferences. The through inferential statistics. That is generating
way we count the votes and tallying them is one inferences from samples.
application of statistics.
DATA – raw pieces of evidences collected, organized
2) Market Place – to determine the best brands, and analyzed by statistician with the hope of
surveys provide information in predicting the establishing information facts. These are facts or figures
choices of consumers. from which conclusion may be draw. Data set provides
3) Medicine – medical researchers conduct study to data about a collection of elements & contains for each
determine the effectiveness of various medicinal element, information about one or more characteristics
drugs for the treatment of different diseases. of interest.
Statistics is held responsible in the accuracy and
efficacy rate of proposed medicines or vaccines. TYPES OF DATA

4) Engineering – to determine and to test the quality 1) PRIMARY DATA – information collected from an
of a product by inspecting some items and record original source of data, which is first-hand in nature.
outcome. Example: Data collected from interviews and surveys

5) Economy – economist develops prediction formula 2) SECONDARY DATA – information collected from
to predict and forecast the economy growth of a published or unpublished sources.
country.
Example: Data collected on books, newspaper, journals,
6) Education – the teacher might focus on the latest thesis class records and etc.
set of students test scores and use statistics to
QUALITATIVE DATA QUANTITATIVE DATA
determine the average score of the students.
Data that can be Data can be ordered
placed in categories and ranked. Values of a
TWO KINDS OF STATISTICS
like gender, civil status variable that are
1) DESCRIPTIVE STATISTICS – can be defined as and educational recorded as meaningful
methods for organizing, summarizing and attainment. numbers.
presenting data in descriptive way. These are
numbers that are used to summarize and describe
data. Any number we choose to compute also
VARIABLES – the characteristics that differentiate one
subject from another. 2) ORDINAL SCALE – higher than nominal scale where
Examples: grades, age, height, weight & income
the data collected are labels or names with an
implied ordering of the labels. These are objects or
TYPES OF VARIABLE individuals that are arranged in rank or order.
1) QUALITATIVE VARIABLES – variable that consider Examples: socio-economic status, difficulty of questions
on exam, sibling position, military rank, class rank, Likert
non-numeric by nature.
scale indicator
Example: blood type, gender, religious affiliation, eye
color and marital status. 3) INTERVAL – can set up inequalities and form
difference, but not multiply or divide. Interval scale
2) QUANTITATIVE VARIABLES – e variable that can be is also used to obtain a more precise measurement
expressed numerically by finding the difference between values.
Example: number of Children in the family, Income of the Mathematical operation such as addition and
parents, age and etc. subtraction. The Zero point is arbitrary as it does
not mean the value does not exist. Zero only
TYPES OF QUANTITATIVE VARIABLES:
represents an additional measurement point.
A. DISCRETE VARIABLE – can assume distinct
Examples: temperature, IQ scores
values which usually result from counting.
Example: Number of students in each section in 4) RATIO – we can set up inequalities and form
Mathematics course, the number cars arriving, and difference, and can multiply or divide. This is the
departing from a shopping mall. most powerful level of measurement. The data are
compared by multiplication and division. The zero
B. CONTINUOUS VARIABLE – can make an infinite
point is very important
number of values and may not be measured
accurately. Examples: height, weight, age, test scores, electric
charge, amount of money
Example: weight, height, age, and the time it takes a
student to solve mathematics problem. SUMMARY
LEVEL PROPERTY BASIC EMPERICAL
TERMS IN STATISTICS OPERARTION
NOMINAL No order, distance, Determination of
SAMPLE – a portion that is a representative of the or origin equivalence
population and it can be small or large. ORDINAL Has order but no Determination of
distance or unique greater or lesser
POPULATION – total collection of observations, origin values
measurements, individuals or objects under study. INTERVAL Both with order Determination of
and distance but no equality of
MEASUREMENT – assignment of numbers to objects or unique origin intervals or
events according to rules. difference
RATIO Has order, distance Determination of
PARAMETER – a number calculated on population data and unique origin equality of ratios
or means
that quantifies a characteristic of the population. Most
common parameter being estimated is the population The levels of measurement depend mainly on the method of
mean. measurement, not on the property measured. The weight of
primary school students measured in kilograms has a ratio
STATISTIC – a number calculated on sample data that level, but the students can be categorized into overweight,
quantifies a characteristic of the sample. Most common normal, underweight, and in which case, the weight is then
parameter being estimated is the sample mean. measured in an ordinal level. Also, many levels are only
interval because their zero point is arbitrarily chosen.
LEVEL OF MEASUREMENT SCALES Examples:
1) NOMINAL SCALE – lowest level of measurement Variable Qualitative/ Discrete, Level of
where data collected are simply labels or names or Quantitative Continuous, measure
category without any implicit ordering of the labels. or n/a ment
At this level, numbers are assigned to identify and 1. Strand you Qualitative N/A Nominal
want to take
classify individuals or object. Known also to be the
up in Senior
weakest form of measurement.
High School
Example: sex, religion, marital status and color, pin code, 2. IQ of a Person Quantitative Continuous Interval
password, bank account. 3. Grade in Quantitative Continuous Interval
General
Mathematics Example:
4. Place of Birth Qualitative N/A Nominal A student got the ff. grades in the 1st sem. What is
5. Life span of Quantitative Continuous Ratio the average of the student in the first semester?
Samsung 89, 89, 91, 94, 98, 93, 90, 89,91, 97
Battery
6. License Qualitative N/A Nominal
Solution:
Number
7. Number of Quantitative Discrete Ratio
89+89+ 91+ 94+ 98+93+90+ 89+91+97
x̅ =
Vendors in 10
Don Domingo 921
8. Brands of T- Qualitative N/A Nominal x̅ =  x=92.10
Shirts 10
9. Size of T-Shirts Qualitative N/A
10. Weekly Qualitative Continuous Ratio Properties and uses of mean (𝒙̅):
Allowance 1. The mean is found by using all the values of the data.
11. Address of Qualitative N/A Nominal 2. The mean varies less than the median or mode when
USLT samples are taken from the same population and all
three measures are computed for these samples.
12. Type of School Qualitative N/A Nominal
3. The mean is used in computing other statistics, such as
you are
the variance.
enrolled in
4. The mean for the data set is unique and not necessarily
13. Internet Qualitative N/A Nominal
one of the data values.
Promos
5. The mean cannot be computed for the data in a
14. Room Qualitative N/A Nominal
frequency distribution that has an open-ended class.
Assignment
6. The mean is affected by extremely high or low values,
15. Educational Qualitative N/A Ordinal
called outliers, and may not be the appropriate average
Attainment
to use in some situations.

WEEK 2 2. MEDIAN (middle) – the halfway point in a data set.


MEASURES OF CENTRAL TENDENCY, Before you can find this point, the data must be
LOCATION, AND VARIATION arranged in order (decreasing or increasing). When
the data set is ordered, it is called a data array. The
Parameters – measures found by using all the data median can either have one or two values.
values in the population
Example: Find the Median using the ff test scores:
Statistic – measures obtained by using the data values 15, 25, 17, 10, 13.
from samples are called statistic
Solution:
the health status of students in one strand is a statistic, and
Arrange in ascending order -------- 10, 13, 15, 17, 25
the health status of the senior high school students is a
parameter. Get the 3rd value ---------------------- 15
TAKE NOTE: If there are two numbers in the middle,
MEASURES OF CENTRAL TENDENCY get their average. The answer will be the median.
CENTRAL TENDENCY
 the statistical measure that identifies a single value Properties and uses of median (Md):
as representative of an entire distribution. It aims to 1. The median is used to find the center or middle value of
provide an accurate description of the entire data. a data set.
2. The median is used when it is necessary to find out
It is the single value that is most
whether the data values fall into the upper half or lower
typical/representative of the collected data.
half of the distribution.
3. The median is used for an open-ended distribution.
THREE COMMONLY USED MEASURES OF CENTRAL 4. The median is affected less than the mean by extremely
TENDENCY high or extremely low values.
1. Mean (represent) – (average) used to describe a set
of data where the measures concentrate at a point. 3. MODE (most popular) – the value that occurs most
∑x often in the data set. It is sometimes said to be the
X= where 𝑥̅ (read as x bar) denotes the most typical case.
n
sample mean, ∑ 𝑥 is the summation of  Unimodal – data set with only one value
all the data and n is the total number of  Bimodal – data set with two values
observations in the sample.  Multimodal – a data with more than two values
 No mode – when no data occurs more than once
3. NEGATIVELY SKEWED (Mean ˂ Median)
Example: Find the Mode using the ff test scores:  The majority of the data values falls to the right
89, 89, 89, 90, 91, 91, 93, 94, 97, 98 of the mean and the clusters at the upper end
of the distribution.
Solution: Mode = 89 (Unimodal)

Properties and Uses of Mode (Mo):


1. The mode is used when the most typical case is desired.
2. The mode is the easiest to compute.
3. The mode can be used when the data are nominal, such
as religious preference, gender, or political affiliation.
4. The mode is not always unique. A data set can have more
than one mode, or the mode may not exist for a data set.

WEIGHTED MEAN – used to find the mean of values of


the data set that are not equally represented. To solve SUMMARY OF MEASURES OF CENTRAL TENDENCY
for weighted mean, multiply the value (x) by its MEASURE DEFINITION SYMBOL
corresponding weight (w) and dividing the sum of the Mean Sum of values, divided by total 𝒙̅
products by the sum of their weights number of values
∑wx Median Middle point in data set that has Md
FORMULA: X = been ordered
w
Mode Most frequent data value Mo
Weighted Multiply the value (x) by its
Example: A college instructor grades recitation, 20%; Mean corresponding weight (w) and
term paper, 30%; final exam, 50%. A student had dividing the sum of the products
grades of 83, 72, and 90, respectively, for recitation, by the sum of their weights
term paper, and final exam. Find the student’s final
average. MEASURES OF LOCATION FOR UNGROUPED DATA
FAT MEASURES OF POSITION/LOCATION give us a way to
MEAT OR FISH (g/oz) see where a certain data point or value falls in a sample
3 oz fried shrimp 3.33 or distribution. A measure can tell us whether a value is
3 oz veal cutlet (broiled) 3.00 about the average, or whether it’s unusually high or
2 oz roast beef (lean) 2.50 low. Measures of position are used for quantitative data
2.5 oz fried chicken 4.40 that falls on some numerical scale.
2.6 oz tuna (canned) 1.75
Solution: A) QUARTILES
∑wx  divides a distribution into four equal parts.
X=  𝑄1 (1st quartile) – 25% or less of the given distribution
w
 𝑄2 (2nd quartile) – 50% or less of the given distribution
3 (3.33)+ 3(3)+ 2(2.5)+2.5(4.4)+2.6(1.75)  𝑄3(3rd quartile) – 75% or less of the given distribution.
X=
3.33+3+2.5+ 4.4+1.75
STEPS IN SOLVING FOR QUARTILES:
39.54 Example: Find the Q 1, Q 2 and Q 3 of the data:
WM=  WM=2.64
14.98 10 18 13 14 15 17 12 10 15 16 15
1. Array the data
The average number of grams of fat per ounce of meat or fish according to 10 10 12 13 14 15 15 15
that a person would consume over a 5-day period is 2.64. magnitude/size in 16 17 18
ascending or
SHAPES OF DISTRIBUTION descending order
1. POSITIVELY SKEWED (Mean ˃ Median) 2. Compute for the SOLUTION FOR Q 1.
 The majority of the data values falls to the left if position using the
the mean and clusters to the lower end of the formula: 1(11+1)
Q 1=
distribution. n(N + 1) 4
Q n= 12
2. SYMMETRICAL DISTRIBUTION (Mean = Median) 4 Q 1= → 3
4
 The data values are evenly distributed on both Q1=3 th score
Where:
sides of the mean. Also, when the distribution is
n = desired nth quartile
unimodal, the mean, median and mode are the N = number of items/scores
same and are at the center of the distribution. SOLUTION FOR Q 2.
2(11+1) n(N +1) 12
Q 2= D n= D 1= → 1.2
4 10 10
24 D1=1.2 th score
Q2= →6 Where:
4 (result is not exact so do some
Q2=6 th score n = desired nth decile
interpolation)
N = number of items/scores

SOLUTION FOR Q 3. SOLUTION FOR D 5.

3(11+ 1) 5(11+ 1)
Q 3= D 5=
4 10
36 60
Q 3= →9 D 5= → 6
4 10
Q3=9th score Q2=6 th score
3. Locate the item (or Based from the arranged
score) corresponding data: 10 10 12 13 14 15 3. If the obtained Note: The result for D1
the obtained position 15 15 16 17 18 position is not exact, must be interpolated.
in the distribution. interpolate if a) Then 10-10 = 0 b.
Always start from the Q1 = 3𝑡ℎ 𝑠𝑐𝑜𝑟𝑒, therefore necessary. b) That is, 0 x 0.2 = 0 c.
lowest score. c) Therefore, D1 = 10 + 0
the Q 1 is 12. a) Get the difference = 10
Interpretation: 12 is between the 1st and 2nd
higher than 25% of the score from the lowest
Interpretation: 10 is
score, since 1.2 is
items in the distribution. higher than 10% of the
between the 1st and 2nd
scores. items in the distribution.
Q2 = 6𝑡ℎ 𝑠𝑐𝑜𝑟𝑒, b) Multiply the difference
therefore the Q 2 is 15. between the 1st and 2nd
Interpretation: 15 is score (0) by the decimal
higher than 50% of the part of 1.2.
items in the distribution. c) Add this product (0) to
the lower score (10) (1st
score) to obtain D1.
Q3 = 9𝑡ℎ 𝑠𝑐𝑜𝑟𝑒,
4. Locate the item (or Based from the arranged
therefore the Q 3 is 16. score) corresponding data: 10 10 12 13 14 15
Interpretation: 16 is the obtained position 15 15 16 17 18
higher than 75% of the in the distribution.
items in the distribution. Always start from the D5 = 6th score, therefore
4. If the obtained position is not exact, interpolate if lowest score. the D 5 is 15.
necessary.
Interpretation: 15 is
higher than 50% of the
B) DECILES items in the distribution.
 are values that divide the distribution into 10 equal
parts.
 The deciles are D1 , D 2 , D3 ,… . D9 , C) PERCENTILES
 are values that divide the distribution into 100
STEPS IN SOLVING FOR DECILES: equal parts.
Example: Find the D 1 , and D 5 , of the data:  The percentiles are P1 , P2 , P3 ,… . P99
10 8 13 14 15 17 12 10 15 16 15 STEPS IN SOLVING FOR PERCENTILES:
Example: Find the P26 and P80 of the data:
1. Array the data
10 18 13 14 15 17 12 10 15 16 15
according to 10 10 12 13 14 15 15 15
1. Array the data
magnitude/size in 16 17 18
according to 10 10 12 13 14 15 15 15
ascending or
magnitude/size in 16 17 18
descending order
ascending or
2. Compute for the SOLUTION FOR D 1.
descending order
position using the
1(11+1) 2. Compute for the SOLUTION FOR P26.
formula: D 1=
10 position using the
formula:
n( N + 1) 26(11+1) P50 are equal because their
P n= P26=
100 100 results are higher than 50% of
312 the items in the distribution.
P26= → 3.12
Where: 100 Then Q 3 𝑎𝑛𝑑 P75 are equal
n = desired nth decile P26=3.12 th score because their results are higher
N = number of items/scores than 50% of the items in the
(result is not exact so do some
distribution.
interpolation)

SOLUTION FOR P80 .

80(11+1)
P80=
100
960 MEASURES OF VARIABILITY OF
P80= → 9.6 UNGROUPED DATA
100
P80=9.6 th score INTRODUCTION
A measure of central tendency (or average) is not enough to
3. If the obtained Note: The result for P26 describe a set of scores adequately. It tells you what the
position is not exact, must be interpolated. “typical “score is. But it does not tell you how typical the
interpolate if a. Then 12-10 = 2 typical score is – that is, how accurately the average
necessary. b. That is, 2 x 0.12= represents the individual scores. If individual scores are
mostly close to the average, then the average represents the
STEPS: 0.24
individual scores accurately. If the scores are spread out-
a) Get the difference c. Therefore, P26= 10 + some very high, some very low, and some near the average,
between the 3rd and 4th 0.24 = 10.24 then the average does not represent the individual scores
score from the lowest very accurately.
score, since 3.12 is Interpretation: 10.24 is
between the 3rd and 4th higher than 26% of the Measures of Central Tendency – convey information about
scores. items in the distribution. the commonalities of measured properties
b) Multiply the difference Measure of Variability – quantify the degree to which they
between the 3rd and 4th differ. If not all values of data are the same, they differ and
score (2) by the decimal variability exists. The measures of central tendency should be
part of .12. complemented by measures of variability for the same reason
c) Add this product (0.24) that objective descriptions of events should contain accounts
to the lower score (10) of both centripetal & centrifugal forces, of consenting &
(3rd score) to obtain P26. opposing opinions, of shared & conflicting views.
4. If the obtained Note: The result for P80 Measures of Variability – determine the range of
position is not exact, must be interpolated. distribution, relative to the measures of central tendency.
interpolate if a. Then 17-16 = 1 Where the measures of central tendency are specific data
necessary. b. That is, 1 x 0.6= 0.6 points, measures of variability are lengths between various
STEPS: c. Therefore, P80= 16 + points within the distribution. The spread of these data points
a) Get the difference 0.6 = 16.6 tells you about variability. Variation or variability is measured
between the 9th and in terms of range, mean deviation, variance, and standard
10th score from the deviation.
Interpretation: 16.6 is
lowest score, since 9.6 is
higher than 80% of the
between the 9th and
items in the distribution. MEASURES OF VARIABILITY
10th scores. Although the average score in distribution is important in
b) Multiply the difference many research contexts so is another set of statistics that
between the 9th and quantify how variable (or how dispersed) the scores tend to
10th score (1) by the be. Do the scores vary a lot, or do they tend to be very similar
decimal part of 0.6. or near each other in value? Sometimes variability in scores is
c) Add this product (0.6) to the central issue in a research question. Variability is a
the lower score (16) (9th quantitative concept, so none of this applies to distributions
score) to obtain P80 . of qualitative data.

Further description of a set of test scores is given by


measures of variability, or the extent of individual differences
NOTE: The illustration shows that some of the measures
around the central tendency. These measures tell us how
of position have the same value with each other. Q 1 spread out the scores in a distribution. Suppose you got 75
𝑎𝑛𝑑 P25 are equal because their results are higher than out of 100 on a statistics test, and you know that the mean
25% of the items in the distribution. While Q 2, D 5 𝑎𝑛𝑑 score in the class was 55. Now you know you have done
better than the average, so that is a good thing, but you do
not know how much better than the average you have done. scores or simply deviations and are
For example, if most people scored at or near the mean, then represented by “d”.
your score may actually be quite high in comparison to all 3rd – Sum up all the absolute deviation scores and
others. On the other hand, if the scores were quite spread divide by the total number of scores
out, then your score may be little better than average. th
4 – The resulting quotient is the mean absolute
A distribution of scores has high variability if the scores are deviation.
widely distributed around a mean; it has low variability if
most of the scores lie fairly close to the mean. As an example,
consider two distributions with the same mean of 55: one MAD=∑∨ X−x ̅ ∨ ¿ ∨MAD=∑∨d∨ ¿ ¿ ¿
distribution with high variability which includes scores like 4, n n
15, 27, 36, 72, 75, 76, 98, 99 and another distribution with
low variability which includes scores like 48, 50, 52, 54, 56, Example: Given the numbers 1, 2, 3, 4, 5. Find MAD.
56, 58, 59, 62.
Often times, in the social sciences, this variability is precisely Solution:
what we are interested in. Why are some people smarter 𝑿 𝑿 − 𝒙̅ |𝒅| = |𝑿 − 𝒙̅|
than others, why do they commit more crimes, why do 1 1 − 3 = −2 2
companies make more profits than others, etc.? we often
2 2 − 3 = −1 1
search for actors which allow us to explain variability. When
data are describes by a measure of central tendency (mean, 3 3−3=0 0
median, or mode), all the scores are summarized by a single 4 4−3=1 1
value. Reports of central tendency are commonly 5 5−3=2 2
supplemented and complemented by measures of variability. ∑𝑋 = 15 𝑛=5 ∑|𝑋 − 𝑥̅| = 6
x=
15 MAD=6∨ ¿ ∨1.2 ¿
MEASURES TO COMMUNICATE HOW SCORES ARE 5 5
SPREAD AROUND THE MEAN IN A PARTICULAR
DISTRIBUTION: x=3
1) RANGE – difference between the highest and the Notice that when the deviations in the second column
lowest scores in a distribution. It is the easiest to are totaled, the sum will be equal to zero. To
determine. overcome these problems, we use the absolute values
Range = Highest Score – Lowest Score of the deviations, rather than their real values
The absolute value of the deviations from the mean is
Example: 12, 25, 27, 29, 36, 38, 40, 43, 50, 54, 62 the difference between a score and the mean without
Solution: Range = 62 – 12 regard to the sign ( + or - ).
Range = 50
OTHER CHARACTERIS OF MEAN ABSOLUTE DEVIATION:
OTHER CHARACTERIS OF RANGE:  Applications of average deviation is very limited, since
 It gives a quick approximation of the variability of absolute values have little use in most statistical
the data, but it is not very sophisticated. procedures making the mean absolute deviation of a
 It is limited, if extreme scores are not sample biased estimator of the mean absolute
representative of the sample, but are included deviation of the population.
among the scores, the range will be  It is easier for new researchers to understand than SD,
unrepresentative of the sample, as well. being simply the average of the deviations- the amount
 It is used when the mode is preferred measure of by which, on the average, any figure differs from the
central tendency (i.e., when you have a nominal overall mean.
level data?  It is actually more efficient than the standard deviation
 It is the simplest measure of variability. in the realistic situation where some of the
 It is not very informative, because it is based only measurements are in error, more efficient for
on the most extreme scores. distributions other than perfect normal, closely related
 It is severely affected by extreme scores in your to a number of other useful analytical techniques, and
data distribution. Just one of these extreme scores easier to understand.
can significantly alter the range. Therefore, it is not  If the set of data is “normally distributed” there is a
used as reliable measure of variability. definite relationship between the average deviation
and the standard deviation: average deviation = 0.80x
standard deviation.
2) Mean Absolute Deviation – amount that each score
3) Variance – the average of the squared d
deviates from the mean.
differences of scores from the mean score of the
distribution.
STEPS:
1st – Calculate first the mean. 4) Standard Deviation – the square root of the
2nd – Subtract the mean from all the raw scores. variance. They are used when the mean is the
These scores are now called the deviation preferred measure of central tendency.
Sample Variance Sample Standard Deviation the units are different, such as comparing the weights
2 2 of two groups belonging to different age brackets or
(Ʃ x ) s=√ s
Ʃ x2 different gender. How can we compare the variability
2 n of the weights of 9 girls, with mean weight equal to
S=
n−1 100 pounds and with a standard deviation 5 and that
Where: of the weights of 12 boys with mean weight equal to
S² = sample variance 160 pounds and a standard deviation of 8? A statistic
S = sample standard deviation called coefficient of variation helps us answer the
x = the value of any particular observation or question.
measurement
𝛴x = sum of all x’s THE FORMULA IS: Where:
𝛴x²= sum of all square of x’s 𝑠 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
s 𝑥̅= 𝑚𝑒𝑎𝑛
CV = ~ (100) n = sample population
x Example 1:
Suppose two groups of students are to be compared in
Example: Beth gathered the following data. It terms of height.
shows the number of Smart TVs sold for the month Group Mean Standard CV
of October. 20 45 36 102 8 17 42 23 25 32 Height Deviation
Find the variance and standard deviation. Male 162 cm 10 cm 6.17%
Female 148 cm 4 cm 2.70%
Solution:
Solution for CV:
s
CV male= ~ (100 % )
x

10 cm
CV male= ( 100 % ) =6.17 %
162 cm

s
CV female = ~ (100 % )
x
4 cm
CV female = ( 100 % )=2.70 %
148 cm

Comparing the relative variations in height of the male


and female students, it can be seen that the male
students have higher coefficient of variation in height
than the female students. Thus, male students’ heights
are more varied.

Example 2: Compare the variability of the height and


SAMPLE STANDARD DEVIATION = 26.18 weight of the students given the following data.
RANGE = 94
Mean Standard CV
COEFFICIENT OF VARIATION (CV) Height Deviation
When it is necessary to compare the variability of two Height in cm 168 cm 12 cm 7.14%
or more groups, the task is easy if the means are the Weight in lbs. 200 lbs. 20 lbs. 10.00%
same. For example, we can easily compare which
group is more varied in height between the following From the results, it can be seen that the weight of the
groups: students is more varied than the height.
Group 1: Mean= 156 cm, standard deviation = 6
Group 2: Mean= 156 cm, standard deviation = 10

Clearly one can see that Group 2 is more varied


because it has higher standard deviation. The task
becomes more difficult if the means are not equal and

You might also like