You are on page 1of 21


In everyday life, whether at home or at work, records are being kept or reports are being read. An item in a
record or report is a fact that is expressed in terms of a numerical value or described by its quality or kind. That
single item or fact is referred to as a datum. All these facts in a record or report are called data. The color of the
hair, the number of basketball player, and the number of times you were absent from class are all examples of
data. Data, and how to handle it scientifically, is the major reason why we study Statistics.


Population – is a collection of all the units from which the data is to be collected. A unit in a population is also
called an element of the population.
Sample – A subset or representative part of the population.

When information is gathered for all the units in the population, the process is called a census. When only
part of the population is used to obtain data, the process is called sampling or a sample survey. When the size of
the population is large, a census becomes a long and tedious process aside from having a prohibitive cost. To save
on cost and time, a sample survey is a convenient alternative. The information derived from the data in the sample
is then used to make some generalizations about the population. However, in making this option, errors are
unavoidable. The role of Statistics is to provide the procedures that will minimize the errors that are bound to

Statistics – branch of science that deals with the development of methods for a more effective way of collecting,
organizing, presenting, and analyzing data.


Statistics is equipped with methods (how to do it) and theories (why it is done that way) Statistical Methods
refers to the procedures and techniques used from the collection of data to the proper presentation and analysis of
the results. Statistical Theory refers to the development of the formulas used in the computation and development
of scientific procedures that constitute the basis of the statistical methods.

The study of Statistics is classified into two major areas, namely:

Descriptive Statistics – deals largely with summary calculations, graphical and tabular displays, and describing
important features of a set of data. It does not attempt to draw conclusions about
anything that pertains to more than the data themselves.
Inferential Statistics – concerned with making generalizations for a bigger group of observations called
population based on information gathered from a small group of observations or sample
drawn from the given population.


The basic element of a statistical analysis is data. These are usually obtained by measuring some
characteristics or properties of the objects, people or things.


Quantitative Data – are those data that can be expressed in numbers. These are the things that can be measured,
like person’s age, height and weight, or a family’s annual income and merchant’s profit.
These data can also be counted, like the number of students who failed the Elementary
Statistics, number of pupils enrolled in Montessori High School, and the number of female
Qualitative Data – are those data for which no numerical measures exist and are usually expressed in categories
or kind. Examples of qualitative data are the color of the eyes which can be brown, black,
gray or blue; a person’s gender which is male or female; a person’s educational level which
can be elementary, secondary, college, masters or doctorate.

Variables are the characteristics or properties measured from the objects, persons or thing. These variables
can either be discrete and continuous.

STATISTICS 1, 2nd Semester 2016-2017

Discrete Variable – assume a value which is a whole number. For example, the number of passers and failures
in a Nursing Board Examination.
Continuous Variable – can be measured using some units of measurements, which may take some decimal
numbers. Example is the heights, weights and ages of the students are continuous


Another way of looking at data is on the way they are measured. Measurement is the process of assigning
a number or a numerical value to a characteristic of the object that is being measured.

Nominal measurement – possess only the property of identity and do not possess the properties of order and
equality of scales. This is the lowest form of measurement.
Example 1.1: color of the dress, sex of the newborn baby, occupation of parents, and religion of the Manobo tribes
in Bukidnon province.

Ordinal measurement – possess the properties of both identity and order but not the equality of scale property.
Example 1.2: when students are ranked according to class performance, an order of 1 st, 2nd, and 3rd … can be
established. Ranking of the military position, DepEd ranking of teachers, taste preference.

Interval measurement – possess the properties of identity, order, and equality of scale but do not have the
property of absolute zero. An ‘absolute zero property’ means it has nothing of the
characteristic that is being measured.
Example 1.3: recording of temperature, intelligence of exam of a student.

Ratio measurement – possess all the properties of identity, order, equality of scales and absolute zero. This is
the highest form of measurement.
Example 1.4: height, weight, age, volume.


In sampling, only a relatively small number of respondents or experimental units will be involved, thus,
it is commonly used in practice. We examine some of the advantages for doing so.

The following are the advantages of sampling procedures:

1. it entails lesser cost and it is less time consuming;

If you conduct a survey to every person, you are taking a census. However, this method is often
impracticable; as it’s often very costly in terms of time and money. For example, a survey that asks
complicated questions may need to use trained interviewers to ensure questions are understood. This may
be too expensive if every person in the population is to be included.

2. it is less cumbersome and more practical to administer since you will need to gather data from a lesser
number of respondents; and
3. some experiments are destructive so it is not possible to involve the whole population.

Sometimes taking a census can be impossible. For example, a car manufacturer might want to test the
strength of cars being produced obviously, each car could not be crash tested to determine its strength. To
overcome these problems, samples are taken from populations, and estimates made about the total
population based on information derived from the sample.

Sampling also has disadvantages, the biggest of which is that the sample may not truly reflect the
characteristic of the population and this would lead to wrong conclusions. Hence, care must be taken in choosing
a sample. Also, a sample must be large enough to give a good representation of the population, but small enough
to manageable.

2.1 PROBABILITY SAMPLING – also known as random sampling. This is one in which the elements of the
sample are chosen on the basis of known probabilities. Each element in the
population has an equal and independent chance of being selected as a
STATISTICS 1, 2nd Semester 2016-2017
sample point. This means that the choice of an element is not influenced by
other considerations such as personal preference, and that the choice of one
element is not dependent upon the choice of another element in the

Simple Random Sampling (SRS) – may be done with or without replacement. With simple random sampling,
each item in a population has an equal chance of inclusion in the sample.
This can be done using the fishbowl method or using random numbers.

Step 1: Assign a number to each element of the population using the numbers from 1 to 𝑁.
Step 2: Select 𝑛 numbers from 1 to 𝑁 using a random process like fishbowl method or draw lots.

Example 2.1: Choose a random sample of 5 students from the following 20 students using draw lots method.

Voltaire Hanalein Bryan Ruffa KP

Ryan Anamie Semplecio Ervina Jonniel
Jhonry Jelord Michael Hammiel Dennis
Kirby Roger Froilan Septemberly Jenny

The advantage of simple random sampling is that it is simple and easy to apply when small populations
are involved. However, because every person or item in a population has to be listed before the corresponding
random numbers can be read, this method is very cumbersome to use for large populations.

Another disadvantage of simple random sampling is that we can never be assured that all sectors or groups
is represented in your sample.

Systematic Random Sampling – sometimes called interval sampling, means that there is a gap, interval, or
between each selection. Here we select every 𝑘 𝑡ℎ element in the population,
the first unit being chosen at random.
Step 1: Assign a number to each element of the population using the numbers from 1 to 𝑁.
Step 2: Determine the sampling interval 𝑘: 𝑘 = 𝑛 where 𝑁 = population size and 𝑛 = sample size.
Step 3: Select a random start 𝑟 where 1 < 𝑟 < 𝑘. The first unit of the sample is the unit corresponding to 𝑟.
Note: If 𝑘 is not a whole number, then it is rounded-off to the nearest whole number.

Example 2.2: In a population of 120 individuals, choose a systematic random sample of size 10.

Stratified Random Sampling – the population of 𝑁 units is first divided into homogenous subpopulations (called
strata) and then a sample is drawn from each stratum. This type of sampling
assures that all groups or strata are represented in the sample. Some examples of
strata commonly used by the SWS Survey are location, age and sex. Other strata
maybe religion, academic ability or marital status.
Step 1: Classify the population into at least two homogenous strata. The basis for classification must be closely
related to the variable of interest. Suppose if we are interested to determine the students opinion on the tuition fee
increase, it may be logical to subdivide the population of students by college, or by year level, or by tribe or a
combination of these.
Step 2: Draw a sample from each stratum by simple or systematic random sampling.

If the size 𝑁 of the population is divided into 𝑘 homogenous subpopulations or strata of sizes
𝑁1 , 𝑁2 , … , 𝑁𝑘 , then the sample size to be taken from each stratum 𝑖 is obtained using the formula:

A. Proportional Allocation 𝑛𝑖 = ( 𝑁𝑖 ) × 𝑛 for 𝑖 = 1, 2, 3, … , 𝑘.
STATISTICS 1, 2nd Semester 2016-2017
B. Equal Allocation 𝑛𝑖 = ( 𝑁𝑖 ) for 𝑖 = 1, 2, 3, … , 𝑘.

If 𝑛𝑖 is not a whole number, then it is rounded-off to the nearest whole number.

Example 2.3: At a small private college, the students may be classified according to the following scheme:

Classification Number of Students

Freshmen 220
Sophomore 195
Junior 163
Senior 150

If we use proportional allocation to select stratified random sample of size 𝑛 = 40, how large a sample must be
taken from each stratum?

Cluster Sampling – divides the population into groups, or clusters. A number of clusters are selected randomly
to represent the population, and then all units within selected clusters are included in the
sample. No units from non-selected clusters are included in the sample. They are represented
by those from selected clusters. This differs from stratified sampling, where some units are
selected from every group.
Step 1: Divide the population area into heterogeneous sections or clusters.
Step 2: Select randomly a few from these clusters.

Example 2.4: Suppose the population of a study is all registered voters of the country. The population may be
considered to be clustered or segregated into 16 regions. From these 16 regions, we may select 5 regions
randomly. From the regions drawn, we select all its registered voters.

2.2 NON-PROBABILITY SAMPLING – is one in which individuals or items are chosen without regard to their
probability of occurrence. This is usually used when the size of the
population is either unknown or cannot be individually identified.
Here, personal preference are applied.

Quota Sampling – also known as convenience sampling. The main consideration directing quota sampling is the
researcher’s ease of access to the sample population. In addition to convenience, he/she is
guided by some visible characteristic, such as gender or race of the study population that is of
interest to him/her. The sample is selected from the location convenient to the researcher and
whenever a person with this visible relevant characteristic is seen, the person is asked to
participate in the study. This process continues until the required number of respondents
(quota) is obtained.

Example 2.5: Suppose you want to select a sample of 20 male students. You may stand at a convenient location
and whenever you see a male student, you collect the required information. You continue until you have 20 male

Accidental Sampling – similar to quota sampling except that the researcher is not guided by any obvious
characteristic. This is common among market research and newspaper reports.

Example 2.6: Suppose you want to get a sample of 20 users of soap “A”. You stand at a convenient location and
ask the person you see if he/she is a user of soap “A”. If he/she is, then get the required information. You continue
until you have 20 respondents.

Judgment or Purposive Sampling – the researcher purposely choose as to who can provide the best information
to achieve the objectives of the study. The researcher only goes to people
who in his/her opinion are likely to have then required information and are

STATISTICS 1, 2nd Semester 2016-2017

willing to share it. This is important when you want to construct a historical
reality, describe a phenomenon or develop something about which only a
little is known.

Example 2.7: A student conducted a study on the history of Tagum City. To get proper information, he interviewed
past mayors, city officials and pioneering employees and staff of City Hall. He also interviewed the City
Information Office since they may have also the past records about Tagum City.

Snowball Sampling – a process of selecting a sample using networks.

Example 2.8: A researcher wanted study the factors why some students occasionally use prohibited drugs. He
intended to get 50 students, but he only knew 5 students who used it. By getting the cooperation of these 5
students, he was referred to other drug users, who in turn also provide additional contacts. In this way, he was
able to get sufficient number of students he needed.


There are generally two methods of presenting the data, namely, tabular presentation and graphical


Percentage or Frequency Tables

Table 3.1 Distribution of Ethnic Origin of Residents in Iligan City
Boholano 8, 964 4.02
Cebuano 74, 147 33.28
Iliganon 74, 292 33.34
Ilonggo 5, 075 2.28
Luzonian 6, 065 2.72
Maranao 9, 661 4.33
Misamisnon 14, 723 6.61
Siquihodnon 12, 780 5.74
Waray 4, 799 2.15
Others 12, 312 5.53
Source: 1994 Iligan Census Summary Report

Table 3.2 Sample Data Set of 50 Students

01 Female Junior 18 26 Female Freshman 40
02 Male Senior 31 27 Male Senior 39
03 Male Junior 37 28 Female Senior 54
04 Female Senior 21 29 Male Freshman 23
05 Male Junior 43 30 Male Sophomore 25
06 Male Freshman 16 31 Male Sophomore 10
07 Male Senior 48 32 Male Freshman 58
08 Female Sophomore 20 33 Female Freshman 54
09 Male Freshman 18 34 Male Sophomore 32
10 Female Sophomore 47 35 Male Junior 29
11 Female Senior 28 36 Male Freshman 32
12 Male Senior 32 37 Male Senior 22
13 Female Freshman 49 38 Female Senior 30
14 Male Freshman 24 39 Female Freshman 26
15 Male Sophomore 12 40 Male Freshman 38
16 Male Junior 42 41 Male Sophomore 47
17 Female Sophomore 36 42 Male Sophomore 48
18 Male Freshman 63 43 Female Senior 56
STATISTICS 1, 2nd Semester 2016-2017
19 Male Senior 30 44 Female Senior 61
20 Male Senior 19 45 Female Junior 28
21 Male Sophomore 15 46 Female Freshman 26
22 Male Freshman 29 47 Male Junior 40
23 Male Sophomore 26 48 Male Junior 17
24 Female Freshman 28 49 Female Junior 36
25 Male Sophomore 23 50 Female Sophomore 35

Note: Most of the presentations in the succeeding sections will be based on the information given by a sample of
fifty (50) students with their corresponding gender, year level and final examination scores as summarized in
Table 3.2 above.

Cross Tabulation Table

When data are in categories, a table listing the frequencies for the different combination of values of two
categorical variables is called a cross tabulation table.

Table 3.3 Distribution of Gender by Year Level

Female 9 9 6 7
Male 6 4 3 6

Table 3.4 A 4x2 Contingency Table

Freshman 9 6 15
Sophomore 9 4 13
Junior 6 3 9
Senior 7 6 13
TOTAL 31 19 50

Frequency Distribution Table (FDT)

A frequency distribution table is a grouping of all the observations into intervals or classes together with
a count of the number of observations that fall in each interval or class. In frequency distributions the data is
presented in a more compact and usable manner. However, this process brings about some loss of details.

Steps in Constructing a Frequency Distribution Table

Step 1: Find the range 𝑹 where 𝑹 = highest value – lowest value.
Step 2: Estimate the number of classes or intervals, 𝒌. To get the intervals, 𝒌 = √𝒏 or 𝒌 = 𝟏 + 𝟑. 𝟑𝟐𝟐 𝐥𝐨𝐠𝟏𝟎 𝒏
where 𝒏 is the number of observations. You may choose also an interval given that 𝟓 < 𝒌 < 𝟐𝟎. (Note: The
results are “rounded-off” to the nearest whole number.)
Step 3: Estimate the class width 𝒄 of the interval by dividing the range 𝑹 by the number of the classes, 𝒌, that is
𝒄 = 𝒌. (Note: “Round-up” this estimate to the same number of significant decimal places as the original set of
Step 4: List the lower and upper class limits of the first set of data.
Step 5: List all the succeeding lower and upper class limits by adding the class width 𝒄 to the lower limit of the
first class interval. The upper class limit of the first interval should be the number before the lower class limit of
the second interval. The highest class should contain the largest observation in the data set.
Step 6: Make a tally. From the raw data, determine the interval in which a data value belongs, and then add one
to the tally of that interval. Repeat, this process for all data values, and then get the total frequency for each class
Step 7: Compute the class boundaries of each intervals. To compute the class boundaries, the lower class
boundary (𝑳𝒊 ) is computed by 𝑳𝒊 = −[𝟏⁄𝟐 × (𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏𝒔)] from the lower class limit. The upper
class boundary (𝑼𝒊 ) is computed by 𝑼𝒊 = +[𝟏⁄𝟐 × (𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏𝒔)] to the upper class limit. This
is done in order to close the gap between two adjacent intervals.

STATISTICS 1, 2nd Semester 2016-2017

̅). The class midpoint or class mark is the midpoint of an interval. It is
Step 8: Compute the class marks (𝒙
𝒍𝒊 +𝒖𝒊
computed by 𝒙̅𝒊 = , where 𝒍𝒊 is the lower class limit for the 𝑖 𝑡ℎ interval and 𝒖𝒊 is the upper class limit for
the 𝑖 𝑡ℎ interval.
Step 9: Make a column containing relative frequencies to obtain a relative frequency distribution. The relative
frequency for each interval is found by dividing the class frequency by the total frequency. Another variation may
be obtained by multiplying the relative frequency of each class by 100% to get a percentage distribution.
Step 10: Make another column for cumulative frequency distribution. The cumulative frequency associated with
the upper class boundary of a particular interval is computed by summing the frequency for that interval and the
frequencies of all the intervals below it.

Example 3.1: Make a frequency distribution table using test scores in Table 3.2.
Step 1: Compute the range, 𝑅 = 63 − 10 = 53.
Step 2: Estimate the number of classes. 𝑘 = √50 = 7.07 ≈ 7 or 𝑘 = 1 + 3.322 log 50 = 6.64 ≈ 7. We use 7
𝑅 53
Step 3: Estimate 𝑐, the width of the interval. 𝑐 = = = 7.57 ≈ 8.
𝑘 7
Step 4 and Step 5: List the lower and upper class limits of the first class interval. Then, list all the succeeding
lower and upper class limits using 𝑐.
Step 6: Make a tally, then get the total frequencies of each interval.
1 1
Step 7: Compute the class boundaries. 𝐿𝑖 = − [ (1)] = −0.5 and 𝑈𝑖 = + [ (1)] = +0.5.
2 2
Step 8: Compute the class mark 𝑥̅ .
Step 9: Make a column for relative frequency.
Step 10: Make a column for cumulative frequency.
(𝑥̅ )
10 – 17 11111 5 9.5 – 17.5 13.5 0.1 5
18 – 25 1111111111 10 17.5 – 25.5 21.5 0.2 15
26 – 33 11111111111111 14 25.5 – 33.5 29.5 0.28 29
34 – 41 11111111 8 33.5 – 41.5 37.5 0.16 37
42 – 49 1111111 7 41.5 – 49.5 45.5 0.14 44
50 – 57 111 3 49.5 – 57.5 53.5 0.06 47
58 – 65 111 3 57.5 – 65.5 61.5 0.06 50


No report is customarily complete without an accompanying picture or graph. This is readily defensible
by the saying “a picture paints a thousand words”. Thus, a frequency distribution table is further enhanced
through its graphical presentation.

Bar Chart
A bar chart is a graph where the different classes are represented by rectangles or bars. The width of each
rectangle along the horizontal axis corresponds to the class limits or categories for nominal variables, while the
length of the rectangle, corresponds to the class frequency.

A graph that is close resemblance of the bar chart is the histogram. The basic difference between the two
graphs is that a bar chart uses the class limits for the horizontal axis while the histogram employs the class
boundaries. Using the class boundaries eliminates the spaces between the rectangles, giving it a solid appearance.

Frequency Ogive
A cumulative frequency distribution can be represented graphically by a frequency ogive. An ogive is
obtained by plotting the upper class boundaries on the horizontal scale and the corresponding cumulative
frequency in the vertical scale.

STATISTICS 1, 2nd Semester 2016-2017

Frequency Polygon
Another useful method of presenting data graphically is the use of the frequency polygon. A frequency
polygon is constructed by plotting the class marks which is in the horizontal scale against its frequency which is
in the vertical scale. To complete the polygon, which is mathematically defined as a closed figure, an additional
class mark is added at the beginning and at the end of the distribution. These additional class marks are each
assigned a frequency of zero.

Pie Chart
A pie chart is a circle divided into pie-shaped sectors, which look like a slices of pizza pie. The angle of
a sector is proportional in size to the frequencies or percentages.


Many of the computations in statistics involve a summation notation of the observed data.
The summation notation, ∑𝑛𝑖=1 𝑥𝑖 , as read as “the sum of 𝑥𝑖 ′𝑠 where 𝑖 ranges from 1 to 𝑛”, is defined as
follows: ∑𝑛𝑖=1 𝑥𝑖 = 𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛 , where 𝑖 is called the index of summation, 1 is the lower limit and 𝑛
is the upper limit of the summation.

Example 4.1:
a. ∑5𝑖=1 𝑥𝑖 = 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 + 𝑥5
b. ∑3𝑖=1(𝑥𝑖 + 𝑦𝑖 ) = (𝑥1 + 𝑦1 ) + (𝑥2 + 𝑦2 ) + (𝑥3 + 𝑦3 )

Rules of Summation:
a. ∑𝑛𝑖=1(𝑥𝑖 + 𝑦𝑖 ) = ∑𝑛𝑖=1 𝑥𝑖 + ∑𝑛𝑖=1 𝑦𝑖
b. ∑𝑛𝑖=1 𝑎𝑥𝑖 = 𝑎 ∑𝑛𝑖=1 𝑥𝑖 , where 𝑎 is any constant.
c. ∑𝑛𝑖=1 𝑎 = 𝑛𝑎, where 𝑎 is any constant.

Example 4.2:
Given 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 8, 𝑥4 = −2, 𝑦1 = −6, 𝑦2 = −1, 𝑦3 = 5, 𝑦4 = 0, find the value of the
a. ∑4𝑖=1 𝑥𝑖 2 = 𝑥1 2 + 𝑥2 2 + 𝑥3 2 + 𝑥4 2 = 32 + 42 + 82 + (−2)2 = 9 + 16 + 64 + 4 = 93
b. (∑4𝑖=1 𝑥𝑖 )2 = (𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 )2 = (3 + 4 + 8 + (−2)) = (13)2 = 169
c. ∑2𝑖=1 𝑥𝑖 𝑦𝑖 = 𝑥1 𝑦1 + 𝑥2 𝑦2 = 3(−6) + (4)(−1) = −18 − 4 = −22

Given 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 8, 𝑥4 = −2, 𝑦1 = −6, 𝑦2 = 1, 𝑦3 = 5, 𝑦4 = 0, find the value of the following:
a. ∑4𝑖=1 𝑥𝑖 2
b. ∑3𝑖=1(3𝑥𝑖 + 𝑦𝑖 )
c. (∑4𝑖=2 𝑥𝑖 )(∑4𝑖=2 𝑦𝑖 )
d. (∑4𝑖=1 𝑦𝑖 ) + 5


We will examine different statistical measures that are computed when given a set of data. Some of these
measures are applicable for both numerical and non-numerical data (categorical data) but many of these are
applicable only to numerical data.

Statistic – is a characteristic or measure obtained by using the data values from a sample.
Parameter – is a characteristic or measure obtained by using all the data values for a specific population.


It is called also measures of average, they usually indicate the center of a set of data. Measures of the
center are very important because they usually represent the common value of the observations.

A. Arithmetic Mean
The mean is the sum of values divided by the total number of values. This is commonly called the average
in layman’s term. In statistics, all measures of center are called average.

STATISTICS 1, 2nd Semester 2016-2017

𝒊=𝟏 𝒙𝒊 ∑𝒏
𝒊=𝟏 𝒙𝒊
Parameter: 𝝁= Statistic: ̅=
𝑵 𝒏

It can be seen from the formula that the procedure in finding the mean for a population or a sample are
just the same. The mean is the most common measure of the center used for numerical data.

Example 5.1: The ages in weeks of six kittens at an animal shelter are 3, 8, 5, 12, 14 and 12. Find the mean.
∑𝑁 𝑖=1 𝑥𝑖 3 + 8 + 5 + 12 + 14 + 12 54
𝜇= = = =9
𝑁 6 6
Thus, the mean age of the kittens is 9 weeks.

Example 5.2: The fat contents in grams for one serving of 11 brands of packaged foods, as determined by the U.S.
Department of Agriculture, are given as follows: 6.5, 6.5, 9.5, 8.0, 14.0, 8.5, 3.0, 7.5, 16.5, 7.0, 8.0. Find the
∑𝑁𝑖=1 𝑥𝑖 6.5 + 6.5 + 9.5 + 8.0 + 14.0 + 8.5 + 3.0 + 7.5 + 16.5 + 7.0 + 8.0 95
𝜇= = = = 8.64
𝑁 11 11
Thus, the mean of fat contents in grams for one serving of 11 brands of packaged foods is 8.64 grams.

Properties of the Mean:

1. It is unique, meaning it has only one value.
2. It can be computed for numerical data only, that is interval or ratio level data.
3. It is easily affected by extreme values in the data. Thus, one should be cautious in using the mean when
there are extreme observations or outliers. If the outlier is extremely low, it pulls down the value of the
mean. If the outlier is very big value, it magnifies the mean. If the mean is greatly affected, then our
summary description of the data is distorted.

B. Median
When the data are arranged in increasing or decreasing order, the median is the halfway point or middle
value in a data set. Meaning, the median is the data point which divides the distribution into two equal parts.

Steps in Computing the Median from the set of data:

Step 1: Arrange the data in increasing (or decreasing) order of magnitude.
Step 2: Select the middle point.
CASE 1: When the observation is odd, there is only one middle value and this is the median. The position
of this data is located at the ( 2
) data point.

Parameter: ̃ = 𝑿𝑵+𝟏
𝝁 Statistic: ̃ = 𝑿𝒏+𝟏
𝟐 𝟐
Example 5.3: The weights (in pounds) of a sample of seven army recruits are 180, 201, 220, 191, 219, 209 and
186. Find the median.
Step 1: Arrange the data in order.
180, 186, 191, 201, 209, 219, 220
Step 2: Select the middle value.
Since there are seven (7) observations, then 𝜇̃ = 𝑋𝑁+1 = 𝑋7+1 = 𝑋8 = 𝑋4 observation which is
2 2 2
the weight 201 pounds.

CASE 2: When the number of observation is even, there are two middle values. The median is the mean
or average of the two middle values. And the position of the two data points are at ( 2 ) and ( 2 + 1).
𝑿𝑵 +𝑿𝑵 𝑿𝒏 +𝑿𝒏
+𝟏 +𝟏
𝟐 𝟐 𝟐 𝟐
Parameter: ̃=
𝝁 Statistic: ̃=
𝟐 𝟐
Example 5.4: The ages of a sample of 10 college students are 18, 24, 20, 35, 19, 23, 26, 23, 19, 20. Find the
Step 1: Arrange the data in order: that is, 18, 19, 19, 20, 20, 23, 23, 24, 26, 35.
Step 2: Select the middle value.
STATISTICS 1, 2nd Semester 2016-2017
𝑋𝑛+𝑋𝑛 𝑋10 +𝑋10
+1 +1 𝑋5 +𝑋5+1 𝑋5 +𝑋6 20+23 43
2 2 2 2
The two (2) middle values are the 𝑥̃ = = = = = = = 21.5.
2 2 2 2 2 2
Therefore, the median age is 21.5 years.

The median is a good alternative measure of the center when there are extreme values. It is easy to compute
if there are few observations. However, if we have a large set of data, the use of computers is essential in arranging
these data.

Properties of Median:
a. It is unique (for numerical data).
b. It can be computed for ordinal, interval or ratio level.
c. It is not affected by extreme values since the median uses only the middle values.

C. Mode
The third measure of average is called the mode. The mode is the value that one that occurs most often in
a data set. It means that the mode has the most typical value. A data set can have more than one mode or no mode
at all.
Parameter: ̂
𝝁 Statistic: ̂
Example 5.5: The following data represent the duration (in days) of US space shuttle voyages for the years 1992-
1994 (Source: The Universal Almanac 1995, p. 563). Find the mode.
8, 9, 9, 14, 8, 8, 19, 7, 6, 9, 7, 8, 10, 14, 11, 8, 14, 11

It is helpful to arrange the data in order, although it is not necessary.

6, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11, 11, 14, 14, 14
Since 8-day voyages occurred five times – a frequency larger than any other number – the mode for the
data set is 8.

Example 5.6: Ten (10) students asked of their opinion on the tuition fee increase and their responses are: in favor,
in favor, not in favor, not in favor, not in favor, neutral, in favor, in favor, not in favor, and neutral.

Since in favor and not in favor both occur four times, the mode are the opinions regarding in favor and
not in favor.

Properties of Mode:
a. It can be computed for any type of data whether it is nominal, ordinal, interval or ratio level data.
b. It may not be unique sine sometimes we cannot just get one value.
c. It may not exist.

D. Weighted Mean
Sometimes, one must find the mean of a data set in which not all values have the same degree of
importance. Just like a data containing the scores of a student in the quizzes, exam and assignments of particular
subject. Scores in major exams weigh more than those in quizzes. This type of measurement that considers an
additional factor is called the weighted mean.
∑𝒌𝒊=𝟏 𝒘𝒊 𝒙𝒊
𝒙𝒘 =
∑𝒌𝒊=𝟏 𝒘𝒊
where 𝑤𝑖 = weight of the observation 𝑖, 𝑘 = number of distinct observations and 𝑥𝑖 = the values.

Example 5.7: Anna is a DOST Scholar student at University of the Philippines – Diliman. She got the following
grades in her subjects last semester:


Math 17 2.25 6
English 1 1.50 3
History 1 1.75 3
STATISTICS 1, 2nd Semester 2016-2017
Filipino 1 2.00 3
English 3 1.75 3
P.E. 1 1.25 2

Compute the grade point average (GPA) of Anna. Will she able to maintain her scholarship if the grade
maintenance is least 1.75?

GRADE (𝒙𝒊 ) *
UNITS (𝒘𝒊 )
Math 17 2.25 6 13.5
English 1 1.50 3 4.5
History 1 1.75 3 5.25
Filipino 1 2.00 3 6.0
English 3 1.75 3 5.25
P.E. 1 1.25 2 2.5
TOTAL 20 37

∑6𝑖=1 𝑤𝑖 𝑥𝑖 37
𝑥𝑤 =
̅̅̅̅ = = 1.85
∑6𝑖=1 𝑤𝑖 20
Therefore, she is not able to maintain her scholarship since her GPA is 1.85.

Example 5.8: Suppose a survey asked a sample of 30 respondents to rate a movie on its cinematography. The rate
is from 1 to 5 with 1 being the lowest. A summary of data shows that twelve (12) gave rating of five (5), eight (8)
gave a rating of four (4) and, seven (7) and three (3) gave a rating of three (3) and two (2) respectively. Find the
average rating.


It is called measures of variation. These are measures of the degree to which numerical data are scattered
or spread.

A. Range
The range is the simplest of the difference between the highest and lowest values in a set of data. That is,
Range, 𝑹 = highest value – lowest value
The range is considered a poor measure of dispersion in the sense that if only considers two values in its
computation. Thus, it cannot accurately determine how spread the values are in a given data set.

Example 5.9: The given data below represents the lifespan of the paints expressed in terms of months.
BRAND A: 45, 60, 50, 55, 48, 56, 57
BRAND B: 35, 25, 45, 28, 39, 40, 44
Find the range for each brand.

For BRAND A: 𝑅 = 60 − 45 = 15 months.

For BRAND B: 𝑅 = 45 − 25 = 20 months.
Therefore, lifespan of BRAND A are less varied compared to BRAND B.

B. Variance
Variance is the average of the squares of the distances of each data value from the mean.

Definitional Formula:
𝟐 ∑𝑵
𝒊=𝟏(𝒙𝒊 −𝝁)
𝟐 ∑𝒏 ̅ )𝟐
𝒊=𝟏(𝒙𝒊 −𝒙
Parameter: 𝝈 = Statistic: 𝒔 =
𝑵 𝒏−𝟏

You may ask, why should each term of the numerator be squared? This is because ∑𝑁𝑖=1(𝑥𝑖 − 𝜇) = 0 or
∑𝑖=1(𝑥𝑖 − 𝑥̅ ) = 0, that is the sum of the deviation from the mean will always be zero.

STATISTICS 1, 2nd Semester 2016-2017

The formula for the population variance and the sample variance are almost the same except for the
denominator. The denominator of the sample variance, 𝑠 2 , is 𝑛 − 1 and not 𝑛 because in this way the sample
variance provides an unbiased estimator of the population variance than when divided by 𝑛. But for large sample
size 𝑛 (say over 30), it really does not matter whether it is divided by 𝑛 or 𝑛 − 1 because the results are almost
the same, and they are acceptable.

The disadvantage with the above formula is that it could lead to serious rounding-off errors especially
when the value of the mean is also a rounded-off value. Hence, we have alternative formula below which can be
minimize this error. These formula were derived from expansion of the original formula above.

Computational Formula:
𝟐 𝟐
𝟐 𝑵 ∑𝑵 𝟐 𝑵
𝒊=𝟏 𝒙𝒊 −(∑𝒊=𝟏 𝒙𝒊 ) 𝟐 𝒏 ∑𝒏 𝟐 𝒏
𝒊=𝟏 𝒙𝒊 −(∑𝒊=𝟏 𝒙𝒊 )
Parameter: 𝝈 = Statistic: 𝒔 =
𝑵 𝒏(𝒏−𝟏)

Example 5.10: A comparison of coffee prices at 4 randomly selected grocery stores showed increases from the
previous month of 12, 15, 17 and 20 cents for 200-gram jar. Find the variance of this random sample of price

Note that the data were collected from a random sample of 4 grocery stores. If we use the definitional
formula we have the following computations:

∑4𝑖=1 𝑥𝑖 12+15+17+20 64
Mean, 𝑥̅ = = = = 16 𝑐𝑒𝑛𝑡𝑠.
4 4 4

∑4𝑖=1(𝑥𝑖 −𝑥̅ )2 (12−16)2+(15−16)2 +(17−16)2 +(20+16)2 16+1+1+16

Sample variance, 𝑠 2 = = = = 11.3 𝑐𝑒𝑛𝑡𝑠 2 .
4−1 4−1 3

If we use the computational formula, we have the following computations for the sample variance:

𝑛 ∑4𝑖=1 𝑥𝑖2 = 4(122 + 152 + 172 + 202 ) = 4(144 + 225 + 289 + 400) = 4(1058) = 4232
(∑4𝑖=1 𝑥𝑖 ) = (12 + 15 + 17 + 20)2 = (64)2 = 4096

4 ∑4𝑖=1 𝑥𝑖2 −(∑4𝑖=1 𝑥𝑖 ) 4232−4096 136
Thus, 𝑠 2 = = = = 11.3 𝑐𝑒𝑛𝑡𝑠 2 .
4(4−1) 4(3) 12

1. The result of the computations will be the same as long as we don’t round-ff the computations except in
the final answer.
2. The value of the variance cannot be negative.
3. The unit of measurement for the variance is in square units of the original measure.

C. Standard Deviation
The standard deviation is the positive square root of the variance. It has the same unit of measurement
with the given data. It can be used to compare variability of two or more sets of data having the same units of
measurement with approximately the same mean. It enables us to determine, with a great deal of accuracy, where
the values of a distribution are located in relation to the mean.

Parameter: 𝝈 = √𝝈𝟐 Statistic: 𝒔 = √𝒔𝟐

Example 5.11: Find the standard deviation of the coffee price increase in Example 5.10.

𝑠 = √𝑠 2 = √11.3 𝑐𝑒𝑛𝑡𝑠 2 = 3.36 𝑐𝑒𝑛𝑡𝑠.

Therefore, the standard deviation of the coffee prices is 3.36 cents.

STATISTICS 1, 2nd Semester 2016-2017

D. Coefficient of Variation
Whenever two or more samples have the same units of measure and approximately the same mean, the
standard deviation for each can be compared directly. Coefficient of Variation is statistic that allows one to
compare standard deviations especially when the observations are expressed in different units of measurement or
when the sets have different means.

The coefficient of variation expresses the standard deviation as a fraction (or percent) of the mean. The
result is expressed as a percentage.
𝝈 𝒔
Parameter: 𝑪𝑽 = 𝝁 × 𝟏𝟎𝟎% Statistic: 𝑪𝑽 = 𝒙̅ × 𝟏𝟎𝟎%

Example 5.12: The mean of the number of cars sold over a three-month period into branches of Toyota,
Incorporation is $87 and the standard deviation is $5. The mean of the commission is $5225 and the standard
deviation is $773. Compare the variations of the two.

Since the units of measurement are different, we use the coefficient of variation to compare their relative

𝜎 5
Cars sold: 𝐶𝑉 = 𝜇 × 100% = 87 × 100% = 5.75% for cars sold
𝜎 773
Commission: 𝐶𝑉 = 𝜇 × 100% = 5225 = 14.79% for commission
Since the coefficient of variation is larger for commissions, the commissions are more varied than the

Exercise: The mean of the number of pages of sample of women’s fitness magazines is 132, with a variance of
23; the mean of the number of pages of a sample of men’s fitness magazines is 182, with a variance of 62.
Compare the variations of pages of the two magazines.


In addition to measures of central tendency and measure of variation, there are also measures of position
whether it will be at the center or at any points in the distribution of the data. These measures include percentiles,
deciles, quartile and z-score.

A. Percentiles
The percentiles are values that divides a set of observations (arranged increasingly) into 100 equal parts.
We use 𝑃𝑘 (𝑘 = 1, 2, 3, … , 99) to denote the 𝑘 𝑡ℎ percentile such that 𝑘% the observation falls below it.

Steps in computing 𝑷𝒌 :
Step 1: Arrange the data in increasing order of magnitude.
Step 2: Find the location 𝐿 of the 𝑘 𝑡ℎ percentile by computing 𝐿 = 100.
Step 3: If 𝐿 is an integer, then the desired value is the average of 𝐿𝑡ℎ observation and (𝐿 + 1)𝑡ℎ observation. If 𝐿
is not integer, round up 𝐿 to the next integer. The desired value is the observation located to the rounded up value
of 𝐿.

Example 5.13: the number of movies attended last month by a random sample of 12 students are recorded as
follows: 2, 0, 3, 1, 6, 4, 7, 5, 8, 9, 10, and 11. Find the following:

1. 𝑷𝟒𝟖
Arrange the data in increasing order. That is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11.
𝐿 = 100 × 12 = 5.76 and since this is not a whole number, we rounded it up to 6. Then, the
𝑃48 = 6𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 = 5.
Therefore, 48% of the observations fall below 5.

STATISTICS 1, 2nd Semester 2016-2017

2. 𝑷𝟕𝟓
𝐿 = 100 × 12 = 9 and since this is whole number then
9𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛+10𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 8+9 17
𝑃75 = = = = 8.5. Therefore, 75% of the observations fall below 8.5.
2 2 2

B. Deciles
Deciles are values that divides the set of observations into 10 equal parts. It is denoted by 𝐷𝑘
(𝑘 = 1, 2, 3, … , 9), such that 𝐷𝑘 = value such that 10 ∗ 𝑘% of the observation falls below it.

Example 5.14: From Example 5.13, find 𝑫𝟑 .

𝐿 = 10 × 12 = 3.6 and since this is not a whole number we round it up to 4. Then
𝐷3 = 4𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑠𝑎𝑡𝑖𝑜𝑛 = 3. Therefore, we can say that 30% of the observations fall below 3.

Example 5.15: A teacher gives a 20-point test to 10 students. The scores are 18, 15, 12, 6, 8, 2, 3, 5, 20, and 10.
Find 𝐷8 .
Arrange the following scores in ascending order. That is, 2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
𝐿 = 10 × 10 = 8. Since it is a whole number,
8𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛+9𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 15+18 33
𝐷8 = = = = 16.5. Therefore, 80% of the observations fall below
2 2 2

C. Quartiles
Quartiles are the values that divide the set of observations into 4 equal parts. It is denoted by 𝑄𝑘
(𝑘 = 1, 2, 3) such that 𝑄𝑘 = the value that 25 ∗ 𝑘% of the observation falls below it.

Example 5.15: From Example 5.13, find the 𝑄1.

𝐿 = 4 × 12 = 3. Since it is a whole number,
3𝑟𝑑 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛+4𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 2+3 5
𝑄1 = = = 2 = 2.5. Therefore, 25% of the observations fall below 2.5.
2 2

Example 5.16: Find the 𝑄3 for the test scores 5, 12, 15, 16, 20, and 21.
𝐿 = 4 × 6 = 4.5 and since this is not a whole number, we round it up to 5. Then 𝑄3 = 5𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
is 20. Therefore, 75% of the observations fall below 20.

There is an old saying that states, “You can’t compare apples and oranges”. But with the use of statistics,
it can be done to some extent. Suppose that a student scored 90 on a music test and 45 on an English exam. Direct
comparison of raw scores is impossible, since the exams are not equivalent in terms of number of questions, value
of the each question, and so on. However, a comparison of relative standard similar to both thongs can be made.
This comparison uses the mean and the standard deviation and is called a z-score (standard score).

D. z-score or Standard Score

The z-score represents the number of standard deviations a data falls above or below the mean. A standard
score or z-score for a value is obtained by subtracting the mean from the value and dividing the result by the
standard deviation. The symbol for the standard score is 𝑧. That is,
𝒗𝒂𝒍𝒖𝒆 − 𝒎𝒆𝒂𝒏 𝒙−𝝁
𝒛= =
𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝝈

Example 5.17: A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10, she
scored 30on a history test with the mean of 25 and a standard deviation of 5. Compare her relative positions on
the two sets.
𝑥−𝜇 65−50
First, find the z-score. For the calculus, 𝑧 = 𝜎 = 10 = 1.5.
For the history, the z-score is 𝑧 = = 1.0. Since the z-score for the calculus is larger, her relative
position in the calculus class is higher than her relative position in the history class.

STATISTICS 1, 2nd Semester 2016-2017

For the data summarized in frequency distribution table, the individual observations are unknown; we
have a different way of computing for statistical measures. Each observation in a class is estimated by its class
mark. We will use this approach only if the raw data are not available. But if they are available, we compute the
statistical measures using the formula that have been discussed previously.

A. Arithmetic Mean
For computational convenience, we construct an additional columns in the frequency distribution table.
The entries for this column will contain the product of the frequency and its corresponding class mark, denoted
by 𝑓𝑖 𝑥𝑖 . The mean then obtained using the formula
𝑥̅ = ∑ 𝑓𝑖 𝑥𝑖

where 𝑓𝑖 = the class frequency of the 𝑖 𝑡ℎ class interval

𝑥𝑖 = the class mark of the 𝑖 𝑡ℎ class interval
𝑘 = the number of class intervals
𝑛 = the total frequency

Note: The arithmetic mean cannot be computed from an open-ended frequency distribution.

Example 5.18: More and more employers are using psychological testing as an aid in determining whether the
applicant is fit for the work in the company. The following data shows the distribution of the scores of applicants
who took the psychological test administered by a company.

Score Class Frequency

41 – 50 5
51 – 60 7
61 – 70 10
71 – 80 16
81 – 90 11
91 – 100 9
Total 58

Estimate the mean score.

First, we compute the class mark for each class and then multiply them by the corresponding frequency
as shown below.

Score Class Frequency (𝒇𝒊 ) Class Mark (𝒙𝒊 ) 𝒇𝒊 𝒙 𝒊

41 – 50 5 45.5 227.5
51 – 60 7 55.5 388.5
61 – 70 10 65.5 655
71 – 80 16 75.5 1208
81 – 90 11 85.5 940.5
91 – 100 9 95.5 859.5
Total 58 4279

Then the mean is

1 1
𝑥̅ = ∑ 𝑓𝑖 𝑥𝑖 = (4279) = 73.78
𝑛 58
Therefore, the mean score of the applicants is 73.78.

STATISTICS 1, 2nd Semester 2016-2017

B. Median
To compute the median, we need the cumulative frequency distribution table. As an initial step, we
𝑛 𝑛𝑡ℎ
compute 2 and from the cumulative frequency column, we determine the interval containing the 2 observation.
This is called the median class or median interval. From this interval, compute the median using the formula

𝑥̃ = 𝐿𝐶𝐵 + 𝑐 [ 2 − 𝐹(𝑚−1) ]
𝑔 𝑚

where 𝐿𝐶𝐵𝑚 = the lower class boundary of the median class

𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the median
𝑓𝑚 = the frequency of the median class
𝑐 = the class width

Note: The median of a grouped data can be calculated even with open-ended intervals provided the median class
is not open-ended.

Example 5.19: Find the median using Example 5.18.

Class Mark Class
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 Frequency
(𝒙𝒊 ) Boundary
41 – 50 5 45.5 227.5 40.5 – 50.5 5
51 – 60 7 55.5 388.5 50.5 – 60.5 12
61 – 70 10 65.5 655 60.5 – 70.5 22
71 – 80 16 75.5 1208 70.5 – 80.5 38
81 – 90 11 85.5 940.5 80.5 – 90.5 49
91 – 100 9 95.5 859.5 90.5 – 100.5 58
Total 58 4279

𝑛𝑡ℎ 58𝑡ℎ
Finding the median class, 2 = = 29. The 29𝑡ℎ observation is located at the 4𝑡ℎ class. Therefore,
− 𝐹(𝑚−1) 29 − 22
𝑥̃ = 𝐿𝐶𝐵 + 𝑐 [ 2 ] = 70.5 + 10 [ ] = 70.5 + 4.375 = 74.88
𝑔 𝑚
𝑓𝑚 16

C. Mode
To compute the mode first we have to locate the modal class. This is the interval having the highest
frequency. From this nodal class, compute the mode using the formula

𝑓𝑚𝑜 − 𝑓1
𝑀𝑜𝑑𝑒𝑔 = 𝐿𝐶𝐵𝑚𝑜 + 𝑐 [ ]
2𝑓𝑚𝑜 − 𝑓1 − 𝑓2

where 𝐿𝐶𝐵𝑚𝑜 = the lower class boundary of the modal class

𝑓𝑚𝑜 = the frequency of the modal class
𝑓1 = the frequency of the interval before the modal class
𝑓2 = the frequency of the interval after the modal class
𝑐 = the class width

Note: The mode can also be computed with open-ended intervals provided the modal class is not open-ended.

STATISTICS 1, 2nd Semester 2016-2017

Example 5.20: From Example 5.18, find the mode.

Class Mark Class
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 Frequency
(𝒙𝒊 ) Boundary (𝑭𝒊)
41 – 50 5 45.5 227.5 40.5 – 50.5 5
51 – 60 7 55.5 388.5 50.5 – 60.5 12
61 – 70 10 65.5 655 60.5 – 70.5 22
71 – 80 16 75.5 1208 70.5 – 80.5 38
81 – 90 11 85.5 940.5 80.5 – 90.5 49
91 – 100 9 95.5 859.5 90.5 – 100.5 58
Total 58 4279

The modal class is the 4𝑡ℎ interval since it has the highest frequency which is 16. Hence,

𝑓𝑚𝑜 − 𝑓1 16 − 10
𝑀𝑜𝑑𝑒𝑔 = 𝐿𝐶𝐵𝑚𝑜 + 𝑐 [ ] = 70.5 + 10 [ ] = 70.5 + 5.45 = 75.95
2𝑓𝑚𝑜 − 𝑓1 − 𝑓2 2(16) − 10 − 11

D. Variance
To make computation faster, additional columns for 𝑓𝑖 𝑥𝑖 and 𝑓𝑖 𝑥𝑖 2 have to be added in the frequency
distribution table, with their sums computed. Computing the variance, we will use the formula

𝑛 ∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 2 − (∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 )
𝑠𝑔 =
𝑛(𝑛 − 1)

where 𝑛 = the number of observations

𝑓𝑖 = the frequency of the 𝑖 𝑡ℎ class interval
𝑥𝑖 = the class mark of the 𝑖 𝑡ℎ class interval
𝑘 = the number of class intervals

Example 5.21: Using the data from Example 5.18, compute the variance.

Class Mark
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 𝒙𝒊 𝟐 𝒇𝒊 𝒙𝒊 𝟐
(𝒙𝒊 )
41 – 50 5 45.5 227.5 2070.25 10351.25
51 – 60 7 55.5 388.5 3080.25 21561.75
61 – 70 10 65.5 655 4290.25 42902.50
71 – 80 16 75.5 1208 5700.25 91204.00
81 – 90 11 85.5 940.5 7310.25 80412.75
91 – 100 9 95.5 859.5 9120.25 82082.25
Total 58 4279 328514.50

𝑛 ∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 2 − (∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 ) 58(328514.50) − (4279)2 744000
𝑠𝑔 = = = = 225.04
𝑛(𝑛 − 1) 58(58 − 1) 3306

E. Standard Deviation
The computing formula for the standard deviation is
𝑠𝑔 = √𝑠𝑔 2

where 𝑠𝑔 2 = the grouped variance

STATISTICS 1, 2nd Semester 2016-2017

Example 5.22: Compute the standard deviation of Example 5.21.

𝑠𝑔 = √𝑠𝑔 2 = √225.04 = 15.00

F. Range
The range can be measured by getting the difference between the highest class boundary and lowest class
boundary, that is,
𝑅 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦

Example 5.23: Find the range 𝑅 of the given data in Example 5.18.

𝑅 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 100.5 − 40.5 = 60.0

G. Percentile
For the computation of the 𝑚𝑡ℎ percentile of 𝑃𝑚 , we need the cumulative frequency column. First, compute
𝑚𝑛 𝑚𝑛 𝑡ℎ
, then locate the 100 observation in the cumulative frequency column and identify the corresponding interval
called the 𝑚 percentile class. From this interval, compute 𝑃𝑚 using the formula

− 𝐹(𝑚−1)
𝑃𝑚 = 𝐿𝐶𝐵𝑚 + 𝑐 [ 100 ]

where 𝐿𝐶𝐵𝑚 = the lower class boundary of the 𝑚𝑡ℎ percentile interval
𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the 𝑚𝑡ℎ
percentile interval
𝑓𝑚 = the frequency of the 𝑚𝑡ℎ percentile interval
𝑐 = the class width

H. Decile
𝑚𝑛 𝑚𝑛𝑡ℎ
For the computation of the 𝑚𝑡ℎ decile of 𝐷𝑚 , first compute , then locate the observation in the
10 10
cumulative frequency column and identify the corresponding interval called the 𝑚 decile class. From this
interval, compute 𝐷𝑚 using the formula

− 𝐹(𝑚−1)
𝐷𝑚 = 𝐿𝐶𝐵𝑚 + 𝑐 [ 10 ]

where 𝐿𝐶𝐵𝑚 = the lower class boundary of the 𝑚𝑡ℎ decile interval
𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the 𝑚𝑡ℎ
decile interval
𝑓𝑚 = the frequency of the 𝑚𝑡ℎ decile interval
𝑐 = the class width

I. Quartile
𝑚𝑛 𝑚𝑛𝑡ℎ
For the computation of the 𝑚𝑡ℎ quartile of 𝑄𝑚 , first compute , then locate the observation in the
4 4
cumulative frequency column and identify the corresponding interval called the 𝑚𝑡ℎ quartile class. From this
interval, compute 𝑄𝑚 using the formula

− 𝐹(𝑚−1)
𝑄𝑚 = 𝐿𝐶𝐵𝑚 + 𝑐 [ 4 ]

where 𝐿𝐶𝐵𝑚 = the lower class boundary of the 𝑚𝑡ℎ quartile interval
STATISTICS 1, 2nd Semester 2016-2017
𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the 𝑚𝑡ℎ
quartile interval
𝑓𝑚 = the frequency of the 𝑚𝑡ℎ quartile interval
𝑐 = the class width

Example 5.24: Using the data given in Example 5.18, compute the following and interpret these values.

Class Mark Class
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 Frequency
(𝒙𝒊 ) Boundary (𝑭𝒊)
41 – 50 5 45.5 227.5 40.5 – 50.5 5
51 – 60 7 55.5 388.5 50.5 – 60.5 12
61 – 70 10 65.5 655 60.5 – 70.5 22
71 – 80 16 75.5 1208 70.5 – 80.5 38
81 – 90 11 85.5 940.5 80.5 – 90.5 49
91 – 100 9 95.5 859.5 90.5 – 100.5 58
Total 58 4279

a. 𝑃45 : 𝐿 = = 26.1 ≈ 27. The 27𝑡ℎ value arranged in order falls on the 4𝑡ℎ class interval. Therefore,
26.1 − 22
𝑃45 = 70.5 + 10 [ ] = 70.5 + 2.5625 = 73.06
Interpretation: 45% of all the observations fall below 73.06.

b. 𝐷6 : 𝐿 = = 34.8 ≈ 35. The 35𝑡ℎ value arranged in order falls on 4𝑡ℎ class interval. Therefore,
34.8 − 22
𝐷6 = 70.5 + 10 [ ] = 70.5 + 8 = 78.5
Interpretation: 60% of all the observations fall below 78.5.

c. 𝑄3 : 𝐿 = = 43.5 ≈ 44. The 44𝑡ℎ value arranged in order falls on the 5𝑡ℎ class interval. Therefore,
43.5 − 38
𝑃3 = 80.5 + 10 [ ] = 70.5 + 5 = 75.5
Interpretation: 75% of all the observations fall below 75.5.


The arithmetic mean and standard deviation are closely related to a family of descriptive statistics known
as moments. The first four central moments about the arithmetic mean are the following:

𝑚1 = ∑(𝑥𝑖 − 𝑥̅ ) = 0
1 𝑛−1 2
𝑚2 = ∑(𝑥𝑖 − 𝑥̅ )2 = 𝑠
𝑛 𝑛
𝑚3 = ∑(𝑥𝑖 − 𝑥̅ )3
𝑚4 = ∑(𝑥𝑖 − 𝑥̅ )4

In general, the 𝑟 𝑡ℎ central moment about the mean is given by:

𝑚𝑟 = ∑(𝑥𝑖 − 𝑥̅ )𝑟

STATISTICS 1, 2nd Semester 2016-2017

The first and second moments are measures of central location and variability respectively, while the third
and fourth moments are used in determining skewness and kurtosis, two other numerical description of data are
discussed in this section.
A distribution that is perfectly symmetric implies that the three measures of central tendency, namely, the
mean, median, and mode are equal. This condition is shown in Figure 5.1. When extreme observations are found
on the right or left end of the distribution, then the distribution is said is said to be asymmetric or skewed, i.e.,
distribution is skewed if it departs from symmetry.
There are two kinds of skewed distribution. One is positively skewed distribution, which means that there
are extremely high observations that tend to pull the mean to the right. This results in a frequency distribution that
is more elongated to the right side as shown in Figure 5.2. The other one is the negatively skewed distribution,
which has extremely low observations that tend to pull the mean to the left. This gives a frequency distribution
that is elongated to the left side as shown in Figure 5.3.

Figure 5.1 Symmetric Distribution

Figure 5.2 Positively Skewed Figure 5.3 Negatively Skewed

Distribution Distribution

In Figure 5.1 above, the mean, the median and the mode has the same numerical value. This is true when
the distribution is symmetric. In Figure 5.2, the mean is at the right of the median while Figure 5.3, the mean is
at the left of the median. Thus for positively skewed distribution, Mode < Median < Mean while for a negatively
skewed distribution, Mean < Median < Mode. An absolute measure of skewness is the expression (Mean – Mode)
but change in measurement units will give varying values. A relative measure, which will not be affected by
change in measurements and is easy to compute is Karl Pearson’s coefficient of skewness and is defined by

3(𝑥̅ − 𝑥̃) 3(𝑥̅ − 𝑥̃)

𝑔1 = =
𝜎 𝑠

When 𝑔1 is positive, it is positively skewed, and the distribution is negatively skewed when 𝑔1 is negative. There
are many other measures of skewness and each measure may give a different numerical value, but they will lead
to similar interpretations, which is the ultimate objective in making the computations.

When the distribution is symmetric, it possesses the property that all odd-ordered central moments (𝑟 is
odd in 𝑚𝑟 = 𝑛 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )𝑟 ) are equal to zero. A measure of skewness is based on the moments is

𝑔1 =
𝑚2 √𝑚2

STATISTICS 1, 2nd Semester 2016-2017

where 𝑚2 is the 2nd central moment and 𝑚3 is the 3rd central moment. For grouped data, a measure of skewnwss

1 𝑥𝑖 − 𝑥̅ 3
𝑔1 = ∑ 𝑓𝑖 ( )
𝑛 𝑠

Example 5.25: Consider the set of numbers 6, 8, 10, 11, 15. Find its measure of skewness.

First we compute the mean, 𝑥̅ = = 10. Using the Karl’s Pearson coefficient of skewness, we need to
solve for 𝑚2 and 𝑚3 .

∑(𝑥𝑖 − 𝑥̅ )2 = (−4)2 + (−2)2 + (0)2 + (1)2 + (5)2 = 46


∑(𝑥𝑖 − 𝑥̅ )3 = (−4)3 + (−2)3 + (0)3 + (1)3 + (5)3 = 54

1 46 1 54
Thus, 𝑚2 = ∑𝑖=1(𝑥𝑖 − 𝑥̅ )2 = = 9.2 and 𝑚3 = ∑5𝑖=1(𝑥𝑖 − 𝑥̅ )3 = = 10.8.
5 5 5 5
𝑚3 10.8
Therefore 𝑔1 = 𝑚 𝑚 = 9.2 9.2 = 0.387. Since 𝑔1 is greater than zero, this implies that the distribution
2√ 2 √
of this set of numbers is skewed to the right.

Two distributions may have the same variability but may be relatively flatter at the top than the normal
curve. A curve is normal if it has a bell-shaped form with 𝑚𝑒𝑎𝑛 = 𝑚𝑒𝑑𝑖𝑎𝑛 − 𝑚𝑜𝑑𝑒. To measure the flatness of
the distribution we use the coefficient or kurtosis. This measure, denoted by 𝑔2 and is defined by

𝑔2 =
𝑚2 2

where 𝑚2 is the 2nd moment and 𝑚4 is the 4th moment. The formula above can be written alternatively as,

∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )4
𝑔2 =
𝑛 (𝑠 2 )2

As an aid to interpreting kurtosis, the value of 𝑔2 is 3 for a normal distribution and when this value is
attained the distribution is said to be mesokurtic. Distributions with 𝑔2 greater than 3 are called leptokurtic (sharp
top) and those with 𝑔2 less than 3 are called platykurtic (flat top).

Example 5.26: Find the measure of kurtosis of the set of numbers given in Example 5.25.

From the previous example, the mean is 𝑥̅ = 10, 𝑚2 = 9.2.


∑(𝑥𝑖 − 𝑥̅ )4 = (−4)4 + (−2)4 + (0)4 + (1)4 + (5)4 = 898


∑5𝑖=1(𝑥𝑖 −𝑥̅ ) 898
Hence, 𝑚4 = = = 179.6.
5 5
𝑚 179.6
Therefore, the measure of kurtosis is 𝑔2 = 𝑚 42 = (9.2)2 = 2.12.
Since 𝑔2 = 2.12 which is less than 3, the distribution for this set of numbers has a flat top.

STATISTICS 1, 2nd Semester 2016-2017

You might also like