Professional Documents
Culture Documents
INTRODUCTORY CONCEPT
In everyday life, whether at home or at work, records are being kept or reports are being read. An item in a
record or report is a fact that is expressed in terms of a numerical value or described by its quality or kind. That
single item or fact is referred to as a datum. All these facts in a record or report are called data. The color of the
hair, the number of basketball player, and the number of times you were absent from class are all examples of
data. Data, and how to handle it scientifically, is the major reason why we study Statistics.
When information is gathered for all the units in the population, the process is called a census. When only
part of the population is used to obtain data, the process is called sampling or a sample survey. When the size of
the population is large, a census becomes a long and tedious process aside from having a prohibitive cost. To save
on cost and time, a sample survey is a convenient alternative. The information derived from the data in the sample
is then used to make some generalizations about the population. However, in making this option, errors are
unavoidable. The role of Statistics is to provide the procedures that will minimize the errors that are bound to
happen.
1.2 STATISTICS
Statistics – branch of science that deals with the development of methods for a more effective way of collecting,
organizing, presenting, and analyzing data.
TYPES OF VARIABLES
Variables are the characteristics or properties measured from the objects, persons or thing. These variables
can either be discrete and continuous.
Nominal measurement – possess only the property of identity and do not possess the properties of order and
equality of scales. This is the lowest form of measurement.
Example 1.1: color of the dress, sex of the newborn baby, occupation of parents, and religion of the Manobo tribes
in Bukidnon province.
Ordinal measurement – possess the properties of both identity and order but not the equality of scale property.
Example 1.2: when students are ranked according to class performance, an order of 1 st, 2nd, and 3rd … can be
established. Ranking of the military position, DepEd ranking of teachers, taste preference.
Interval measurement – possess the properties of identity, order, and equality of scale but do not have the
property of absolute zero. An ‘absolute zero property’ means it has nothing of the
characteristic that is being measured.
Example 1.3: recording of temperature, intelligence of exam of a student.
Ratio measurement – possess all the properties of identity, order, equality of scales and absolute zero. This is
the highest form of measurement.
Example 1.4: height, weight, age, volume.
If you conduct a survey to every person, you are taking a census. However, this method is often
impracticable; as it’s often very costly in terms of time and money. For example, a survey that asks
complicated questions may need to use trained interviewers to ensure questions are understood. This may
be too expensive if every person in the population is to be included.
2. it is less cumbersome and more practical to administer since you will need to gather data from a lesser
number of respondents; and
3. some experiments are destructive so it is not possible to involve the whole population.
Sometimes taking a census can be impossible. For example, a car manufacturer might want to test the
strength of cars being produced obviously, each car could not be crash tested to determine its strength. To
overcome these problems, samples are taken from populations, and estimates made about the total
population based on information derived from the sample.
Sampling also has disadvantages, the biggest of which is that the sample may not truly reflect the
characteristic of the population and this would lead to wrong conclusions. Hence, care must be taken in choosing
a sample. Also, a sample must be large enough to give a good representation of the population, but small enough
to manageable.
2.1 PROBABILITY SAMPLING – also known as random sampling. This is one in which the elements of the
sample are chosen on the basis of known probabilities. Each element in the
population has an equal and independent chance of being selected as a
STATISTICS 1, 2nd Semester 2016-2017
sample point. This means that the choice of an element is not influenced by
other considerations such as personal preference, and that the choice of one
element is not dependent upon the choice of another element in the
sampling.
Simple Random Sampling (SRS) – may be done with or without replacement. With simple random sampling,
each item in a population has an equal chance of inclusion in the sample.
This can be done using the fishbowl method or using random numbers.
Procedure:
Step 1: Assign a number to each element of the population using the numbers from 1 to 𝑁.
Step 2: Select 𝑛 numbers from 1 to 𝑁 using a random process like fishbowl method or draw lots.
Example 2.1: Choose a random sample of 5 students from the following 20 students using draw lots method.
The advantage of simple random sampling is that it is simple and easy to apply when small populations
are involved. However, because every person or item in a population has to be listed before the corresponding
random numbers can be read, this method is very cumbersome to use for large populations.
Another disadvantage of simple random sampling is that we can never be assured that all sectors or groups
is represented in your sample.
Systematic Random Sampling – sometimes called interval sampling, means that there is a gap, interval, or
between each selection. Here we select every 𝑘 𝑡ℎ element in the population,
the first unit being chosen at random.
Procedure:
Step 1: Assign a number to each element of the population using the numbers from 1 to 𝑁.
𝑁
Step 2: Determine the sampling interval 𝑘: 𝑘 = 𝑛 where 𝑁 = population size and 𝑛 = sample size.
Step 3: Select a random start 𝑟 where 1 < 𝑟 < 𝑘. The first unit of the sample is the unit corresponding to 𝑟.
Note: If 𝑘 is not a whole number, then it is rounded-off to the nearest whole number.
Example 2.2: In a population of 120 individuals, choose a systematic random sample of size 10.
Stratified Random Sampling – the population of 𝑁 units is first divided into homogenous subpopulations (called
strata) and then a sample is drawn from each stratum. This type of sampling
assures that all groups or strata are represented in the sample. Some examples of
strata commonly used by the SWS Survey are location, age and sex. Other strata
maybe religion, academic ability or marital status.
Procedure:
Step 1: Classify the population into at least two homogenous strata. The basis for classification must be closely
related to the variable of interest. Suppose if we are interested to determine the students opinion on the tuition fee
increase, it may be logical to subdivide the population of students by college, or by year level, or by tribe or a
combination of these.
Step 2: Draw a sample from each stratum by simple or systematic random sampling.
If the size 𝑁 of the population is divided into 𝑘 homogenous subpopulations or strata of sizes
𝑁1 , 𝑁2 , … , 𝑁𝑘 , then the sample size to be taken from each stratum 𝑖 is obtained using the formula:
𝑁
A. Proportional Allocation 𝑛𝑖 = ( 𝑁𝑖 ) × 𝑛 for 𝑖 = 1, 2, 3, … , 𝑘.
STATISTICS 1, 2nd Semester 2016-2017
𝑁
B. Equal Allocation 𝑛𝑖 = ( 𝑁𝑖 ) for 𝑖 = 1, 2, 3, … , 𝑘.
Example 2.3: At a small private college, the students may be classified according to the following scheme:
If we use proportional allocation to select stratified random sample of size 𝑛 = 40, how large a sample must be
taken from each stratum?
Cluster Sampling – divides the population into groups, or clusters. A number of clusters are selected randomly
to represent the population, and then all units within selected clusters are included in the
sample. No units from non-selected clusters are included in the sample. They are represented
by those from selected clusters. This differs from stratified sampling, where some units are
selected from every group.
Procedure:
Step 1: Divide the population area into heterogeneous sections or clusters.
Step 2: Select randomly a few from these clusters.
Example 2.4: Suppose the population of a study is all registered voters of the country. The population may be
considered to be clustered or segregated into 16 regions. From these 16 regions, we may select 5 regions
randomly. From the regions drawn, we select all its registered voters.
2.2 NON-PROBABILITY SAMPLING – is one in which individuals or items are chosen without regard to their
probability of occurrence. This is usually used when the size of the
population is either unknown or cannot be individually identified.
Here, personal preference are applied.
Quota Sampling – also known as convenience sampling. The main consideration directing quota sampling is the
researcher’s ease of access to the sample population. In addition to convenience, he/she is
guided by some visible characteristic, such as gender or race of the study population that is of
interest to him/her. The sample is selected from the location convenient to the researcher and
whenever a person with this visible relevant characteristic is seen, the person is asked to
participate in the study. This process continues until the required number of respondents
(quota) is obtained.
Example 2.5: Suppose you want to select a sample of 20 male students. You may stand at a convenient location
and whenever you see a male student, you collect the required information. You continue until you have 20 male
students.
Accidental Sampling – similar to quota sampling except that the researcher is not guided by any obvious
characteristic. This is common among market research and newspaper reports.
Example 2.6: Suppose you want to get a sample of 20 users of soap “A”. You stand at a convenient location and
ask the person you see if he/she is a user of soap “A”. If he/she is, then get the required information. You continue
until you have 20 respondents.
Judgment or Purposive Sampling – the researcher purposely choose as to who can provide the best information
to achieve the objectives of the study. The researcher only goes to people
who in his/her opinion are likely to have then required information and are
Example 2.7: A student conducted a study on the history of Tagum City. To get proper information, he interviewed
past mayors, city officials and pioneering employees and staff of City Hall. He also interviewed the City
Information Office since they may have also the past records about Tagum City.
Example 2.8: A researcher wanted study the factors why some students occasionally use prohibited drugs. He
intended to get 50 students, but he only knew 5 students who used it. By getting the cooperation of these 5
students, he was referred to other drug users, who in turn also provide additional contacts. In this way, he was
able to get sufficient number of students he needed.
Note: Most of the presentations in the succeeding sections will be based on the information given by a sample of
fifty (50) students with their corresponding gender, year level and final examination scores as summarized in
Table 3.2 above.
Example 3.1: Make a frequency distribution table using test scores in Table 3.2.
Step 1: Compute the range, 𝑅 = 63 − 10 = 53.
Step 2: Estimate the number of classes. 𝑘 = √50 = 7.07 ≈ 7 or 𝑘 = 1 + 3.322 log 50 = 6.64 ≈ 7. We use 7
intervals.
𝑅 53
Step 3: Estimate 𝑐, the width of the interval. 𝑐 = = = 7.57 ≈ 8.
𝑘 7
Step 4 and Step 5: List the lower and upper class limits of the first class interval. Then, list all the succeeding
lower and upper class limits using 𝑐.
Step 6: Make a tally, then get the total frequencies of each interval.
1 1
Step 7: Compute the class boundaries. 𝐿𝑖 = − [ (1)] = −0.5 and 𝑈𝑖 = + [ (1)] = +0.5.
2 2
Step 8: Compute the class mark 𝑥̅ .
Step 9: Make a column for relative frequency.
Step 10: Make a column for cumulative frequency.
CLASS
CLASS CLASS RELATIVE CUMULATIVE
TALLY FREQUENCY MARKS
INTERVALS BOUNDARIES FREQUENCY FREQUENCY
(𝑥̅ )
10 – 17 11111 5 9.5 – 17.5 13.5 0.1 5
18 – 25 1111111111 10 17.5 – 25.5 21.5 0.2 15
26 – 33 11111111111111 14 25.5 – 33.5 29.5 0.28 29
34 – 41 11111111 8 33.5 – 41.5 37.5 0.16 37
42 – 49 1111111 7 41.5 – 49.5 45.5 0.14 44
50 – 57 111 3 49.5 – 57.5 53.5 0.06 47
58 – 65 111 3 57.5 – 65.5 61.5 0.06 50
Bar Chart
A bar chart is a graph where the different classes are represented by rectangles or bars. The width of each
rectangle along the horizontal axis corresponds to the class limits or categories for nominal variables, while the
length of the rectangle, corresponds to the class frequency.
History
A graph that is close resemblance of the bar chart is the histogram. The basic difference between the two
graphs is that a bar chart uses the class limits for the horizontal axis while the histogram employs the class
boundaries. Using the class boundaries eliminates the spaces between the rectangles, giving it a solid appearance.
Frequency Ogive
A cumulative frequency distribution can be represented graphically by a frequency ogive. An ogive is
obtained by plotting the upper class boundaries on the horizontal scale and the corresponding cumulative
frequency in the vertical scale.
Pie Chart
A pie chart is a circle divided into pie-shaped sectors, which look like a slices of pizza pie. The angle of
a sector is proportional in size to the frequencies or percentages.
Example 4.1:
a. ∑5𝑖=1 𝑥𝑖 = 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 + 𝑥5
b. ∑3𝑖=1(𝑥𝑖 + 𝑦𝑖 ) = (𝑥1 + 𝑦1 ) + (𝑥2 + 𝑦2 ) + (𝑥3 + 𝑦3 )
Rules of Summation:
a. ∑𝑛𝑖=1(𝑥𝑖 + 𝑦𝑖 ) = ∑𝑛𝑖=1 𝑥𝑖 + ∑𝑛𝑖=1 𝑦𝑖
b. ∑𝑛𝑖=1 𝑎𝑥𝑖 = 𝑎 ∑𝑛𝑖=1 𝑥𝑖 , where 𝑎 is any constant.
c. ∑𝑛𝑖=1 𝑎 = 𝑛𝑎, where 𝑎 is any constant.
Example 4.2:
Given 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 8, 𝑥4 = −2, 𝑦1 = −6, 𝑦2 = −1, 𝑦3 = 5, 𝑦4 = 0, find the value of the
following:
a. ∑4𝑖=1 𝑥𝑖 2 = 𝑥1 2 + 𝑥2 2 + 𝑥3 2 + 𝑥4 2 = 32 + 42 + 82 + (−2)2 = 9 + 16 + 64 + 4 = 93
2
b. (∑4𝑖=1 𝑥𝑖 )2 = (𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 )2 = (3 + 4 + 8 + (−2)) = (13)2 = 169
c. ∑2𝑖=1 𝑥𝑖 𝑦𝑖 = 𝑥1 𝑦1 + 𝑥2 𝑦2 = 3(−6) + (4)(−1) = −18 − 4 = −22
Exercises:
Given 𝑥1 = 3, 𝑥2 = 4, 𝑥3 = 8, 𝑥4 = −2, 𝑦1 = −6, 𝑦2 = 1, 𝑦3 = 5, 𝑦4 = 0, find the value of the following:
a. ∑4𝑖=1 𝑥𝑖 2
b. ∑3𝑖=1(3𝑥𝑖 + 𝑦𝑖 )
c. (∑4𝑖=2 𝑥𝑖 )(∑4𝑖=2 𝑦𝑖 )
d. (∑4𝑖=1 𝑦𝑖 ) + 5
Statistic – is a characteristic or measure obtained by using the data values from a sample.
Parameter – is a characteristic or measure obtained by using all the data values for a specific population.
A. Arithmetic Mean
The mean is the sum of values divided by the total number of values. This is commonly called the average
in layman’s term. In statistics, all measures of center are called average.
It can be seen from the formula that the procedure in finding the mean for a population or a sample are
just the same. The mean is the most common measure of the center used for numerical data.
Example 5.1: The ages in weeks of six kittens at an animal shelter are 3, 8, 5, 12, 14 and 12. Find the mean.
∑𝑁 𝑖=1 𝑥𝑖 3 + 8 + 5 + 12 + 14 + 12 54
𝜇= = = =9
𝑁 6 6
Thus, the mean age of the kittens is 9 weeks.
Example 5.2: The fat contents in grams for one serving of 11 brands of packaged foods, as determined by the U.S.
Department of Agriculture, are given as follows: 6.5, 6.5, 9.5, 8.0, 14.0, 8.5, 3.0, 7.5, 16.5, 7.0, 8.0. Find the
mean.
∑𝑁𝑖=1 𝑥𝑖 6.5 + 6.5 + 9.5 + 8.0 + 14.0 + 8.5 + 3.0 + 7.5 + 16.5 + 7.0 + 8.0 95
𝜇= = = = 8.64
𝑁 11 11
Thus, the mean of fat contents in grams for one serving of 11 brands of packaged foods is 8.64 grams.
B. Median
When the data are arranged in increasing or decreasing order, the median is the halfway point or middle
value in a data set. Meaning, the median is the data point which divides the distribution into two equal parts.
Parameter: ̃ = 𝑿𝑵+𝟏
𝝁 Statistic: ̃ = 𝑿𝒏+𝟏
𝒙
𝟐 𝟐
Example 5.3: The weights (in pounds) of a sample of seven army recruits are 180, 201, 220, 191, 219, 209 and
186. Find the median.
Step 1: Arrange the data in order.
180, 186, 191, 201, 209, 219, 220
Step 2: Select the middle value.
Since there are seven (7) observations, then 𝜇̃ = 𝑋𝑁+1 = 𝑋7+1 = 𝑋8 = 𝑋4 observation which is
2 2 2
the weight 201 pounds.
CASE 2: When the number of observation is even, there are two middle values. The median is the mean
𝑁 𝑁
or average of the two middle values. And the position of the two data points are at ( 2 ) and ( 2 + 1).
𝑿𝑵 +𝑿𝑵 𝑿𝒏 +𝑿𝒏
+𝟏 +𝟏
𝟐 𝟐 𝟐 𝟐
Parameter: ̃=
𝝁 Statistic: ̃=
𝒙
𝟐 𝟐
Example 5.4: The ages of a sample of 10 college students are 18, 24, 20, 35, 19, 23, 26, 23, 19, 20. Find the
median.
Step 1: Arrange the data in order: that is, 18, 19, 19, 20, 20, 23, 23, 24, 26, 35.
Step 2: Select the middle value.
STATISTICS 1, 2nd Semester 2016-2017
𝑋𝑛+𝑋𝑛 𝑋10 +𝑋10
+1 +1 𝑋5 +𝑋5+1 𝑋5 +𝑋6 20+23 43
2 2 2 2
The two (2) middle values are the 𝑥̃ = = = = = = = 21.5.
2 2 2 2 2 2
Therefore, the median age is 21.5 years.
The median is a good alternative measure of the center when there are extreme values. It is easy to compute
if there are few observations. However, if we have a large set of data, the use of computers is essential in arranging
these data.
Properties of Median:
a. It is unique (for numerical data).
b. It can be computed for ordinal, interval or ratio level.
c. It is not affected by extreme values since the median uses only the middle values.
C. Mode
The third measure of average is called the mode. The mode is the value that one that occurs most often in
a data set. It means that the mode has the most typical value. A data set can have more than one mode or no mode
at all.
Parameter: ̂
𝝁 Statistic: ̂
𝒙
Example 5.5: The following data represent the duration (in days) of US space shuttle voyages for the years 1992-
1994 (Source: The Universal Almanac 1995, p. 563). Find the mode.
8, 9, 9, 14, 8, 8, 19, 7, 6, 9, 7, 8, 10, 14, 11, 8, 14, 11
Example 5.6: Ten (10) students asked of their opinion on the tuition fee increase and their responses are: in favor,
in favor, not in favor, not in favor, not in favor, neutral, in favor, in favor, not in favor, and neutral.
Since in favor and not in favor both occur four times, the mode are the opinions regarding in favor and
not in favor.
Properties of Mode:
a. It can be computed for any type of data whether it is nominal, ordinal, interval or ratio level data.
b. It may not be unique sine sometimes we cannot just get one value.
c. It may not exist.
D. Weighted Mean
Sometimes, one must find the mean of a data set in which not all values have the same degree of
importance. Just like a data containing the scores of a student in the quizzes, exam and assignments of particular
subject. Scores in major exams weigh more than those in quizzes. This type of measurement that considers an
additional factor is called the weighted mean.
∑𝒌𝒊=𝟏 𝒘𝒊 𝒙𝒊
̅̅̅̅
𝒙𝒘 =
∑𝒌𝒊=𝟏 𝒘𝒊
where 𝑤𝑖 = weight of the observation 𝑖, 𝑘 = number of distinct observations and 𝑥𝑖 = the values.
Example 5.7: Anna is a DOST Scholar student at University of the Philippines – Diliman. She got the following
grades in her subjects last semester:
Compute the grade point average (GPA) of Anna. Will she able to maintain her scholarship if the grade
maintenance is least 1.75?
GRADE (𝒙𝒊 ) *
SUBJECT GRADE (𝒙𝒊 ) UNITS (𝒘𝒊 )
UNITS (𝒘𝒊 )
Math 17 2.25 6 13.5
English 1 1.50 3 4.5
History 1 1.75 3 5.25
Filipino 1 2.00 3 6.0
English 3 1.75 3 5.25
P.E. 1 1.25 2 2.5
TOTAL 20 37
∑6𝑖=1 𝑤𝑖 𝑥𝑖 37
𝑥𝑤 =
̅̅̅̅ = = 1.85
∑6𝑖=1 𝑤𝑖 20
Therefore, she is not able to maintain her scholarship since her GPA is 1.85.
Example 5.8: Suppose a survey asked a sample of 30 respondents to rate a movie on its cinematography. The rate
is from 1 to 5 with 1 being the lowest. A summary of data shows that twelve (12) gave rating of five (5), eight (8)
gave a rating of four (4) and, seven (7) and three (3) gave a rating of three (3) and two (2) respectively. Find the
average rating.
A. Range
The range is the simplest of the difference between the highest and lowest values in a set of data. That is,
Range, 𝑹 = highest value – lowest value
The range is considered a poor measure of dispersion in the sense that if only considers two values in its
computation. Thus, it cannot accurately determine how spread the values are in a given data set.
Example 5.9: The given data below represents the lifespan of the paints expressed in terms of months.
BRAND A: 45, 60, 50, 55, 48, 56, 57
BRAND B: 35, 25, 45, 28, 39, 40, 44
Find the range for each brand.
B. Variance
Variance is the average of the squares of the distances of each data value from the mean.
Definitional Formula:
𝟐 ∑𝑵
𝒊=𝟏(𝒙𝒊 −𝝁)
𝟐
𝟐 ∑𝒏 ̅ )𝟐
𝒊=𝟏(𝒙𝒊 −𝒙
Parameter: 𝝈 = Statistic: 𝒔 =
𝑵 𝒏−𝟏
You may ask, why should each term of the numerator be squared? This is because ∑𝑁𝑖=1(𝑥𝑖 − 𝜇) = 0 or
𝑛
∑𝑖=1(𝑥𝑖 − 𝑥̅ ) = 0, that is the sum of the deviation from the mean will always be zero.
The disadvantage with the above formula is that it could lead to serious rounding-off errors especially
when the value of the mean is also a rounded-off value. Hence, we have alternative formula below which can be
minimize this error. These formula were derived from expansion of the original formula above.
Computational Formula:
𝟐 𝟐
𝟐 𝑵 ∑𝑵 𝟐 𝑵
𝒊=𝟏 𝒙𝒊 −(∑𝒊=𝟏 𝒙𝒊 ) 𝟐 𝒏 ∑𝒏 𝟐 𝒏
𝒊=𝟏 𝒙𝒊 −(∑𝒊=𝟏 𝒙𝒊 )
Parameter: 𝝈 = Statistic: 𝒔 =
𝑵 𝒏(𝒏−𝟏)
Example 5.10: A comparison of coffee prices at 4 randomly selected grocery stores showed increases from the
previous month of 12, 15, 17 and 20 cents for 200-gram jar. Find the variance of this random sample of price
increases.
Note that the data were collected from a random sample of 4 grocery stores. If we use the definitional
formula we have the following computations:
∑4𝑖=1 𝑥𝑖 12+15+17+20 64
Mean, 𝑥̅ = = = = 16 𝑐𝑒𝑛𝑡𝑠.
4 4 4
If we use the computational formula, we have the following computations for the sample variance:
𝑛 ∑4𝑖=1 𝑥𝑖2 = 4(122 + 152 + 172 + 202 ) = 4(144 + 225 + 289 + 400) = 4(1058) = 4232
2
(∑4𝑖=1 𝑥𝑖 ) = (12 + 15 + 17 + 20)2 = (64)2 = 4096
2
4 ∑4𝑖=1 𝑥𝑖2 −(∑4𝑖=1 𝑥𝑖 ) 4232−4096 136
Thus, 𝑠 2 = = = = 11.3 𝑐𝑒𝑛𝑡𝑠 2 .
4(4−1) 4(3) 12
Remarks:
1. The result of the computations will be the same as long as we don’t round-ff the computations except in
the final answer.
2. The value of the variance cannot be negative.
3. The unit of measurement for the variance is in square units of the original measure.
C. Standard Deviation
The standard deviation is the positive square root of the variance. It has the same unit of measurement
with the given data. It can be used to compare variability of two or more sets of data having the same units of
measurement with approximately the same mean. It enables us to determine, with a great deal of accuracy, where
the values of a distribution are located in relation to the mean.
The coefficient of variation expresses the standard deviation as a fraction (or percent) of the mean. The
result is expressed as a percentage.
𝝈 𝒔
Parameter: 𝑪𝑽 = 𝝁 × 𝟏𝟎𝟎% Statistic: 𝑪𝑽 = 𝒙̅ × 𝟏𝟎𝟎%
Example 5.12: The mean of the number of cars sold over a three-month period into branches of Toyota,
Incorporation is $87 and the standard deviation is $5. The mean of the commission is $5225 and the standard
deviation is $773. Compare the variations of the two.
Since the units of measurement are different, we use the coefficient of variation to compare their relative
variability.
𝜎 5
Cars sold: 𝐶𝑉 = 𝜇 × 100% = 87 × 100% = 5.75% for cars sold
𝜎 773
Commission: 𝐶𝑉 = 𝜇 × 100% = 5225 = 14.79% for commission
Since the coefficient of variation is larger for commissions, the commissions are more varied than the
sales.
Exercise: The mean of the number of pages of sample of women’s fitness magazines is 132, with a variance of
23; the mean of the number of pages of a sample of men’s fitness magazines is 182, with a variance of 62.
Compare the variations of pages of the two magazines.
A. Percentiles
The percentiles are values that divides a set of observations (arranged increasingly) into 100 equal parts.
We use 𝑃𝑘 (𝑘 = 1, 2, 3, … , 99) to denote the 𝑘 𝑡ℎ percentile such that 𝑘% the observation falls below it.
Steps in computing 𝑷𝒌 :
Step 1: Arrange the data in increasing order of magnitude.
𝑛𝑘
Step 2: Find the location 𝐿 of the 𝑘 𝑡ℎ percentile by computing 𝐿 = 100.
Step 3: If 𝐿 is an integer, then the desired value is the average of 𝐿𝑡ℎ observation and (𝐿 + 1)𝑡ℎ observation. If 𝐿
is not integer, round up 𝐿 to the next integer. The desired value is the observation located to the rounded up value
of 𝐿.
Example 5.13: the number of movies attended last month by a random sample of 12 students are recorded as
follows: 2, 0, 3, 1, 6, 4, 7, 5, 8, 9, 10, and 11. Find the following:
1. 𝑷𝟒𝟖
Arrange the data in increasing order. That is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11.
48
𝐿 = 100 × 12 = 5.76 and since this is not a whole number, we rounded it up to 6. Then, the
𝑃48 = 6𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 = 5.
Therefore, 48% of the observations fall below 5.
B. Deciles
Deciles are values that divides the set of observations into 10 equal parts. It is denoted by 𝐷𝑘
(𝑘 = 1, 2, 3, … , 9), such that 𝐷𝑘 = value such that 10 ∗ 𝑘% of the observation falls below it.
Example 5.15: A teacher gives a 20-point test to 10 students. The scores are 18, 15, 12, 6, 8, 2, 3, 5, 20, and 10.
Find 𝐷8 .
Arrange the following scores in ascending order. That is, 2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
8
𝐿 = 10 × 10 = 8. Since it is a whole number,
8𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛+9𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 15+18 33
𝐷8 = = = = 16.5. Therefore, 80% of the observations fall below
2 2 2
16.5.
C. Quartiles
Quartiles are the values that divide the set of observations into 4 equal parts. It is denoted by 𝑄𝑘
(𝑘 = 1, 2, 3) such that 𝑄𝑘 = the value that 25 ∗ 𝑘% of the observation falls below it.
Example 5.16: Find the 𝑄3 for the test scores 5, 12, 15, 16, 20, and 21.
3
𝐿 = 4 × 6 = 4.5 and since this is not a whole number, we round it up to 5. Then 𝑄3 = 5𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
is 20. Therefore, 75% of the observations fall below 20.
There is an old saying that states, “You can’t compare apples and oranges”. But with the use of statistics,
it can be done to some extent. Suppose that a student scored 90 on a music test and 45 on an English exam. Direct
comparison of raw scores is impossible, since the exams are not equivalent in terms of number of questions, value
of the each question, and so on. However, a comparison of relative standard similar to both thongs can be made.
This comparison uses the mean and the standard deviation and is called a z-score (standard score).
Example 5.17: A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10, she
scored 30on a history test with the mean of 25 and a standard deviation of 5. Compare her relative positions on
the two sets.
𝑥−𝜇 65−50
First, find the z-score. For the calculus, 𝑧 = 𝜎 = 10 = 1.5.
30−25
For the history, the z-score is 𝑧 = = 1.0. Since the z-score for the calculus is larger, her relative
10
position in the calculus class is higher than her relative position in the history class.
A. Arithmetic Mean
For computational convenience, we construct an additional columns in the frequency distribution table.
The entries for this column will contain the product of the frequency and its corresponding class mark, denoted
by 𝑓𝑖 𝑥𝑖 . The mean then obtained using the formula
𝑘
1
𝑥̅ = ∑ 𝑓𝑖 𝑥𝑖
𝑛
𝑖=1
Note: The arithmetic mean cannot be computed from an open-ended frequency distribution.
Example 5.18: More and more employers are using psychological testing as an aid in determining whether the
applicant is fit for the work in the company. The following data shows the distribution of the scores of applicants
who took the psychological test administered by a company.
First, we compute the class mark for each class and then multiply them by the corresponding frequency
as shown below.
𝑘
1 1
𝑥̅ = ∑ 𝑓𝑖 𝑥𝑖 = (4279) = 73.78
𝑛 58
𝑖=1
Therefore, the mean score of the applicants is 73.78.
𝑛
𝑥̃ = 𝐿𝐶𝐵 + 𝑐 [ 2 − 𝐹(𝑚−1) ]
𝑔 𝑚
𝑓𝑚
Note: The median of a grouped data can be calculated even with open-ended intervals provided the median class
is not open-ended.
Cumulative
Class Mark Class
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 Frequency
(𝒙𝒊 ) Boundary
(𝑭𝒊)
41 – 50 5 45.5 227.5 40.5 – 50.5 5
51 – 60 7 55.5 388.5 50.5 – 60.5 12
61 – 70 10 65.5 655 60.5 – 70.5 22
71 – 80 16 75.5 1208 70.5 – 80.5 38
81 – 90 11 85.5 940.5 80.5 – 90.5 49
91 – 100 9 95.5 859.5 90.5 – 100.5 58
Total 58 4279
𝑛𝑡ℎ 58𝑡ℎ
Finding the median class, 2 = = 29. The 29𝑡ℎ observation is located at the 4𝑡ℎ class. Therefore,
2
𝑛
− 𝐹(𝑚−1) 29 − 22
𝑥̃ = 𝐿𝐶𝐵 + 𝑐 [ 2 ] = 70.5 + 10 [ ] = 70.5 + 4.375 = 74.88
𝑔 𝑚
𝑓𝑚 16
C. Mode
To compute the mode first we have to locate the modal class. This is the interval having the highest
frequency. From this nodal class, compute the mode using the formula
𝑓𝑚𝑜 − 𝑓1
𝑀𝑜𝑑𝑒𝑔 = 𝐿𝐶𝐵𝑚𝑜 + 𝑐 [ ]
2𝑓𝑚𝑜 − 𝑓1 − 𝑓2
Note: The mode can also be computed with open-ended intervals provided the modal class is not open-ended.
Cumulative
Class Mark Class
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 Frequency
(𝒙𝒊 ) Boundary (𝑭𝒊)
41 – 50 5 45.5 227.5 40.5 – 50.5 5
51 – 60 7 55.5 388.5 50.5 – 60.5 12
61 – 70 10 65.5 655 60.5 – 70.5 22
71 – 80 16 75.5 1208 70.5 – 80.5 38
81 – 90 11 85.5 940.5 80.5 – 90.5 49
91 – 100 9 95.5 859.5 90.5 – 100.5 58
Total 58 4279
The modal class is the 4𝑡ℎ interval since it has the highest frequency which is 16. Hence,
𝑓𝑚𝑜 − 𝑓1 16 − 10
𝑀𝑜𝑑𝑒𝑔 = 𝐿𝐶𝐵𝑚𝑜 + 𝑐 [ ] = 70.5 + 10 [ ] = 70.5 + 5.45 = 75.95
2𝑓𝑚𝑜 − 𝑓1 − 𝑓2 2(16) − 10 − 11
D. Variance
To make computation faster, additional columns for 𝑓𝑖 𝑥𝑖 and 𝑓𝑖 𝑥𝑖 2 have to be added in the frequency
distribution table, with their sums computed. Computing the variance, we will use the formula
2
2
𝑛 ∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 2 − (∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 )
𝑠𝑔 =
𝑛(𝑛 − 1)
Example 5.21: Using the data from Example 5.18, compute the variance.
Class Mark
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 𝒙𝒊 𝟐 𝒇𝒊 𝒙𝒊 𝟐
(𝒙𝒊 )
41 – 50 5 45.5 227.5 2070.25 10351.25
51 – 60 7 55.5 388.5 3080.25 21561.75
61 – 70 10 65.5 655 4290.25 42902.50
71 – 80 16 75.5 1208 5700.25 91204.00
81 – 90 11 85.5 940.5 7310.25 80412.75
91 – 100 9 95.5 859.5 9120.25 82082.25
Total 58 4279 328514.50
Then,
2
2
𝑛 ∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 2 − (∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 ) 58(328514.50) − (4279)2 744000
𝑠𝑔 = = = = 225.04
𝑛(𝑛 − 1) 58(58 − 1) 3306
E. Standard Deviation
The computing formula for the standard deviation is
𝑠𝑔 = √𝑠𝑔 2
F. Range
The range can be measured by getting the difference between the highest class boundary and lowest class
boundary, that is,
𝑅 = ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 − 𝑙𝑜𝑤𝑒𝑠𝑡 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦
Example 5.23: Find the range 𝑅 of the given data in Example 5.18.
G. Percentile
For the computation of the 𝑚𝑡ℎ percentile of 𝑃𝑚 , we need the cumulative frequency column. First, compute
𝑚𝑛 𝑚𝑛 𝑡ℎ
, then locate the 100 observation in the cumulative frequency column and identify the corresponding interval
100
𝑡ℎ
called the 𝑚 percentile class. From this interval, compute 𝑃𝑚 using the formula
𝑚𝑛
− 𝐹(𝑚−1)
𝑃𝑚 = 𝐿𝐶𝐵𝑚 + 𝑐 [ 100 ]
𝑓𝑚
where 𝐿𝐶𝐵𝑚 = the lower class boundary of the 𝑚𝑡ℎ percentile interval
𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the 𝑚𝑡ℎ
percentile interval
𝑓𝑚 = the frequency of the 𝑚𝑡ℎ percentile interval
𝑐 = the class width
H. Decile
𝑚𝑛 𝑚𝑛𝑡ℎ
For the computation of the 𝑚𝑡ℎ decile of 𝐷𝑚 , first compute , then locate the observation in the
10 10
𝑡ℎ
cumulative frequency column and identify the corresponding interval called the 𝑚 decile class. From this
interval, compute 𝐷𝑚 using the formula
𝑚𝑛
− 𝐹(𝑚−1)
𝐷𝑚 = 𝐿𝐶𝐵𝑚 + 𝑐 [ 10 ]
𝑓𝑚
where 𝐿𝐶𝐵𝑚 = the lower class boundary of the 𝑚𝑡ℎ decile interval
𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the 𝑚𝑡ℎ
decile interval
𝑓𝑚 = the frequency of the 𝑚𝑡ℎ decile interval
𝑐 = the class width
I. Quartile
𝑚𝑛 𝑚𝑛𝑡ℎ
For the computation of the 𝑚𝑡ℎ quartile of 𝑄𝑚 , first compute , then locate the observation in the
4 4
cumulative frequency column and identify the corresponding interval called the 𝑚𝑡ℎ quartile class. From this
interval, compute 𝑄𝑚 using the formula
𝑚𝑛
− 𝐹(𝑚−1)
𝑄𝑚 = 𝐿𝐶𝐵𝑚 + 𝑐 [ 4 ]
𝑓𝑚
where 𝐿𝐶𝐵𝑚 = the lower class boundary of the 𝑚𝑡ℎ quartile interval
STATISTICS 1, 2nd Semester 2016-2017
𝐹(𝑚−1) = the cumulative frequency of the class interval immediately preceding the 𝑚𝑡ℎ
quartile interval
𝑓𝑚 = the frequency of the 𝑚𝑡ℎ quartile interval
𝑐 = the class width
Example 5.24: Using the data given in Example 5.18, compute the following and interpret these values.
Cumulative
Class Mark Class
Score Class Frequency (𝒇𝒊 ) 𝒇𝒊 𝒙 𝒊 Frequency
(𝒙𝒊 ) Boundary (𝑭𝒊)
41 – 50 5 45.5 227.5 40.5 – 50.5 5
51 – 60 7 55.5 388.5 50.5 – 60.5 12
61 – 70 10 65.5 655 60.5 – 70.5 22
71 – 80 16 75.5 1208 70.5 – 80.5 38
81 – 90 11 85.5 940.5 80.5 – 90.5 49
91 – 100 9 95.5 859.5 90.5 – 100.5 58
Total 58 4279
45(58)
a. 𝑃45 : 𝐿 = = 26.1 ≈ 27. The 27𝑡ℎ value arranged in order falls on the 4𝑡ℎ class interval. Therefore,
100
26.1 − 22
𝑃45 = 70.5 + 10 [ ] = 70.5 + 2.5625 = 73.06
16
Interpretation: 45% of all the observations fall below 73.06.
6(58)
b. 𝐷6 : 𝐿 = = 34.8 ≈ 35. The 35𝑡ℎ value arranged in order falls on 4𝑡ℎ class interval. Therefore,
10
34.8 − 22
𝐷6 = 70.5 + 10 [ ] = 70.5 + 8 = 78.5
16
Interpretation: 60% of all the observations fall below 78.5.
3(58)
c. 𝑄3 : 𝐿 = = 43.5 ≈ 44. The 44𝑡ℎ value arranged in order falls on the 5𝑡ℎ class interval. Therefore,
4
43.5 − 38
𝑃3 = 80.5 + 10 [ ] = 70.5 + 5 = 75.5
11
Interpretation: 75% of all the observations fall below 75.5.
𝑛
1
𝑚1 = ∑(𝑥𝑖 − 𝑥̅ ) = 0
𝑛
𝑖=1
𝑛
1 𝑛−1 2
𝑚2 = ∑(𝑥𝑖 − 𝑥̅ )2 = 𝑠
𝑛 𝑛
𝑖=1
𝑛
1
𝑚3 = ∑(𝑥𝑖 − 𝑥̅ )3
𝑛
𝑖=1
𝑛
1
𝑚4 = ∑(𝑥𝑖 − 𝑥̅ )4
𝑛
𝑖=1
𝑛
1
𝑚𝑟 = ∑(𝑥𝑖 − 𝑥̅ )𝑟
𝑛
𝑖=1
In Figure 5.1 above, the mean, the median and the mode has the same numerical value. This is true when
the distribution is symmetric. In Figure 5.2, the mean is at the right of the median while Figure 5.3, the mean is
at the left of the median. Thus for positively skewed distribution, Mode < Median < Mean while for a negatively
skewed distribution, Mean < Median < Mode. An absolute measure of skewness is the expression (Mean – Mode)
but change in measurement units will give varying values. A relative measure, which will not be affected by
change in measurements and is easy to compute is Karl Pearson’s coefficient of skewness and is defined by
When 𝑔1 is positive, it is positively skewed, and the distribution is negatively skewed when 𝑔1 is negative. There
are many other measures of skewness and each measure may give a different numerical value, but they will lead
to similar interpretations, which is the ultimate objective in making the computations.
When the distribution is symmetric, it possesses the property that all odd-ordered central moments (𝑟 is
1
odd in 𝑚𝑟 = 𝑛 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )𝑟 ) are equal to zero. A measure of skewness is based on the moments is
𝑚3
𝑔1 =
𝑚2 √𝑚2
𝑘
1 𝑥𝑖 − 𝑥̅ 3
𝑔1 = ∑ 𝑓𝑖 ( )
𝑛 𝑠
𝑖=1
Example 5.25: Consider the set of numbers 6, 8, 10, 11, 15. Find its measure of skewness.
50
First we compute the mean, 𝑥̅ = = 10. Using the Karl’s Pearson coefficient of skewness, we need to
5
solve for 𝑚2 and 𝑚3 .
Two distributions may have the same variability but may be relatively flatter at the top than the normal
curve. A curve is normal if it has a bell-shaped form with 𝑚𝑒𝑎𝑛 = 𝑚𝑒𝑑𝑖𝑎𝑛 − 𝑚𝑜𝑑𝑒. To measure the flatness of
the distribution we use the coefficient or kurtosis. This measure, denoted by 𝑔2 and is defined by
𝑚4
𝑔2 =
𝑚2 2
where 𝑚2 is the 2nd moment and 𝑚4 is the 4th moment. The formula above can be written alternatively as,
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )4
𝑔2 =
𝑛 (𝑠 2 )2
As an aid to interpreting kurtosis, the value of 𝑔2 is 3 for a normal distribution and when this value is
attained the distribution is said to be mesokurtic. Distributions with 𝑔2 greater than 3 are called leptokurtic (sharp
top) and those with 𝑔2 less than 3 are called platykurtic (flat top).
Example 5.26: Find the measure of kurtosis of the set of numbers given in Example 5.25.
4
∑5𝑖=1(𝑥𝑖 −𝑥̅ ) 898
Hence, 𝑚4 = = = 179.6.
5 5
𝑚 179.6
Therefore, the measure of kurtosis is 𝑔2 = 𝑚 42 = (9.2)2 = 2.12.
2
Since 𝑔2 = 2.12 which is less than 3, the distribution for this set of numbers has a flat top.