You are on page 1of 17

Republic of the Philippines

BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY


City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


There are two types of statistical data which are the same amount of funds. The researcher
gathered from primary or secondary sources. cannot expect that all the questions mailed
will be retrieved since many respondents will
PRIMARY data are data coming from primary simply ignore answering the questionnaire.
sources which include government agencies, 3. The Observation Method – The research
business establishments, organizations and obtained data pertaining to behavior of an
individuals who carry original data or have first individual or a group of individuals at the
hand information relevant to a given problem. time of occurrence of a given situation.
Subjects may be observed individually or
SECONDARY data are those coming from collectively depending on the objectives of
secondary sources which newspapers, the investigator. One limitation of this
magazines, journals and published materials. method lies in the fact that in most cases,
observation is made only at the time of
METHODS OF COLLECTING DATA occurrence of the appropriate events.
In marketing studies, they usually
employ interview or questionnaire method to 4. THE USE OF DOCUMENTS OR
gather information on consumer preferences REGISTRATION
and certain buying habits. -This method of collecting data makes use
Feasibility studies make use of record of of important documents such as the number
data available in various government agencies of households, birth rates, death rates and
like PSA, Central Bank, NEDA, Security & marriages that can be found in both private
Exchange Commission and other government and government offices. It is very
agencies. economical not only in terms of cost but also
in terms of time and effort. Those in
Primary Data can be obtained through: business make their projection of trends and
patterns based on the data from the National
1. DIRECT or INTERVIEW METHOD – A Economic and Development Authority. The
person to person encounter between the PSA takes care of keeping birth, death and
one soliciting information (interviewer) and marriage records. The Commission of
the one supplying the data (interviewee). Election takes care of updating the list of
The researcher could either used personal registered votes. In companies records of
or telephone interview. The advantage of individual employees are kept and later
the interview method is that questions can used as basis for promotion and salary
be repeated, rephrased or modified for increase.
better understanding.
2. Indirect or Questionnaire Method- The 5. Experimental Method
researcher distributes the questionnaire This method of collecting data is used to
either by mail or hand carry to the intended find the cause and effect relationship. For
person and collects them by the same example, a farmer is interested in finding
process. out the effect of gasoline additive to the
-This method is more economical than gasoline consumption of cars.
interview because it can involve a greater
number of individual in the population with

1
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


Data needed to find out the cause and PROBABILITY SAMPLING TECHNIQUES
effect relationship may be obtained
through a series of experiments. A sample should not be selected in haphazard
SECONDARY DATA can be obtained from: way to obtain data and information that are
1. JOURNALS & PERIODICALS reliable and realistic.
2. NEWSPAPERS Sampling Technique is a procedure used to
3. TABLES determine which element is to be included in the
4. UN-PUBLISHED RESEARCH PAPER sample. This procedure is called random
5. THESIS & DISSERTATIONS sampling technique or probability sampling
technique.
DETERMINING THE SAMPLE SIZE
Most of the survey conducted are done on a A random sample is a sample drawn in such a
sample basis because of financial and way that each individual has an equal probability
economic considerations, time and of being observe or chosen.
manageability of data involved. If the population
is used when sampling from a finite population 1. SYSTEMATIC SAMPLING a population is
of N individuals, the sample size (n) may be arranged in some order; goods in a
obtained by warehouse, drugs in pharmacy, crop in a field
or names shown from a telephone directory in
SLOVIN’s formula. such cases, it may be easier to draw a random
𝑁 sample that would entail looking for particular
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 (𝑛) =
(1 + 𝑁𝑒 ! ) items within the population. It is obtained by
𝑛 = 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 selecting one unit or member of the sample on
𝑁 = 𝑡ℎ𝑒 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑛𝑖𝑡𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 random basis and then choosing additional
𝑒 = 𝑚𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟 units of equally spaced intervals until the
desired size of the sample is reached.
Example: A group of researcher was tasked by Example select 15 students from a list of 80
the House of Representative to survey whether students, select the first name at random say
student in Metro Naga favor moving of the start 15th , then every third name will be selected
of the classes from June to August. If there are until the sample of 15 names is completed.
1,000,000 students and 10% margin of error are 15th, 18th,21st ,24th ,27th …
expected, compute the sample size.
𝑁
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 (𝑛) = 2. STRATIFIED SAMPLING
(1 + 𝑁𝑒 ! )
𝑁 = 1,000,000 A stratified sample is obtained by
𝑒 = .10 independently selecting a separate random
1,000,000 sample from each population stratum or class.
𝑆𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 (𝑛) = This is appropriate in studies were research
(1 + (1,000,000)(. 10)! )
problem requires comparison between
= 99.9 𝑜𝑟 100 sample size (n)
various subgroup. It assures the researcher
that the sample will be representative of the
population in terms of certain critical factors
that have been used as a basis for

2
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


stratification and assures the researcher of class are sampled in the proportions which
adequate cases for subgroup analysis. they occupy in the population. Suppose for
instance, that we are going to use a quota
3. CLUSTER SAMPLING GROUP ARE sampling from the students attending a
SELECTED rather than individuals. Example university, where 42 % were females & 58%
select 5 school from among the 20 Schools were males. Using this method, interviewers
in Bicol. are given a quota of students to locate, so that
42% of the sample consists of females and 58
4. MULTI -STAGE SAMPLING % of the samples consists of males. The same
Combination of several sampling techniques percentages are included in the sample.
in getting a sample from 3. PURPOSIVE (JUDGMENT or DELIBERTE
large population. This is done by dividing the SAMPLING)
whole area and then each area into strata. Purposive (judgment or deliberate sampling)
Then from each stratum, get the sample by sampling is more sophisticated type of sampling
using the simple random sampling technique. which emerges when personal judgment,
presumably based on prior experience, plays a
major role in selecting a group of observations.
NON-PROBABILITY SAMPLING In this type of sampling, logic, common sense of
-A sampling technique where all the participants sound judgment can be used to select a sample
of the investigation are not derived through that is representative of a large population.
equal chances. Certain parts in the overall Example of this technique is when you are
group are deliberately not included in the interested to find out a reaction of some
selection of the representative’s subgroup. students on the devaluation of peso. Instead of
This technique is also called non-probability asking the opinion of all the students in various
sampling or judgment sampling because it colleges and universities in Bicol you may
makes use of judgment in the selection of items purposely ask only the student leaders of a
to be included in the subgroup. particular college or university. This is sampling
with a purpose.
Non- random or judgment sampling is classified
into:
(1) convenience sampling METHOD OF DATA PRESENTATION
(2) quota sampling
(3) purposive sampling. 1. THE TEXTUAL FORM
“PARAGRAPH METHOD” combines text
1. Convenience or Accidental Sampling- The and figures in statistical report. This method
researcher simply includes the convenient presents data in paragraph form and
cases in his or her sample and excludes the becomes effective when the objective is to
inconvenient cases. (to interview by call the reader’s attention to some data that
telephone- only to those who only have it will require special emphasis.
be interviewed for convenience)
2. Quota Sampling – in this sampling 2.TABULAR FORM
procedure, diverse characteristics of a DATA are presented in rows and columns. A
population, such as age, gender or social more convenient and understandable than

3
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


textual method because the numeral
information’s are displayed in a more concise
and systematic manner by using vertical and
horizontal lines which describes the
corresponding heading. It has four essential
components:
a. Table heading – shows the table number
and the title.
b. Body – the main part of the table which
contains the quantitative information.
c. Stub – at the left part/ classifications or
categories which are presented as
values of a variable.
d. Box head. - Captions that appear above
the column.

b. Line Graph
The most practical and effective device
which shows a general trend, patterns or
changes over a given time. It makes use
of ordered pairs and graph or ordered
pairs in a coordinate plane. The
categories or time periods are
chronologically arranged on the
horizontal axis and the relevant values
3. GRAPHICAL FORM are indicated in the vertical axis.
The most effective way of presenting
statistical data because important
relationships are brought about clearly.
Comparison and trends of quantitative
values are readily available to enable ease
of communication of results or information.
a. BAR GRAPH
Used to organize data/information visually.
Bar graphs are helpful in comparing
quantities. It can be horizontal or vertical.

4
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE

c. Circle Graph or Pie Chart


It consists of a circular region divided
into sections that do not overlap and
each section represents a part or THE FREQUENCY DISTRIBUTION
percentage of the whole being
considered. A tabular arrangement of the data by using
categories or class and their corresponding
frequencies. The frequency of a particular
observation is the number of times the
observation occurs in a category or class.

Example of raw data

Sturge’s Rule
𝐾 = 1 + 3.3 log 𝑁
𝐾 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑠
𝑁 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠

d. Pictograph. A very effective tool for


attracting attention since it uses pictures
or symbols to indicate the message of
the obtained numerical information.

5
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


A sample of fifty customers at a newly open 𝑅 50
market has been selected at random. The 𝐶= = = 7.14 = 8
following data show the customer’s age 𝐾 7
(𝒓𝒐𝒖𝒏𝒅𝒆𝒅 𝒕𝒐 𝒕𝒉𝒆 𝒏𝒆𝒙𝒕 𝒘𝒉𝒐𝒍𝒆 𝒏𝒖𝒎𝒃𝒆𝒓)
12 19 35 28 19 4. Choose an appropriate lower limit for the
10 27 21 20 18 first-class interval. This number is less
23 23 13 18 50 than or equal to the lowest value in the
41 21 39 17 46 data. It is more convenient to use a lower
limit that is divisible by the class width.
59 27 60 26 14
Add the class width to obtain the next
23 21 37 29 29 lower-class limit. Keep on adding the
21 23 33 27 53 class width to get all the other lower-
16 11 23 19 47 class limits.
32 21 25 34 21
5. Find the upper-class limits. If the class
42 28 48 47 52 size is rounded off to the unit’s place,
subtract one from the second lower
class limits to arrive at the first upper
For 50 observations, determine the number of class limit. Subtract 0.1 from the result,
Classes if rounded off to the tenth place, subtract
.01 if rounded to the hundredths place.
𝐾 = 1 + 3.3 log 50 = 6.61 = 7
6. Determine the class boundaries. The
The procedures/steps in constructing a class boundaries are the true limits of a
frequency distribution are as follow: class interval made up of the lower-class
1. Decide on the number of class interval boundary and upper-class boundary.
to use (between 5 & 15). The class boundary is the midway
Use Sturge’s Formula: between the upper limit and the lower
limit of the next higher-class interval.
2. Find the range. (the difference between
the Highest value and the lowest value 7. Find the class mark or midpoint of each
in the set of data. class interval, as follows:
R = 60- 10 = 50
3. Divide the range by the desired number 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 + 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
of class intervals. The result is rounded 𝐶𝑙𝑎𝑠𝑠 𝑚𝑎𝑟𝑘 =
2
off to the next unit if the scores to be 8. Tally the raw score and indicate the
grouped are expressed as whole frequency for each of the class intervals.
numbers. Otherwise, it has to be 9. Add the frequencies and indicate the
rounded to the next number with the sum.
same number of decimal places as the
given measurement. This resulting
number is called the interval sizes, class
size or class width.

6
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


Example 1: Consider the sample
MEASURES OF CENTRAL TENDENCY 7,9,10,8,20,18,19,16 and 13. Find the mean.

A measure of central tendency, commonly 𝛴𝑋


𝑥̅ =
referred to as an average, is a single value that 𝑛
represents a data set.
7 + 9 + 10 + 8 + 20 + 18 + 19 + 16 + 13
The arithmetic mean or mean is the most 𝑥̅ =
9
frequently used measure of central tendency. Example: Find the population mean of the ages
This is the only common measure in which all of 9 middle-management employees of a
values plays an equal role meaning to certain company. The ages are 53, 45, 59, 48,
determine its values you would need to consider 54, 46, 51, 58 and 55.
all values of any given data set. The mean is
appropriate to determine the central tendency of 53 + 45 + 59 + 48 + 54 + 46 + 51 + 58 + 55
an interval or ratio data. (𝑥̅ ) is used to represent 𝑥̅ =
9
the mean of a sample (𝜇) is used to denote the 𝑥̅ = 52.11
mean of a population. The mean population age of middle-
management employee is 52.11.
Mean for Ungrouped Data:
Sample Mean for Grouped Data
"#$ &' ()) *()#+,
Mean =-#$.+/ &' *()#+,
𝛴𝑓𝑋
𝛴𝑋 𝑥̅ =
𝑥̅ = 𝑛
𝑛 𝑥 ̅ = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑥 ̅ = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 𝑓 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑋 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑋 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟
𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 𝑜𝑏𝑠𝑒𝑟𝑎𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡
𝛴𝑋 = 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑋𝑠 𝛴𝑓𝑋 = 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑠 𝑜𝑓
𝑛 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑓 𝑎𝑛𝑑 𝑋𝑠
𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑁 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
Population mean Class 𝑓 Midpoints 𝑓𝑋
𝛴𝑋 Limits (𝑋)
𝜇 =
𝑁 10-17 7 13.5 94.5
𝜇 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 18-25 18 21.5 387
𝛴𝑓𝑋
𝑋 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑥̅ = 26-33 10 29.5 295
𝑛 34-41 5 37.5 187.5
𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 42-49 5 45.5 227.5
𝛴𝑋 = 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑋𝑠 50-57 2 53.5 107
𝑁 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 58-65 3 61.5 184.5
𝑖𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛. 𝜮𝒇𝑿 =1483

7
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


𝛴𝑓𝑋 𝒏"𝟏 𝒗𝒂𝒍𝒖𝒆 𝒂𝒕 𝒕𝒉𝒆 𝒆𝒏𝒅 𝒐𝒇 𝒕𝒉𝒆 𝒑𝒆𝒓𝒊𝒐𝒅
𝑥̅ = 𝑮𝑴 = (
𝒗𝒂𝒍𝒖𝒆 𝒂𝒕 𝒕𝒉𝒆 𝒔𝒕𝒂𝒓𝒕 𝒐𝒇 𝒕𝒉𝒆 𝒑𝒆𝒓𝒊𝒐𝒅
−𝟏
𝑛
1483
𝑥̅ = = 29.66 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛 𝑎𝑔𝑒 𝑜𝑓 𝑮𝑴 = 𝒈𝒆𝒐𝒎𝒆𝒕𝒓𝒊𝒄 𝒎𝒆𝒂𝒏
50 𝑋1 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟
People in the supermarket. 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡.
𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
WEIGHTED MEAN – is useful when various Example:
classes or groups contribute differently to the Suppose the profits earned by the MSS
total. The weighted mean is found by multiplying Construction Company on five projects
each value by its corresponding weight and were 5,6,4,8, and 10 percent,
dividing by the sum of the weights. respectively. What is the geometric
mean profit?
∑2134 𝑤1 𝑋1
````
𝑋0 =
∑2134 𝑤1 𝑋4 = 5 𝑋! = 6 𝑋@ = 4 𝑋A = 8 𝑋B =
𝑤4 𝑋4 + 𝑤! 𝑋! + ⋯ + 𝑤2 𝑋2 10 𝑛 = 5
````
𝑋0 = "
𝑤4 + 𝑤! + ⋯ + 𝑤2 𝐺𝑀 = √5 ∗ 6 ∗ 4 ∗ 8 ∗ 10 = 6.26
````
𝑋0 = 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 Example:
𝑤1 = 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑤𝑒𝑖𝑔ℎ𝑡 Badminton as a sport grew rapidly in
𝑋1 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 2008. From January to December 2017
𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡. number of badminton clubs in Metro
Example: Manila increased from 25 to 165.
If 9500 Algebra books were sold at P235 each, Compute the monthly percent increase
1300 Trigonometry books at P220 each and 500 in the number of the badminton clubs.
Business Books at 340 each and 4500 Statistics 𝑛 = 12
books at 270 each, find the mean sales of the
four books. 𝟏𝟔𝟓
𝟏𝟐%𝟏
𝑮𝑴 = t − 𝟏 = . 𝟏𝟖𝟕𝟏𝟒
````
𝑿 𝒘 = 𝟐𝟓
𝟗𝟓𝟎𝟎∗𝟐𝟑𝟓<𝟏𝟑𝟎𝟎∗𝟐𝟐𝟎<𝟓𝟎𝟎∗𝟑𝟒𝟎<𝟒𝟓𝟎𝟎∗𝟐𝟕𝟎
e f= = 𝟏𝟖. 𝟕𝟏𝟒% 𝒊𝒏𝒄𝒓𝒆𝒂𝒔𝒆 𝒑𝒆𝒓 𝒎𝒐𝒏𝒕𝒉
𝟗𝟓𝟎𝟎<𝟏𝟑𝟎𝟎<𝟓𝟎𝟎<𝟒𝟓𝟎𝟎
````
𝑿𝒘 = 𝟐𝟒𝟕. 𝟎𝟓
MEDIAN – the midpoint of the data array. When the
Geometric Mean the nth root of the product of n data set is ordered whether ascending or
descending it is called a data array. It is an
numbers.
appropriate measure of central tendency
Application: data that are ordinal or above but is more
a. To average percent, indexes and valuable in an ordinal type of data.
relatives;
b. To establish the average percent MEDIAN for UNGROUPED DATA
increase in production sales or other
business transaction or economic series To determine the value of median for ungrouped,
from one period of time to another. consider two rules:

a. If n is odd, the median is the middle ranked.


𝐺𝑀 = !n𝑋4 𝑋! 𝑋@ … 𝑋2

8
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


b. If n is even, then the median is the average of 𝑁 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 (𝑜𝑟 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛)
the two middle ranked values. 𝑐𝑓
= 𝑐𝑢𝑚𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠
𝑛+1 𝑓 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑉𝑎𝑙𝑢𝑒) = 𝑖 = 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
2
𝑛 = 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒

Example: Class 𝑓 𝑐𝑓
Find the median of the ages of 9 middle Limits
management employees of a certain company. The 10-17 7 𝑐𝑓 =7
ages are 53,45,59,48,54,46,51,58,55. 18-25 18 25
26-33 10 35
Arrange the data in order: 𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑉𝑎𝑙𝑢𝑒) = 34-41 5 40
<=>
= 5𝑡ℎ 42-49 5 45
?
50-57 2 47
45, 46, 48, 51, 53, 54, 55, 58, 59 58-65 3 50
N = 50
therefore, the median age is 53
𝐿𝐵 = 18 − 0.5 = 17.5
Example: The daily rates of a sample of eight
𝑖 = 18 − 10 = 8
employees at CGK Inc, are P 550, P420, P560.
50
P500, P700, P670, P860, P480. e f−7
𝑀𝒆𝒅𝒊𝒂𝒏 = 17.5 + … 2 ‡ 8 = 21.056
Arrange the data in order: 18
P420, P480, P500, P550, P560, P670, P700, P860

8+1
Thus, the median is 21.056, observed that the
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑉𝑎𝑙𝑢𝑒) = = 4.5 𝑡ℎ 𝑟𝑎𝑛𝑘 median will fall within the class boundary of the
2
median class.
550 + 560
𝑀𝑒𝑑𝑖𝑎𝑛 (𝑅𝑎𝑛𝑘 𝑉𝑎𝑙𝑢𝑒) = = 555
2 MODE – the mode for ungrouped data is
Therefore, the median daily rate is 555.
defined as the value that appears with the
highest frequency. That is the item that appears
MEDIAN for GROUPED DATA: most often. It is used with nominal data. It can
be easily identified by inspection of ungrouped
For N = 50 set of data by getting the score or item which
occurs most frequently.
𝑵 𝟓𝟎
𝑴𝒆𝒅𝒊𝒂𝒏 (𝑹𝒂𝒏𝒌 𝑽𝒂𝒍𝒖𝒆) = =
𝟐 𝟐 When all values appear with the same
= 𝟐𝟓𝒕𝒉 𝒓𝒂𝒏𝒌 frequency, the mode does not exist. A
𝑵 distribution with only one mode is called
e 𝟐 f − 𝒄𝒇
𝑴𝒆𝒅𝒊𝒂𝒏 = 𝑳𝑩 + … ‡𝒊 unimodal while a distribution which has two
𝒇 modes is bimodal; and for same sets of data
Where: with three or more modes is known as multi-
𝐿𝐵 = 𝑙𝑜𝑤𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠
modal.

9
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


18-25 18 25
Example. Determine the mode of the following 26-33 10 35
distribution 34-41 5 40
42-49 5 45
3,5,8,8,9,10. 50-57 2 47
• The mode is 8. – UNIMODAL 58-65 3 50
N = 50
Example. Find the mode of the following 13, 13,
14, 12, 15, 18, 17, 17.
𝒅𝟏 = (18 − 7) = 11
The values 13 & 17 appeared twice. Then we 𝒅𝟐 = (18 − 10) = 8
can say that the modes are 13 & 17 – BIMODAL
𝑑4
Example: Find the mode of the following 𝑀& = 𝐿C.D& + Š ‹𝑐
𝑑4 + 𝑑!
11
3, 5, 7, 7, 8, 5, 5, 8, 8, 9, 10, 9, 9 𝑀& = 17.5 + Š ‹ 8 =
11 + 8
The mode is
The value 5, 8 and 9 appeared thrice, the modes
MIDRANGE
5, 8 and 9 – TRIMODAL or MULTIMODAL.
The midrange is the average of the lowest
and highest value in a data set.
MODE of GROUPED DATA – the class mark or 𝑋)&0+,` + 𝑋a1ba+,`
class midpoint of the class interval with the 𝑀𝑖𝑑𝑟𝑎𝑛𝑔𝑒 =
highest frequency. (MODAL CLASS). The mode 2
𝑋)&0+,` = 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 𝑖𝑛 𝑎 𝑑𝑎𝑡𝑎 𝑠𝑒𝑡
obtained in this manner is called a crude mode
𝑋a1ba+,` = 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 𝑖𝑛 𝑑𝑎𝑡𝑎
because it is just a rough approximation of the
actual mode. So to determine the true mode,
Example: Find the midrange of the ages of 9
use the formula:
𝑑4 middle management employees of a certain
𝑀& = 𝐿C.D& + Š ‹𝑐 company. The ages are
𝑑4 + 𝑑! 53,45,59,48,54,46,51,58,and 55.
𝑴𝒐 = 𝒎𝒐𝒅𝒆
𝑋)&0+,` = 45
𝑳𝑪𝒃𝑴𝒐 = 𝒍𝒐𝒘𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝒐𝒇 𝒕𝒉𝒆 𝒎𝒐𝒅𝒂𝒍 𝒄𝒍𝒂𝒔𝒔 𝑋a1ba+,` = 58
𝒅𝟏 = 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒕𝒉𝒆 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝒐𝒇 45 + 59
𝒕𝒉𝒆 𝒎𝒐𝒅𝒂𝒍 𝒄𝒍𝒂𝒔𝒔 𝒂𝒏𝒅 𝒕𝒉𝒆 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝑀𝑖𝑑𝑟𝑎𝑛𝑔𝑒 = = 52
𝒐𝒇 𝒕𝒉𝒆 𝒄𝒍𝒂𝒔𝒔 𝒊𝒏𝒕𝒆𝒓𝒗𝒂𝒍 𝒃𝒆𝒇𝒐𝒓𝒆 𝒕𝒉𝒆 𝒎𝒐𝒅𝒂𝒍 𝒄𝒍𝒂𝒔𝒔 2
𝒅𝟐 = 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒕𝒉𝒆 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 The midrange is 52.
𝒐𝒇 𝒕𝒉𝒆 𝒎𝒐𝒅𝒂𝒍 𝒄𝒍𝒂𝒔𝒔 𝒄𝒍𝒂𝒔𝒔 𝒂𝒏𝒅 𝒕𝒉𝒆 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚
𝒐𝒇 𝒕𝒉𝒆 𝒄𝒍𝒂𝒔𝒔 𝒊𝒏𝒕𝒆𝒓𝒗𝒂𝒍 𝒂𝒕𝒆𝒓 𝒕𝒉𝒆 𝒎𝒐𝒅𝒂𝒍 𝒄𝒍𝒂𝒔𝒔.
𝒄 = 𝑚𝑜𝑑𝑒
The modal class is the class interval with the highest
frequency.

Class 𝑓 𝑐𝑓
Limits
10-17 7 𝑐𝑓 =7

10
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


Activity :
AVARAGE DEVIATION FOR UNGROUPED
Complete the table to find the mean, median DATA
mode and midrange for the grouped frequency
distribution: ∑(f gf̅ )
𝐴𝐷 = 2
Class 𝒇 𝑿 𝒇𝑿 𝒄𝒇
Limits Example: The daily rates of a sample of eight
37 – 39 3 employees at ASEAN BICOL UNIVERSITY are
40 – 42 6
43 – 45 10 P1200, 1500, 1350, 1400, 1550, 1300, 1250,
46 – 48 5 1150. Find the average deviation.
49 – 51 4 a. Mean of the data set
52 – 54 2 ∑𝑥
55 - 57 1 𝑥̅ =
𝑛
P1200, 1500, 1350, 1400, 1550, 1300, 1250, 1150
=
8
Measures of Dispersion and Location = 1337.5
The standard deviation is a statistical term that b. Subtract the mean from each of the value
provides a good indication of volatility measures in the data set.
how widely values are dispersed from the 𝑥 𝑥 − 𝑥̅ Absolute
value of
average.
𝑥 − 𝑥̅
Dispersion is the difference between the actual 1200 -137.5 137.5
value and average value.
1500 162.5 162.5
Range is the difference of the highest value and 1350 12.5 12.5
the lowest value in the data set. 1400 62.5 62.5
1550 212.5 212.5
Average Deviation of an element of a data set is 1300 -37.5 37.5
the absolute difference between that element 1250 -87.5 87.5
and a given point. 1150 -187.5 187.5
l 𝒙 = 𝟏𝟎𝟕𝟎𝟎 n=𝟎
l𝒙 − 𝒙 !𝒙 − $
𝒙 = 𝟗𝟎𝟎
The Measure of Central Tendency is typically
the point from which the deviation is measured. ∑(𝑥 − 𝑥̅ ) 𝟗𝟎𝟎
𝐴𝐷 = = = 112.5
𝑛 8
Mean absolute deviation is a summary statistic
of statistical dispersion or variability. AVERAGE DEVIATION FOR GROUPED
DATA
∑ 𝑓(𝑥 − 𝑥̅ )
𝐴𝐷 =
𝑛
𝐴𝐷 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
𝑓 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑥 = 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑎𝑛𝑦 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑜𝑟 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡
𝑥̅ = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

11
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


Class
Limits
𝑓 𝑥 𝑓𝑥 𝑥 − 𝑥̅ 𝑥 − 𝑥̅ 𝑓(𝑥 − ‚‚‚
𝑥) 1550 212.5 45156.25
absolute
37 – 39
40 – 42
3
6
38
41
114
246
-7.06
-4.06
7.06
4.06
21.18
24.36
1300 -37.5 1406.25
43 – 45
46 – 48
10
5
44
47
440
235
-1.06
1.94
1.06
1.94
10.6
9.7
1250 -87.5 7656.25
49 – 51
52 – 54
4
2
50
53
200
106
4.94
7.94
4.94
7.94
19.76
15.88 1150 -187.5 35156.25
55 - 57 1 56 56 10.94 10.94 10.94
! 𝑓𝑥 ∑ 𝒇(𝒙 − l 𝒙 = 𝟏𝟎𝟕𝟎𝟎 l𝒙 − n
𝒙 n )𝟐
l(𝒙 − 𝒙
///
𝒙)=112.42
= 1397
=𝟎 = 𝟏𝟑𝟖, 𝟕𝟓𝟎
𝛴𝑓𝑋 1397
𝑥̅ = = = 45.06
𝑛 31
∑ 𝑓(𝑥 − 𝑥̅ ) 112.42
𝐴𝐷 = = = 3.63 n)𝟐 𝟏𝟑𝟖, 𝟕𝟓𝟎
∑(𝒙 − 𝒙
𝑛 31 𝒔𝟐 = = = 𝟏𝟗, 𝟖𝟐𝟏. 𝟒𝟑
𝒏−𝟏 𝟖−𝟏
3.63 is the average deviation of the data set.
∑(𝒙 − n
𝒙 )𝟐 𝟏𝟗, 𝟖𝟒𝟏. 𝟒𝟑
VARIANCE & STANDARD DEVIATION: 𝒔=Š
𝒏−𝟏

𝟕
= 𝟏𝟒𝟎. 𝟏𝟖

Standard Deviation is one of the most widely


used measures of dispersion, this is calculated The variance is 19,821.43 and the standard deviation is 53.24
as the square root of the variance. VARIANCE AND STANDARD DEVIATION FOR GROUPED
Variance is a mathematical expectation of the DATA
average squared deviation from the mean:
Volatility is a measure of risk, so that this n )𝟐
∑ 𝒇(𝒙 − 𝒙
𝒔𝟐 =
statistic can help determine the risk an investor 𝒏−𝟏
might take on when purchasing a specific
security. 𝒔=Š
∑ 𝒇(𝒙 − 𝒙
n)𝟐
Range Rule od Thumb is the range that can be 𝒏−𝟏
used to approximate the standard deviation.
𝒓𝒂𝒏𝒈𝒆
𝒔≈ 𝒂 𝒓𝒐𝒖𝒈𝒉 𝒆𝒔𝒕𝒊𝒎𝒂𝒕𝒆 𝒐𝒇 𝒕𝒉𝒆 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 Class 𝑓 𝑥 𝑓𝑥 𝑥 − 𝑥̅ (𝑥 − 𝑥̅ )$ 𝑓(𝑥 − 𝑥̅ )$
𝟒 Limits
For ungrouped data 37 – 39 3 38 114 7.06 49.8436 149.5308
40 – 42 6 41 246 4.06 16.4836 98.9016
43 – 45 10 44 440 1.06 1.1236 11.236
n)𝟐
∑(𝒙 − 𝒙 46 – 48 5 47 235 1.94 3.7636 18.818
𝒔𝟐 = 49 – 51 4 50 200 4.94 24.4036 97.6144
𝒏−𝟏 52 – 54 2 53 106 7.94 63.0436 126.0872
55 - 57 1 56 56 10.94 119.6836 119.6836
! 𝑓𝑥 621.87
∑(𝒙 − n
𝒙)𝟐 = 1397
𝒔=Š
𝒏−𝟏
n)𝟐 𝟔𝟐𝟏. 𝟖𝟕
∑ 𝒇(𝒙 − 𝒙
𝒔𝟐 = = = 𝟐𝟎. 𝟕𝟐𝟗
Example:The daily rates of a sample of eight 𝒏−𝟏 𝟑𝟏 − 𝟏

employees at ASEAN BICOL UNIVERSITY are


∑ 𝒇(𝒙 − 𝒙
n)𝟐
P1200, 1500, 1350, 1400, 1550, 1300, 1250, 𝒔=Š
𝒏−𝟏
= 𝟒. 𝟓𝟓
1150. Find the variance & Standard deviation.
The variance is 20.729 and the standard deviation is 4.55
𝑥 𝑥 − 𝑥̅ 𝒙)𝟐
(𝒙 − n

1200 -137.5 18906.25 QUARTILE, DECILE AND PERCENTILES


1500 162.5 26406.25 When presenting or analyzing data set it is sometime helpful to
1350 12.5 156.25 group subjects into several equal groups.
1400 62.5 3906.25

12
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


To create four equal groups we need values that split the data 55 - 57 1 31

such that 25% of the observations are in each group. The cut off 𝑘𝑁
points are called quartiles. ž Ÿ − 𝑐𝑓
DECILE split data into 10 parts and PERCENTILES split the data 𝑄+ = 𝐿𝐵 + • 4 𝑖
𝑓
into 100 parts (Centiles), the median is the 50th percentile or the
5th decile. 31
ž Ÿ−3
𝑄* = 39.5 + • 4 3 = 41.875
QUARTILES for ungrouped data 6
2(31) 31
𝒌(𝑵 + 𝟏) 𝑄, 𝑅𝑎𝑛𝑘𝑒𝑑 𝑉𝑎𝑙𝑢𝑒 = = = 15.5
𝑸𝒌 = 4 2
𝟒 2 ∗ 31
ž Ÿ−9
𝑄, = 42.5 + • 4 3 = 44.45
𝑸𝒌 = 𝑸𝒖𝒂𝒓𝒕𝒊𝒍𝒆 10
𝑵 = 𝒑𝒐𝒑𝒖𝒍𝒂𝒕𝒊𝒐𝒏
𝒌 = 𝒒𝒖𝒂𝒓𝒕𝒊𝒍𝒆 𝒍𝒐𝒄𝒂𝒕𝒊𝒐𝒏 3(31) 93
𝑄) 𝑅𝑎𝑛𝑘𝑒𝑑 𝑉𝑎𝑙𝑢𝑒 = = = 23.25
4 4
Example: 3 ∗ 31
ž Ÿ − 19
𝑄) = 45.5 + • 4 3 = 46.01
Find the first, second, and third quartiles of the ages of 9 middle- 5
management employees of a certain company. The ages are
53,45,59,48, 54,46,51,58 and 55. DETERMINE THE 𝐷- CLASS

Arrange the data in order: 7𝑁 7 ∗ 50


𝐷- 𝑅𝑎𝑛𝑘𝑒𝑑 𝑉𝑎𝑙𝑢𝑒 = = = 35 𝑡ℎ
10 10
45, 46, 48, 51, 53, 54, 55, 58, 59 Class 𝑓 𝑐𝑓
Limit
𝟏(𝟗 + 𝟏) 𝟓 s
𝑸𝟏 = = = 𝟐. 𝟓
𝟒 𝟐 10-17 7 7
18-25 18 25
𝟐(𝟗 + 𝟏)
𝑸𝟐 = =𝟓 26-33 10 35
𝟒 34-41 5 40
𝟑(𝟗 + 𝟏) 42-49 5 45
𝑸𝟑 = = 𝟕. 𝟓 50-57 2 47
𝟒
58-65 3 50
𝑘𝑁 − 𝑐𝑓
𝐷- = 𝐿𝐵 + £ ¤∗𝑖
𝑓
45, 46, 48, 51, 53, 54, 55, 58, 59 7 ∗ 50
− 25
𝐷- = 25.5 + • 10 ∗ 8 = 33.5
𝑸𝟏 = 𝟐. 𝟓 𝒕𝒉 10

𝟒𝟔 + 𝟒𝟖
𝑸𝟏 = = 𝟒𝟕
𝟐
PERCENTILE
𝑸𝟐 = 𝟓𝒕𝒉 =53 DETERMINE THE 𝑃,, Class
55 + 58 ,,∗/0
𝑄) = 7.5𝑡ℎ = = 𝟓𝟔. 𝟓 𝑃,, 𝑅𝑎𝑛𝑘𝑒𝑑 𝑉𝑎𝑙𝑢𝑒 = = 11th
2 *00

Class 𝑓 𝑐𝑓
For GROUPED DATA: Limit
s
1𝑁 31 10-17 7 7
𝑄* 𝑅𝑎𝑛𝑘𝑒𝑑 𝑉𝑎𝑙𝑢𝑒 = = = 7.75
4 4 18-25 18 25
Class 𝑓 𝑐𝑓 26-33 10 35
Limits
37 – 39 3 3 34-41 5 40
40 – 42 6 9 42-49 5 45
43 – 45 10 19
46 – 48 5 24 50-57 2 47
49 – 51 4 28
52 – 54 2 30
58-65 3 50

13
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


!!∗#$

𝑃,, = 17.5 + %$$


*2
1-
* 8 = 19.27 𝝁 = 𝟐𝟓, 𝟓𝟎𝟎 𝒂𝒏𝒅 𝝈 = 𝟐, 𝟓𝟎𝟎
MIDHINGE – the mean of the first & third quartiles in the data set. Chebyshev’s theorem states that
3 43
𝑀𝑖𝑑ℎ𝑖𝑛𝑔𝑒 = % & =
5*.2-/4 58.0*
, ,
= 43.94 88.89%, of the data values will fall within
3 standards of mean. Hence,
Interquartile Range (IQR) – midspread or middle fifty a measure
of statistical dispersion, being equal to the difference between the P25,500 + 3(P2,500) = P25,500 + P7,500
𝑄) & 𝑄* , 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒𝑠.
= P33,000
𝐼𝑄𝑅 = 𝑄) − 𝑄* = 56.5 − 47 = 9.5
3 13
Quartile Deviation (QD) = & % = 4.75
P25 500 - 3(P2 500) = P25 500 - P7,500 =
,
Coefficient of Variation = CV = standard deviation / mean = P18,000
𝑠
∗ 100%
𝑥̅
Interpretation and Uses of Standard Therefore, at least 88.89% of all laptop
Deviation sold will have a price range from P
18,000 and P 33,000.
As already mentioned, the variance and Example 2: Water Bills in January
standard deviation of a variable can be used average P230 with a standard deviation
to determine the dispersion, or spread, of a of 58. What percent have a bill between
variable. Specifically, the larger the variance 56 and 404?
and standard deviation, the more the data Example 3: In a town, the average
values are spread or dispersed. The Russian income is 34,200 with standard deviation
mathematician P. L. Chebyshev (1891-1894) of 2,200. What percent of homes erns
developed a theorem that specifies the between 29,800 and 38,600?
proportions of the spread in terms of the
standard deviation.
B. Empirical Rule. For symmetrical, bell
A. Chebyshev's Theorem. For any set of shaped frequency distribution,
observations, the proportion of the
values that lie within k standard 1. About 68% of the area under the
𝟏
deviations of the mean is at least 𝟏 − 𝒌𝟐 , normal distribution curve is within
where k is any constant greater than 1. one standard deviation of the
Example 1: The mean price of laptop mean. This can be written as 𝝁 ±
computer is P25,500 and the standard 𝟏 𝝈.
deviation is P2500. Find the price range 2. About 95% of the area under the
for which at least 88.89% of the laptop normal distribution curve is within
will sell. one standard deviation of the
Solution: mean. This can be written as 𝝁 ±
𝟐 𝝈.
When k = 3
3. About 99.7% of the area under the
𝟏 𝟏
𝟏 − 𝟑𝟐 = 𝟏 − 𝟗 = 𝟖𝟖. 𝟖𝟗% normal distribution curve is within

14
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


one standard deviation of the
mean. This can be written as 𝝁 ± Therefore, from empirical rule 95%
𝟑 𝝈. of newborn babies weigh between
2183 & 4467 grams.

SKEWNESS

The coefficient of skewness


measures the general shape of the
distribution or the lack of
symmetry of a distribution. It
ranges from -3 to +3 it relates the
difference between the mean and
the median to the standard
deviation. The direction of the
long tail of the distribution points
the direction of the skewness.

Example3. The average weight of Skewness is extremely important


newborn babies is bell shaped to finance and investing. Most
with mean of 3325 grams and sets of data, including stock
standard deviation of 571 grams. prices and asset returns, have
What percent of new born babies either positive or negative skew
weigh between 2183 grams and rather than following the balance
4467 grams? normal distribution (which has a
skewness of zero). By knowing
𝝁 = 𝟑𝟑𝟐𝟓 𝒈𝒓𝒂𝒎𝒔 which way data is skewed, one
𝝈 = 𝟓𝟕𝟏 𝒈𝒓𝒂𝒎𝒔 can better estimate whether a
given (or future) data point will be
𝝁 ± 𝟏 𝝈 = 𝟑𝟑𝟐𝟓 + 𝟓𝟕𝟏 = more or less than the mean. Most
advanced economic analysis
𝟑𝟖𝟗𝟔 𝒈𝒓𝒂𝒎𝒔
models study data for skewness
and incorporate this into their
𝝁 ± 𝟏 𝝈 = 𝟑𝟑𝟐𝟓 − 𝟓𝟕𝟏 =
calculation. Skewness risk is the
𝟐𝟕𝟓𝟒 𝒈𝒓𝒂𝒎𝒔
risk that a model assumes a
normal distribution of data when
𝝁 ± 𝟐 𝝈 = 𝟑𝟑𝟐𝟓 + 𝟐 ∗ 𝟓𝟕𝟏 =
in fact data is skewed to the left or
𝟒𝟒𝟔𝟕 𝒈𝒓𝒂𝒎𝒔
right of the mean.
𝝁 ± 𝟐 𝝈 = 𝟑𝟑𝟐𝟓 − 𝟐 ∗ 𝟓𝟕𝟏 =
𝟐𝟏𝟖𝟑 𝒈𝒓𝒂𝒎𝒔

15
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


than the mean and in that case the
tail of the distribution is skewed to
Types of Distribution: the right (notionally the positive
section of a cartesian frame.)
1. Symmetrical Distribution. When the median is more than the
When the data values are mean, the coefficient of skewness
evenly distributed on both is negative, and the tail of
sides of the mean. Also, the distribution is skewed in the left
distribution is unimodal and direction it is longer on the left
the mean, median and mode side than on the right.
are similar and are at the center '(𝒎𝒆𝒅𝒊𝒂𝒏)
𝟑(𝒙
of the distribution. 𝒔𝒌 = 𝒔
2. Positively Skewed Distribution sk = coefficient of skewness
(or Right Skewed Distribution. D = sample mean
𝒙
When most of the values in the s= sample standard deviation
data fall to the left of the mean
and group at the lower end of 1. A motorcycle dealership pays its
the distribution; the tail is to salesperson a salary plus a
the right. Also, the mean is to commission on sales. The mean
the right of the median and the monthly commission is 8,800 the
mode is to the left of the median is P 9,000 and the
median. standard deviation is P 1200.
3. Negatively Skewed Determine the coefficient of
Distribution (or Left-Skewed skewness, Comment on the shape
Distribution). When the mass of distribution.
of the data values fall to the
right of the mean and group at 2. Calculate the skewness of the
the upper end of the following frequency distributions
distribution, with the tail to the given their corresponding mean,
left. In addition, the mean is to median and standard deviations,
the left of the median and the are as follows:
mode is to the right of the
median. a. Frequency Distribution A
Mean = 50.7
PEARSON’s Coefficient of
Median = 49.1
Skewness
Standard Deviation = 9.2
The coefficient of skewness
is positive when the median is less

16
Republic of the Philippines
BICOL STATE COLLEGE OF APPLIED SCIENCES AND TECHNOLOGY
City of Naga

COLLEGE OF ENGINEERING AND ARCHITECTURE


b. Mean = 75.3
Median = 76.7
Standard Deviation = 6.8

MEASURES OF KURTOSIS
The degree of peakedness of a
frequency curve of a
distribution in relation to a Ku > 3 Ku =3 Ku <
normal distribution is known 3
as kurtosis (Ku).
')𝟒
∑(𝒙( 𝒙
𝑲𝒖 = 𝒏𝒔𝟒
for ungrouped
data

')𝟒
∑ 𝒇(𝒙( 𝒙
𝑲𝒖 = 𝒏𝒔𝟒
for grouped
A frequency distribution with a data
relatively high curve of peak is
called leptokurtic. This means
that the values in the
distribution are heavily
concentrated or piled up in the
center. A leptokurtic curve has
long tails.

A flat-topped distribution,
where the values are relatively
even in distribution about a
center is known as platykurtic
curve. This type of curve
possesses short tails.

A normal distribution curve


which does not have a
relatively high curve or peak or
not too flat is called
mesokurtic.

17

You might also like