Professional Documents
Culture Documents
The inter quartile range is the difference between the third and first quartile in a set of
data. This measure considers the spread in the middle fifty percent of the data; therefore,
it is not influenced by the extreme values of the data set.
Inter quartile range is obtained by subtracting the first quartile from the third quartile.
Inter quartile range = Q3 – Q1
Median is the middle value of an ordered array of data. The median is the middle value
that splits the ordered numbers into half (50% of the observations are smaller and 50%
are larger than the median,) the quartiles are descriptive measures that splits the
ordered data into four quarters. Quartiles are used to describe positional values of large
sets of numerical data.
First quartile is positional value where 25% of the data are smaller and 75% are larger
𝑛+1 𝑡ℎ
than the value given by the following formula. 𝑄1 = ( ) value of ordered data array.
4
Third quartile is positional value where 75% of the data are smaller and 25% are larger
𝑛+1 𝑡ℎ
than the value given by the following formula. 𝑄3 = 3 ( ) value of ordered data array
4
Solution
Arrange the number in ascending order
3 4 5 7 8 9 11 14 15 16 16
17 19 19 20 21 22
(𝑛+1) (17+1)
Location of Q1: 𝑡ℎ = 𝑡ℎ = 4.5𝑡ℎ i.e average between the 4th and 5th item.
4 4
(7+8)
Thus Q1= = 7.5
2
(17+1)
Location of median is the 𝑡ℎ = 9𝑡ℎ data, in this case, 15.
2
3(𝑛+1) 3(17+1)
Location of Q3: 𝑡ℎ = 𝑡ℎ = 13.5𝑡ℎ i.e average between the 13th and 14th
4 4
item.
(19+19)
Thus Q3= = 19
2
𝑛+1 8+1
= = 2.25
4 4
The first quartile is located between the 2nd and 3rd position.
Q1 = 5 + (0.25)(6 − 5) = 5.25
3(𝑛+1) 3(8+1)
= = 6.75
4 4
The third quartile is located between the 6th and 7th position.
Q3 = 10 + (0.75)(10 − 10) = 10
Step 3: Once the intervals are determined, the following formula is used:
𝑛+1
− 𝐹1
𝑄1 = 𝐿1 + [ 4 ] × 𝑐1
𝑓1
Example 24
Use data in example 13; compute the quartiles and median of employees’ years of
working experience.
Solution
Step 1: Obtain the cumulative frequencies (F)
Years of Number of employees, Cumulative Position of data
experience f frequency, F
(X)
1– 4 16 16 1−16
5–8 20 36 17−36 Q1
9 – 12 28 64 37−64 Q2
13 – 16 24 88 65−88
17 – 20 16 104 89−104 Q3
21 – 24 11 115 105−115
25 – 28 5 120 116−120
Total 120
Step 2: Identify the class containing:
𝑛 120+1
( 4)th for Q1, = 30.25𝑡ℎ lies in the interval 5 – 8; the first quartile class.
4
𝑛 120+1
( 2 )th for median = 60.5𝑡ℎ lies in the interval 9 – 12; the median class.
2
𝑛 3(120+1)
3( 4 )th for Q3. = 90.75𝑡ℎ lies in the interval 17 – 20; the third quartile class.
4
𝑛+1 120 + 1
− 𝐹𝑚−1 − 36
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿𝑚 +[ 2 ] × 𝑐 = 8.5 + [ 2 ] × 4 = 12 𝑦𝑒𝑎𝑟𝑠
𝑓𝑚 28
3𝑛 + 1 3(120 + 1)
− 𝐹3 − 88
𝑄3 = 𝐿3 + [ 4 ] × 𝑐3 = 16.5 + [ 4 ] × 4 = 17.2 𝑦𝑒𝑎𝑟𝑠
𝑓3 16
Example 25
Determine the first quartile, median and third quartile of the following data, and state
the inter quartile range. The table shows the distribution of the scores of 100 students
taking their final Math examination.
Scores Number of students
30 – 39 6
40 – 49 9
50 – 59 20
60 – 69 35
70 – 79 20
80 – 89 6
90 – 99 4
Total 100
Solution
Step 1:
Scores Number of students, Cumulative
f frequency, F
30 – 39 6
40 – 49 9
50 – 59 20
60 – 69 35
70 – 79 20
80 – 89 6
90 – 99 4
Total 100
Step 2:
Step3:
Percentiles are position measures used in educational and health related fields to
indicate the position of an individual in a group. Percentiles divide the data or
observations into 100 equal groups. Symbols for percentiles P1, P2, P3, ……….., P99 and
divide the distribution into 100 groups.
Smallest P1 P2 P3, P4 ……….., P98 P99 Largest
data data
value value
1% 1% 1% 1% 1% 1%
Percentile Formula
The percentile corresponding to a given value X is computed by using the following
formula:
(𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑏𝑒𝑙𝑜𝑤 𝑋) + 0.5
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = ∙ (100)
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
Step 3:
The 60th percentile is located between the 5th and 6th position.
P60 = 8 + (0.4)(10 − 8) = 8.8
Where k= 𝑘 𝑡ℎ percentile
Find the 𝑃50 or 50th percentile for the following frequency distribution
𝑛 = ∑ 𝑓 = 71
𝑘(𝑛+1 (50)(71+1)
= = 36. The location of percentile on cumulative frequency is
100 100
39.
Therefore 𝑷𝟓𝟎 lies in the class 18-23
Step 2: By using the formula:
𝑘(𝑛 + 1)
− 𝐹𝑘
𝑃𝑘 = 𝐿𝑘 + ( 100 ) × 𝑐𝑘 ,
𝑓𝑘
50(𝑛+1)
−𝐹50
100
𝑃50 = 𝐿50 + ( ) × 𝑐50
𝑓50
36−27
𝑃50 =17.5 +( )×6
12
Solution
Step 1: Find the cumulative frequencies column
Step 2: Find the cumulative percentages using
𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦, 𝐹
𝐶𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 % = ∙ 100
𝑛
Class boundaries Frequency, f Cumulative Cumulative
frequency, F percentage
89.5 – 104.5 24 24 12
104.5 − 119.5 62 86 43
119.5 − 134.5 72 158 79
134.5 − 149.5 26 184 92
149.5 − 164.5 12 196 98
164.5 − 179.5 4 200 100
200
120
100
80
Axis Title
60
40
20
0
89.5 104.5 119.5 134.5 149.5 164.5
Axis Title
Step 3:
To find the percentile rank of a blood pressure reading of 130, find 130 on the x axis
and draw a vertical line to the graph. Note that a blood pressure of 130 corresponds to
approximately the 70th percentile. If the value that corresponds to the 40 th percentile
is desired, start on the y axis at 40 and draw a horizontal line to the graph, and
corresponds to a value of approximately 118.
Example 30
a) Find the percentile rank for 49 score in the data set
12 28 35 42 47 49 50
What value corresponds to the 60th percentile?
b) Find the approximate values that correspond to the given percentiles.
Class Frequency, f
66 − 86 4
87 – 107 2
108 – 128 3
129 – 149 2
150 – 170 1
171 – 191 2
192 – 212 3
213 – 233 4
Total 21
Outliers
A data set should be checked for extremely high or extremely low values. These values
are called outliers.
• An outlier is an extremely high or an extremely low data value when compared
with the rest of the data values.
• An outlier can strongly affect the mean and standard deviation of a variable. For
example, suppose a researcher mistakenly recorded an extremely high data value.
This value would then make the mean and standard deviation of the variable much
larger than they really were.
Procedure for identifying outliers
Step 1: Arrange the data in order from lowest to highest and find Q 1 and Q3.
Step 2: Find the interquartile range: IQR = Q 3 – Q1.
Step 3: Multiply the IQR by 1.5
Step 4: Find the value Q1 – 1.5(IQR) and Q3 + 1.5(IQR).
Step 5: Check the data set for any data value that is smaller than Q1 – 1.5(IQR) or larger
than Q3 + 1.5(IQR)
Example 31
Check the following data set for outliers. 5 6 12 13 15 18 22 50
Solution
Step 1: Arrange the data in order from lowest to highest and find Q 1 and Q3.
10+1
Location ( ) 𝑡ℎ = 2.75𝑡ℎ for Q1: Q1 =6+(0.25) (12 − 6) = 7.5
4
10+1
Location 3 ( ) 𝑡ℎ = 8.25𝑡ℎ for Q3: Q3 =18+(0.75)(22-18)= 21
4
Find Q1 and Q3: 30, 39, 47, 48, 78, 89, 138, 164, 215,296
Q1 Q3
The minimum data value is 30 and the maximum data value is 296
Step 2:
0 100 200 300
Step 3: Draw the box above the scale using Q 1 and Q3. Draw a vertical line through
the median, and draw lines from the lowest data value to the box and from the
highest data value to the box.
47 83.5 164
30 296
Example 33
The data shown are the speeds in miles per hour of a sample of wooden roller
coasters and a sample of steel roller coasters. Compare the distribution by using box-
plots.
Wood Steel
50 56 60 48 35 67 72 55 70 48 28 100 106 102
68 120
Solution
Note: The box-plots show that the median of the speeds of the steel coasters is much
higher than the median speeds of the wooden coasters. The inter quartile range
(spread) of the steel coasters is much larger than that of the wooden coasters. Finally,
the range of the speeds of the steel coasters is larger than that of the wooden coasters.
Exercise 7
1. A sample of 10 students in UTM showed the following credit hours taken during the
first year of their programme.
17 18 21 18 20 19 22 21 18 24
a) What are the mean, median and mode for their credit hours?
b) Find the first quartile and third quartile.
c) Draw a box-plot for the data given.
2. The data below show the number of vehicles that arrive at Batu toll booth during 16
intervals of the 10 minute duration.
25 55 34 32 25 18 25 32
29 28 44 40 34 28 25 42
a) What are the mean, median and mode for data given?
b) Find the first quartile and third quartile.
c) Draw a box-plot for the data given.
3. The following are data about 63 bowlers who took part in a bowling competition at
ABC Bowling Centre.
Score (x) 60 80 110 120 130 140
Number of bowlers 2 8 10 15 20 8
(f)
a) What are the mean, median and mode for data given?
b) Find the first quartile and third quartile.
c) Draw a box-plot for the data given.
4. The number of students in a school, according to age, that read the Women’s Weekly
magazine during the first week of June 2016, is indicated in the table below.
Age (year) Number of students a) Find the mean, median and
10 – below 12 5 mode age of the students who read
12 – below 14 10 the Women’s Weekly magazine
14 – below 16 20 and find the first and third
16 – below 28 50 quartiles.