You are on page 1of 19

B B AD MC

102
BWU - BBD - 21 -
014
What Are The Introduction to
SLIDE SLIDE
3 Outlier Boxplot
14
Definition & Example
What is Box plot

SLIDE Types of Representing the data


outliers SLIDE
7 Via Boxplot
Describing 3 types
15

SLIDE
Reason for Affect on Mean, Median,
SLIDE
10 Outliers Mode
16
4 reason for outliers Example & theory

How to Find
SLIDE
11
Outliers
Steps & Example
What Are The
Outliers ?
 Outliers are extreme values.
 Extremely high or low values in a data set.

In a data set outliers may include sample maximum or minimum or


both.
 It indicate that distribution is heavy tailed or highly skewed.
Q1. For Example.

1.5, 2.1, 2.5, 3, 3.1, 3.7, 4.1, 5, 9.5


Q2. For Example.

1, 3, 5, 6, 7, 9, 10, 12,
22

Q3. For Example.

15, 1, 3, 3.5, 2.1, 4.2,


3.1, 5
Types of outliers ?

Outliers

Global Contextual Collective


Outlier Outlier
Outlier
a. Global outliers .
when a data object differ from the rest of the given data
set, it is considered to be global outliers.

b. Contextual outliers.
A contextual is a data object anomalous within its context
or its neighborhood.

c. Collective outliers.
if a collection of related data instance is anomalous with
respect to the entire data set, it is termed as a collective
outliers.
a. Global c. Collective
outlier
b. Contextual outlier
outlier
Reason For
Outliers ?

 Data entry error.


 Instrumental error.
 System faults.
 Measurement error.
How To Identify
Outliers ?

A data value less Q1 – or greater


than Q3 + can1.5(IQR)
be considered an
outlier. 1.5(IQR)
Steps To Find
Outliers
 Arrange the data in order from lowest to highest and find Q1 and Q3.

 Find the interquartile range (IQR) Q3 – Q1.

 Multiply IQR by 1.5.

 Subtract step 3 from Q1 and add in Q3.

 Check the data set for any data value that is smaller than Q1 – 1.5(IQR)
or larger than Q3 + 1.5(IQR)
1. Arrange the data & find Q1, Q3 .
7, 10, 11, 15, 25, 30, 35, 68 Example
Q1. = 10.5 Q3 = 32.5
10, 11, 15, 25, 35, 30, 7, 68
2. Find the IQR (Q3 – Q1)
= 32.5 – 10.5
= 22
3. Multiply IQR by 1.5
= 33
4. Subtract IQR from Q1 & add in Q3 10.5 – 33 = -22.5
32.5 + 33 = 65.5

5. Check the data set for any data value that is smaller than Q1 – 1.5(IQR) or larger than Q3 + 1.5(IQR).

68 is the outliers
Box Plots At A
Glance
IQ
Min. R Max.
Q3 – Q1
valu valu
e e

Outliers Outliers
Q1 Q3

Q2
&
Median
Represent In Box
Plot
Q1. = 10.5 7(Min 35(Max
Median
) )
Q2 = 20
Q3 = 32.5
Outlier = 68
Min value = 7 68(Outliers
)
Max value = 10.5(Q1 ) 20(Q2 ) 25(Q3)
35
Affect O n Mean, Median, Mode.
Without
Outlier

With Outlier
Affect
s
 Mode is not affected by
outlier.
 Median is also not so affected.

 Mean is more affected as mean depends on average of all


data.
• Low outlier tend to shift mean more negatively than median.
• High outlier tend to shift mean more positively than
median.
 QUOTE 
A single death is a
tragedy; a million deaths
is a statistics.
- Joseph
Stalin

Thank you…

You might also like