You are on page 1of 2

BOX PLOT

What it is:
The box plot is a graphical representation of data that shows a data sets lowest value, highest
value, median value, and the size of the first and third quartile. The box plot is useful in analyzing
small data sets that do not lend themselves easily to histograms. Because of the small size of a box
plot, it is easy to display and compare several box plots in a small space. A box plot is a good
alternative or complement to a histogram and is usually better for showing several simultaneous
comparisons.

How to use it:


Collect and arrange data. Collect the data and arrange it into an ordered set from lowest value
to highest.

n+ 1
Calculate the depth of the median. d(M) = 2
where
d = depth; the number of observations to count from the beginning of the ordered
data set
M = median
n = number of observations in the set of data
If the ordered data set contains an odd number of values, the formula will identify which of the
values will be the median. If the ordered data set contains an even number of values, the median
will be midway between two of the values.
(1)n + 2
4
where
d = depth; the number of observations to count from the beginning of the ordered
data set
(Q1 ) = the first quartile
n = number of observations in the set of data

Calculate the depth of the first quartile. d(Q1 ) =

The first quartile will be the value of the data item identified by this formula.
(3)n + 2
4
where
d = depth; the number of observations to count from the beginning of the ordered
data set
(Q3 ) = the third quartile
n = number of observations in the set of data

Calculate the depth of the third quartile. d(Q3 ) =

The third quartile will be the value of the data item identified by this formula.
Calculate the interquartile rage (IQR). This range is the difference between the first and
third quartile vales. (Q3 - Q1 )
Calculate the upper adjacent limit. This is the largest data value that is less than or equal to
the third quartile plus 1.5 X IQR. Q3 + [(Q3 - Q1 ) X 1.5]
Calculate the lower adjacent limit. This value will be the smallest data value that is greater
than or equal to the first quartile minus 1.5 X IQR. Q1 - [(Q3 - Q1 ) X 1.5]

Draw and label the axes of the graph. The scale of the vertical axis must be large enough to
encompass the greatest value of the data sets. The horizontal axis must be large enough to
encompass the number of box plots to be drawn.
Draw the box plots. Construct the boxes, insert median points, and attach upper and lower
adjacent limits.. Identify outliers (values outside the upper and lower adjacent limits) with
asterisks.
Analyze the results. A box plot shows the distribution of data. The line between the lowest
adjacent limit and the bottom of the box represent one-fourth of the data. One-fourth of the data
falls between the bottom of the box and the median, and another one-fourth between the median
and the top of the box. The line between the top of the box and the upper adjacent limit represents
the final one-fourth of the data observations. Once the pattern of data variation is clear, the next
step is to develop an explanation for the variation.

Box Plot Example


Ordered Data Set

Number of Process Errors Per Run


Process A
Process B
Process C
Process D

1
12
6
2
1

2
15
22
3
22

3
23
26
6
36

4
24
33
8
37

5
30
35
13
45

6
31
47
14
47

7
33
54
19
48

8
36
55
23
51

Box Plots of Data


80
*
N 70
u
m
b 60
e
r 50
o
f

*
*

40

E 30
r
r
o 20
r
s 10
A

C
Processes

*
D

9
50
62
60
52

10
73
63
69
69

You might also like