You are on page 1of 6

7.

Box and whisker plots


A box-and-whisker plot is a visual representation of how the data is spread out and how much
variation there is. The main advantage of the box-and-whisker plot is that it is not cluttered
by showing all the data values. It highlights only a few important features of the data.
Therefore, the box-and-whisker plot makes it easier to focus attention on the median,
extremes, and quartiles and comparisons among them. Another advantage of the box-andwhisker plot is that it does not become more complicated with more data values. A
disadvantage of the box-and-whisker plot occurs when there are only a few data values.
It is a convenient way of graphically depicting groups of numerical data through
their quartiles. Box plots may also have lines extending horizontally from the boxes
(whiskers) indicating variability outside the upper and lower quartiles, hence the terms boxand-whisker plot and box-and-whisker diagram
Box and whisker plots are uniform in their use of the box: the left and right of the box are
always the first and third quartiles, and the band inside the box is always the
second quartile (the median). The ends of the whiskers can represent the maximum and
minimum values in simple diagrams.

Exercise 1
Steps to plot the box and whisker plot
Use the data below:
16,6
17,4
23,6
13,5
20,8
13,9
24,6
14,9
20,5
17,1

1. Arrange the data in ascending order in a single column


2. Using excel formula calculate the five point summary of data arranged in the following
order:
MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

3. Create a column title called original and enter the corresponding values as shown below

MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

Original
13,5
15,325

4. Create a third column with the five point summary of data again as shown below

MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

Original
13,5
15,325
17,25
20,725
24,6

MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

5. Create a fourth column titled Plot values


a. In the plot values column the only value that will remain the same as it is originally is the
minimum
b. For the 1st Quartile value :In the plot values column, Click on equals, click on the original
1st Quartile value, press minus, Click on the original minimum value, Press enter
c. For the Median value: In the plot values column, Click on equals, click on the original
Median value , press minus, Click on the first quartile value, press enter
d. For the 3rd Quartile value: subtract the median from the 3rd quartile value
e. For the MAX value- subtract the 3rd quartile value from the MAX value

MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

Original
13,5
15,325
17,25
20,725
24,6

Plot values
MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

13,5
1,825
1,925
3,475
3,875

The plot values in the table above are used to create the box and whisker plot
6. To create the box plot, highlight the last two columns above (exclude the column title)
a. Click on insert, click on bar, choose 2D bar, click on stacked, which is the second bar type
b. Click on the design tab, Click on Switch Row/Column

c. The legend explains what each of the coloured bars on your graph represents
d. Some of the bars need to be deleted as they are unnecessary
e. Put your mouse cursor on the first bar, right click, click format data series, click fill, choose
no fill, this will make the first box invisible
f. Do the same for the second box and the very last box
g. The remaining 2boxes represent the box of the box and whisker plot which consists of
the 1st Quartile, the median and the 3rd Quartile
h. To produce the whiskers which are the lines that extend to the minimum and maximum
value you insert the error lines
i. Click on the 1st Quartile box (which is now invisible)
ii. Click on layout, click on error bars, click on more error bar options, Click on minus
because the whisker is going to the left of the bar towards the minimum, under Error amount
in the percentage tab, type 100, the left whisker is displayed on your chart
iii. Click on the 3rd Quartile box (which is still visible)
iv. Click on layout, click on error bars, click on more error bar options, Click on plus because
the whisker is going to the right of the bar towards the maximum, under Error amount click
on Custom, click on specify value, under positive error value type the value of the maximum
which is 3.875, Click on OK the second whisker is plotted on your chart.
7. The box and whisker plot is almost complete. You now need to make your chart more
friendly (i.e even more easy to read) by doing the following:
a. Right click on any of the numbers on the horizontal axis, Click on format axis, Under axis
options click on the Minimum tab you can specify the start value of your chart, Since the
lowest value is 13,5 your horizontal axis doesnt have to start at 0, Click on fixed and type10.
b. Since the maximum value is 24, in the maximum tab, click on fixed and type 26
Your graph is now spread out and much easier to read
c. The vertical axis is not required in a box plot as we are only testing the distribution of one
set of data.
To get rid of the vertical axis; right click on the vertical number, go to format axis, In the axis
options tab, on axis labels Choose none
d. You may maintain the coloured boxes as they are or you may remove the colour from the
boxes.
To do this right click on any of the coloured boxes, go to format data series,

Click on Fill, Choose no fill


Click on Border Colour, Choose solid line, Click on the colour tab, Choose Black or any
other suitable colour you prefer
Click on Border Style, Click on the width, specify the width to 2 for example
Right click on the other box and do the same
e. For the whiskers, right click on the error bar, click on format error bar, Click on line style,
specify a width of 2
8. Specify a suitable Chart title and horizontal axis title

Exercise 2 -Data with more than one series


The table below shows student test results for test A and test B, Plot the box and whisker plot
to compare the performance of the students in the two tests
TEST A
34
48
51
52
56
58
63
64
64
66
70
70
71
74
74
75
76
80
85
86
86
88

TEST B
47
50
51
55
62
62
65
66
69
71
71
72
73
76
77
80
82
82
85
89
89
90

1. Prepare the columns for the five point summary of data for each series, the result is as
shown below: (N.B you only need to type excel formulas for Test A, for test B, type the
formula for MIN only, highlight the values you calculated for Test A excluding the MIN,
copy and paste into the column for Test B, the corresponding values for Test B will be
automatically calculated for you, (this saves time when dealing with several series). As
shown below:

MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

Original
Test A
34
59,25

Test B
47
62,75

MIN
1st Quartile
MEDIAN
3rd Quartile
MAX

Plot values
Test A
Test B
34
47
25,25
15,75

2. Select the data of the last 3 columns including the column titles Test A and Test B but
exclude the Plot values column heading, Click on Insert, Click on Bar, Choose the second
2D bar- stacked bar
3. Click on the Design tab, Click on switch Row/Column
4. Edit the graph for each test as outlined in Exercise 1 above

Exercise 3
6 water samples were taken at a sewage treatment plant and the concentration of
contamination tested in g/L as shown below over a period of 9 days as shown below:
1. Produce the box and whisker plot
2. Describe the spread of contamination
Sample
1
50,5
51,3
55,3
50,3
55,0
59,6
51,3
56,1
59,7

Sample
2
46,7
45,0
45,6
46,3
49,7
49,8
48,5
48,7
48,8

Sample
3
43,2
45,3
43,2
43,5
45,6
43,1
45,4
46,0
44,1

Sample
4
62,5
64,2
66,1
66,7
63,4
67,7
62,2
68,4
62,7

Sample
5
52,0
52,3
55,0
54,3
52,6
53,9
51,2
52,1
54,5

Sample
6
53,2
58,6
55,4
53,5
56,0
57,6
54,5
58,7
55,7

You might also like