You are on page 1of 7

e

P
HO BeS 1ad suiod Gay
sUM
suM
eO4- aue9 ad suiod Gny
The three scatter
plots above show the same data as the original example
bubble chart. While it is easier to
get the specific win counts for each team
from this series of
plots, the relationship between all three variables is not as
clearly stated as in the bubble chart.

Example of data structure


AVGPOINTS AGAINST AVG POINTS FOR WINS
26.56
14.06
26.44 25.88
17.94 24.31 10
23.38 16.81

A bubble chart is created from a data table with three columns. Two columns
will correspond with the horizontal and vertical
the third will indicate each
positions of each point, while
point's size. One point will be plotted
for each
row in the table.

Best practices for using a


bubble chart
Scale bubble area by value

One easy mistake that can be made is to scale the


points' diameters or radii to
the third variable's values. When this kind of scaling is
performed, a point
with twice the value of another point will end up with four times the
area,
making its value look much larger than is actually warranted.
Instead, make sure that the bubbles areas correspond with the third
variable's values. In the same scenario as above, a point with twce the vaue
of another
point should have sqrt(2) =1.41 times the diameter or radius so
that its area is twice the smaller
point'S.

X
10 10

20 20

Depending on how you are creating your bubble chart, you may need to scale
your data to account for how data values are mapped to point sizes. Many
visualization tools will automatically match value to area, but be careful of
those cases where value is matched to diameter or radius instead.

Limit number of points to plot

Bubble charts are commonly drawn with


transparency on points since
overlaps are a much easier occurrence than when all points are a small size.
This overlapping also means that there are limitations to the
number of data
points that can be plotted while keeping a plot readable.
Other charts related to bubble
charts

Scatter plot
The bubble chart is, of course, built upon the scatter plot as a base, just with
the addition of a third variable through point size. It's worth mentioning
however, that third variables can be added to scatter plots through other point
encodings. Most common among these is color. When we have a categorical
third variable (taking discrete values that may or may not be ordered), we can
assign a distinct hue to each category of points. It is actually possible to use
hue as a fourth variable in conjunction with point size, but this should be used
carefully since it can result in information overload - the earlier cautions

regarding presenting a clear trend are magnified greatly with a fourth


variable.

Color can also be used as an encoding for numeric variables. If we have


a color palette where colors have a continuous relationship (e.g. light to
dark), we can use color to indicate value for a third variable, rather than size.
Note that perception of value based on color has similar limitations as using
size, so a legend is just as necessary when using color as it is for point: ze.

10

54
56 58 50 62
number of sites

number of herb species umber of herb species


JaN0O 86eJuBajad
DO
10A00 ofeuooiod

You might also like