You are on page 1of 58

The visual display of quantitative

information: The use of graphs in


research and manuscripts

David L. Schriger, MD, MPH


Richelle J. Cooper, MD MSHS
For a copy of these slides and a bibliography:
e mail: mikulich@ucla.edu
put “graphing lecture” on the subject line
any comments about the lecture would be
appreciated. Mr Mikulich will anonymously
forward these to us (yeah right).
Much of this material can be found in the
January, 2001 issue of Annals of Emergency
Medicine.
Goals of the session
• Importance and advantages of graphical
data display
• Master use of basic features of graphs
• Learn advanced techniques
• Gain ability to critique graphs
Importance of graphs:

• Exploratory Data Analysis


• Presentation of data
– Values of data elements
– Relationship of data elements
Importance of graphs:

• Exploratory data analysis


– Picture worth 978 words
– How the investigator learns about the data
– Seeing is believing
– If we only had n dimensions
– Make multiple slices
– Advanced computer methods - “Fantastic voyage”
– Can’t cover in this hour
-1.79 -.97 -.64 -.23 .16 .57 1.25
-1.7 -.95 -.63 -.22 .2 .57 1.25
-1.68 -.89 -.63 -.22 .21 .59 1.26
-1.57 -.87 -.59 -.2 .24 .64 1.27
-1.55 -.85 -.59 -.19 .27 .71 1.53
-1.44 -.85 -.57 -.12 .28 .82 1.62
-1.36 -.8 -.55 -.1 .3 .87 1.85
-1.34 -.8 -.55 -.06 .31 .87 1.9
-1.29 -.78 -.38 -.04 .32 .9 1.95
-1.27 -.76 -.35 .01 .33 .96 1.98
-1.23 -.75 -.33 .05 .37 1
-1.23 -.75 -.33 .1 .46 1.08
-1.18 -.73 -.32 .14 .48 1.13
-1.12 -.66 -.31 .14 .52 1.13
-1.09 -.66 -.29 .15 .55 1.14
. summ z

Variable | Obs Mean Std. Dev. Min Max


---------+-----------------------------------------------------
z | 100 -.0692 .9138869 -1.79 1.98
. summ z,d

z
-------------------------------------------------------------
Percentiles Smallest
1% -1.745 -1.79
5% -1.495 -1.7
10% -1.25 -1.68 Obs 100
25% -.755 -1.57 Sum of Wgt. 100

50% -.155 Mean -.0692


Largest Std. Dev. .9138869
75% .56 1.85
90% 1.195 1.9 Variance .8351893
95% 1.575 1.95 Skewness .2684074
99% 1.965 1.98 Kurtosis 2.395736
-.155

-1.787513 z 1.983905
Stem-and-leaf plot for x
plot in units of .01
-1** | 79,70,68
-1** | 57,55,44
-1** | 36,34,29,27,23,23
-1** | 18,12,09
-0** | 97,95,89,87,85,85,80,80
-0** | 78,76,75,75,73,66,66,64,63,63
-0** | 59,59,57,55,55
-0** | 38,35,33,33,32,31,29,23,22,22,20
-0** | 19,12,10,06,04
0** | 01,05,10,14,14,15,16
0** | 20,21,24,27,28,30,31,32,33,37
0** | 46,48,52,55,57,57,59
0** | 64,71
0** | 82,87,87,90,96
1** | 00,08,13,13,14
1** | 25,25,26,27
1** | 53
1** | 62
1** | 85,90,95,98
Fraction

0
.1
-1** | 79,70,68

-1.79
-1** | 57,55,44
-1** | 36,34,29,27,23,23
-1** | 18,12,09
-0** | 97,95,89,87,85,85,80,80
-0** | 78,76,75,75,73,66,66,64,63,63
-0** | 59,59,57,55,55
-0** | 38,35,33,33,32,31,29,23,22,22,20
-0** | 19,12,10,06,04

z
0** | 01,05,10,14,14,15,16
0** | 20,21,24,27,28,30,31,32,33,37
0** | 46,48,52,55,57,57,59
0** | 64,71
0** | 82,87,87,90,96
1** | 00,08,13,13,14
1** | 25,25,26,27
1** | 53
1** | 62
1** | 85,90,95,98
1.98
Importance of graphs:

•Why not just use


summary statistics?
Summary statistics for selected variables

Variable | Obs Mean Std. Dev


---------+--------------------------------
z | 100 -.0692 .9138869
a | 100 -.0692 .9136105
u | 100 -.0692 .9131416
.2 .34

A U
Fraction

Fraction
0 0
-1.79 1.98 -1 1.04
a u

.1

Z
Fraction

0
-1.79 1.98
z
Importance of graphs:

•Why not just use


analytic statistics?
. regress o n

Number of obs = 100


F( 1, 98) = 411.93
Prob > F = 0.0000
R-squared = 0.8078
Adj R-squared = 0.8059
Root MSE = .40336
----------------------------------------------------
o | Coef. Std. Err. t P>|t|
---------+------------------------------------------
n | 2.818721 .1388798 20.296 0.000
_cons | -1.41951 .0816387 -17.388 0.000

-----------------------------------
o | [95% Conf. Interval]
---------+-------------------------
n | 2.543118 3.094323
_cons | -1.581519 -1.2575
-----------------------------------.
2.46622
o

-2.10937
.005723 .984407
n
Principles of graphing:

• Graph the data, not statistics


• Exploit the dimensionality of the
graphic format
• Depict the unit of analysis
• Stratify on confounders
• Show the trees and the forest
82

82

81 p< .05

80
Mean VAS (mm)

79 77

78

77

76

75
A B
Treatment Group
100

90
80
p < .05
70
Mean VAS (mm)

60
50
40

30
20
10
0
A B
N = 200 Treatment Group N = 200
•The mean VAS in
group A was 77 (SD 30,
N=200) and in group B
was 82 (SD 7 , N=200)
(p< .05).
20 Median
89

15
Treatment A
10 N = 200
Mean 77

5
Number of subjects

0
0 50 100
VAS score (mm)
5

10

15
20 Treatment B
N = 200
Mean 82
Median 83
Group A by treating physician 9
7 9
7 88
Low pain group High pain group 6988
6878
8 58677
Number of cases

979847667 8
5 9868747667 7
2 8 88877487366678 97
32 2 777574853355569 787
22222 894763537433555599673
22112 7 8854653336333345469572
11112 6 6 7533632325323343353352
11111212 96 56253432431323322222343212

4 11 71 100
VAS (mm) VAS (mm)
Principles of graphing:
82 100 20 Median
89
90
15
82 80 Treatment A 9
N = 200 7 9
p < .05 10
70 7 88

Mean VAS (mm)


81 p< .05 Mean 77
6988
60 5 6878
80

Number of subjects
Mean VAS (mm)
8 58677
50 979847667 8
0
79 77 0 50 100 5 9868747667 7
40
VAS score (mm) 28 88877487366678 97
78 30 5 32 2 777574853355569 787
22222 894763537433555599673
20 10 22112 7 8854653336333345469572
77 11112 6 6 7533632325323343353352
10 11111212 96 56253432431323322222343212
76 15
0 2 Treatment B
A B 0 N = 200
75 Mean 82
A B N = 200 Treatment Group N = 200 Median 83

Treatment Group

• Graph the data, not statistics ü ü


• Depict the unit of analysis ü
• Stratify on confounders
• Show the trees and the forest ü ü
Importance of showing data:

• No assumptions needed
• Efficiency
• Empowers readers to:
– make their own conclusions
– determine whether authors’ analyses are
appropriate
– do their own analyses
Elements of the graph:
Title
• Title should state what is being shown or
compared.
• Focus reader on what they are about to
see
NOT - “Change in Respiratory Function”
BUT - “Change in FEV1 by group and
baseline FEV1”
Elements of the graph:
Legend
• Makes the figure self-explanatory.
• Defines abbreviations, symbols, and
methods
– any regression line, p-value, or other symbol
based on calculations should be explained
• Defines sample size if not shown in graph
Elements of the graph:

•Axes
Elements of the graph:
Axes - scale
• Appropriate boundaries: Do not overly
compress or expand the data.
• Uniformity: Distance along axis must
retain consistent interpretation
throughout graph.
73
72
Mean Pain Score (VAS) mm 71
70
69
68
67
66
65
64
63
62
61
60
59
58
57
1 2 3 4 5 6 7 8 9

Post-op day
Elements of the graph:
Axes – tick marks and labels
• Avoid clutter
• Align ticks, labels and data points
• Consider specific label for first and last
point of the data set
Elements of the graph:

•Data points
.99277
b

.074068
-.007491 3.27872
a
.99277
b

.074068
-.007491 3.27872
a
.99
b

.07
-.01 3.28
a
Elements of the graph:
Data points
• Is it the pattern or the individual points
that you want readers to see?
• Consider using jitter or alter graph
dimensions to avoid clutter.
• Consider symbols to further
differentiate strata in the data.
Elements of the graph:

•Chartjunk!
•Any ink that does not
show or explain data
82

82

81 p< .05

80
Mean VAS (mm)

79 77

78

77

76

75
A B
Treatment Group
Elements of the graph:
Background and Shading
• Efficiency is the key to a good
graphic.
• Avoid
- background shadings, background
grid lines, or unnecessary axes
- moiré patterns
- 3-D effects
Elements of the graph:
Other pitfalls to avoid

• Avoid redundancy within the graphic

• Check for errors


Elements of the graph:
Matching the graph with the text
• Avoid redundancy.
• Use a graph to provide information
beyond what can be conveyed tersely in
text.
• Be sure text and figures are congruent.
• Work with copy editor - get the graphic
on the same page as the relevant text.
Choosing the type of graph
• Choice depends on the type of data
collected, number of observations, and
the message you wish to convey.
– Numeric details or overall pattern?
– Number of subjects or measurements?
– Data form a distribution?
– Data paired?
Graph type - examples
• Univariate simple display - pie chart,
bar graph, point graph
• Univariate distribution – one-way,
histogram, stem-and-leaf plot, box-
and-whisker, survival curve
• Bivariate display – scatterplot and its
variations (ROC, Bland-Altman)
Simple Univariate Display

100

90
80
4% 12%
p < .05
12%
70

60
10% 1

Mean VAS (mm)


2 50
3
4 40
26% 5
30
6
20
36%
10
0
A B
N = 200 Treatment Group N = 200
Univariate distributions
-1** | 79,70,68
-1** | 57,55,44
-1** | 36,34,29,27,23,23 20 Median
-1** | 18,12,09 89

-0** | 97,95,89,87,85,85,80,80 15
Treatment A
-0** | 78,76,75,75,73,66,66,64,63,63 N = 200
10
-0** | 59,59,57,55,55 Mean 77

-0** | 38,35,33,33,32,31,29,23,22,22,20 5
-0** | 19,12,10,06,04

Number of subjects
0** | 01,05,10,14,14,15,16 0
0 50 100
0** | 20,21,24,27,28,30,31,32,33,37 VAS score (mm)
0** | 46,48,52,55,57,57,59 5
0** | 64,71
10
0** | 82,87,87,90,96
1** | 00,08,13,13,14 15
1** | 25,25,26,27 2 Treatment B
N = 200
1** | 53 0 Mean 82
1** | 62 Median 83

1** | 85,90,95,98
600
PEFR (L / min)

400

200

Pre Post Pre Post


0
Drug S Drug N
Figure 6a - Drug S Figure 6b - Drug N, v.1
600 600

400 400
PEFR (L / min)

200 200

0 0
Pre Post Pre Post

Figure 6c - Drug N, v.2 Figure 6d - Drug N, v.3


600 600

400 400

200 200

0 0
Pre Post Pre Post
Figure 7 - Change in PEFR by subject

Not on Steroids On Steroids


600
PEFR (L / min)

475

350

225

100

Subjects N = 360
N=180
Not on steroids On steroids

470
Change in PEFR (L / min)

400

200

-110
110 200 300 400 510
Initial PEFR (L / min)
Special Features
• Allow extra detail or strata to be
portrayed.
• Convey complex relationships simply
• Increase information content while
maintaining visual clarity
Examples of special features
• Illustration of pairing
• Symbolic dimensionality
• Small multiples
• Layering of two graphic types to convey
detail and summary measures (eg scatterplot
with box-and-whisker plots)
Linear Regression Lowess Regression
10
Patient Satisfaction

1
0 2 4 6 8
Length of ED Stay - hours
Discharged . Admitted

10

8
Patient Satisfaction

2
1
0 2 4 6 8
Length of ED Stay - hours
ankle injury laceration wrist/hand fracture

10
7
4
Satisfaction

back pain bronchospasm vomiting


Satisfaction

10
7
4
1
Patient

don't feel well weakness headache

10
7
4
1
0 4 8 0 4 8 0 4 8

Length ofStay
Length of Stay(hours)
- hours
For a copy of these slides and a bibliography:

e mail: mikulich@ucla.edu
please put “graphing lecture” on the subject line
any comments about the lecture would be appreciated. Mr
Mikulich will anonymously forward these to us (yeah right).
Much of this material can be found in the January, 2001 issue
of Annals of Emergency Medicine.
8

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

You might also like