You are on page 1of 58

The visual display of quantitative

information: The use of graphs in


research and manuscripts
David L. Schriger, MD, MPH
Richelle J. Cooper, MD MSHS

For a copy of these slides and a bibliography:


e mail: mikulich@ucla.edu
put graphing lecture on the subject line
any comments about the lecture would be
appreciated. Mr Mikulich will anonymously
forward these to us (yeah right).
Much of this material can be found in the
January, 2001 issue of Annals of Emergency
Medicine.

Goals of the session


Importance and advantages of graphical
data display
Master use of basic features of graphs
Learn advanced techniques
Gain ability to critique graphs

Importance of graphs:
Exploratory Data Analysis
Presentation of data
Values of data elements
Relationship of data elements

Importance of graphs:
Exploratory data analysis

Picture worth 978 words


How the investigator learns about the data
Seeing is believing
If we only had n dimensions
Make multiple slices
Advanced computer methods - Fantastic voyage
Cant cover in this hour

-1.79
-1.7
-1.68
-1.57
-1.55
-1.44
-1.36
-1.34
-1.29
-1.27
-1.23
-1.23
-1.18
-1.12
-1.09

-.97
-.95
-.89
-.87
-.85
-.85
-.8
-.8
-.78
-.76
-.75
-.75
-.73
-.66
-.66

-.64
-.63
-.63
-.59
-.59
-.57
-.55
-.55
-.38
-.35
-.33
-.33
-.32
-.31
-.29

-.23
-.22
-.22
-.2
-.19
-.12
-.1
-.06
-.04
.01
.05
.1
.14
.14
.15

.16
.2
.21
.24
.27
.28
.3
.31
.32
.33
.37
.46
.48
.52
.55

.57
.57
.59
.64
.71
.82
.87
.87
.9
.96
1
1.08
1.13
1.13
1.14

1.25
1.25
1.26
1.27
1.53
1.62
1.85
1.9
1.95
1.98

. summ z
Variable |
Obs
Mean
Std. Dev.
Min
Max
---------+----------------------------------------------------z |
100
-.0692
.9138869
-1.79
1.98

. summ z,d
z
------------------------------------------------------------Percentiles
Smallest
1%
-1.745
-1.79
5%
-1.495
-1.7
10%
-1.25
-1.68
Obs
100
25%
-.755
-1.57
Sum of Wgt.
100
50%
75%
90%
95%
99%

-.155
.56
1.195
1.575
1.965

Largest
1.85
1.9
1.95
1.98

Mean
Std. Dev.

-.0692
.9138869

Variance
Skewness
Kurtosis

.8351893
.2684074
2.395736

-.155

-1.787513

1.983905

Stem-and-leaf plot for x


plot in units of .01
-1** | 79,70,68
-1** | 57,55,44
-1** | 36,34,29,27,23,23
-1** | 18,12,09
-0** | 97,95,89,87,85,85,80,80
-0** | 78,76,75,75,73,66,66,64,63,63
-0** | 59,59,57,55,55
-0** | 38,35,33,33,32,31,29,23,22,22,20
-0** | 19,12,10,06,04
0** | 01,05,10,14,14,15,16
0** | 20,21,24,27,28,30,31,32,33,37
0** | 46,48,52,55,57,57,59
0** | 64,71
0** | 82,87,87,90,96
1** | 00,08,13,13,14
1** | 25,25,26,27
1** | 53
1** | 62
1** | 85,90,95,98

Fraction

-1** | 79,70,68
-1** | 57,55,44
-1** | 36,34,29,27,23,23
-1** | 18,12,09
-0** | 97,95,89,87,85,85,80,80
-0** | 78,76,75,75,73,66,66,64,63,63
-0** | 59,59,57,55,55
-0** | 38,35,33,33,32,31,29,23,22,22,20
-0** | 19,12,10,06,04
0** | 01,05,10,14,14,15,16
0** | 20,21,24,27,28,30,31,32,33,37
0** | 46,48,52,55,57,57,59
0** | 64,71
0** | 82,87,87,90,96
1** | 00,08,13,13,14
1** | 25,25,26,27
1** | 53
1** | 62
1** | 85,90,95,98

.1

0
-1.79
z
1.98

Importance of graphs:

Why not just use


summary statistics?

Summary statistics for selected variables

Variable |

Obs

Mean

Std. Dev

---------+-------------------------------z |

100

-.0692

.9138869

a |

100

-.0692

.9136105

u |

100

-.0692

.9131416

.2

.34

Fraction

Fraction

0
-1.79

1.98
a

Fraction

0
1.98
z

1.04
u

.1

-1.79

-1

Importance of graphs:

Why not just use


analytic statistics?

. regress o n
Number of obs =
100
F( 1,
98) = 411.93
Prob > F
= 0.0000
R-squared
= 0.8078
Adj R-squared = 0.8059
Root MSE
= .40336
---------------------------------------------------o |
Coef.
Std. Err.
t
P>|t|
---------+-----------------------------------------n |
2.818721
.1388798
20.296
0.000
_cons |
-1.41951
.0816387
-17.388
0.000
----------------------------------o | [95% Conf. Interval]
---------+------------------------n | 2.543118
3.094323
_cons | -1.581519
-1.2575
-----------------------------------.

2.46622

-2.10937
.005723

.984407
n

Principles of graphing:
Graph the data, not statistics
Exploit the dimensionality of the
graphic format
Depict the unit of analysis
Stratify on confounders
Show the trees and the forest

82
82

Mean VAS (mm)

81

p< .05

80
79

77

78
77
76
75
A

B
Treatment Group

100
90

Mean VAS (mm)

80
p < .05

70
60
50
40
30
20
10
0
A
N = 200

Treatment Group

B
N = 200

The mean VAS in


group A was 77 (SD 30,
N=200) and in group B
was 82 (SD 7 , N=200)
(p< .05).

20

Median
89

15
Treatment A
N = 200

Number of subjects

10

Mean 77

5
0

50
VAS score (mm)

100

10
15
20

Treatment B
N = 200
Mean 82
Median 83

9
7 9
7 88
6988
High pain group
6878
8 58677
979847667
8
9868747667
7
88877487366678
97
777574853355569 787
894763537433555599673
7 8854653336333345469572
6 6 7533632325323343353352
96 56253432431323322222343212

Group A by treating physician

Number of cases

Low pain group

5
2 8
32 2
22222
22112
11112
11111212

4
11
VAS (mm)

71

VAS (mm)

100

Principles of graphing:
100

82

20

90
82
p< .05

80
79

77

78

p < .05

70

50
40
30
20

77

75
A

A
N = 200

Treatment Group

B
N = 200

9
7 9
7 88

Mean 77

6988
6878

5
0

50
VAS score (mm)

100

10

10
76

Treatment A
N = 200

10

60
Number of subjects

Mean VAS (mm)

Mean VAS (mm)

Median
89

15

80

81

15
2
0

8 58677
979847667
9868747667

88877487366678
97
777574853355569 787
894763537433555599673
7 8854653336333345469572
6 6 7533632325323343353352

11111212

96 56253432431323322222343212

Treatment B
N = 200
Mean 82
Median 83

Treatment Group

Graph the data, not statistics


Depict the unit of analysis
Stratify on confounders
Show the trees and the forest

8
7

28
32 2
22222
22112
11112

Importance of showing data:


No assumptions needed
Efficiency
Empowers readers to:
make their own conclusions
determine whether authors analyses are
appropriate
do their own analyses

Elements of the graph:


Title
Title should state what is being shown or
compared.
Focus reader on what they are about to
see
NOT - Change in Respiratory Function
BUT - Change in FEV1 by group and
baseline FEV1

Elements of the graph:


Legend
Makes the figure self-explanatory.
Defines abbreviations, symbols, and
methods
any regression line, p-value, or other symbol
based on calculations should be explained

Defines sample size if not shown in graph

Elements of the graph:

Axes

Elements of the graph:


Axes - scale
Appropriate boundaries: Do not overly
compress or expand the data.
Uniformity: Distance along axis must
retain consistent interpretation
throughout graph.

Mean Pain Score (VAS) mm

73
72
71
70
69
68
67
66
65
64
63
62
61
60
59
58
57
1

Post-op day

Elements of the graph:


Axes tick marks and labels
Avoid clutter
Align ticks, labels and data points
Consider specific label for first and last
point of the data set

Elements of the graph:

Data points

.99277

.074068
-.007491

3.27872

.99277

.074068
-.007491

3.27872

.99

.07
-.01

3.28

Elements of the graph:


Data points
Is it the pattern or the individual points
that you want readers to see?
Consider using jitter or alter graph
dimensions to avoid clutter.
Consider symbols to further
differentiate strata in the data.

Elements of the graph:

Chartjunk!
Any ink that does not
show or explain data

82
82

Mean VAS (mm)

81

p< .05

80
79

77

78
77
76
75
A

B
Treatment Group

Elements of the graph:


Background and Shading
Efficiency is the key to a good
graphic.
Avoid
- background shadings, background
grid lines, or unnecessary axes
- moir patterns
- 3-D effects

Elements of the graph:


Other pitfalls to avoid
Avoid redundancy within the graphic
Check for errors

Elements of the graph:


Matching the graph with the text
Avoid redundancy.
Use a graph to provide information
beyond what can be conveyed tersely in
text.
Be sure text and figures are congruent.
Work with copy editor - get the graphic
on the same page as the relevant text.

Choosing the type of graph


Choice depends on the type of data
collected, number of observations, and
the message you wish to convey.
Numeric details or overall pattern?
Number of subjects or measurements?
Data form a distribution?
Data paired?

Graph type - examples


Univariate simple display - pie chart,
bar graph, point graph
Univariate distribution one-way,
histogram, stem-and-leaf plot, boxand-whisker, survival curve
Bivariate display scatterplot and its
variations (ROC, Bland-Altman)

Simple Univariate Display


100
90
80
4%

12%

12%
1
2
3
4
5
6

60
Mean VAS (mm)

10%

26%

p < .05

70

50
40
30
20

36%

10
0
A
N = 200

Treatment Group

B
N = 200

Univariate distributions
20

Median
89

15
Treatment A
N = 200

10

Mean 77

5
Number of subjects

-1** | 79,70,68
-1** | 57,55,44
-1** | 36,34,29,27,23,23
-1** | 18,12,09
-0** | 97,95,89,87,85,85,80,80
-0** | 78,76,75,75,73,66,66,64,63,63
-0** | 59,59,57,55,55
-0** | 38,35,33,33,32,31,29,23,22,22,20
-0** | 19,12,10,06,04
0** | 01,05,10,14,14,15,16
0** | 20,21,24,27,28,30,31,32,33,37
0** | 46,48,52,55,57,57,59
0** | 64,71
0** | 82,87,87,90,96
1** | 00,08,13,13,14
1** | 25,25,26,27
1** | 53
1** | 62
1** | 85,90,95,98

100

50
VAS score (mm)

5
10
15
2
0

Treatment B
N = 200
Mean 82
Median 83

PEFR (L / min)

600

400

200

Pre

Post

Pre

Post

0
Drug S

Drug N

PEFR (L / min)

Figure 6a - Drug S

Figure 6b - Drug N, v.1

600

600

400

400

200

200

0
Pre

Post

Pre

Figure 6c - Drug N, v.2

Figure 6d - Drug N, v.3

600

600

400

400

200

200

0
Pre

Post

Post

Pre

Post

Figure 7 - Change in PEFR by subject


Not on Steroids

On Steroids

PEFR (L / min)

600
475
350
225
100
Subjects
N = 360
N=180

Change in PEFR (L / min)

Not on steroids

On steroids

470
400

200

0
-110
110

200
300
400
Initial PEFR (L / min)

510

Special Features
Allow extra detail or strata to be
portrayed.
Convey complex relationships simply
Increase information content while
maintaining visual clarity

Examples of special features

Illustration of pairing
Symbolic dimensionality
Small multiples
Layering of two graphic types to convey
detail and summary measures (eg scatterplot
with box-and-whisker plots)

Linear Regression

Lowess Regression

Patient Satisfaction

10

1
0

2
4
6
Length of ED Stay - hours

Discharged

. Admitted

10

Patient Satisfaction

8
6
4
2
1
0

2
4
6
Length of ED Stay - hours

ankle injury

laceration

wrist/hand fracture

back pain

bronchospasm

vomiting

don't feel well

weakness

headache

10
7

Satisfaction
Patient
Satisfaction

4
1

10
7
4
1

10
7
4
1
0

Length
ofStay
Stay(hours)
- hours
Length of

For a copy of these slides and a bibliography:

e mail: mikulich@ucla.edu
please put graphing lecture on the subject line
any comments about the lecture would be appreciated. Mr
Mikulich will anonymously forward these to us (yeah right).
Much of this material can be found in the January, 2001 issue
of Annals of Emergency Medicine.

8
7
6
5
4
3
2
1
0
1

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30