Graphics and Data Visualization in R

Overview Thomas Girke

May 25, 2012

Graphics and Data Visualization in R

Slide 1/76

Overview

Graphics Environments Base Graphics Grid Graphics lattice ggplot2

Specialty Graphics

Graphics and Data Visualization in R

Slide 2/76

Outline

Overview

Graphics Environments Base Graphics Grid Graphics lattice ggplot2

Specialty Graphics

Graphics and Data Visualization in R

Overview

Slide 3/76

Graphics in R Powerful environment for visualizing scientific data Integrated graphics and statistics infrastructure Publication quality graphics Fully programmable Highly reproducible A Full L TEX Link & Sweave Link support Vast number of R packages with graphics utilities Graphics and Data Visualization in R Overview Slide 4/76 .

Documentation on Graphics in R General Graphics Task Page R Graph Gallery Link Link Link Link R Graphical Manual Paul Murrell’s book R (Grid) Graphics Interactive graphics rggobi (GGobi) iplots Link Link Link Open GL (rgl) Graphics and Data Visualization in R Overview Slide 5/76 .

svg jpeg/png/wmf/tiff/.and high-level) grid : Manual Link . pdf. Intro Link . Intro Link . Book Link High-level infrastructure lattice: Manual Link . Book Link Graphics and Data Visualization in R Overview Slide 6/76 .. Four major graphic environments Low-level infrastructure R Base Graphics (low. Book Link ggplot2: Manual Link .Graphics Environments Viewing and saving graphics in R On-screen graphics postscript..

Outline Overview Graphics Environments Base Graphics Grid Graphics lattice ggplot2 Specialty Graphics Graphics and Data Visualization in R Graphics Environments Slide 7/76 .

Outline Overview Graphics Environments Base Graphics Grid Graphics lattice ggplot2 Specialty Graphics Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 8/76 .

qqline. contour. heatmap. coplot: display of multivariant data Help on these functions ?myfct ?plot ?par Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 9/76 . persp: functions to generate image-like plots qqnorm. qqplot: distribution comparison plots pairs.Base Graphics: Overview Important high-level plotting functions plot: generic x-y plotting barplot: bar plots boxplot: box-and-whisker plot hist: histograms pie: pie charts dotchart: cleveland dot plots image.

Base Graphics: Preferred Input Data Objects Matrices and data frames Vectors Named vectors Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 10/76 .

2 0.2 q q q 0.seed(1410) > y <.2]) q 0.8 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 11/76 .4 y[.Scatter Plot: very basic Sample data set for subsequent plots > set. 2] 0. LETTERS[1:3])) > plot(y[. dimnames=list(letters[1:10].1].6 q 0. y[. 1] 0.8 q q y[.matrix(runif(30).4 q q q 0.6 0. ncol=3.

6 0.2 q 0.2 0.4 0.0 Graphics and Data Visualization in R Graphics Environments Base Graphics 0.8 q q q q q q 0.0 0.8 1.Scatter Plot: all pairs > pairs(y) 0.8 1.8 0.2 q q q q q q q q q q q q q q q q q q q q 0.4 0.4 q q q q q q B q q q q q 0.6 q 0.2 q q 0.4 A 0.8 Slide 12/76 .6 0.2 0.2 0.0 0.8 q q q q q q q q q q q q q q q 0.6 q q 0.4 q q C 0.6 0.0 0.6 0.4 0.

y[. col="red". rownames(y)) Symbols and Labels q j q 0.2 q d q i 0.4 y[.8 e g q 0.2].2].Scatter Plot: with labels > plot(y[. 2] q a q 0.1].4 f q b q h 0. main="Symbols and Labels") > text(y[. pch=20.8 q c 0.6 y[. 1] Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 13/76 .03.2 0.6 0. y[.1]+0.

cex.8.2. ylab="y label". xlab="x label". top. 5. y[. type="p".lab=1. sub="My Sub") par(op) Important arguments mar: specifies the margin sizes around the plotting area in order: c(bottom. lwd = 2) op <. rownames(y)) Usage of important plotting parameters > > > + + > grid(5.2.1].*: control font sizes For details see ?par Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 14/76 .sub=1.2. col="red".Scatter Plots: more examples Print instead of symbols the row names > plot(y[. y[.1].2]. cex.main=1.axis=1. cex. cex. y[. lwd=4. right) col: color of symbols pch: type of symbols. main="My Main".8). type="n".2].par(mar=c(8. left. samples: example(points) lwd: size of symbols cex.8. bg="lightblue") plot(y[. pch=20. main="Plot of Labels") > text(y[.1].2].

2]). y[1.2]) > myline <. but on log scale > plot(y[.2]~y[. cex=1. log="xy") Add a mathematical expression to a plot > plot(y[.Scatter Plots: more examples Add a regression line to a plot > plot(y[. y[.2].1]. y[.lm(y[.2].sqrt(x^2*pi)))).1]. abline(myline. lwd=2) > summary(myline) Same plot as above.1]).1]. > expression(sum(frac(1. y[. text(y[1.3) Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 15/76 .1].

Exercise 1: Scatter Plots Task 1 Generate scatter plot for first two columns in iris data frame and color dots by its Species column.2 1.2 setosa 4.1 3. Task 2 Use the xlim/ylim arguments to set limits on the x.1 1.Length Petal.frame" > iris[1:4.5 1.Width Species 5.2 setosa 4.3 0.Width Petal.4 0.0 1.5 0.] 1 2 3 4 Sepal.and y-axes so that all data points are restricted to the left bottom quadrant of the plot.2 setosa 4.7 3.4 0. Structure of iris data set: > class(iris) [1] "data.6 3.Length Sepal.2 setosa > table(iris$Species) setosa versicolor 50 50 virginica 50 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 16/76 .9 3.

lwd=2.1].6 0. type="l". col="blue") y[.Line Plot: Single Data Set > plot(y[.4 0.8 2 4 6 Index 8 10 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 17/76 .2 0. 1] 0.

Line Plots: Many Data Sets > split. new=FALSE) plot(y[.screen(all=TRUE) Intensity 0. col=1) for(i in 2:length(y[1.1]. ylab="".8 1. col=i. xlab="".1)).screen(c(1. main="". bty="n") } close.1). yaxt="n".i]. [1] 1 > > + + + + > plot(y[.0 2 4 6 Measurement 8 10 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 18/76 . ylim=c(0. xaxt="n". ylim=c(0. ylab="Intensity".])) { screen(1. xlab="Measurement".2 0.6 0.4 0.0 0. type="l". lwd=2.1). type="l". lwd=2.

y=as.27 0.2 0.44 0.93 a b c d 0. max(y[1:4.Bar Plot Basics > barplot(y[1:4. + legend=letters[1:4]) > text(labels=round(as.12 0.2).matrix(y[1:4.04) 1. 4)).])+0.]. ylim=c(0.2).41 0. by=1) + +sort(rep(c(0. beside=TRUE.vector(as.vector(as.32 0.53 0.0 A B Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 19/76 .2 0.47 0.1.5. x=seq(1. 13.4 0.]))+0.14 0.])).8 1.31 0.matrix(y[1:4.6 0.05 0 C 0.3).0 0.

15. bar.barplot(m <.sd(t(y)) > arrows(bar. m + stdev. m. angle = 90) 0 2 4 6 8 10 a b c d e f g h i j Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 20/76 . 10)) > stdev <.Bar Plots with Error Bars > bar <.rowMeans(y) * 10. length=0. ylim=c(0.

4 y 0.0 1 2 3 0. breaks=10) Histogram of y 4 Frequency 0 0.Histograms > hist(y.6 0.0 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 21/76 .2 0. freq=TRUE.8 1.

2 0.136 1.default(x = y) Density 0.0 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 22/76 .5 N = 30 Bandwidth = 0.Density Plots > plot(density(y).6 0.0 0.8 1.4 0. col="red") density.0 0.0 0.

pt.8. cex=1. + col=rainbow(length(y[.1].8).8).names(y).1. legend=row. start=0.3. pch=15.1]). clockwise=TRUE) > legend("topright". start=0. end=0. col=rainbow(length(y[.Pie Charts > pie(y[. bty="n".1.1]). end=0.cex=1. ncol=1) j i a b h c g a b c d e f g h i j d f e Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 23/76 .

"darkblue".1. "yellow".2)) [1] "#1A1A1A" "#4D4D4D" "#808080" "#B3B3B3" "#E6E6E6" Color gradients with colorpanel function from gplots library > library(gplots) > colorpanel(5. "white") Much more on colors in R see Earl Glynn’s color chart Link Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 24/76 .1. start=0.2)) > palette() [1] "#FF9900" "#FFBF00" "#FFE600" "#F2FF00" "#CCFF00" > palette("default") The gray function allows to select any type of gray shades by providing values from 0 to 1 > gray(seq(0. 1. by= 0.Color Selection Utilities Default color palette and how to change it > palette() [1] "black" "red" "green3" "blue" "cyan" "magenta" "yellow" "gray" > palette(rainbow(5. end=0.

for(i in 1:6) { plot(1:10) } 10 10 q q q q q q q q 10 q q q q 8 8 1:10 1:10 q q q q q q q q q q 1:10 q q 8 q q q q q q 6 6 4 4 2 2 2 4 6 Index 8 10 2 4 6 Index 8 10 2 4 2 6 4 6 Index 8 10 10 10 q q q q q q q q 10 q q q q 8 8 1:10 1:10 q q q q q q q q q q 1:10 q q 8 q q q q q q 6 6 4 4 2 2 2 4 6 Index 8 10 2 4 6 Index 8 10 2 4 2 6 4 6 Index 8 10 Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 25/76 . > par(mfrow=c(2.ncol)) one can define how several plots are arranged next to each other.3)).Arranging Several Plots on Single Page With par(mfrow=c(nrow.

c(3.show(nf) > for(i in 1:3) { barplot(1:10) } 10 10 8 6 4 2 0 Graphics and Data Visualization in R 0 2 4 6 8 10 Graphics Environments 0 2 4 6 8 Base Graphics Slide 26/76 .7). + respect=TRUE) > # layout. 2. c(5.3. > nf <.layout(matrix(c(1. byrow=TRUE).5).2. 2.3).Arranging Plots with Variable Width The layout function allows to divide the plotting device into variable numbers of rows and columns with the column-widths and the row-heights specified in the respective arguments.

pdf"). > library("RSvgDevice"). devSVG("test.. plot(1:10. plot(1:10. . > pdf("test. ps. tiff.pdf.off() Graphics and Data Visualization in R Graphics Environments Base Graphics Slide 27/76 . png.. 1:10). dev.off() Generates Scalable Vector Graphics (SVG) files that can be edited in vector graphics programs. Works for all common formats similarly: jpeg. 1:10). dev.Saving Graphics to Files After the pdf() command all graphs are redirected to file test. such as InkScape.svg").

4 0.1 1. Structure of iris data set: > class(iris) [1] "data.] 1 2 3 4 Sepal.Length Petal. Task 2 Generate two bar plots: one with stacked bars and one with horizontally arranged bars.6 3.2 setosa 4.2 1.9 3.Width Petal.7 3.2 setosa 4.Width Species 5.2 setosa 4.0 1. Organize the results in a matrix where the row names are the unique values from the iris Species column and the column names are the same as in the first four iris columns.5 1.1 3.frame" > iris[1:4.Exercise 2: Bar Plots Task 1 Calculate the mean values for the Species components of the first four columns in the iris data set.3 0.2 setosa > table(iris$Species) setosa versicolor 50 50 Graphics and Data Visualization in R virginica 50 Graphics Environments Base Graphics Slide 28/76 .4 0.Length Sepal.5 0.

Outline Overview Graphics Environments Base Graphics Grid Graphics lattice ggplot2 Specialty Graphics Graphics and Data Visualization in R Graphics Environments Grid Graphics Slide 29/76 .

grid Graphics Environment What is grid ? Low-level graphics system Highly flexible and controllable system Does not provide high-level functions Intended as development environment for custom plotting functions Pre-installed on new R distributions Documentation and Help Manual Link Book Link Graphics and Data Visualization in R Graphics Environments Grid Graphics Slide 30/76 .

Outline Overview Graphics Environments Base Graphics Grid Graphics lattice ggplot2 Specialty Graphics Graphics and Data Visualization in R Graphics Environments lattice Slide 31/76 .

lattice Environment What is lattice ? High-level graphics system Developed by Deepayan Sarkar Implements Trellis graphics system from S-Plus Simplifies high-level plotting tasks: arranging complex graphical features Syntax similar to R’s base graphics Documentation and Help Manual Link Intro Link Book Link library(help=lattice) opens a list of all functions available in the lattice package Accessing and changing global parameters: ?lattice.options and ?trellis.device Graphics and Data Visualization in R Graphics Environments lattice Slide 32/76 .

table=TRUE) > plot(p1) 2 4 6 8 A 8 B 6 4 q q 2 q q 1:8 C D q q q q 8 6 4 2 2 4 6 8 1:8 Graphics and Data Visualization in R Graphics Environments lattice Slide 33/76 .Scatter Plot Sample > library(lattice) > p1 <. as. each=2).xyplot(1:8 ~ 1:8 | rep(LETTERS[1:4].

Width Graphics and Data Visualization in R Graphics Environments lattice Slide 34/76 .Length Sepal. 3. iris. + layout = c(1.Line Plot Sample > library(lattice) > p2 <.parallel(~iris[1:4] | Species.Width Petal. 1)) > plot(p2) virginica Max Min versicolor Max Min setosa Max Min Sepal.axis = FALSE.Length Petal. horizontal.

Outline Overview Graphics Environments Base Graphics Grid Graphics lattice ggplot2 Specialty Graphics Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 35/76 .

ggplot2 Environment What is ggplot2 ? High-level graphics system Implements grammar of graphics from Leland Wilkinson Streamlines many graphics workflows for complex plots Syntax centered around main ggplot function Simpler qplot function provides many shortcuts Link Documentation and Help Manual Link Intro Link Book Link Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 36/76 .

ggplot2 Usage
ggplot function accepts two arguments
Data set to be plotted Aesthetic mappings provided by aes function

Additional parameters such as geometric objects (e.g. points, lines, bars) are passed on by appending them with + as separator. List of available geom_* functions:
Link

Settings of plotting theme can be accessed with the command theme_get() and its settings can be changed with opts(). Preferred input data object
qgplot: data.frame (support for vector, matrix, ...) ggplot: data.frame

Packages with convenience utilities to create expected inputs
plyr reshape
Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 37/76

qplot Function

qplot syntax is similar to R’s basic plot function Arguments:
x: x-coordinates (e.g. col1) y: y-coordinates (e.g. col2) data: data frame with corresponding column names xlim, ylim: e.g. xlim=c(0,10) log: e.g. log="x" or log="xy" main: main title; see ?plotmath for mathematical formula xlab, ylab: labels for the x- and y-axes color, shape, size ...: many arguments accepted by plot function

Graphics and Data Visualization in R

Graphics Environments

ggplot2

Slide 38/76

qplot: Scatter Plots
Create sample data > library(ggplot2) > x <- sample(1:10, 10); y <- sample(1:10, 10); cat <- rep(c("A", "B"), 5) Simple scatter plot > qplot(x, y, geom="point") Prints dots with different sizes and colors > qplot(x, y, geom="point", size=x, color=cat, + main="Dot Size and Color Relative to Some Values") Drops legend > qplot(x, y, geom="point", size=x, color=cat) + + opts(legend.position = "none") Plot different shapes > qplot(x, y, geom="point", size=5, shape=cat)

Graphics and Data Visualization in R

Graphics Environments

ggplot2

Slide 39/76

qplot: Scatter Plot with qplot > p <.position = "none") > print(p) Dot Size and Color Relative to Some Values 10 q q 8 q q 6 q q y 4 q q 2 q q 2 4 6 8 10 x Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 40/76 . + main="Dot Size and Color Relative to Some Values") + + opts(legend. y. size=x. color=cat. geom="point".qplot(x.

qplot(carat.qplot: Scatter Plot with Regression Line > > > + > set.0 3.diamonds[sample(nrow(diamonds).5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 41/76 . 1000).5 1. geom = c("point". price.5 3.seed(1410) dsmall <.0 1. method = "lm") print(p) 25000 20000 q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q 15000 q q q q q q q q q q q q q q q q q price 10000 5000 0 q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q qqq q q q q qq q q q q q q q q q q q qq qq q q q q q q q q q qq q q q q qq q qq q q q q q qq q qqq q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q qq q q qq q qq q q q q qqqq q q q q q q q q qq q q q qq q q q q q q q q q q q q q qq q q qq q q q qq q q q q q q q q qq qqq q q q q q q q qq qq q q q q q q q q q q q q q q q qq q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q qq q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q qq q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqqq q q qq q qq q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q 0.0 2. data = dsmall. ] p <. "smooth").5 2.

5 2. geom=c("point".0 2. "smooth").qplot: Scatter Plot with Local Regression Curve (loess) > p <.5 3.5 1.5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 42/76 . span=0.4) > print(p) # Setting 'se=FALSE' removes error shade q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 15000 q q price 10000 5000 q q q q q q q q q q q q qq q q q qq q q q q q q q q q q q qq q q q q q q q q qq qq q q qq qq q q q q q q qq qq q q q q q q q q q q q qq q q q q qqq q q q q q q q q q q q q q q qq q q q q q q q qqq q qq q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q qq q q q q q q q qqqq q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q qq qqq q q q qq q q q qq q q q q q q q q qq q q q qq q q q q q q q q q q q q qq q q q q q q q q q qq qq q q q q q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q qq q q q qq q qq q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q qq qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q qq qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0. data=dsmall.0 1.qplot(carat.0 3. price.

data... .).. position) stat_*(mapping.. etc.. Effects are global when passed on to ggplot() and local for other components. geom_*.. geom. usually a data. + stat_*() + ..)) + geom_*() + ...... stat. Layer specifications geom_*(mapping. aes(. data. .ggplot Function More important than qplot to access full functionality of ggplot2 Main arguments data set. y color: grouping vector (factor) group: grouping vector (factor) Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 43/76 . x.frame aesthetic mappings provided by aes function General ggplot syntax ggplot(data. position) Additional components scales coordinates facet aes() mappings can be passed on to all components (ggplot.

.. colour = "black")) Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 44/76 .Changing Plotting Themes with ggplot Theme settings can be accessed with theme_get() Their settings can be changed with opts() Some examples Change background color to white .background=theme_rect(fill = "white". + opts(panel.

pdf") Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 45/76 .] Saves plot stored in variable p to file > ggsave(p. color = alpha("steelblue". 0. se = F. price)) + geom_point() > p # or print(p) Returns information about data and aesthetic mappings followed by each layer > summary(p) Prints dots with different sizes and colors > bestfit <. aes(carat.geom_smooth(methodw = "lm". file="myplot.ggplot(dsmall.Storing ggplot Specifications Plots and layers can be stored in variables > p <.5). 100). > p + bestfit # Plot with custom regression line Syntax to pass on other data sets > p %+% diamonds[sample(nrow(diamonds).

price.5 1.0 3.5 3. color=color)) + + geom_point(size=4) > print(p) qq qq q qq q qq q q q q qq q q q qq q qq q q qq q q q q q q q q qq q q q q qq q q q q q q qq q qq q q qq q q q q q qq q q q q qq q q q q qq q q q qq qq q q q q q q q qq q q qq qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q qq q q q q q q q q qq qq q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q q qq q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq q q q q q q q q q q q q q q q q q q q q q q 0.0 2.ggplot(dsmall.5 15000 color q price 10000 q q q q q q q D E F G H I J 5000 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 46/76 .0 1.5 2. aes(carat.ggplot: Scatter Plot > p <.

0 3.5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 47/76 .5 3.5 2.ggplot(dsmall.0 1.5 1. se=FALSE) > print(p) 25000 20000 q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q 15000 q q q q q q q q q q price 10000 5000 0 q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q qqq q q q q qq q q q q q q q q q q q qq qq q q q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q q q q q q qq q q qq q qq q q q q q qqq q q q q q q q q qq qq q q qq q q q q q q q q q q q q q qq q q qq q q q qq q q q q q q q q qq qqq q q q q q qq qq q qq q q q q q q q q q qq q q q qq q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q qq q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q q qq q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqqq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q 0. price)) + geom_point() + + geom_smooth(method="lm".ggplot: Scatter Plot with Regression Line > p <. aes(carat.0 2.

size=2) + + geom_smooth(aes(color=color).5 1.5 3.5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 48/76 . se=FALSE) > print(p) 20000 q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q qq qq q q q q q q qq qq q q q q q q q q q q q q q q q q qq q q q q q q q q q q qq qq q qq qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q qq q q q qq q q q q q q qqq q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q qq q q q qq q q q q q q qq q q q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q qq q q q q q q qq q q q q q qq q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 15000 q color q q q q q q q q D E F G H I J price 10000 5000 0 0.ggplot: Scatter Plot with Several Regression Lines > p <.0 1. price. group=color)) + + geom_point(aes(color=color).5 2.ggplot(dsmall. aes(carat.0 3.0 2. method = "lm".

ggplot: Scatter Plot with Local Regression Curve (loess) > p <.0 3.ggplot(dsmall.5 2.5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 49/76 . price)) + geom_point() + geom_smooth() > print(p) # Setting 'se=FALSE' removes error shade q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 15000 q q price 10000 5000 q q q q q q q q q q q q qq q q q qq q q q q q q q q q q q qq q q q q q q q q qq qq q q qq qq q q q q q q qq qq q q q q q q q q q q q qq q q q q qqq q q q q q q q q q q q q q q qq q q q q q q q qqq q qq q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q qq q q q q q q q qqqq q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q qq qqq q q q qq q q q qq q q q q q q q q qq q q q qq q q q q q q q q q q q q qq q q q q q q q q q qq qq q q q q q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q qq q q q qq q qq q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q qq qq q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q qq qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.5 1.5 3.0 1.0 2. aes(carat.

0 Petal.0 0. aes(Petal. Petal.5 2.ggplot: Line Plot > p <.ggplot(iris.Width.5 1 2 3 4 5 6 Petal.5 Species setosa versicolor virginica 1.Length Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 50/76 .Length. + color=Species)) + geom_line() > print(p) 2. group=Species.Width 1.

aes(Sepal.5 3.Length.0 2. ncol=1) > print(p) setosa 4.0 versicolor Sepal.0 virginica 4.0 2.5 2.0 5 6 7 Species setosa versicolor virginica Sepal.0 3.Width 4.5 2. size=1) + + facet_wrap(~Species.0 3.0 2.5 3.5 3.ggplot: Faceting > p <.ggplot(iris.5 2.Length Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 51/76 . Sepal.Width)) + + geom_line(aes(color=Species).0 3.

4 0.Length Sepal.6 3.5 0.and y-axes so that all data points are restricted to the left bottom quadrant of the plot.3 0.2 setosa 4.] 1 2 3 4 Sepal.Exercise 3: Scatter Plots Task 1 Generate scatter plot for first two columns in iris data frame and color dots by its Species column.2 setosa 4. Structure of iris data set: > class(iris) [1] "data.5 1. ylim functionss to set limits on the x.Width Petal.2 setosa > table(iris$Species) setosa versicolor 50 50 Graphics and Data Visualization in R virginica 50 Graphics Environments ggplot2 Slide 52/76 .9 3.frame" > iris[1:4.4 0.2 setosa 4.Length Petal. Task 2 Use the xlim.7 3.2 1.1 1. Task 3 Generate corresponding line plot with faceting show individual data sets in saparate plots.Width Species 5.0 1.1 3.

rep(colnames(df)[-1]. FUN=mean) ## Calculate standard deviations for aggregates given by Species ## column in iris data set iris_sd <.aes(ymax = df_mean[.as.convertDF(iris_mean.2] .convertDF(iris_sd.vector(as.matrix(df[. "Values".frame(df[. return(df) } ## Convert iris_mean df_mean <. by=list(Species=iris$Species). convertDF <. "Samples")) { myfactor <.function(df=df. myfactor) colnames(df) <.mycolnames.2]) Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 53/76 . mydata.1]. "Values". ymin=df_mean[.df_sd[.aggregate(iris[.data. FUN=sd) ## Define function to convert data frames into ggplot2-friendly format. "Samples")) ## Define standard deviation limits limits <.aggregate(iris[.-1])) df <.1:4]. "Samples")) ## Convert iris_sd df_sd <.2] + df_sd[. "Values".1])) mydata <. by=list(Species=iris$Species). mycolnames=c("Species". each=length(df[.2]. mycolnames=c("Species". mycolnames=c("Species".1:4].ggplot: Bar Plots Sample Set: the following transforms the iris data set into a ggplot2-friendly format > > > > > > > > + + + + + > > > > > > ## Calculate mean values for aggregates given by Species column ## in iris data set iris_mean <.

Length Petal. fill = Species)) + + geom_bar(position="dodge") > print(p) 6 5 4 Species setosa versicolor Values 3 virginica 2 1 0 Petal.Length Sepal. aes(Samples.Width Sepal.ggplot(df_mean.ggplot: Bar Plot > p <. Values.Width Samples Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 54/76 .

hjust=1)) > print(p) Sepal. Values.Length 0 1 2 3 4 5 6 Values Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 55/76 .y=theme_text(angle=0.ggplot: Bar Plot Sideways > p <.Length Samples Species setosa versicolor virginica Petal.text. aes(Samples.ggplot(df_mean. fill = Species)) + + geom_bar(position="dodge") + coord_flip() + + opts(axis.Width Sepal.Width Petal.

aes(Samples.Length Petal.Width Sepal. ncol=1) > print(p) setosa 6 5 4 3 2 1 0 versicolor 6 5 Species setosa versicolor virginica Values 4 3 2 1 0 virginica 6 5 4 3 2 1 0 Petal.Length Sepal. Values)) + geom_bar(aes(fill = Species)) + + facet_wrap(~Species.Width Samples Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 56/76 .ggplot(df_mean.ggplot: Bar Plot with Faceting > p <.

fill = Species)) + + geom_bar(position="dodge") + geom_errorbar(limits. aes(Samples.Width Sepal.ggplot(df_mean. Values. position="dodge" > print(p) 6 Species Values 4 setosa versicolor virginica 2 0 Petal.Length Sepal.ggplot: Bar Plot with Error Bars > p <.Width Samples Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 57/76 .Length Petal.

brewer. Values. color=Species)) + geom_bar(position="dodge") + geom_errorbar(limits.Length Petal.Length Sepal. fill=Species. aes(Samples. position="dodge") + scale_fill_brewer(palette="Blues") + scale_color_brewer(palette = "Greys") print(p) 6 Species Values 4 setosa versicolor virginica 2 0 Petal.all() p <.ggplot(df_mean.Width Sepal.Width Samples Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 58/76 .ggplot: Changing Color Settings > > > + + > library(RColorBrewer) # display.

"green3".Width Samples Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 59/76 . "blue")) + + scale_color_manual(values=c("red". color=Species)) + + geom_bar(position="dodge") + geom_errorbar(limits.Length Sepal.Length Petal. "green3". Values.ggplot(df_mean.Width Sepal.ggplot: Using Standard Colors > p <. aes(Samples. position="dodge") + + scale_fill_manual(values=c("red". fill=Species. "blue")) > print(p) 6 Species Values 4 setosa versicolor virginica 2 0 Petal.

Length Sepal.4 0.4 0.9 3.Width Species 5.frame" > iris[1:4.1 1.0 1.6 3.2 setosa 4.5 0. Structure of iris data set: > class(iris) [1] "data.2 setosa 4.5 1.3 0.2 setosa > table(iris$Species) setosa versicolor 50 50 virginica 50 Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 60/76 .Length Petal.] 1 2 3 4 Sepal. Task 2 Generate two bar plots: one with stacked bars and one with horizontally arranged bars.7 3.Width Petal.2 1. Use the convertDF function from one of the previous slides to bring the results into the expected format for ggplot.Exercise 4: Bar Plots Task 1 Calculate the mean values for the Species components of the first four columns in the iris data set.2 setosa 4.1 3.

1053735 2 -0. paste("Sample".frame(count=1:length(y[.2866302 -0.3695197 -0.ggplot: Data Reformatting Example for Line Plot > y <.2724114 df <.7674733 2. Values)) + geom_line(aes(color=Samples)) + facet_wrap(~Samples.202619 -0. aes(Position. ] # First rows of input format expected by convertDF() g1 g2 g3 g4 > > > > > count Sample1 Sample2 Sample3 Sample4 Sample5 1 0.matrix(rnorm(500).6096398 1.6556324 1.1]).2370991 4 -0.357860 -1.6815086 -1.0326720 -0. ncol=1) print(p) ## Represent same data in box plot ## ggplot(df.1169518 1. 100.data.1699970 -0.3429578 1. sep=""))) > y <.174138 -0. mycolnames=c("Position". 1:100.8991985 3 -0. sep=""). 1:5. aes(Samples.6544649 -0. dimnames=list(paste("g". Values. "Samples")) p <.173512 -1.convertDF(y. y) > y[1:4. "Values".ggplot(df. fill=Samples)) + geom_boxplot() Sample1 2 0 −2 Sample2 2 0 −2 Sample3 2 0 −2 Sample4 2 0 −2 Sample5 2 0 −2 Samples Sample1 Sample2 Sample3 Sample4 Sample5 Graphics and Data Visualization in R Values Graphics Environments ggplot2 Slide 61/76 .4875971 -1. 5.

aes(color.ggplot: Jitter Plots > p <.ggplot(dsmall. price/carat)) + + geom_jitter(alpha = I(1 / 2). aes(color=color)) > print(p) 12000 10000 color D 8000 E F G 6000 H I J 4000 price/carat 2000 D E F G H I J color Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 62/76 .

aes(color. fill=color)) + geom_boxplot() > print(p) q q 12000 q q q q q q q q q q q q q q q q q q q q 10000 color D E F G price/carat 8000 6000 H I J 4000 2000 D E F G H I J color Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 63/76 . price/carat.ggplot(dsmall.ggplot: Box Plots > p <.

5 1.0 3.0 D E density F G H I 0.5 color 1.ggplot: Density Plot with Line Coloring > p <.ggplot(dsmall.5 2.0 0. aes(carat)) + geom_density(aes(color = color)) > print(p) 1.0 2.5 J 0.0 1.5 3.5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 64/76 .

5 3.5 1.0 3.0 1.5 carat Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 65/76 .5 J 0.0 D E density F G H I 0.ggplot(dsmall.ggplot: Density Plot with Area Coloring > p <.0 2.0 0.5 color 1.5 2. aes(carat)) + geom_density(aes(fill = color)) > print(p) 1.

2 1.ggplot: Histograms > p <.4 density 0...ggplot(iris.6 10 20 30 0.Width Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 66/76 .density...5 3..0 4.2 0.2) + geom_density() > print(p) 1. + fill = .Width)) + geom_histogram(aes(y = .count.0 2. binwidth=0. aes(x=Sepal.0 2.5 Sepal.).8 count 0 0.0 0.0 3.5 4.

4. "dog". y = value. "mouse".ggplot: Pie Chart > df <.data. aes(x = "".3.ggplot(df.frame(variable=rep(c("cat".3. start=pi / 3) + opts(title = "Pie Chart") > print(p) Pie Chart 10 12 0 variable 8 bird cat "" dog fly mouse 2 6 4 value Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 67/76 . fill = variable)) + + geom_bar(width = 1) + + coord_polar("y".2)) > p <. + value=c(1. "fly")). "bird".

aes(x = variable. y = value. start=pi / 3) + + opts(title = "Pie Chart") > print(p) Pie Chart 3 mouse fly dog cat 0/4 variable bird cat dog fly mouse 2 variable bird 1 value Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 68/76 .ggplot(df. fill = variable)) + + geom_bar(width = 1) + coord_polar("y".ggplot: Wind Rose Pie Chart > p <.

row = 2. aes(color.3. x=0.layout(2.pos. aes(color=color)) b <. alpha = I(1 / 1.row = 2. aes(color. layout. price/carat. layout.8.col = 1:2)) print(b. vp = viewport(layout.pos.ggplot(dsmall. color=color)) + geom_boxplot() c <.5).row = 1.3.pos. price/carat.pos.col = 2.ggplot(dsmall.pos. vp = viewport(layout. y=0. layout.newpage() # Open a new page on grid device pushViewport(viewport(layout = grid.ggplot: Arranging Graphics on One Page > > > > > > > > > library(grid) a <. height=0. vp = viewport(layout. aes(color.pos.col = 1)) print(c. width=0. 2))) # Assign to device viewport with 2 by 2 grid layout print(a.position = "none" grid. fill=color)) + geom_boxplot() + opts(legend.ggplot(dsmall.8)) Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 69/76 . price/carat)) + geom_jitter(size=4.

ggplot: Arranging Graphics on One Page 10000 color 8000 D E 6000 F G 4000 H I J 2000 price/carat D E F G H I J color q q 10000 color 8000 q q 10000 D 8000 q q price/carat 6000 F G price/carat E 6000 4000 H I J 4000 2000 2000 D E F G H I J D E F G H I J color color Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 70/76 .

3.8)) # dev.ggplot: Inserting Graphics into Plots > > > > # pdf("insert. height=0.8.3.pdf") print(a) print(b. vp=viewport(width=0.off() color q 10000 10000 D E F G H I price/carat 8000 6000 4000 2000 q q 8000 D EF G HIJ color J color D E price/carat 6000 F G H I J 4000 2000 D E F G H I J color Graphics and Data Visualization in R Graphics Environments ggplot2 Slide 71/76 . y=0. x=0.

Outline Overview Graphics Environments Base Graphics Grid Graphics lattice ggplot2 Specialty Graphics Graphics and Data Visualization in R Specialty Graphics Slide 72/76 .

dist(1-cor(y.dist(1-cor(t(y). dimnames=list(paste("g". Rowv=as. 5.dendrogram(hc). col=redgreen(75).hclust(as. method="spearman")).hclust(as. method="pearson")). 1:5. 1:100. method="complete") heatmap. Colv=as. density. 100.info="none". sep="").2(y. method="complete") hc <. scale="row".dendrogram(hr).Trees and Heatmaps > > + > > > + library(gplots) y <.matrix(rnorm(500). trace="none") Color Key −1 0 1 Row Z−Score g17 g5 g90 g6 g100 g45 g92 g73 g36 g2 g99 g35 g95 g77 g4 g74 g55 g69 g65 g49 g24 g28 g13 g75 g30 g89 g53 g46 g64 g29 g62 g82 g63 g26 g78 g58 g98 g9 g51 g48 g20 g12 g3 g15 g10 g70 g19 g7 g52 g72 g59 g47 g34 g50 g96 g43 g84 g32 g60 g40 g23 g42 g37 g80 g31 g56 g39 g87 g76 g81 g27 g68 g61 g88 g21 g54 g22 g85 g83 g97 g25 g91 g57 g1 g38 g14 g93 g86 g16 g8 g79 g11 g71 g67 g44 g33 g94 g41 g66 g18 t4 t1 t3 t2 Graphics and Data Visualization in R Specialty Graphics t5 Slide 73/76 . paste("t". sep=""))) hr <.

ucr.Venn Diagrams (Code) > source("http://faculty.5)) # dev.sapply(OLlist5$Venn_List. 16).overLapper(setlist=setlist5. lcex=1.5). rep(0. 20).list(A=sample(letters.off() Graphics and Data Visualization in R Specialty Graphics Slide 74/76 . type="vennsets") counts <. sep="_".5.25). 2 OLlist5 <. length) # pdf("venn. D=sample(letters.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/overLapper.pdf") vennPlot(counts=counts.5.R") > > > > > > setlist5 <.6. 18).2). B=sample(letters. ccol=c(rep(1. ccex=c(rep(1.30). C=sample(letters.1.

S5 = 18 Figure: Venn Diagram Graphics and Data Visualization in R Specialty Graphics Slide 75/76 . S4 = 22. S2 = 16. S1 = 18. S3 = 20.Venn Diagram (Plot) Venn Diagram A 0 E 0 0 0 0 1 1 2 0 0 B 0 0 0 1 1 2 2 5 0 1 1 2 3 2 0 2 0 0 0 0 D 0 C Unique objects: All = 26.

Compound Depictions with ChemmineR > library(ChemmineR) > data(sdfsample) > plot(sdfsample[1]. print=FALSE) CMP1 N q O q H q N q O q O q N q O q O q O q N q H q Graphics and Data Visualization in R Specialty Graphics Slide 76/76 .

Sign up to vote on this title
UsefulNot useful