111 views

Uploaded by GCEID

Lecture 8: graphing in R
and intro to ggplot

save

You are on page 1of 56

**and intro to ggplot
**

Ben Fanson

Simeon Lisovski

Lecture Outline

1) introduction to R graphics

2) introduction to ggplot

Helpful references

- http://www.cookbook-r.com/Graphs/

- ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham

R graphics

Pros

1) You can make almost any graph that you can think of

2) Graphics are publishable quality

3) Combined with the previous programming learned, you can 'easily' make

very complex graphs to visualize your data and statistical models

4) You can make lots of graphs easily [e.g. plot for each individual]

Cons

1) it takes some effort to learn the language and quirks of the graphing

approach

Overview of R main

graphics

R graphics

base plot

[original R graphics]

- plot()

- hist()

- barplot()

- pairs()

plot(...) image(...)

barplot (...)

persp(...)

pairs(...)

and lots more....

Some advantages of base plot

1) I find it the easiest to build very customized plot since you build the

plots one element at a time

#--- example code to build a plot by each element ---#

plot.new()

points(seq(0,1,0.1),seq(0,1,0.1), pch=1:10)

axis(1,at=c(0.2,0.7))

axis(2,at=c(0.1,0.8))

mtext('xlab',1,line=2)

mtext('ylab',2,line=2)

box()

abline(0,1, col='red')

mtitle('Title',lr='')

Some advantages of base plot

1) I find it the easiest way to build very customized plot, since you

build the plots one element at a time

2) being the original, it is the most integrated with packages

base plot and methods

ds <- data.frame(x=1:10,y=rnorm(10,1:10,3))

plot(ds$y ~ ds$x)

base plot and methods

ds <- data.frame(x=1:10,y=rnorm(10,1:10,3))

plot(ds$y ~ ds$x)

lm_model <- lm( y ~ x, data=ds)

par(mfrow=c(2,2))

plot(lm_model)

base plot and methods

how can plot() give you very different results?????????????

ds <- data.frame(x=1:10,y=rnorm(10,1:10,3))

plot(ds$y ~ ds$x)

lm_model <- lm( y ~ x, data=ds)

par(mfrow=c(2,2))

plot(lm_model)

plot() is not a single function

How does plot() work?

1) plot() looks at the class of the object(s) and then choose another

function

e.g. plot( y ~ x )

plot asks what is class(y) and class(x) and since both are numeric vector, it

makes a scatterplot

plot() is not a single function

How does plot() work?

2) plot() looks at the class of the object(s) and then choose another

function

e.g. plot( lm_mod )

plot asks what is class(lm_mod), and since it is a

'lm' class, it runs function plot.lm() which makes

four graphs by default

methods(plot)

base plot and methods

Overview of R main

graphics

R graphics

base plot

[original R graphics]

- plot()

- hist()

- barplot()

- pairs()

Overview of R main

graphics

R graphics

grid graphics

[ alternative framework]

base plot

[original R graphics]

- plot()

- hist()

- barplot()

- pairs()

Overview of R main

graphics

R graphics

grid graphics

[ alternative framework]

base plot

[original R graphics]

lattice

- plot()

- hist()

- barplot()

- pairs()

- xyplot()

- barchart()

- wireframe()

xyplot(...)

Faceting (aka Trellising)

barchart(...)

Lattice can also do most things are base plot

wireframe(...)

Overview of R main

graphics

R graphics

grid graphics

[ alternative framework]

base plot

[original R graphics]

lattice

- plot()

- hist()

- barplot()

- pairs()

- xyplot()

- barchart()

- wireframe()

Overview of R main

graphics

R graphics

grid graphics

[ alternative framework]

base plot

[original R graphics]

lattice ggplot2

- plot()

- hist()

- barplot()

- pairs()

- ggplot() + geom_line()

- ggplot() + geom_point()

- xyplot()

- barchart()

- wireframe()

http://mandymejia.wordpress.com/2013/11/13/10-reasons-to-switch-to-ggplot-7/

ggplot(...) + geom_point(...) + facet_wrap(...)

ggmap(...) + geom_tiles()

why I use ggplot?

1) I like the faceting and grouping...makes it easy to make quick, yet

complex graphs for data exploration

2) I found it easier to add a new layer

3) I liked the grouping options and colour schemes in ggplot

4) You can make up your own 'theme' that you can use over and over

again

5) Lots of active development in the area

cons of ggplot...

1) I find working with grid graphics more difficult than base plot. This

makes it harder to do some of those final touches on the graph.

[Note- ggplot2 community is active, so can often find the answer or

get help easy enough]

2) no 3d plotting

3) Customising axis labels for facetted graphs can be annoying

4) cannot do double axes

a) Hadley Wickham refuses to add this feature due to philosophical objections

b) though I have heard of a workaround for it

Saving a graph

jpeg(filename, height=, width=, units=,res= )

jpeg('figures/test1.jpg', height=6, width=6, units='cm', res=1000)

plot(....)

dev.off()

pdf(filename, height=, width=)

pdf('figures/test1.pdf', height=6, width=6)

plot(....)

dev.off()

Raster vs. vector graphics

Raster images

- method: based on a grid of dots (pixels). Each pixel is assigned a

colour.

- file formats: jpg, tiff, bitmap, psd

- use: best for photographs

Raster vs. vector graphics

Vector images

- method: based on mathematical equations to redraw the image

- file formats: eps, ps, pdf, ai

- use: best for drawings, logos, graphics. Much easier to do post-

processing revisions

Raster vs. vector graphics

Adobe illustrator for post-processing

Illustrator is great for minor little touches to the graphs or collating

multiple graphs into a single page.

<< illustrator quick demo >>

Short introduction to ggplot

geoms – geometric objects [think of as plot type]

e.g. scatterplot, line graph, histogram

ggplot jargon

geom_point() geom_line() geom_bar()

geoms – geometric objects [think of as plot type]

e.g. scatterplot, line graph, histogram

aes – aesthetics are the attributes associated with each geometric

object

ggplot jargon

aesthetics

x-value = 2.4

y-value = 0.4

shape = dot

colour = black

transparency = opaque

aesthetics

x-value = c(1.7,2.4,2.7...)

y-value = c(-0.5, 0.4,0.6...)

line type = solid

colour = black

transparency = opaque

geoms – geometric objects [think of as plot type]

e.g. scatterplot, line graph, histogram

aes – aesthetics are the attributes associated with each geometric

object

scales – attributes of the x-axis and y-axis [and any z-axis]

ggplot jargon

scales

continuous

ranges from -1.5 to 2.1

ticks marks at every 0.5

scales

continuous

ranges from -1.0 to 1.0

ticks marks at every 0.5

set.seed=100

ds <- data.frame(x=1:10,y=rnorm(10))

ggplot(ds, aes(x=x, y=y)) + geom_point(aes(size=y))

geoms – geometric objects [think of as plot type]

e.g. scatterplot, line graph, histogram

aes – aesthetics are the attributes associated with each geometric

object

scales – attributes of the x-axis and y-axis [and any z-axis]

facets – making separate plots broken up by one or two variables

ggplot jargon

facets

set.seed=100

ds <- data.frame(x=1:30,y=rnorm(30), sex=rep(c('m','f'),each=15))

ggplot(ds, aes(x=x, y=y)) + geom_point(aes(size=y)) + facet_grid(.~sex)

similar to dplyr grammar, think of it as a sentence that you are building

'specify dataset' + # ggplot(ds,...)

ggplot grammar

similar to dplyr grammar, think of it as a sentence that you are building

'specify dataset' +

'specify x, y, grouping variables' +

ggplot grammar

# aes(x=,y=,col=, shape=)

similar to dplyr grammar, think of it as a sentence that you are building

'specify dataset' +

'specify x, y, grouping variables' +

'specify plot layers (e.g. point, line, stat function)' +

ggplot grammar

# geom_name()

similar to dplyr grammar, think of it as a sentence that you are building

'specify dataset' +

'specify x, y, grouping variables' +

'specify plot layers (e.g. point, line, stat function)' +

'specify if you want faceting' +

ggplot grammar

# facet_grid()

similar to dplyr grammar, think of it as a sentence that you are building

'specify dataset' +

'specify x, y, grouping variables' +

'specify plot layers (e.g. point, line, stat function)' +

'specify if you want faceting' +

'specify minor details/options [labels, position of legend..]'

ggplot grammar

# scale_name(), theme(), labs()

example dataset

Bird_id Sex Treatment Growth_rate

1 male t1 12.3

2 male t2 10.3

3 male t3 14.5

4 female t1 14.3

5 female t2 9.3

6 female t3 15.6

= ds

ggplot( ds ) + geom_point( aes(x=sex, y= growth_rate ) )

scatterplot of data

ggplot( ds ) + geom_point( aes(x=trt, y= growth_rate ) )

scatterplot of data

ggplot( ds ) + geom_point( aes(x=trt, y= growth_rate, col=sex ) )

scatterplot of data – colour

by sex

ggplot( ds ) + geom_point( aes(x=trt, y= growth_rate, col=sex ) ) +

geom_line( aes(x=trt, y= growth_rate, col=sex, group=sex ) )

what if we want to add a line

group = sex is needed only

because trt is categorical.

If trt was numeric, then it

would not be needed

ggplot( ds, aes(x=trt, y= growth_rate, col=sex, group=sex ) ) +

geom_point( ) +

geom_line( )

you can move aes() to

ggplot()

ggplot( ds, aes(x=trt, y= growth_rate, col=sex, group=sex ) ) +

geom_point( ) +

geom_line( ) +

facet_grid(.~sex)

adding a facet

row ~ column

['.' just means no grouping variable]

1) I have not had to specify the dataset anymore

2) all the geom adopt the same scales (no specifying x-range or y-range)

3) grouping by colour, shape, fill, etc. is easy

4) faceting is quick

5) a common language to everything (i.e. not a bunch of separate

packages for different plot types)

key points so far

Learning about base plot

- introducing basics of plot()

- overlaying plots and customizing your plots

- discuss some more advanced plotting functions

What's next

Lecture 8: Hands on Section

1) get Lecture8.R from github

2) make sure that you have data/lecture7/ [same files as last week]

3) open up Lecture8.R in Rcourse_proj.Rpoj

4) start working through the example and then try the exercise

Lecture 8 files

- Engineering Graphics Basics Engineering108.ComUploaded byamru23
- Practical Tools PhD - Hadley-WickhamUploaded bySchaun Wheeler
- Spatial Analysis With RUploaded byEdwin J. Alvarado-Rodriguez
- unit 5 assignment 1 checkedUploaded byapi-300509071
- GIS Introduction & Basics Seminar ReportUploaded byRavindra Mathanker
- Qp April2011 42 Mec IcgUploaded byMohammedshafeeq Ahmed
- R Fundamentals (Hadley Wickham_Rice Univ)Uploaded byurcanfleur
- Enormo Feed Guide 2010-04-14Uploaded byEnormo Support
- QL-XOHBoGbQUploaded bymazux
- Project Evaluation ParametersUploaded byapi-3801064
- e port thomas lesson planUploaded byapi-276958258
- scazsc OldUploaded byDa Niel
- Text Features PresentationUploaded bytandrew182
- Forensic Analysis of Video FormatsUploaded byaxyy
- Data InterpretationUploaded byVijaya Kumari
- Contex Scanners Manual x300Uploaded byAndy Jam
- Adobe Photoshop CS5 _ Change Pixels to Transparent With the Background Eraser ToolUploaded byAnurag Jasti
- G-Clamp Assessment NotificationUploaded byJaewon Chang

- One Health 2016 Workshop FlyerUploaded byGCEID
- Outcome of 2018 Vice-Chancellor's Awards applicationUploaded byGCEID
- Program-CIE Conference-2016.xlsxUploaded byGCEID
- R Course 2014: Lecture 6Uploaded byGCEID
- World One Health Day 2017Uploaded byGCEID
- GCEID - Cooperative Research CentresUploaded byGCEID
- 5th HDR Conference Program (2014)Uploaded byGCEID
- Regular Expresssion Cheat Sheet: Lecture 3Uploaded byGCEID
- R Course 2014: Lecture 1Uploaded byGCEID
- Influenza Virus Research and Funding Opportunities by John StambasUploaded byGCEID