You are on page 1of 7

ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...

Licence:

STHDA
Sta tistica l tools for high-throughput da ta a na lysis Search... $

Home Basics Data Visualize Analyze Resources Our Products Support About

Sign in Home / Easy Guides / R software / Data Visualization / ggplot2 - Essentials / ggplot2 : Quick correlation matrix %
heatmap - R software and data visualization

Login
Login

Password
Neue Sprache, neuer
Password
Mehrwert
Auto connect Sign in
Alle Argumente auf einen Blick: So rentiert
! Register sich Blended Learning für Ihr Unternehmen
"
? Forgotten password

Babbel Mehr
Welcome!
Want to Learn More on R
Programming and Data
Science? & ggplot2 : Quick correlation matrix heatmap - R software and
Follow us by Email
data visualization

Subscribe '
by FeedBurner
Prepare the data
Compute the correlation matrix
on Social Networks Create the correlation heatmap with ggplot2
Get the lower and upper triangles of the correlation matrix
Finished correlation matrix heatmap
Reorder the correlation matrix
Add correlation coefficients on the heatmap
Click to see our collection of Infos
resources to help you on your
path...
This R tutorial describes how to compute and visualize a correlation matrix using R software and ggplot2 package.

Prepare the data


mtcars data are used :

mydata <- mtcars[, c(1,3,4,5,6,7)]


head(mydata)

## mpg disp hp drat wt qsec


## Mazda RX4 21.0 160 110 3.90 2.620 16.46
## Mazda RX4 Wag 21.0 160 110 3.90 2.875 17.02
## Datsun 710 22.8 108 93 3.85 2.320 18.61
## Hornet 4 Drive 21.4 258 110 3.08 3.215 19.44
## Hornet Sportabout 18.7 360 175 3.15 3.440 17.02
## Valiant 18.1 225 105 2.76 3.460 20.22

Compute the correlation matrix


Correlation matrix can be created using the R function cor() :

cormat <- round(cor(mydata),2)


head(cormat)

## mpg disp hp drat wt qsec

1 sur 7 28/02/2021 à 17:16


ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...

## mpg 1.00 -0.85 -0.78 0.68 -0.87 0.42


## disp -0.85 1.00 0.79 -0.71 0.89 -0.43
## hp -0.78 0.79 1.00 -0.45 0.66 -0.71
## drat 0.68 -0.71 -0.45 1.00 -0.71 0.09
## wt -0.87 0.89 0.66 -0.71 1.00 -0.17
## qsec 0.42 -0.43 -0.71 0.09 -0.17 1.00

Read more about correlation matrix data visualization : correlation data visualization in R

Create the correlation heatmap with ggplot2


The package reshape is required to melt the correlation matrix :

library(reshape2)
melted_cormat <- melt(cormat)
head(melted_cormat)

## Var1 Var2 value


## 1 mpg mpg 1.00
## 2 disp mpg -0.85
## 3 hp mpg -0.78
## 4 drat mpg 0.68
## 5 wt mpg -0.87
## 6 qsec mpg 0.42

The function geom_tile()[ggplot2 package] is used to visualize the correlation matrix :

library(ggplot2)
ggplot(data = melted_cormat, aes(x=Var1, y=Var2, fill=value)) +
geom_tile()

Course & Specialization

Recommended for You (on


Coursera):
Course: Machine
Learning: Master the
Fundamentals
Specialization: Data
Science
Specialization: Python
for Everybody
Course: Build Skills for a
Top Job in any Industry
Specialization: Master
Machine Learning
Fundamentals The default plot is very ugly. We’ll see in the next sections, how to change the appearance of the heatmap.
Specialization: Statistics
with R
Specialization: Software
( Note that, if you have lot of data, it’s preferred to use the function geom_raster() which can be much faster.
Development in R
Specialization: Genomic
Data Science Get the lower and upper triangles of the correlation matrix
Note that, a correlation matrix has redundant information. We’ll use the functions below to set half of it to NA.

See More Resources Helper functions :

# Get lower triangle of the correlation matrix


get_lower_tri<-function(cormat){
cormat[upper.tri(cormat)] <- NA
return(cormat)
analyzing data
}
# Get upper triangle of the correlation matrix
heatmap software
get_upper_tri <- function(cormat){
cormat[lower.tri(cormat)]<- NA
data mapping return(cormat)
}
correlation data

correlation test Usage :

2 sur 7 28/02/2021 à 17:16


ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...

upper_tri <- get_upper_tri(cormat)


R Packages upper_tri

factoextra

survminer ## mpg disp hp drat wt qsec


## mpg 1 -0.85 -0.78 0.68 -0.87 0.42
ggpubr ## disp NA 1.00 0.79 -0.71 0.89 -0.43
## hp NA NA 1.00 -0.45 0.66 -0.71
ggcorrplot ## drat NA NA NA 1.00 -0.71 0.09
## wt NA NA NA NA 1.00 -0.17
fastqcr ## qsec NA NA NA NA NA 1.00

Finished correlation matrix heatmap


Melt the correlation data and drop the rows with NA values :

# Melt the correlation matrix


library(reshape2)
melted_cormat <- melt(upper_tri, na.rm = TRUE)
# Heatmap
Gagnez 30 library(ggplot2)
ggplot(data = melted_cormat, aes(Var2, Var1, fill = value))+
Go de geom_tile(color = "white")+

bonus scale_fill_gradient2(low = "blue", high = "red", mid = "white",


midpoint = 0, limit = c(-1,1), space = "Lab",
internet name="Pearson\nCorrelation") +
theme_minimal()+
Optez pour le theme(axis.text.x = element_text(angle = 45, vjust = 1,

Smartbox 4G size = 12, hjust = 1))+


coord_fixed()
de Airtel et
gagnez 30 Go
de bonus
internet.

Airtel
Madagascar

Ouvrir

Our Books

In the figure above :

negative correlations are in blue color and positive correlations in red. The function scale_fill_gradient2 is used
with the argument limit = c(-1,1) as correlation coefficients range from -1 to 1.
coord_fixed() : this function ensures that one unit on the x-axis is the same length as one unit on the y-axis.

Reorder the correlation matrix


This section describes how to reorder the correlation matrix according to the correlation coefficient. This is useful to iden-
tify the hidden pattern in the matrix. hclust for hierarchical clustering order is used in the example below.
R Graphics Essentials for
Helper function to reorder the correlation matrix :
Great Data Visualization:
200 Practical Examples
You Want to Know for reorder_cormat <- function(cormat){
Data Science # Use correlation between variables as distance
⋆ NEW!! dd <- as.dist((1-cormat)/2)
hc <- hclust(dd)
cormat <-cormat[hc$order, hc$order]
}

Reordered correlation data visualization :

Practical Guide to Cluster # Reorder the correlation matrix

3 sur 7 28/02/2021 à 17:16


ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...

Analysis in R cormat <- reorder_cormat(cormat)


upper_tri <- get_upper_tri(cormat)
# Melt the correlation matrix
melted_cormat <- melt(upper_tri, na.rm = TRUE)
# Create a ggheatmap
ggheatmap <- ggplot(melted_cormat, aes(Var2, Var1, fill = value))+
geom_tile(color = "white")+
scale_fill_gradient2(low = "blue", high = "red", mid = "white",
midpoint = 0, limit = c(-1,1), space = "Lab",
Practical Guide to
name="Pearson\nCorrelation") +
Principal Component
theme_minimal()+ # minimal theme
Methods in R
theme(axis.text.x = element_text(angle = 45, vjust = 1,
3D Plots in R size = 12, hjust = 1))+
coord_fixed()
# Print the heatmap
print(ggheatmap)

Gagnez 30
Go de Add correlation coefficients on the heatmap
bonus 1. Use geom_text() to add the correlation coefficients on the graph
2. Use a blank theme (remove axis labels, panel grids and background, and axis ticks)
internet 3. Use guides() to change the position of the legend title

Optez pour le
ggheatmap +
Smartbox 4G geom_text(aes(Var2, Var1, label = value), color = "black", size = 4) +
de Airtel et theme(
gagnez 30 Go axis.title.x = element_blank(),
de bonus axis.title.y = element_blank(),

internet. panel.grid.major = element_blank(),


panel.border = element_blank(),
panel.background = element_blank(),
Airtel
axis.ticks = element_blank(),
Madagascar legend.justification = c(1, 0),
legend.position = c(0.6, 0.7),
legend.direction = "horizontal")+
guides(fill = guide_colorbar(barwidth = 7, barheight = 1,
Ouvrir title.position = "top", title.hjust = 0.5))

Blogroll

Datanovia: Online
Data Science Courses

R-Bloggers

4 sur 7 28/02/2021 à 17:16


ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...

Read more about correlation matrix data visualization : correlation data visualization in R

Infos

( This analysis has been performed using R software (ver. 3.2.1) and ggplot2 (ver. 1.0.1)

* Enjoyed this article? I’d be very grateful if you’d help it spread by emailing it to a friend, or sharing it on Twitter,
Facebook or Linked In.

Show me some love with the like buttons below... Thank you and please don't forget to share and comment be-
low!!

Share 54 Like 54 Tweet Share


Share Enregistrer Share 113
 

Gagnez 30 Go de
bonus internet
Airtel Madagascar

Optez pour le Smartbox 4G de Airtel et


gagnez 30 Go de bonus internet.

OUVRIR

Recommended for You!

Machine Learning Essentials: Practical Guide to Cluster Analysis Practical Guide to Principal
Practical Guide in R in R Component Methods in R

5 sur 7 28/02/2021 à 17:16


ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...


More books on R and data science

R Graphics Essentials for Great Network Analysis and Visualization


Data Visualization in R

Recommended for you

* This section contains best data science and self-development resources to help you on your path.

Coursera - Online Courses and Specialization


Data science
Course: Machine Learning: Master the Fundamentals by Standford
Specialization: Data Science by Johns Hopkins University
Specialization: Python for Everybody by University of Michigan
Courses: Build Skills for a Top Job in any Industry by Coursera
Specialization: Master Machine Learning Fundamentals by University of Washington
Specialization: Statistics with R by Duke University
Specialization: Software Development in R by Johns Hopkins University
Specialization: Genomic Data Science by Johns Hopkins University

Popular Courses Launched in 2020


Google IT Automation with Python by Google
AI for Medicine by deeplearning.ai
Epidemiology in Public Health Practice by Johns Hopkins University
AWS Fundamentals by Amazon Web Services

Trending Courses
The Science of Well-Being by Yale University
Google IT Support Professional by Google
Python for Everybody by University of Michigan
IBM Data Science Professional Certificate by IBM
Business Foundations by University of Pennsylvania
Introduction to Psychology by Yale University
Excel Skills for Business by Macquarie University
Psychological First Aid by Johns Hopkins University
Graphic Design by Cal Arts

Books - Data Science


Our Books
Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
Network Analysis and Visualization in R by A. Kassambara (Datanovia)
Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)

Others
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build
Intelligent Systems by Aurelien Géron
Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
Deep Learning with R by François Chollet & J.J. Allaire
Deep Learning with Python by François Chollet

Want to Learn More on R Programming and Data Science?


Follow us by Email

Newsletter Email +

6 sur 7 28/02/2021 à 17:16


ggplot2 : Quick correlation matrix heatmap - R software and d... http://www.sthda.com/english/wiki/ggplot2-quick-correlation...

Boosted by PHPBoost

Recommended for you

ggplot2 axis ticks : A ggplot2 box plot : Quick ggplot2 histogram plot : ggplot2 scatter plots :
guide to customize tic... start guide - R softwar... Quick start guide - R s... Quick start guide - R s...

www.sthda.com www.sthda.com www.sthda.com www.sthda.com

AddThis

7 sur 7 28/02/2021 à 17:16

You might also like