12 Density Graphs

Density plots are built in ggplot2 thanks to the geom_density geom. Only one numeric variable is need as input.
# Libraries
library(ggplot2)
library(dplyr)
# Load dataset from github

data <-
read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/1_OneNum.csv",
header=TRUE)
# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8)
Custom with theme_ipsum
The hrbrthemes package offer a set of pre-built themes for your charts. I am personnaly a big fan of
the theme_ipsum: easy to use and makes your chart look more professional:
# Libraries
library(ggplot2)
library(dplyr)
library(hrbrthemes)

data <-
read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/1_OneNum.csv",
header=TRUE)
# Make the histogram

data %>%
filter( price<300 ) %>%
ggplot( aes(x=price)) +
geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8) +
ggtitle("Night price distribution of Airbnb appartements") +
theme_ipsum()
Density with geom_density
A density chart is built thanks to the geom_density geom of ggplot2 (see a basic example). It is possible to plot this
density upside down by specifying y = -..density... It is advised to use geom_label to indicate variable names.
# Libraries
library(ggplot2)
library(hrbrthemes)
# Dummy data
data <- data.frame(
var1 = rnorm(1000),
var2 = rnorm(1000, mean=2)
)
# Chart
p <- ggplot(data, aes(x=x) ) +
# Top
geom_density( aes(x = var1, y = ..density..), fill="#69b3a2" ) +
geom_label( aes(x=4.5, y=0.25, label="variable1"), color="#69b3a2") +
# Bottom
geom_density( aes(x = var2, y = -..density..), fill= "#404080") +
geom_label( aes(x=4.5, y=-0.25, label="variable2"), color="#404080") +
theme_ipsum() +
xlab("value of x")
#p
Histogram with geom_histogram
Of course it is possible to apply exactly the same technique using geom_histogram instead of geom_density to get
a mirror histogram:
# Chart
p <- ggplot(data, aes(x=x) ) +
geom_histogram( aes(x = var1, y = ..density..), fill="#69b3a2" ) +
geom_label( aes(x=4.5, y=0.25, label="variable1"), color="#69b3a2") +
geom_histogram( aes(x = var2, y = -..density..), fill= "#404080") +
geom_label( aes(x=4.5, y=-0.25, label="variable2"), color="#404080") +
theme_ipsum() +
xlab("value of x")
#p
Multi density chart
A multi density chart is a density chart where several groups are represented. It allows to compare their distribution.
The issue with this kind of chart is that it gets easily cluttered: groups overlap each other and the figure gets
unreadable.
An easy workaround is to use transparency. However, it won’t solve the issue completely and is is often better to
consider the examples suggested further in this document.
# Libraries
library(ggplot2)
library(hrbrthemes)
library(dplyr)
library(tidyr)
library(viridis)
# The diamonds dataset is natively available with R.
# Without transparency (left)

p1 <- ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +
geom_density(adjust=1.5) +
theme_ipsum()
#p1
# With transparency (right)

p2 <- ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +
geom_density(adjust=1.5, alpha=.4) +
theme_ipsum()
#p2
Here is an example with another dataset where it works much better. Groups have very distinct distribution, it is
easy to spot them even if on the same chart. Note that it is much better to add group name next to their distribution
instead of having a legend beside the chart.
data <- read.table("https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv", header=TRUE,
sep=",")
data <- data %>%
gather(key="text", value="value") %>%
mutate(text = gsub("\\.", " ",text)) %>%
mutate(value = round(as.numeric(value),0))
# A dataframe for annotations

annot <- data.frame(
text = c("Almost No Chance", "About Even", "Probable", "Almost Certainly"),
x = c(5, 53, 65, 79),
y = c(0.15, 0.4, 0.06, 0.1)
)
# Plot
data %>%
filter(text %in% c("Almost No Chance", "About Even", "Probable", "Almost Certainly")) %>%
ggplot( aes(x=value, color=text, fill=text)) +
geom_density(alpha=0.6) +
scale_fill_viridis(discrete=TRUE) +
scale_color_viridis(discrete=TRUE) +
geom_text( data=annot, aes(x=x, y=y, label=text, color=text), hjust=0, size=4.5) +
theme_ipsum() +
theme(
legend.position="none"
)+
ylab("") +
xlab("Assigned Probability (%)")
Small Multiple with facet_wrap()
Using small multiple is often the best option in my opinion. Distribution of each group gets easy to read, and
comparing groups is still possible if they share the same X axis boundaries.
# Using Small multiple
ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +
geom_density(adjust=1.5) +
theme_ipsum() +
facet_wrap(~cut) +
theme(
legend.position="none",
panel.spacing = unit(0.1, "lines"),
axis.ticks.x=element_blank()
)
Stacked density chart
Another solution is to stack the groups. This allows to see what group is the most frequent for a given value, but it
makes it hard to understand the distribution of a group that is not on the bottom of the chart.
Visit data to viz for a complete explanation on this matter.
# Stacked density plot:
p <- ggplot(data=diamonds, aes(x=price, group=cut, fill=cut)) +
geom_density(adjust=1.5, position="fill") +
theme_ipsum()
#p

12 Density Graphs

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

12 Density Graphs

Uploaded by

Copyright:

Available Formats

Density plots are built in ggplot2 thanks to the geom_density geom. Only one numeric variable is need as input.

# Load dataset from github

# Make the histogram

# Load dataset from github

# Make the histogram

# The diamonds dataset is natively available with R.

# Without transparency (left)

# With transparency (right)

# A dataframe for annotations

Small Multiple with facet_wrap()

Stacked density chart

You might also like