You are on page 1of 2

Loading today's data

For today we will be making boxplots of some numerical data. I will be using McDonald's nutri-
tional information available in the McDonaldsMenu.csv le on canvas. To start lets load the data:
mcd = read.csv("FILEPATH") remember le path is the location of the le on your computer, you
can often drag the le into R to see the path
Making a single boxplot
To make a single boxplot, we can use the boxplot command and give the column of data we want
to use. For example:
boxplot(mcd$Calories)
If you want it to be horizontal instead of vertical, you can do:
boxplot(mcd$Calories, horizontal = TRUE)
In addition you can add titles, axis labels and change the colors in the same way as when you made
histograms:
boxplot(mcd$Calories, horizontal = TRUE, main = "McDonald's Calories", col =
"red", xlab = "Calories (g)")
Putting boxplots side by side
Often we will want to not just display a single boxplot, but put many boxplot on the same graph
so we can compare them easily. To do this we need to specify the data column (numerical) we
want to make the boxplot of, and a data column (catagorical) we want to use to seperate the data.
boxplot(mcd$Calories ~ mcd$Category)
This will make a boxplot for each dierent catagory of food on the menu (e.g. breakfast, beef &
pork). Right now however it doesn't look great. We can only see a few of the labels for the
dierent boxes which isn't great. To x that we can make the labels go perpendicular to the axis
like so:
boxplot(mcd$Calories ~ mcd$Category, las = 2)
Now this looks better, but we might notice that a lot of the labels are cut o on the bottom of the
graph. To x this we need to adjust the margins of the graph. To do this we can use the par
command, and then redo the boxplot:
par(mar = c(9, 4, 1, 2)) here 9 is the bottom margin, 4 is left, 1 is top and 2 is right
boxplot(mcd$Calories ~ mcd$Category, las = 2)
You might want to adjust those values a bit to get the margins to be exactly how you want.
Looking at fewer catagories
Right now we're looking at all of the dierent food catagories. But maybe we want to only consider
a few (e.g. breakfast, chicken & sh, and beef & pork) and not have all of the dierent items on our
boxplot. To do that we can just make a subset of the data, and then make a boxplot of it. First
we can do:
sand = subset(mcd, Category == "Breakfast" | Category == "Beef & Pork" | Category
== "Chicken & Fish")
Then we can make a boxplot just like we did before:
boxplot(sand$Calories ~ sand$Category, las = 2)
However you might notice that all the categories are still there, even though most of them don't
have any data. To x this we need to get rid of the empty levels:
sand = droplevels(sand)

1
This will remove all of the levels that don't have any data points. Now we should be able to make
our boxplot:
boxplot(sand$Calories ~ sand$Category, las = 2)
Don't forget that you can add additional labels or change the orientation:
boxplot(sand$Calories ~ sand$Category, las = 2, horizontal = TRUE, main =
"McDonald's Calories")
Just remember you might need to adjust the margins to get it to look nice:
par(mar = c(3, 7, 2, 1))
boxplot(sand$Calories ~ sand$Category, las = 2, horizontal = TRUE, main =
"McDonald's Calories")

You might also like