You are on page 1of 15

R: Graphics, Visualization, and Basic Data Analysis

Comma-separated values (CSV) files

Data files have many formats and accordingly we have options for loading them.
>data=read.csv(“C:\\Users\\admin\\Desktop\\data.csv”) Or
>data=read.csv(“C:/Users/admin/Desktop/stud .csv”)

This Presentation provides the most basic information to get started


producing plots in R. First of all, there is a three-line code example that
demonstrates the fundamental steps involved in producing a plot. This is followed
by a series of figures to demonstrate the range of images that R can produce. There
is also a section on the organization of R graphics giving information on where to
look for a particular function. The final section describes the different graphical
output formats that R can produce and how to obtain a particular output format.

Use the Animals data in R to practice accessing summary statistics.

Find the number of observations in Animals using the structure command. Note this also
gives you information about the class of the object and the types of variables
>str(Animals)

Output:

'data.frame': 28 obs. of 2 variables:


$ body : num 1.35 465 36.33 27.66 1.04 ...
$ brain: num 8.1 423 119.5 115 5.5 …

You may also want to explore the relationships between variables in your data. For
example, you could investigate the correlation between body and brain weight in
Animals.
cor(Animals$brain, Animals$body)

Output:

[1] -0.005341163

For a more detailed look at what values of our data seem to go together, you could also
use cross tabulation. There are several ways to do this in R. Use the Uscereal data in R to
practice cross tabulation.

Basic Data Visualization


One of the many advantages of R is its flexible and extensive set of graphical capabilities.
However: with great power comes great responsibility. This section will include some
common ways of summarizing your data visually, but some graphical choices will be
more informative than others, and some methods of visualizing your data will be actively
misleading.
For each of these examples, use one of R’s built-in datasets and then try with your own
data.
Note:If you are using the R console, by default R will open a graphics viewing window
when you create your first plot. Each later plot command you run will then overwrite the
prior plot. You can save plots to objects just like you can vector values or dataframes
(myplot <- plot()). You an also manually open new viewing windows before running a
new plot command (windows() on Windows,quartz()on Mac).

A simple plot plot(X) has each element of a discrete variable X ploted on the y-
axis and the element's index on the x-axis.
Pie Charts
Pie charts are commonly used to visualize the proportion of data belonging to
certain categories. While common in news media, pie charts are controversial
among data visualizers, statisticians, and data scientists. Essentially, experts argue,
humans have a very difficult time visually assessing the differences among
categories using a single pie chart (because it requires evaluating area rather than
length, as in a bar chart), and that comparing across pie charts is nearly impossible.
Great alternatives to a pie chart include bar charts and dot plots.

Box Plots
Box/whisker plots summarize the distribution of data in a collapsed format, often
representing the mean of the data in each of several categories, the quantiles, and
any outliers. Violin plots are a variation on boxplots utilizing kernel density to
illustrate the density of the data distribution rather than a collapsed summary.
Editing Base Graphics
You may have noticed that R includes a significant amount of information in
plots by default, often including axis labels and titles or colors, but that whether
and what is included varies. To edit these characteristics of plots yourself, you will
commonly use the plot() command.
More Advanced Base Graphics
In addition to having built-in data, R also has a built-in demo for some of its more
advanced graphics options. You can walk through these demos as follows:

Even More Advanced Graphics: ggplot2


In addition to base graphics, R also has several packages that you can use to make
even more advanced graphics.ggplot2 is one popular option. Created by Hadley
Wickham (currently chief scientist at RStudio and creator of the tidyverse, which
we will explore later), ggplot2 has a different syntax from base plot commands,
which can make it challenging to master, but the payoffs are great.
The basics of ggplot2 are contained in the qplot()function. This is ggplot2’s analog
to base plot() and includes a ton of options:

You might also like