Professional Documents
Culture Documents
One of the easiest and most reliable ways of getting data into R is to use CSV files.
The CSV file (Comma Separated Values file) is a widely supported file format
used to store tabular data. It uses commas to separate the different values in a
line, where each line is a row of data.
R’s Built-in csv parser makes it easy to read, write, and process data from CSV
files.
Specify a File
When you specify the filename only, it is assumed that the file is located in the
current folder. If it is somewhere else, you can specify the exact path that the file is
located at.
Remember! while specifying the exact path, characters prefaced by \ (like \n \r \t
etc.) are interpreted as special characters.
You can escape them by:
Changing the backslashes to forward slashes like: "C:/data/myfile.csv"
Using the double backslashes like: "C:\\data\\myfile.csv"d
# or like this
mydata <- read.csv("C:\\data\\mydata.csv")
If you want to read CSV Data from the Web, substitute a URL for a file name.
The read.csv() functions will read directly from the remote server.
write.csv(df, "mydata.csv")
mydata.csv
"","name","age","job","city"
"1","Bob","25","Manager","Seattle"
"2","Sam","30","Developer","New York"
Notice that the write.csv() function prepends each row with a row name by default.
If you don’t want row labels in your CSV file, set row.names to FALSE.
df
name age job city
1 Amy 20 Developer Houston
write.table(df, "mydata.csv",
append = TRUE,
sep = ",",
col.names = FALSE,
row.names = FALSE,
quote = FALSE)
mydata.csv
name,age,job,city
Bob,25,Manager,Seattle
Sam,30,Developer,New York
Amy,20,Developer,Houston
In order to get started you first need to install and load the package.
# or like this
mydata <- read.csv("C:\\data\\mydata.xlsx")
Specify Worksheet
When you use read.xlsx() function, along with a filename you also need to specify
the worksheet that you want to import data from.
To specify the worksheet, you can pass either an integer indicating the position
of the worksheet (for example, sheetIndex=1) or the name of the worksheet (for
example, sheetName="Sheet1" )
The following two lines do exactly the same thing; they both import the data in
the first worksheet (called Sheet1):
If you want your data interpreted as string rather than a factor, set
the stringsAsFactors parameter to FALSE.
Notice that the write.xlsx() function prepends each row with a row name by default.
If you don’t want row labels in your excel file, set row.names to FALSE.
R has very strong graphics capabilities that can help you visualize your data.
Syntax
The syntax for the plot() function is:
plot(x,y,type,main,xlab,ylab,pch,col,las,bty,bg,cex,…)
Parameters
Parameter Description
… Other graphical parameters
For symbols 21 through 25, you can specify border color using col argument and
fill color using bg argument.
Here’s a list of all the different types that you can use.
Value Description
“p” Points
“l” Lines
“n” No plotting
For example, to create a plot with lines between data points, use type="l"; to draw
both lines and points, use type="b".
plot(pressure,
main = "Vapor Pressure of Mercury",
xlab = "Temperature (deg C)",
ylab = "Pressure (mm of Hg)")
The Axes Label Style
By specifying the las (label style) argument, you can change the axes label style.
This changes the orientation angle of the labels.
Value Description
1 Always horizontal
3 Always vertical
For example, to change the axis style to have all the axes text horizontal,
use las=1
plot(pressure, las = 1)
The Box Type
Specify the bty (box type) argument to change the type of box round the plot area.
Value Description
“l”, “7”, “c”, “u”, or “]” Draws a shape around the plot area.
Add a Grid
The plot() function does not automatically draw a grid. However, it is helpful to the
viewer for some plots. Call the grid() function to draw the grid once you call
the plot().
plot(pressure)
grid()
Add a Legend
You can include a legend to your plot – a little box that decodes the graphic for
the viewer. Call the legend() function, once you call the plot().
The position of the legend can be specified using the following keywords :
“bottomright”, “bottom”, “bottomleft”, “left”, “topleft”, “top”, “topright”, “right” and
“center”.
plot(pressure)
lines(pressure$temperature/2, pressure$pressure)
You can change the line type using lty argument; and the line width
using lwd argument.
# Change the axis limits so that the x-axis and y-axis ranges from 0 to 500
plot(pressure, ylim=c(0,500), xlim=c(0,500))
To use this parameter, you need to pass a two-element vector, specifying the
number of rows and columns. Then fill each cell in the matrix by repeatedly
calling plot.
Once your plot is complete, you need to reset your par() options. Otherwise, all
your subsequent plots will appear side by side.
myPlot.png