Professional Documents
Culture Documents
R Programming Cheat Sheet: Ata Tructures
R Programming Cheat Sheet: Ata Tructures
• ddply(), llply(), ldply(), etc. (1st letter = the type of • Only one connection may be open at a time. The Default basic graphic 3.3 data.table
input, 2nd = the type of output connection automatically closes if R closes or another
connection is opened. hist(df1$col1, main = 'title', xlab = 'x dt1 <- data.table(df1, key = c('1',
• plyr can be slow, most of the functionality in plyr axis label')
can be accomplished using base function or other • If table name has space, use [ ] to surround the table '2')), dt2 <- ...‡
packages, but plyr is easier to use name in the SQL string. plot(col2 ~ col1, data = df1),
aka y ~ x or plot(x, y) • Left Join
ddply • which() in R is similar to ‘where’ in SQL
Takes a data.frame, splits it according to some Included Data lattice and ggplot2 (more popular) dt1[dt2]
variable(s), performs a desired action on it and returns a R and some packages come with data included.
data.frame • Initialize the object and add layers (points, lines, ‡ Data table join requires specifying the keys for the data
List Available Datasets data() histograms) using +, map variable in the data to an
List Available Datasets in data(package = tables
llply
a Specific Package 'ggplot2')
axis or aesthetic using ‘aes’
• Can use this instead of lapply ggplot(data = df1) + geom_histogram(aes(x
• For sapply, can use laply (‘a’ is array/vector/matrix), Missing Data (NA and NULL) = col1)) Created by Arianne Colton and Sean Chen
however, laply result does not include the names. NULL is not missing, it’s nothingness. NULL is atomical data.scientist.info@gmail.com
and cannot exist within a vector. If used inside a vector, it • Normalized histogram (pdf, not relative frequency
DPLYR (for data.frame ONLY) simply disappears. histogram) Based on content from
• Basic functions: filter(), slice(), arrange(), select(), ggplot(data = df1) + geom_density(aes(x = 'R for Everyone' by Jared Lander
Check Missing Data is.na()
rename(), distinct(), mutate(), summarise(), col1), fill = 'grey50')
Avoid Using is.null() Updated: December 2, 2015