Professional Documents
Culture Documents
Pipes/Apply in R
We will use the %>% operator which is part of the library magrittr.
With the help of the piping operator %>%, we can restructure the
statement as follows
m %>% log() %>% diff() %>% exp() %>% round(1)
Piping
Four reasons why you should be using pipes in R:
● You'll structure the sequence of your data operations from left to right,
as apposed to from inside and out
● You'll minimize the need for local variables and function definitions
mean(select(filter(mpg,model=="a4"),displ)$displ)
mpg %>% filter(., model=="a4") %>% select(., displ) %>% .$displ %>% mean(.)
Or
We can use the output of one statement as an input for the next statement in several palces
x %>% {cos(.)sin(.)}
In case you want to override the value of the left-hand side, we use the compound
assignment operator %<>%
x ← x %>% sqrt()
Becomes
x %<>% sqrt()
Tee Operator
The Tee operator %T>% returns the left hand side value rather than the potential result of
the right-hand side operations.
The tee operator can come in handy in situations where you have included functions that
are used for their side effect, such as plotting with plot() or printing to a file.
> set.seed(123)
> rnorm(200) %>% matrix(ncol = 2) %T>% plot %>% colSums
Exposing Data Variables
For functions that don’t have a data argument, such as the cor() function, it's still handy if you
can expose the variables in the data.
● apply takes Data frame or matrix as an input and gives output in vector, list or array
- x: an array or matrix
- MARGIN: take a value or range between 1 and 2 to define where to apply the function:
-MARGIN=1`: the manipulation is performed on rows
-MARGIN=2`: the manipulation is performed on columns
-MARGIN=c(1,2)` the manipulation is performed on rows and columns
- FUN: tells which function to apply. Built functions like mean, median, sum, min, max and
even user-defined functions can be applied
Apply
● Each element of the output is the result of applying FUN to the corresponding element of
the list
● lapply(X, FUN)
● Arguments:
-X: A vector or an object
-FUN: Function applied to each element of x
lapply
● sapply(X, FUN)
● Arguments:
-X: A vector or an object
-FUN: Function applied to each
element of x
tapply
● The main idea is to develop reusable, structured code that is easy to maintain and to
extend.
Exercise
● Create a function modulo that returns the modulo of an object x (use the operator %% 10)
● Create a 10x10 matrix M with values going from 2 to 200 with a step of 2. Use the function
seq to create such a matrix.