Professional Documents
Culture Documents
Using Functions in R
A function is a piece of code written to carry out a
specified task; it may accept arguments or parameters
(or not) and it may return one or more values (or not!)
Generically, its arguments constitute the input and their
return values their output.
Functions
R comes with many functions that you can use to do sophisticated tasks
like random sampling.
For example:
You can round a number with the round function, or calculate its
factorial with the factorial function
Using a function is pretty simple. Just write the name of the function
and then the data you want the function to operate on in
parentheses: > factorial(3)
[1] 6
> round(3.1415)
[1] 3
The data that you pass into the function is called the functions
argument. The argument can be raw data, an R object, or even the
results of another R function. In this last case, R will work from the
innermost function to the outermost, as in Figure 5:
> mean(1:6) Figure 5. When you link functions
[1] 3.5 together, R will resolve them from
> mean(die) the innermost
operation to the outermost. Here
[1] 3.5
R first looks up die, then
> round(mean(die)) calculates the mean of one
[1] 4 through six, then rounds the
> mean
There is an R function that can help roll the die. You can simulate a roll
of the die with Rs sample function. sample takes two arguments: a vector
named x and a number named size. sample will return size elements from the
vector:
> sample(x = 1:4, size = 2)
[1] 1 4
> sample(x = 1:4, size = 1)
[1] 4
> sample(x = 1:4, size = 4)
[1] 1 4 2 3
>
To roll your die and get a number back, set x to die and sample one
element from it. Youll get a new (maybe different) number each time
you roll it:
> sample(x = die, size = 1)
[1] 4
> sample(x = die, size = 1)
[1] 4
> sample(x = die, size = 1)
[1] 5
> sample(x = die, size = 1)
[1] 1
> sample(x = die, size = 1)
[1] 4
[1] 1
>
Many R functions take multiple arguments that help them do their job.
You can give a function as many arguments as you like as long as you
separate each argument with a comma.
If youre not sure which names to use with a function, you can look up the
functions arguments with args. To do this, place the name of the function in the
parentheses behind args. For example, you can see that the round function takes
two arguments, one named x and one named digits:
> args(round)
function (x, digits = 0)
NULL
>
Sampling with replacement is an easy way to create independent random samples.
Each value in your sample will be a sample of size one that is independent of the
other values. This is the correct way to simulate a pair of dice:
>
> sample(die, size = 2, replace = TRUE)
[1] 4 3
Congratulate yourself; youve just run your first simulation in R! You now have a
method for simulating the result of rolling a pair of dice. If you want to add up
the dice, you can feed your result straight into the sum function:
What would happen if you call dice multiple times? Would R generate a new pair of
dice values each time? Lets give it a try:
> dice
[1] 5 5
> dice
[1] 5 5
> dice
[1] 5 5
>
Nope. Each time you call dice, R will show you the result of that one time you called sample and
saved the output to dice. R wont rerun sample(die, 2, replace = TRUE) to create a new roll of the
dice. This is a relief in a way. Once you save a set of
results to an R object, those results do not change. Programming would be quite hard if the
values of your objects changed each time you called them. However, it would be convenient to
have an object that can re-roll the dice whenever you call it. You can make such an object by
writing your own R function.
The Function Constructor
Every function in R has three basic parts:
a name, a body of code, and a set of arguments.
To make your own function, you need to replicate these parts and store
them in an R object, which you can do with the function function. To do
this, call function() and follow it with a pair of braces, {}:
> roll()
[1] 3
You can think of the parentheses as the trigger that causes R to run
the function. If you type in a functions name without the parentheses,
R will show you the code that is stored inside the function. If you type
in the name with the parentheses, R will run that code:
> roll
function(){
die <-1:6
dice <- sample(die, size = 2, replace = TRUE)
sum(dice)
}
>
The code that you place inside your function is known as the body of
the function. When you run a function in R, R will execute all of the
code in the body and then return the result of the last line of code. If
the last line of code doesnt return a value, neither will your function, so
you want to ensure that your final line of code returns a value.
One way to check this is to think about what would happen if you ran
the body of code line by line in the command line. Would R display a
result after the last line, or would it not? Heres some code that would
display a result:
> dice <- sample(die, size = 2, replace = TRUE)
> two <- 1+1
> a <- sqrt(2)
>
>a
[1] 1.414214
> two
[1] 2
> dice
[1] 5 2
>
Arguments
Arguments
What if we removed one line of code from our function and changed the
name die to bones, like this?
> roll2 <- function() {
+ dice <- sample(bones, size = 2, replace = TRUE) Now Ill get an error when I
+ sume(dice) run the function. The
+} function needs the object
> roll2() bones to do its job, but there
Error in sample(bones, size = 2, replace = TRUE) : is no object named bones to
object 'bones' not found be found:
>
You can supply bones when you call roll2 if you make bones an
argument of the function. To do this, put the name bones in the
parentheses that follow function when you define roll2:
Now roll2 will work as long as you supply bones when you call the
function. You can take advantage of this to roll different types of dice
each time you call roll2. Remember, were rolling pairs of dice:
> roll2 <- function(bones){
+ dice <- sample(bones, size = 2, replace = TRUE)
+ sum(dice)
+}
> roll2(bones = 1:4)
[1] 8
> roll2(bones = 1:6)
[1] 8
> roll2(bones = 1:20)
[1] 33
>
Notice that roll2 will still give an error if you do not supply a value for
the bones argument when you call roll2:
> roll2 <- function(bones) {
+ dice <- sample(bones, size = 2, replace = TRUE)
+ sume(dice)
+}
> roll2()
Error in sample(bones, size = 2, replace = TRUE) :
argument "bones" is missing, with no default
>
You can prevent this error by giving the bones argument a default
value. To do this, set bones equal to a value when you define roll2:
You can prevent this error by giving the bones argument a default value.
To do this, set bones equal to a value when you define roll2:
roll2 <- function(bones = 1:6) {
+ dice <- sample(bones, size = 2, replace = TRUE)
+ sum(dice)
+}
> roll2()
[1] 8
>
You can give your functions as many arguments as you like. Just list their
names, separated by commas, in the parentheses that follow function.
Figure 3. Every function in R has the same parts, and you can use
function to create these parts.
Summary of Using Functions in R and
with more examples
Functions in R
In R, according to the base docs, you define a function
with the construct:
function ( arglist ) {body}
where the code in between the curly braces is the body
of the function. Note that by using build-in functions,
the only thing you need to worry about is how to
effectively communicate the correct input arguments
(arglist) and manage the return value/s (if any)
What are the most popular functions in R?
R has many built in functions, and you can access many
more by installing new packages. So theres no-doubt
you already use functions. This guide will show how to
write your own functions, and explain why this is
helpful for writing nice R code.
The procedure for writing any other functions is similar, involving three
key steps:
arg1, arg2, arg3: these are the arguments of the function, also
called formals. You can write a function with any number of
arguments. These can be any R object: numbers, strings, arrays,
data frames, of even pointers to other functions; anything that
is needed for the function.name function to run.
Some arguments have default values specified, such as arg3 in
our example. Arguments without a default must have a value
supplied for the function to run. You do not need to provide a
value for those arguments with a default, as the function will
use the default value.
There are two methods for loading functions into the memory:
1. Copy the function text and paste it into the console
2. Use the source() function to load your functions from file.
Our recommendation for writing nice R code is that in most cases, you
should use the second of these options. Put your functions into a file
with an intuitive name, like plotting-fun.R and save this file within the R
folder in your project. You can then read the function into memory by
calling:
source("R/plotting-fun.R")
data.frame(a=1, b=2)
returns a data.frame with two columns and
> pow(2, 8)
[1] "2 raised to the power 8 is 256"
Here, the arguments used in the function declaration (x and
y) are called formal arguments and those used while calling
the function are called actual arguments.
Built-in Function
Simple examples of in-built functions are seq(), mean(), max(),
sum(x) and paste(...) etc. They are directly called by user
written programs. You can refer most widely used R functions.
# Create a sequence of numbers from 32 to 44.
print(seq(32,44))
[1] 32 33 34 35 36 37 38 39 40 41
42 43 44
[1] 53.5
[1] 1526
User-defined Function
We can create user-defined functions in R. They are specific to what a
user wants and once created they can be used like the built-in
functions. Below is an example of how a function is created and used.
# Create a function to print squares of numbers in sequence.
new.function <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}
Calling a Function
# Create a function to print squares of numbers in sequence.
new.function <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}
> new.function()
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
>
Calling a Function with Argument Values (by position and by
name)
new.function(6)
When we execute the above code, it produces the following
result
> new.function(6)
[1] 36
[1] 6
Error in print(b) : argument "b" is missing, with no default
>
Control Structures in R
As the name suggest, a control structure controls the flow of
code / commands written inside a function. A function is a set
of multiple commands written to automate a repetitive coding
task.
Example: You have 10 data sets. You want to find the mean of
Age column present in every data set. This can be done in 2
ways: either you write the code to compute mean 10 times or
you simply create a function and pass the data set to it.
if (<condition>){
##do something
} else {
##do something
}
Example
#initialize a variable
N <- 10
#check if this variable * 5 is > 40
if (N * 5 > 40){
print("This is easy!")
} else {
print ("It's not easy!")
}
[1] "This is easy!"
for This structure is used when a loop is to be executed
fixed number of times. It is commonly used for iterating over
the elements of an object (list, vector). Below is the syntax: