You are on page 1of 30

Functions in R

In-Built Functions in R
Numeric Functions

R Introduction 2
#To install package in R
install.packages(“package name”)
# To see all installed package in R
Installed.packages( )
# To activate the package
library(“name of packages”)
# To bring up help
help(command)
?command
#search for package
Search()
#to make package unavailable for use
Detach(package : name)
R Introduction 3
apropos(("plot")) #shows all commands that contain “plot”
[1] ".rs.api.savePlotAsImage" ".rs.replayNotebookPlots" "assocplot“
[4] "barplot" "barplot.default" "biplot"
[7] "boxplot" "boxplot.default" "boxplot.matrix"
[10] "boxplot.stats" "cdplot" "coplot"
[13] "fourfoldplot" "interaction.plot" "lag.plot"
[16] "matplot" "monthplot" "mosaicplot"
[19] "plot" "plot.default" "plot.design“
[22] "plot.ecdf" "plot.function" "plot.new"
[25] "plot.spec.coherency" "plot.spec.phase" "plot.stepfun"
[28] "plot.ts" "plot.window" "plot.xy"
[31] "preplot" "qqplot" "recordPlot"
[34] "replayPlot" "savePlot" "screeplot"
[37] "spineplot" "sunflowerplot" "termplot"
[40] "ts.plot"

R Introduction 4
Combining vectors Creating Contingency Tables
F<-c(2,3,5,7,6) table (f1)
F1<-c(F,8,9,3,2)
F1
Sorting and ranking the data Note that order() is different from rank().
sort(F1) rank() will return you rank of the elements
rank(F1) while order() returns the ranked element's
rank(sort(F1)) position in the original list:

Declare a data.frame a <- c(45,50,10,96)


n<-data.frame(x=1:5,m1=c(2,3,4,3,2))
sort(a)
To see the structure of the data object [1] 10 45 50 96
str(n) order(a)
[1] 3 1 2 4
To get the statistical summary of data rank(a)
summary(n) [1] 2 3 1 4
R Introduction 5
Cumulative Statistics
f1<-c( 3 4 5 7 6 5 4 6 10 4 5 )
Cumulative Sum, Max, Min & Product
cumsum(f1)
[1] 3 7 12 19 25 30 34 40 50 54 59
cummax(f1)
[1] 3 4 5 7 7 7 7 7 10 10 10
cumprod(f1)
[1] 3 12 60 420 2520 12600 50400 302400 3024000 12096000 60480000
Objects with NA
y<-c(2,NA,7,9)
cumprod(y)
[1] 2 NA NA NA

cumsum(y)
[1] 2 NA NA NA

R Introduction 6
Descriptive Statistics

Descriptive Statistics
R provides a wide range of functions for obtaining summary statistics. One
method of obtaining descriptive statistics is to use the sapply( ) function
with a specified summary statistic. Possible functions used in sapply
include mean, sd, var, min, max, median, range, and quantile.

Generating random numbers in R:

rnorm(n, mean, sd)


rnorm(10,5,1)
[1] 4.675599 7.163377 6.605899 3.900403 5.671785 4.717104 5.207579
3.812271 6.382171 [10] 4.711080
R Introduction 7
Descriptive Statistics

x<-c(2,3,4,6,NA)
Dealing with missing values
>sapply(x, mean, na.rm=TRUE)
[1] 2 3 4 6 NaN

R Introduction 8
Summary Command for Data Objects
x<-c(2,3,4,6,NA)
summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA‘s
2.00 2.75 3.50 3.75 4.50 6.00 1
Summary of Data Frames Apply command for summaries on Row or Columns
n<-data.frame(x=1:5,m1=c(2,3,4,3,2)) apply(n,1,mean)
summary(n) [1] 1.5 2.5 3.5 3.5 3.5
apply(n, 2,mean)
density(n) # used mostly for plot
x m1
max(n) / min(n) 3.0 2.8
length(n) sapply(n,mean,rm.na=TRUE)
rowMeans(n) x m1
colMeans(n) 3.0 2.8
rowSums(n)

R Introduction 9
Exercise: Create a data file of 20 subjects with age and income &
perform the descriptive analysis.
apply(), sapply(), tapply() in R
• The apply() function is the most basic of all collection. Along with this
sapply(), tapply() & lapply() are also used.
• The apply collection can be viewed as a substitute to the loop.
• They can be used for an input list, matrix or array and apply a
function.
• Any function can be passed into apply().
apply() function
apply() can be used for an input list, matrix or array and apply a
function. Any function can be passed into apply().
apply(X, MARGIN, FUN)
x: an array or matrix
MARGIN=1: the manipulation is performed on rows
MARGIN=2: the manipulation is performed on columns
MARGIN=c(1,2) the manipulation is performed on rows and columns
FUN: tells which function to apply. Built functions like mean, median, sum, min, max and even user-
defined functions can be applied.
apply ()
lapply() Function
lapply(X, FUN)
Arguments:-
X: A vector or an object
FUN: Function applied to each element of x

• l in lapply() stands for list.


• The difference between lapply() and apply() lies between the output
return.
• The output of lapply() is a list. lapply() can be used for other objects
like data frames and lists.
• lapply() function does not need MARGIN.
lapply() Function
A very easy example can be to change the string value of a matrix to
lower case with tolower function. Construct a matrix with the name.
The name is in upper case format.
sapply() Function

sapply() function does the same jobs as lapply()


function but returns a vector.

sapply(X, FUN)
Arguments:-
X: A vector or an object
FUN: Function applied to each element of x

To measure the minimum speed and stopping


distances of cars from the cars dataset.
tapply() Function
• tapply () splits the array based on
specified data, usually factor levels
and then applies the function to it.

For example, in the mtcars dataset:


tapply(mtcars$wt, mtcars$cyl, mean)

The tapply() function first groups the cars


together based on the number of cylinders
they have, and then calculates the mean
weight for each group.
mapply() Function
• mapply is a multivariate version of sapply.
• It will apply the specified function to the first element of each
argument first, followed by the second element, and so on.
• For example:

• It adds 1 with 6, 2 with 7, and so on.


Summary-apply(), lapply() & sapply()

Function Arguments Objective Input Output


apply () apply(x, MARGIN, Apply a function to Data frame or vector, list,
FUN) the rows or columns matrix array
or both
lapply () lapply(X, FUN) Apply a function to all List, vector or list
the elements of the data frame
input
sapply() sappy(X, FUN) Apply a function to all List, vector or vector or
the elements of the data frame matrix
input
Statistical Probability Functions
The following table describes functions related to probability distributions. For random number generators
below:.
Class Exercise
Explain following functions with suitable example (Using Laptop):
1. cbind()/ rbind
2. str()/summary()
3. head()/tail()
4. order()/sort()
5. read.csv()/ read
6. colSums/rowSums
7. apply()/sapply()
8. hist()/boxplot()
9. var()/cov()
10. save()/load()
Create Functions • Create a Function to calculate
mean and standard deviation

• Create a Function to concatenate


characters and numbers
Calculating p Values (T-Test)
In order to calculate P values for many tests lets assume that one-sided hypothesis test for a
number of comparisons. In particular three different hypothesis will be framed which would be of
following form:
H0=µ1-µ2 = 0,
Ha=µ1-µ2 ≠ 0
Comparison 1
Mean Std. Dev. Number (pop.)
Group 1 10 3 300
Group 2 10.5 2.5 230
Comparison 2
Group 1 12 4 210
Group 2 13 5.3 340
Comparison 3
Group 1 30 4.5 420
Group 2 28.5 3 400
Calculating t values and p values
T-test Using R Function
T-test Using R Function

You might also like