You are on page 1of 47

Module 2: R Programming Basics

Topics covered in Module-2


Introduction to R and RStudio
Looping in R
R data types
Processing time computation

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 1 / 128


Module 2: R Programming Basics

Introduction to R and RStudio

R is is a programming language and free software environment


(available freely under the GNU public license) for statistical
computing.
R was started in 1976 by Bell Laboratories as “S” for Fortran Library.
It was then created and developed by Ross Ihaka and Robert
Gentleman at the University of Auckland, New Zealand, in 1995.
It is managed and maintained by Comprehensive R Archive Network
(CRAN).
It is widely used among statisticians and data miners for developing
statistical software and data analysis.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 2 / 128


Module 2: R Programming Basics

Introduction to R and RStudio (contd.)

R and its libraries implement a wide variety of techniques including


linear and nonlinear modeling, classical statistical tests, time-series
analysis, regression, classification, clustering, etc.
It has got a very active community and package contributions
(through user-created packages) and a very little programming
knowledge is required.
RStudio is an Integrated Development Environment (IDE) for R with
advanced and more user-friendly GUI.
R is the substrate on which we can mount various features using
PACKAGES like RCMDR (R Commander) or RStudio.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 3 / 128


Module 2: R Programming Basics

Looping in R

for loop:
The for loop consists of the following parts: (i) The keyword for,
followed by parentheses; (ii) An identifier between the parentheses
(say i ); (iii) The keyword in, which follows the identifier; (iv) A vector
with values to loop over; and (v) A code block between braces that
has to be carried out for every value in the object values.
if-else:.
An if-else statement contains the same elements as an if statement,
and then some extra: (i) The keyword else, placed after the first code
block; and (ii) A second block of code, contained within braces, that
has to be carried out if and only if the result of the condition in the
if () statement is FALSE.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 4 / 128


Module 2: R Programming Basics

R data types
R has a wide variety of data types including (but not limited to) the
following. Refer Figure 10.

Basic data types: numeric (double precision), integer, character,


logical and complex.
Vectors: Defined with c() function.
Matrices: Defined with matrix(c(), ...) function.
Lists: It is a list of vectors of usually different types (i.e. numeric,
character, etc.). Defined with list() function.
Data frames: It is a list of vectors of usually different types but of the
same length. Defined with data.frame() function.

Matrices and data frames can be reshaped using cbind and rbind
functions.
Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 5 / 128
Module 2: R Programming Basics

R data types

Figure 10: R Data Types. Source:


https://medium.com/@tiwarigaurav2512/r-data-types-847fffb01d5b

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 6 / 128


Module 2: R Programming Basics

Processing time computation


proc.time() determines how much real and CPU time (in seconds) the
currently running R process has already taken.
It returns five elements for backwards compatibility, but its print
method prints a named vector of length 3.
The first two entries are the total user and system CPU times of the
current R process and any child processes on which it has waited, and
the third entry is the ‘real’ elapsed time since the process was started.
Last two entries are the cumulative sum of user and system times of
any child processes spawned by it on which it has waited.
The ‘user time’ is the CPU time charged for the execution of user
instructions of the calling process. The ‘system time’ is the CPU time
charged for execution by the system on behalf of the calling process.

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 7 / 128


Module 2: R Programming Basics

Module-2 Summary
Introduction to R and RStudio
Looping in R: for and if-else
R data types: basic data types (numeric, character, etc), vectors,
matrices and data frames
Processing time computation using proc.time()

Facilitator: Dr Sathiya Narayanan S VIT-Chennai - SENSE Fall Semester 2021-22 8 / 128


Revisitation:
Control Statements in
R

9
Control Structure in R

10
Control Structure in R

11
Control Structure in R
IF

if (condition) {
# do something
}
else {
# do something else
}

Example :

x <- 1:15
if (sample(x, 1) <= 10) {
print("x is less than 10")
}
else {
print("x is greater than 10")
}
12
Control Structure in R

If else statement:

x<-5
if(x>1){
print("x is greater than 1")
}
else{
print("x is less than 1")
}

13
Control Structure in R
Vectorization with ifelse

ifelse(x <= 10, "x less than 10", "x greater than 10")

Other valid ways of writing if/else

if (sample(x, 1) < 10) {


y <- 5
} else {
y <- 0
}
y <- if (sample(x, 1) < 10) {
5
} else {
0
}
14
Control Structure in R

x=10
if(x>1 & x<7){
print("x is between 1 and 7")
}
else if(x>8 & x< 15){
print("x is between 8 and 15")

[1] "x is between 8 and 15"

15
Control Structure in R
for
A for loop works on an itterable variable and assigns successive values
till the end of a sequence.

for (i in 1:10) {
print(i)
}
x <- c("apples", "oranges", "bananas", "strawberries")
for (i in x) {
print(x[i])
}

16
Control Structure in R
for

x = c(1,2,3,4,5)
for(i in 1:5){
print(x[i])
}

o/p

[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

17
Control Structure in R
for

for (i in 1:4) {
print(x[i])
}
for (i in seq(x)) {
print(x[i])
}
for (i in 1:4) print(x[i])

18
Control Structure in R
Nested loops

m <- matrix(1:10, 2)

for (i in seq(nrow(m))) {

for (j in seq(ncol(m))) {

print(m[i, j])

19
Control Structure in R
While

i <- 1
while (i < 10) { print(i)

i <- i + 1
}

Be sure there is a way to exit out of a while loop.

20
Control Structure in R
Example:

x = 2.987
while(x <= 4.987) {
x = x + 0.987

print(c(x,x-2,x-1))
}

o/p:

[1] 3.974 1.974 2.974


[1] 4.961 2.961 3.961
[1] 5.948 3.948 4.948

21
Control Structure in R
Repeat and break

repeat {
# simulations; generate some value have an expectation if
within some range,
# then exit the loop
if ((value - expectation) <= threshold) {
break
}
}

22
Control Structure in R
Repeat Loop:
The repeat loop is an infinite loop and used in association with a break
statement.

#Below code shows repeat loop:


a=1
repeat {
print(a); a = a+1; if(a > 4)
break
}

o/p:
[1] 1
[1] 2
[1] 3
[1] 4 23
Control Structure in R
Break Statement
A break statement is used in a loop to stop the iterations and flow the
control outside of the loop.

#Below code shows break statement:


x = 1:10
for (i in x){
if (i == 2){
break
}
print(i)
}
[1] 1
24
Control Structure in R
Next

for (i in 1:20) {
if (i%%2 == 1) {
next
} else
{
print(i)
}
}

This loop will only print even numbers and skip over odd numbers.

25
Control Structure in R
Next
Next statement enables to skip the current iteration of a loop without
terminating it.

for (i in x) {
if(i == 2){
Next
}
print(i)
}

o/p
[1] 1
[1] 3
[1] 4 26
Control Structure in R
Switch Statement
❑ A switch statement permits a variable to be tested in favor of equality
against a list of case values.

❑ In the switch statement, for each case the variable which is being
switched is checked. This statement is generally used for multiple
selection of condition based statement.

Syntax:

switch (test_expression, case1, case2, case3 .... caseN)

27
Control Structure in R
Switch Statement

i=2
gk<-switch (
i,
"First",
"Second",
"Third",
"Fourth")
print (gk)

## [1] "Second"

28
Control Structure in R

29
Control Structure in R
Scan Function
Read data from screen if let the file name "", or just without any parameter:

>x <- scan("",what="int")


1: 43 #input 43 from the screen
2:
Read 1 item
>x

[1] "43"

30
Control Structure in R
>x <-scan("",what="int")
1: 43 #input 43 from the screen
2: 22
3: 67
4:
Read 3 items
>x

[1] "43" "22" "67"

Large data can be scanned in by just copy and paste, for example paste
from EXCEL.

>x <- scan()

Then use "ctrl+v" to paste the data, the data type will be automatically
determined.
31
Examples #1
Ramu wants to buy a house in an even year, preferably a leap year only. Help him
check if the year is even and a leap year.
#Solution
# Program to check if the input year is a leap year or
not
# Program to check if the input number is odd
year = as.integer(readline(prompt="Enter a year: "))
or even.
if((year %% 4) == 0) {
# A number is even if division by 2 give a
if((year %% 100) == 0) {
remainder of 0.
if((year %% 400) == 0) {
# If remainder is 1, it is odd.
print(paste(year,"is a leap year"))
num = as.integer(readline(prompt="Enter a
} else {
number: "))
print(paste(year,"is not a leap year"))
if((num %% 2) == 0) {
}
print(paste(num,"is Even"))
} else {
} else {
print(paste(year,"is a leap year"))
print(paste(num,"is Odd"))
}
}
} else {
print(paste(year,"is not a leap year"))
}
Example #2
Rahul is learning string manipulation and wants to concatenate strings with and
without separators. He also wants to find the string length before and after
concatenation. Can you help him?
# create a string
string1 <- "Programiz"
# create two strings
string1 <- "Programiz" # use nchar() to find length of string1
string2 <- "Pro" result <- nchar(string1)

# using paste() to concatenate two strings cat("Total Length:", result)


result = paste(string1, string2)

print(result) # import stringr package


library(stringr)
# concatenate two strings using separator
result = paste(string1, string2, sep = "-") string1 <- "Programiz"

print(result) # use str_length() of stringr package to find


length
result <- str_length(string1)

cat("Total length:", result)


Task
An Electrician joins a firm on Jan 01, 2023 with a promised
compensation of Rs.2400 per day. He was promised to be given an
annual increment of [n! * one day pay] every year; ‘n’ being the year.
Help the Electrician calculate his yearly compensation at the end of 10
years using R-statements.
R functions

A function is a set of statements organized together to perform a specific task. R


has a large number of in-built functions and the user can create their own
functions.

In R, a function is an object so the R interpreter is able to pass control to the


function, along with arguments that may be necessary for the function to
accomplish the actions.

The function in turn performs its task and returns control to the interpreter as
well as any result which may be stored in other objects.
Task#2

A customer heads to the billing section of a supermarket after


loading his trolley with the required groceries. After his bill
was made, his child brought three additional Kit-Kats to be
billed. Help the supermarket biller append the new items to
the existing bill using R-functions.

You might also like