You are on page 1of 48

UNIT IV: DATA ANALYTICS USING R

Introduction to Data Science- Introduction to R, Getting Started - R


Console, Data types and Structures, Exploring and Visualizing Data,
Programming Structures, Functions, and Data Relationships.
Introduction to R
• R is a programming language and software environment for statistical
analysis, graphics representation and reporting.
• R was created by Ross Ihaka and Robert Gentleman at the University
of Auckland, New Zealand, and is currently developed by the R
Development Core Team.
• This programming language was named R, based on the first letter of
first name of the two R authors (Robert Gentleman and Ross Ihaka),
and partly a play on the name of the Bell Labs Language S.
Features of R

• R is a well-developed, simple and effective


programming language which includes conditionals,
loops, user defined recursive functions and input and
output facilities.
• R has an effective data handling and storage facility.
• R provides a suite of operators for calculations on
arrays, lists, vectors and matrices.
• R provides graphical facilities for data analysis and
display either directly at the computer or printing at
the papers.
R-Installation Guide
• https://cran.r-project.org/
• Select Download R 3.4.2 for Windows (75 megabytes, 32/64 bit)
• plotrix package which is required for 3D charts.
• install.packages("plotrix")
R Basic Syntax
• This will launch R interpreter and you will get a
prompt > where you can start typing your program
• > print("hello world")
• [1] "hello world"
• > print(2+3)
• [1] 5
• > mysring <- "hello world"
• > print(mystring)
• The <- symbol is the assignment operator.
R Data types

• Variables are nothing but reserved memory locations.


• The variables are assigned with R-Objects and the data type of the R-
object becomes the data type of the variable.
• There are many types of R-objects.
• R has five basic or “atomic” classes of objects:
• character
• numeric (real numbers)
• integer
• complex
• logical (True/False)
R- Data Objects
• There are many types of R-objects.
Vectors
Lists
Matrices
Arrays
Factors
Data Frames
The simplest of these objects is the vector object and there are six data
types of these atomic vectors, also termed as six classes of vectors.
R Data Types Example

Data Type Example Verify


Logical v <- TRUE print(class(v))
TRUE, FALSE
[1] "logical"
Numeric v <- 23.5 print(class(v))
12.3, 5, 999 [1] “numeric"

Integer v <- 2Lprint(class(v))


2L, 34L, 0L
[1] "integer"
Complex v <- 2+5i print(class(v))
3 + 2i
[1] "complex"
Character 'a' , '"good", "TRUE", '23.4' v <- "TRUE" print(class(v))
[1] "character"
Executed example in R
> v <- TRUE
> print(class(v))
[1] "logical"
> r <-23.5
> print(class(r))
[1] "numeric"
> u <-"hello"
> print(class(u))
[1] "character"
>
What is use of C()?
• c function in R is used to create a vector with values you provide
explicitly.
• Example, v1<- c(1,2,3,4,5)
R-Operators
• An operator is a symbol that tells the compiler to perform specific mathematical
or logical manipulations.
• R language is rich in built-in operators and provides following types of operators.
• Types of Operators
• Arithmetic Operators
• Relational Operators
• Logical Operators
• Assignment Operators
• Miscellaneous Operators
• Arithmetic Operators (+,-,/,*,^)
• Adds two vectors –
• v <- c( 2,5.5,6)
• t <- c(8, 3, 4)
• print(v+t)
• Result : [1] 10.0 8.5 10.0
Contd.. R-Operators
• Relational Operators (<,>,<=,>=,==,!=)
• Each element of the first vector is compared with the corresponding
element of the second vector.
Checks if each element of the first vector is greater than the
•> corresponding element of the second vector.

v <- c(2,5.5,6,9)
t <- c(8,2.5,14,9)
print(v>t)
Result: [1] FALSE TRUE FALSE FALSE

• != Checks if each element of the first vector is unequal to the corresponding


element of the second vector.
• v <- c(2,5.5,6,9)
• t <- c(8,2.5,14,9)
• print(v!=t)
• Reult: [1] True True True False
Contd.. R-Operators
Logical Operators(&,|,!,&&,||)
• logical operators supported by R language.
• It is applicable only to vectors of type logical, numeric or complex.
• All numbers greater than 1 are considered as logical value TRUE.
• Each element of the first vector is compared with the corresponding
element of the second vector
Example,
&-
v <- c(3,1,TRUE,2+3i)
t <- c(4,1,FALSE,2+3i)print(v&t)
[1] TRUE TRUE FALSE TRUE
Contd..Logical Operators (&,|,!,&&,||)

|(logical OR):
• It combines each element of the first vector with the corresponding element of
the second vector and gives a output TRUE if one the elements is TRUE.
Example,
v <- c(3,0,TRUE,2+2i)
t <- c(4,0,FALSE,2+3i)
print(v|t)
[1] TRUE FALSE TRUE TRUE
!(Logical Not):
• It is called Logical NOT operator. Takes each element of the vector and gives the
opposite logical value.
Example,
v <- c(3,0,TRUE,2+2i)
print(!v)
Contd.. R-Operators

Assignment Operators
These operators are used to assign values to vectors.
• Called Left Assignment (<- or = <<-)
v1 <- c(3,1,TRUE,2+3i)
print(v1)
Contd.. R-Operators
• Miscellaneous Operators(: and %in%)
These operators are used to for specific purpose and not general mathematical or logical computation.
: (Colon Operator)
Colon operator. It creates the series of numbers in sequence for a vector.
v <- 2:8
print(v)
Result - [1] 2 3 4 5 6 7 8

%in%
This operator is used to identify if an element belongs to a vector.
v1 <- 8
v2 <- 12
t <- 1:10
print(v1 %in% t)
print(v2 %in% t)
[1] TRUE
[1] FALSE
Demonstrate R program to take input
message from user.
my.name <- readline(prompt="Enter name: ")
my.age <- readline(prompt="Enter age: ")

# convert character into integer


my.age <- as.integer(my.age)

print(paste("Hi,", my.name, "next year you will be", my.age+1, "years old."))

Output :
Enter name: Ram
Enter age: 17
[1] "Hi, Mary next year you will be 18 years old."
Declare Array in R programming
Arrays are the R data objects which can store data in more than two
dimensions.
• For example − If we create an array of dimension (2, 3, 1) then it
creates 1 rectangular matrices each with 2 rows and 3 columns.
Arrays can store only data type.
• An array is created using the array() function.
• It takes vectors as input and uses the values in the dim parameter to
create an array.
Declare Array in R programming – Example
R- Arrays:

# Create two vectors of different lengths.


vector1 <- c(5,9,3)
vector2 <- c(10,11,12,13,14,15)
column.names <- c("COL1","COL2","COL3")
row.names <- c("ROW1","ROW2","ROW3")
matrix.names <- c("Matrix1","Matrix2")

# Take these vectors as input to the array.

result <- array(c(vector1,vector2),dim = c(3,3,2),dimnames =


list(row.names,column.names,
matrix.names))
print(result)
Contd..Declare Array in R programming –
Example
, , Matrix1

COL1 COL2 COL3


ROW1 5 10 13
ROW2 9 11 14
ROW3 3 12 15

, , Matrix2

COL1 COL2 COL3


ROW1 5 10 13
ROW2 9 11 14
ROW3 3 12 15
Control Structures in R programming

These allow you to control the flow of execution of a script typically inside of a
function. Common ones include:
if, else
for
while
repeat
break
next
return
We don't use these while working with R interactively but rather inside functions.
If

Syntax:
if (condition) {
# do something
} else {
# do something else
}
Example:
> x <- 1.5
> if (x>1) {
+ print("x is greater than 1")
+ } else {
+ print("x is less than 1")}
[1] "x is greater than 1"
>
Other way of writing ifelse(x <= 10, "x less than 10", "x greater than 10")
for
• A for loop works on an iterable variable and assigns successive values till
the end of a sequence.
Example:1
for (i in 1:4) {
print(i)
}
Output:
[1] 1
[1] 2
[1] 3
[1] 4
Contd..for

Example2:
x <- c("a", "b", "c", "d")
for (i in 1:4) {
print(x[i])
}
Output:
a
b
c
d
While

i <- 1
while (i < 5) {
print(i)
i <- i + 1
}
Output:
1
2
3
4
repeat
• The repeat loop is an infinite loop and used in association with a
break statement.
Example:
a=1
repeat {
print(a)
a = a+1
if(a > 4)
break
}
Output: 1 2 3 4
break
• A break statement is used in a loop to stop the iterations and flow the control outside
of the loop.
Example:
x = 1:10
for (i in x){
if (i == 2){
break
}
print(i)
}
Ans: 1
next statement
• Next statement enables to skip the current iteration of a loop without terminating it.
x = 1: 4
for (i in x) {
if (i == 2){
next
}
print(i)
}
Output:
[1] 1
[1] 3
[1] 4
Write R Program to find positive or negative number

num = as.double(readline(prompt="Enter a number: "))


if(num > 0)
{
print("Positive number")
}
else
{
if(num == 0)
{
print("Zero")
}
else
{
print("Negative number")
}
}
Write a Program to Check Armstrong number

# take input from the user


num = as.integer(readline(prompt="Enter a number: "))

# initialize sum
sum = 0

# find the sum of the cube of each digit


temp = num
while(temp > 0) {
digit = temp %% 10
sum = sum + (digit ^ 3)
temp = floor(temp / 10)
}

# display the result


if(num == sum) {
print(paste(num, "is an Armstrong number"))
} else {
print(paste(num, "is not an Armstrong number"))
}
Write R Program to Sum of Digits
# take input from the user
num = as.integer(readline(prompt="Enter a number: "))

# initialize sum
sum = 0

temp = num

while(temp > 0) {
digit = temp %% 10
sum = sum + digit
temp = floor(temp / 10)
}
# display the result
print(paste(sum, "is an Answer"))
Write R Program to Reverse Number
# take input from the user
num = as.integer(readline(prompt="Enter a number: "))

reverse = 0
temp = num

while(temp > 0) {
digit = temp %% 10
reverse = reverse*10 + digit
temp = floor(temp / 10)
}
# display the result
print(paste(reverse, "is an Answer"))
Logic of Armstrong Number, Sum of Digits, Reverse Number
• Armstrong Number
while(temp > 0) {
digit = temp %% 10
sum = sum + (digit ^ 3)
temp = floor(temp / 10)
}
• Sum of Digits
while(temp > 0) {
digit = temp %% 10
sum = sum + digit
temp = floor(temp / 10)}

• Reverse Number
while(temp > 0) {
digit = temp %% 10
reverse = reverse*10 + digit
temp = floor(temp / 10)
}
Write R Program to Palindrome Number???

# take input from the user


num = as.integer(readline(prompt="Enter a number: "))

reverse = 0
temp = num

while(temp > 0) {
digit = temp %% 10
reverse = reverse*10 + digit
temp = floor(temp / 10)
}
# display the result
if(num = reverse)
{
print(paste(reverse, "it is Palindrome"))
}
else
{
print(paste(reverse, "it is not Palindrome"))
}
Functions in R
• A function is a set of statements organized together to perform a specific task.
• R has a large number of in-built functions and the user can create their own
functions.
• In-Built Functions and User Defined Functions
Function Components:
• Function Definition and Function Calling
• Function Name
• Arguments
• Function Body
• Return Value
Built In functions
• Simple examples of in-built functions are seq(), mean(), max(), sum(x)
and paste(...) etc
• # Create a sequence of numbers from 32 to 44.
print(seq(32,44))
Output : [1] 32 33 34 35 36 37 38 39 40 41 42 43 44

• # Find sum of numbers from 41 to 68.


print(sum(41:68))
[1] 1526
User-defined Function
• We can create user-defined functions in R. They are specific to what a
user wants and once created they can be used like the built-in functions.
# Create a function to print squares of numbers in sequence.
# Function Definition
new.square <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}
# Call the function new.square supplying 6 as an argument.
new.square(6)
Contd.. Create a function to print squares of numbers in sequence.

Output:
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
[1] 36
#1 Calling a Function with Argument Values

# Create a function with arguments.


new.multiply <- function(a = 3, b = 6) {
result <- a * b
print(result)
}
# Call the function without giving any argument.
new.multiply()

# Call the function with giving new values of the argument.


new.multiply(9,5)

Output:
[1] 18
[1] 45
#2 Calling a Function with Argument Values
# Create a function with arguments.
new.function <- function(a,b,c) {
result <- a * b + c
print(result)
}

# Call the function by position of arguments.


new.function(5,3,11)

# Call the function by names of the arguments.


new.function(a = 11, b = 5, c = 3)
Output???
# Create a function with arguments.
new.function <- function(a, b) {
print(a^2)
print(a)
print(b)
}

# Evaluate the function without supplying one of the arguments.


new.function(6)
Write R Program to display number upto N

num = as.integer(readline(prompt = "Enter a number: "))


Enter a number: 5
> if(num < 0) {
print("Enter a positive number")
}
else {
for(i in 1:num){
print(paste("The answer is", i))
}
}
Output:
[1] "The answer is 1"
[1] "The answer is 2"
[1] "The answer is 3"
[1] "The answer is 4"
[1] "The answer is 5"
R Programming – Even or Odd
num = as.integer(readline(prompt="Enter a number: "))
if((num %% 2) == 0) {
print(paste(num,"is Even"))
} else {
print(paste(num,"is Odd"))
}
R Programming – Factorial of Number
# take input from the user
num = as.integer(readline(prompt="Enter a number: "))
factorial = 1

# check is the number is negative, positive or zero


if(num < 0) {
print("Sorry, factorial does not exist for negative numbers")
} else if(num == 0) {
print("The factorial of 0 is 1")
} else {
for(i in 1:num) {
factorial = factorial * i
}
print(paste("The factorial of", num ,"is",factorial))
}
R Programming – Pie Charts
• R Programming language has numerous libraries to create charts and
graphs.
• A pie-chart is a representation of values as slices of a circle with
different colors.
• The slices are labelled and the numbers corresponding to each slice is
also represented in the chart.
• In R the pie chart is created using the pie() function which takes
positive numbers as a vector input.
• The additional parameters are used to control labels, color, title etc.
Syntax
pie(x, labels, radius, main, col, clockwise)
Contd..R Programming – Pie Charts

• x is a vector containing the numeric values used in the pie chart.


• labels is used to give description to the slices.
• radius indicates the radius of the circle of the pie chart.(value between −1 and +1).
• main indicates the title of the chart.
• col indicates the color palette.
• clockwise is a logical value indicating if the slices are drawn clockwise or anti
clockwise.

# Simple Pie Chart


slices <- c(10, 12,4, 16, 8)
lbls <- c("US", "UK", "Australia", "Germany", "France")
pie(slices, labels = lbls, main="Pie Chart of Countries")
>
Contd..R Programming – Bar Charts
main to give the title, xlab and ylab to provide labels for the axes, names.arg for naming each bar,
col to define color
#1 Simple Bar Chart
temp <- c(22, 27, 26, 24, 23, 26, 28)
barplot(temp)

#2 Bar Chart Added Parameters


max.temp <- c(22, 27, 26, 24, 23, 26, 28)
barplot(max.temp,
+ main = "Maximum Temperatures in a Week",
+ xlab = "Degree Celsius",
+ ylab = "Day",
+ names.arg = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"),
+ col = "darkred",
+ horiz = TRUE)
Plotting Categorical Data
> age <- c(17,18,18,17,18,19,18,16,18,18)
> table(age)
age
16 17 18 19
1 2 6 1

> barplot(table(age),
+ main="Age Count of 10 Students",
+ xlab="Age",
+ ylab="Count",
+ border="red",
+ col="blue",
+ density=10
+)
>

You might also like