Professional Documents
Culture Documents
What is R?
#numeric variable
R is a programming language; interpreted mygrade <- 95
mygrade
language
Software environment for statistical analysis, #logical variable
graphics representation and reporting. canvote <- TRUE
canvote
Created by Ross Ihaka and Robert Gentleman
at the University of Auckland, New Zealand #remove variable in the memory
Currently developed by the R Development rm(x)
Core Team.
#display data type
class(mygrade)
class(canvote)
Variables
#integer variable
Name given to a memory location, which is
score <- 46L
used to store values in a computer program. score
Variables in R programming can be used to class(score)
store numbers (real and complex), words,
#complex variable
matrices, and even tables. mydata <- 3+5i
Example: x <- 5 class(mydata)
o variable x has 5 as its value
#character variable
o Variable names are case sensitive (X ! mychar <- "Hello World"
= x) mychar
myaverage <- "95"
DATA TYPES class(myaverage)
#multiple assignment
r <- f <- 5
print(r)
print(f)
# create a sequence
myseq <- 1:250
myseq
#built-in functions in R
pi
sqrt(16)
Data Structures in R Operators in R
What is Data Structure?
Arithmetic Relationa
Collection of data (collection of logical,
+ = Addition l
numeric, integer, complex, character or raw
- = Subtraction <
data)
* = Multiplication >
Deal with how the data is stored together
/ = Division ==
R Data Structures include the Vector, Lists,
%% = Remainder <=
Matrix, Arrays, Data Frames and Factors
^ = Exponentiation >=
Vectors
!=
collection of similar types of objects
each element must belong to the same data
type
Example:
> vehicles = c("car","bike","bus")
c means coerce -> values are converted to the Accessing Vector Elements
simplest type required to represent all information
Use [] brackets to access elements, This is
known as indexing. Indexing starts with
The ordering is logical<integer<numeric<character
position 1.
Vector with Consecutive Number Negative values in the index are used to drop
elements
The : operator creates a vector of consecutive Boolean values, TRUE or FALSE can be used
numbers for indexing.
lasttwo = myvector[c(4,5)]
print(lasttwo)
m = matrix(1:9, 3, 3,)
m
m = matrix(1:9, 3, 3, byrow = TRUE)
print(m)
r = 7%%2
print(r)
ex = 5^2
print(ex)
v = c(1,2,3)
t = c(4,5,6)
class(v)
class(t)
vt= v+t
print(vt)
answer <- 22 == 22
print(answer)
answer <- 22 == "Twenty two"
print(answer)
answer <- 22 != "Twenty two"
print(answer)
Data Structures Part 2
Accessing Data Frames
Data Frame
Data
• A table or two-dimensional array-like structure in
which each column contains values of one variable
and each row contains one set of values from each
column
Selecting 1 column
Selecting 1 row
Data
• Getting the summary
Arrays
Factors
Data objects which are used to categorize the
data and store it as levels
Arrays with Dimension Names Factors are created using the factor ()
function by taking a vector as input
Data Frame
sname = c("Sam", "Dominic", "Diony")
block = c("C", "A", "B")
code = c(3000, 1000, 2000)
class(sname)
Accessing Arrays class(block)
-Element of row 2, column 1, matrix 1 class(code)
df[1]
df[3,]
df[c(1,2,3),]
1,2,3 matrix 1, 2
- Use dimension name df$code <- df$code + 1
List df
R objects which contain elements of different
types like − numbers, strings, vectors and print(summary(df))
str(df)
Array
List
Factors
direction = c("N", "S", "E", "W", "N",
"E", "W", "N")
f_direction = factor(direction)
str(f_direction)
From LMS