Professional Documents
Culture Documents
[1] 2 3 4
[1] 1 3 9
3/64
Numeric (real numbers)
Character (letters)
4/64
Sequences
· Inclusive sequences of integers can be generated using the : operator
[1] 10 11 12 13 14 15 16 17 18
[1] 2 6 10 14 18
5/64
· Sequences of repeated entries: the rep() function
[1] 4 4 4 4 4
6/64
rep(c(2, 5), 3) # repeat the series 2 & 5 three times
[1] 2 5 2 5 2 5
rep(c(2, 5), c(3, 2)) # '2' three times and '5' twice
[1] 2 2 2 5 5
In the two last examples, there are functions within functions. In that case, the
inner most function is evaluated first.
7/64
Character vectors
8/64
months2 <- c("January/2020", "February/2020", "March/2020",
"April/2020", "May/2020", "June/2020",
"July/2020", "August/2020", "September/2020",
"October/2020", "November/2020", "December/2020")
months2
9/64
· A character vector can be used to name the elements of another vector
[1] 0.600 0.660 0.663 0.721 0.742 0.790 0.805 0.852 0.865 0.870 0.870 0.877
10/64
· substr() function: it is used to extract parts of a string (set of characters)
substr(months1, 1, 3)
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
11/64
Factor
· factor() function: converts a vector into a factor vector (an additional
class of vector) and to properly accommodate categorical variables
[1] "F" "F" "F" "F" "M" "M" "M" "M" "M"
gender2
[1] F F F F M M M M M
Levels: F M
12/64
· gl() command: it can be used to generate a factor vector when each level
of the factor has an equal number of entries (replicates)
[1] F F F F F M M M M M
Levels: F M
13/64
Vector indexing
· We can use [ ] to extract a subset of vectors. It can be do by using
different forms (Logan, 2010). We will use the vector branch.length, that
was created earlier.
branch.length
14/64
Vector of positive integers
March
0.663
15/64
Vector of negative integers
16/64
Vector of character strings: it is necessary that the vector elements have
been named
December
0.877
September October
0.865 0.870
17/64
Vector of logical values
Just to remember:
branch.length
18/64
branch.length[branch.length > 0.8] # logical condition is true
branch.length[branch.length > 0.7 & branch.length < 0.8] # multiple logical condition is true
19/64
Matrices
Matrices
· Vector: single dimension - it has length
x <- 1:12
xmat1 <- matrix(x, ncol=4)
xmat1
21/64
xmat2 <- matrix(x, nrow=3)
xmat2
22/64
· Matrix is used to store vectors with the same type and size
X Y
[1,] 16.92 8.37
[2,] 24.03 12.93
[3,] 7.61 16.65
[4,] 15.49 12.20
[5,] 11.77 13.12
23/64
rbind(X, Y) # to combine rows
matrix(0,nrow=4,ncol=3)
24/64
· Dimensions of a matrix
xmat3
dim(xmat3)
[1] 3 4
25/64
· Setting the rows and columns names: rownames() and colnames()
commands
xmat3
C1 C2 C3 C4
R1 1 2 3 4
R2 5 6 7 8
R3 9 10 11 12
26/64
Matrices indexing
· Like vectors, matrices can be indexed from vectors of positive integers,
negative integers, character strings and logical values. However, matrices
have two dimensions (height and width): matrix indexing takes on the
form of [row.indices, col.indices],
27/64
XY
X Y
[1,] 16.92 8.37
[2,] 24.03 12.93
[3,] 7.61 16.65
[4,] 15.49 12.20
[5,] 11.77 13.12
Y
16.65
28/64
XY[3,] # select the entire 3rd row
X Y
7.61 16.65
X Y
[1,] 16.92 8.37
[2,] 7.61 16.65
[3,] 11.77 13.12
29/64
XY[, -2] # select all columns except the 2nd
X Y
[1,] 16.92 8.37
[2,] 24.03 12.93
[3,] 15.49 12.20
30/64
Lists
Constructing lists
· lists are used to store collections of objects that can be of different lengths
and types
$Age
[1] 32
$Name
[1] "Aline"
$Grades
[1] 98 85 96
32/64
Objects of a list
· An object within a list can be referred to by appending a string character
($) followed by the name of the object to the list names or
list_name$object_name
[1] 98 85 96
33/64
· An object or object elements within a list can also be referred to by
appending an index vector (enclosed in double square brackets, [[]])
[1] 98 85 96
[1] 98 85 96
34/64
Data frames
What are Data frames?
· Similar to the matrices (rows and columns, two dimensional), but different
columns can stored different types of vectors. However, the vectors must
have the same size or length
36/64
· Example (Mello e Peternelli, 2013):
Age
[1] 17 17 16 15 15 13
Gender
[1] M F F F F M
Levels: F M
37/64
Grades
[1] 92 75 81 87 90 88
38/64
Importing (reading) data
· There are a large number of competing methods that can be used to
import data and from a wide variety of sources
· We will present the simplest methods of importing data from the most
popular sources
· The most common text file are both comma ou semicolon delimited and
tab delimited
39/64
· To read this data file, it is necessary to save
example1_phenological_data.xlsx from Excel to a text file
· To read a semicolon delimited text file, you can use the commands:
40/64
data2 <- read.table("example1_phenological_data.csv", #
head=T, sep=";", dec=",")
str(data2)
head(data2)
41/64
· To read a tab delimited text file, you can use the commands:
42/64
# You can omit the 'sep="\t"' argument
# and just use the command
data4 <- read.table("example1_phenological_data.txt", #
head=T, dec=",")
head(data4)
43/64
Reviewing a data frame
· fix() function: it is used to view a data frame as a simple spreadsheet in a
separate window
44/64
Indexing data frames
· A vector or vector elements within a data frame can be referred to by
appending an index vector (enclosed in square brackets, [ ]) or by using
data_frame_name$column_name
[1] "Angela"
45/64
data1$Name # select the entire variable 'Name'
[1] "Angela"
46/64
· Indexing by conditions
47/64
Sorting datasets
· order()function: to sort datasets according to one or more variables
data1
48/64
data1[order(data1$Gender, data1$Name),]
49/64
Manipulation of data frames
· Commands cbind() and rbind() can be used for data frames
50/64
data1$Grade2 <- c("A","C", "B", "B","A", "B")
data1
51/64
· To split a data.frame by groups: command split()
split(data1, Gender)
$F
Name Age Gender Grades Grade Grade2
2 Angela 17 F 75 C C
3 Aline 16 F 81 B B
4 Mayara 15 F 87 B B
5 Lara 15 F 90 A A
$M
Name Age Gender Grades Grade Grade2
1 José 17 M 92 A A
6 Nicolas 13 M 88 B B
52/64
Object information and
conversion
Object’s attributes
· All R objects are of a certain type or class
[1] "character"
class(Age)
[1] "numeric"
class(Gender)
[1] "factor"
54/64
· Family of functions prefixed with is.: to evaluate whether or not an object
is of a particular class
is.data.frame(Age)
[1] FALSE
is.data.frame(data1)
[1] TRUE
is.numeric(Age)
[1] TRUE
55/64
· Size or length of an object:
[1] 39
56/64
· Other characteristics of an object can be view by using str()
str(data2)
57/64
str(data1)
58/64
· Command attributes(): to access object’s attributes
attributes(data2)
$names
[1] "Student" "Male" "Female"
$dim
[1] 5 2
$dimnames
$dimnames[[1]]
NULL
$dimnames[[2]]
[1] "X" "Y"
59/64
attributes(data1)
$names
[1] "Name" "Age" "Gender" "Grades" "Grade" "Grade2"
$row.names
[1] 1 2 3 4 5 6
$class
[1] "data.frame"
60/64
Object conversion
· Objects can be converted into other objects using a family of functions
with a as. prefix
61/64
· To convert a matrix into a vector
as.vector(x1)
[1] 1 2 3 4 5 6 7 8 9 10 11 12
62/64
str(data1)
data1$Grade<- as.factor(data1$Grade)
str(data1)
63/64
References
LOGAN, M. (2010) Biostatistical Design and Analysis Using R: A Practical Guide.
Hoboken, NJ: Wiley-Blackwell.
64/64