Professional Documents
Culture Documents
Introduction To The R Language: Data Types and Basic Operations
Introduction To The R Language: Data Types and Basic Operations
1 / 27
Objects
2 / 27
Numbers
3 / 27
Attributes
4 / 27
Entering Input
At the R prompt we type expressions. The <- symbol is the assignment operator.
> x <- 1
> print(x)
[1] 1
> x
[1] 1
> msg <- "hello"
The # character indicates a comment. Anything to the right of the # (including the #
itself) is ignored.
5 / 27
Evaluation
When a complete expression is entered at the prompt, it is evaluated and the result of
the evaluated expression is returned. The result may be auto-printed.
6 / 27
Printing
7 / 27
Creating Vectors
8 / 27
Mixing Objects
When different objects are mixed in a vector, coercion occurs so that every element in
the vector is of the same class.
9 / 27
Explicit Coercion
Objects can be explicitly coerced from one class to another using the as.* functions, if
available.
10 / 27
Explicit Coercion
11 / 27
Matrices
Matrices are vectors with a dimension attribute. The dimension attribute is itself an
integer vector of length 2 (nrow, ncol)
12 / 27
Matrices (cont’d)
13 / 27
Matrices (cont’d)
Matrices can also be created directly from vectors by adding a dimension attribute.
14 / 27
cbind-ing and rbind-ing
15 / 27
Lists
Lists are a special type of vector that can contain elements of different classes. Lists
are a very important data type in R and you should get to know them well.
> x <- list(1, "a", TRUE, 1 + 4i)
> x
[[1]]
[1] 1
[[2]]
[1] "a"
[[3]]
[1] TRUE
[[4]]
[1] 1+4i
16 / 27
Factors
Factors are used to represent categorical data. Factors can be unordered or ordered.
One can think of a factor as an integer vector where each integer has a label.
Factors are treated specially by modelling functions like lm() and glm()
Using factors with labels is better than using integers because factors are
self-describing; having a variable that has values “Male” and “Female” is better
than a variable that has values 1 and 2.
17 / 27
Factors
18 / 27
Factors
The order of the levels can be set using the levels argument to factor(). This can
be important in linear modelling because the first level is used as the baseline level.
19 / 27
Missing Values
20 / 27
Missing Values
21 / 27
Data Frames
22 / 27
Data Frames
23 / 27
Names
R objects can also have names, which is very useful for writing readable code and
self-describing objects.
24 / 27
Names
$b
[1] 2
$c
[1] 3
25 / 27
Names
And matrices.
26 / 27
Summary
Data Types
atomic classes: numeric, logical, character, integer, complex
vectors, lists
factors
missing values
data frames
names
27 / 27