You are on page 1of 72

Department of Statistics

Ambo University

Statistical Computing II (Stat 3022)

Using

R & SAS
By: Senahara Korsa (PhD)

3/29/2023
Introduction to R
 To support your learning and understanding of R, the first thing
you will need to do is download and install both R and RStudio on
your computer.
 Take a look at the Setup link
https://alexd106.github.io/intro2R/setup.html for further
instructions on how to set up your computer and download the
required datasets.
 To get up and running the first thing you need to do is install R.
 R is freely available for Windows from the Comprehensive R
Archive Network (CRAN) website.
 Click on the link below for step-by-step instructions on how to
download and install R and Rstudio
https://alexd106.github.io/intro2R/howto.html#install
Senahara Korsa (PhD) 3/29/2023 2
What is R?
• R is a free language and environment for statistical computing and
graphics.
• R was initially written in the 1990s by Ross Ihaka and Robert
Gentleman.
• The name is partly based on the (first) names of the first two R
authors (Robert Gentleman and Ross Ihaka).
• R is widely used programming language for statistical analysis,
predictive modeling and data science
Some tasks of R are:
• Exploring and Manipulating Data
• Building and validating predictive models
• Applying machine learning and text mining algorithms
• Creating visual appealing graphs
• Connecting with Databases
Senahara Korsa (PhD) 3/29/2023 3
Download R and Rstudio

Download R from:
http://cran.r-project.org/bin/

Download Rstudio from:


http://www.rstudio.com/ide/download/desktop

Senahara Korsa (PhD) 3/29/2023 4


To Download R:
1. Visit official site of R - http://www.r-project.org/.
2. Click on "CRAN" located at the left hand side of the
page.
3. Choose your country and click on the link available for
your location.
4. Click Download R for Windows (For Windows)
5. Click base
Follow the steps shown in the image below.

Senahara Korsa (PhD) 3/29/2023 5


Senahara Korsa (PhD) 3/29/2023 6
Structure of R
R Packages
• A package is a collection of previously programmed functions,
often including functions for specific tasks.
• There are two types of packages: those that come with the base
installation of R and packages that you must manually download
and install.
• To get the list of all the packages installed, run the following R
script:
>library()

Senahara Korsa (PhD) 3/29/2023 7


Installing and Loading R Packages
• There are two ways to add new R packages.
• One is installing directly from the CRAN (Comprehensive R
Archive Network) directory and
• Another is downloading the package to your local system and
installing it manually.
• Install directly from CRAN
• The following command gets the packages directly from the
CRAN webpage and installs the package in the R environment.
#install a new package from webpage
>install.packages(“Package Name")

Senahara Korsa (PhD) 3/29/2023 8


• Loading Packages
• Whenever you want to use an R package, you must not only
have installed locally you must also load the package during the
session you are using it.
• To load an installed package, select the “Load package” option
from the packages menu.
• To load packages to your workspace: for instance,
#load the package “foreign” to your workspace
>library(foreign)
#load the package “Hmisc” to your workspace
>library(Hmisc)

Senahara Korsa (PhD) 3/29/2023 9


R Window
• R Editor window
• It is where you write code or program.
• R Console
• It is where you can see result or output of your code.
• You can write code here as well but it's difficult to edit
or make changes in the R Console.

Senahara Korsa (PhD) 3/29/2023 10


Senahara Korsa (PhD) 3/29/2023 11
Running R
• When you start R the first thing you will see is the prompt (>) in GUI
(graphical user interface), which is R's way of saying ―”Go Ahead ...
Do something”.
• The instructions you give to R are called commands.
• Commands are separated either by a semicolon (;), or by a newline.
• Comments can be put almost anywhere, starting with a hashmark (#);
everything after # is a comment.
• If you see a “+” in place of the prompt that means your last command
was not completed.
• To get more information on any specific named function, e.g sqrt(),
the command is
>help(sqrt) or
> ?sqrt

Senahara Korsa (PhD) 3/29/2023 12


R - Ugly Interface?
• Many novice R users hate R interface.
• It looks boring as compared to SAS/SPSS.
• It is not user friendly at all.
• It makes writing program complex.
• How to make it less boring (or interesting)? RStudio
comes to rescue!

Senahara Korsa (PhD) 3/29/2023 13


R Studio Interface

Senahara Korsa (PhD) 3/29/2023 14


RStudio…
• After installing R and RStudio, launch RStudio from your
computer “application folders”.
• RStudio is a four panel work-space for
1. creating file containing R script
2. typing R commands
3. viewing command histories
4. viewing plots and more.

Senahara Korsa (PhD) 3/29/2023 15


Rstudio Screen
1. Top-left panel: Code editor allowing you to create and open a file
containing R script.
The R script is where you keep a record of your work.
R script can be created as: File –> New –> R Script.
2. Bottom-left panel: R console for typing R commands
3. Top-right panel:
• Workspace tab: shows the list of R objects you created.
• History tab: shows the history of all previous commands.
4. Bottom-right panel:
• Files tab: show files in your working directory
• Plots tab: show the history of plots you created. From this tab,
you can export a plot to a PDF or an image files
• Packages tab: show external R packages available on your
system. If checked, the package is loaded in R.

Senahara Korsa (PhD) 3/29/2023 16


Senahara Korsa (PhD) 3/29/2023 17
R and Rstudio

• R and RStudio are not two different versions of the same thing.
• In fact, they work together.
• R is a programming language for statistical calculation.
• And RStudio is an Integrated Development Environment (IDE)
that helps you develop programs in R.
• You can use R without using RStudio, but you cannot use
RStudio without using R, so R comes first.
• Rstudio is not R or a type of R, rather it is a program that runs R
and provides extra tools that are helpful when writing R code.
• Everything that is conducted in Rstudio can also be done
directly on R.

Senahara Korsa (PhD) 3/29/2023 18


Some Useful RStudio Shortcuts
1. Press CTRL + Enter to submit/run code
2. Press CTRL + SHIFT + C to comment/ uncomment code
3. Press CTRL + SHIFT + N to create a new R script

Senahara Korsa (PhD) 3/29/2023 19


Fundamentals of R

1. R is a case-sensitive package.
• For instance: AGE, Age, and age are different objects!
2. R as calculator
• You can use R as a powerful calculator for a wide range of
numerical computations.
Example:
> log2(32)
[1] 5
>seq(0, 5, length=6)
[1] 0 1 2 3 4 5
>plot(sin(seq(0, 2*pi, length=100)))
The [1] tells you the resulting value is the first result.

Senahara Korsa (PhD) 3/29/2023 20


3. The # character at the beginning of a line signifies a comment.

4. The operator "<–" (without quotes) is equivalent to "=" sign .


You can use either of the operators.

5. The getwd() function shows the working directory

Senahara Korsa (PhD) 3/29/2023 21


6. R uses forward slashes instead of backward slashes in
filenames (as shown in the image above).
7. The setwd() function tells R where you would like your files to
save (changes the working directory).
setwd ("C:/Users/Deepanshu/Downloads")
8. The c function is widely used to combine values to form a
vector.

9. In RStudio, press CTRL + SHIFT + A to format your code.


10. R uses NA to represent Not Available, or missing values.

Senahara Korsa (PhD) 3/29/2023 22


11. To calculate sum excluding NA, use na.rm = TRUE (By default, it is
FALSE).

12. The form 1:10 generates the integers from 1 to 10.

13. R is case-sensitive, so you have to use the exact case


that the program requires.
14. To get help for a certain function such as sum, use the form:
>help (sum).

Senahara Korsa (PhD) 3/29/2023 23


15. Object names in R can be any length consisting of letters,
numbers, underscores ‘‘_’’ or the period ‘‘.’’
16. Object names in R should begin with a letter.
17. Unlike SAS and SPSS, R has several different data structures
including vectors, factors, data frames, matrices, arrays, and lists.

Senahara Korsa (PhD) 3/29/2023 24


Variable Declaration and Assignment
• To assign a value to a variable, use the assignment
command “<-“.
• A variable name can be any name made up of letters,
numbers, and or _ provided it starts with a letter, or then a
letter.
• Note that names are case sensitive.
• Do not start a variable name with a number.
Example:
>x <-2 #Assigning 2 to the variable x
>y <- “Hello!!” #Assigning a string to the variable y

Senahara Korsa (PhD) 3/29/2023 25


Working with Variable

• Finding Variables
• To know all the variables currently available in the workspace
we use the ls() function.
• Deleting Variables
• Variables can be deleted by using the rm() function.
• To remove object x, for instance, use:
>rm(x) #to delete specific object x
>rm(list = ls())# To remove all currently defined objects

Senahara Korsa (PhD) 3/29/2023 26


Save works on R
 To save works on R use the following

> save(file="file-name.RData") or

> save.image("filen-name.RData") or

 From the file menu:

File Save workspace browse to the folder where


you want to save supply the file name.

Senahara Korsa (PhD) 3/29/2023 27


Save R output
 To save your output from a session in R can be saved
using the sink command:

> sink(“Rintro”)#to save the output in R

> sink(“output.txt")#to save the output in txt form

 To turn off the sink command:

> sink()

Senahara Korsa (PhD) 3/29/2023 28


Getting online help in R
 Help is also available in HTML format by running
> help.start()
which will launch a web browser that allows the help
pages to be browsed with hyperlinks.
 The help.search command (alternatively ?? )

allows searching for help in various ways.


Example:
> ?? Solve #Or
> help.search(“solve”)
 Try ?help.search for details and more examples.

Senahara Korsa (PhD) 3/29/2023 29


Getting online help in R…

 The examples on a help topic can normally be run by

> example("topic")

 For example: to know what lm does with


demonstration type the following command:

>example(lm)

 For further information about online help: use

> ? help

Senahara Korsa (PhD) 3/29/2023 30


Assigning values to variables
 Assignment can be made using any of the following operations:
 Using “<-”
 Example: > x <-5
 Using “=”
 Example: >x = 5
 Using the function “assign()”
 Example: > assign(“x”, 5+10)
 Using “->”
 Example: > 5->X

Senahara Korsa (PhD) 3/29/2023 31


Objects and Simple Manipulation
 The entities that R creates and manipulates are known as
objects.

 These may be variables, arrays of numbers, character strings,


functions, or more general structures built from such
components.
 To list the objects you have created in a session use
either of the following commands:

> objects()

> ls()

Senahara Korsa (PhD) 3/29/2023 32


Data types in R
 R has several different data types
 Vectors
 Lists
 Data frames
 Matrices
 Arrays
 Factors

Senahara Korsa (PhD) 3/29/2023 33


Vectors
 Vectors are the simplest type of object in R.

 They can easily be created with c, the combined function.

 There are 3 main types of vectors:

 Numeric vectors

 Character vectors

 Logical vectors

Senahara Korsa (PhD) 3/29/2023 34


Numeric Vectors
 It is a single entity consisting of an ordered collection of
numbers.
 Example: to set up a numeric vector X consisting of 5
numbers, 10, 6, 3, 6, 22, we use any one of the following
commands:

> x<-c(10, 6, 3, 6, 22) #OR


> x= c(10, 6, 3, 6, 22) #OR

> assign(“x”, c(10, 6, 3, 6, 22))#OR


> c(10, 6, 3, 6, 22)->x

Senahara Korsa (PhD) 3/29/2023 35


Numeric Vectors…
 The further assignment

> y<-c(x,0,x)

 This would create a vector y with 11 entries consisting of


two copies of x with a zero in the middle place.

> y<-c(x,0,x)

> y

[1] 10 6 3 6 22 0 10 6 3 6 22

Note: The [1] in front of the result is the index of the first
row in the vector x.

Senahara Korsa (PhD) 3/29/2023 36


Numeric Vectors…
Functions that return a single value
> length(x) # the number of elements in x

> sum(x) # the sum of the values of x


> mean(x) # the mean of the values of x
> var(x) # the variance of the values of x
> sd(x) # the standard deviation of the values of x
> min(x) # the minimum value from the values of x
> max(x) # the maximum value from the values of x
> prod(x) # the product of the values of x
> range(x) # the range of the values of x

Senahara Korsa (PhD) 3/29/2023 37


Numeric Vectors…
Functions that return vectors with the same length
 To print the rank of the values of x:
> order(x)
> sort.list(x)
 To print the values of x in increasing order
> sort(x)
> x[order(x)]
> x[sort.list(x)]
 To print the reciprocals of the values of x
> 1/x

Senahara Korsa (PhD) 3/29/2023 38


Numeric Vectors…
 To print the sin, cos, tan, asin, acos, atan, log, exp, … of
the values of x:

> sin(x)

> cos(x)

> exp(x)

 The parallel maximum and minimum functions pmax and


pmin return a vector (of equal to their largest argument)
that contains in each element the largest (smallest) element in
that position in any of the input vectors.
Senahara Korsa (PhD) 3/29/2023 39
Let x be:
> x=c(1:10)
> x
[1] 1 2 3 4 5 6 7 8 9 10
> pmax(x,6)
[1] 6 6 6 6 6 6 7 8 9 10
 Returns the values of x but values that are less than 6
will be replaced by 6.
> pmin(x,6)
[1] 1 2 3 4 5 6 6 6 6 6

 Returns the values of x but values that are greater than


6 will be replaced by 6.

Senahara Korsa (PhD) 3/29/2023 40


Character vectors
Character strings are another common data type, used to
represent text.

 To set up a character/string vector z consisting of 3 place


names use:

 Character strings are entered using either matching double ("


") or single (' ') quotes, but are printed using double quotes
(or sometimes without quotes).
Senahara Korsa (PhD) 3/29/2023 41
Escape Character
 Most characters can be used in a string, with a couple of
exceptions, one being the backslash character, "\".
 This character is called the escape character and is used to
insert characters that would otherwise be difficult to add.
 The table below shows some of the other characters that can
be "escaped" in this way.

Example

Senahara Korsa (PhD) 3/29/2023 42


Concatenation

 Character vectors can be concatenated using c( )

 Example:

Senahara Korsa (PhD) 3/29/2023 43


Logical Vectors
 A logical vector is a vector whose elements are TRUE,
FALSE or NA.
 Logical vectors are generated by conditions.
 Example: temp <- x>13

 Sets temp as a vector of the same length as x with values


FALSE corresponding to elements of x where the
condition is not met and TRUE where it is met.

 The logical operators are <, <=, >, >=, = = for exact
equality and != for inequality.
 In addition if c1 and c2 are logical expressions, then c1&c2 is
their intersection(“and”), c1|c2 is their union(“or”), and !c1 is
the negation of c1.
Senahara Korsa (PhD) 3/29/2023 44
Missing Values inVectors
 The function is.na(x) gives a logical vector of the same size as x
with value TRUE if and only if the corresponding element in x is
NA = not available or a missing value. Example:

 Note that there is a second kind of ―missingǁ values which are


produced by numerical computation, the so-called Not a Number,
NaN, values.
Example:

Note: is.na(xx) is TRUE both for NA and NaN

Senahara Korsa (PhD) 3/29/2023 45


Indexing Vectors
 Vectors indices are placed with square brackets:[]

 Vectors can be indexed in any of the following ways:

 Vector of positive integers


 Vector of negative integers
 Vector of named items
 Logical vector

Example:

Senahara Korsa (PhD) 3/29/2023 46


 Printing all elements of const except the 1st and the 2nd.
> const[c(-1,-2)]
sqrt2 golden
1.4142 1.618

 Printing TRUE or FALSE if respective value is greater than 2.

> const>2
pi euler sqrt2 golden
TRUE TRUE FALSE FALSE

 Printing Truth valued vallues.


> const[const>2]
pi euler
3.1416 2.7183

Senahara Korsa (PhD) 3/29/2023 47


Modifying Vectors
 To alter the contents of a vector, similar methods can
be used.
Example: Create a variable x with 5 elements: 10 5 3 6 21
> x=c(10,5,3,6,21)
 Now, to modify the first element of x and assign it a value 7
use
> x[1]<-7
> x
[1] 7 5 3 6 21
 The following command replaces any NA (missing)
values in the vector w with the value 0:
> w[is.na(w)]<-0

Senahara Korsa (PhD) 3/29/2023 48


Generating sequences
 R has a number of ways to generate sequences of
numbers.

 These includes by using colon (:)

 Example:

 To generate numbers from 1 to 10 in increasing order

> 1:10

[1] 1 2 3 4 5 6 7 8 9 10

 To generate numbers from 10 to 1 in decreasing order

> 10:1

[1] 10 9 8 7 6 5 4 3 2 1
Senahara Korsa (PhD) 3/29/2023 49
Generating sequences…
 The colon operator has high priority within an expression.

 Example: 2*1:10 is equivalent to 2*(1:10)

> 2*1:10
[1] 2 4 6 8 10 12 14 16 18 20
> 2*(1:10)
[1] 2 4 6 8 10 12 14 16 18 20
 The other option to generate numbers is using seq() function

Example:

> seq(1:10)

> seq(from=1, to=10)

> seq(to=10, from=1)

are all equivalent to 1:10


Senahara Korsa (PhD) 3/29/2023 50
Generating sequences…
 The parameters by=value and length=value specify a step
size and length for the sequence respectively.
 If neither of these is given, the default by=1 is assumed.

Example:
>seq(1,5, by=2)
[1] 1 3 5
> seq(1,10, length=5)

[1] 1.00 3.25 5.50 7.75 10.00


>seq(from=1, by=2.25, length=5)
[1] 1.00 3.25 5.50 7.75 10.00

Senahara Korsa (PhD) 3/29/2023 51


Vector Replication
 The function rep() can be used for replicating an object in
various complicated ways.
 The ff command will print 5 copies of x end to end
> rep(x, times=5) #or
> rep(x, 5)

 While the command :


> rep(x, each=5)

will print each element of x five times before moving onto the next.
 Further more, the command
> rep(c(1,4), c(2,3))

will print 1 two times and then 4 three times (1 1 4 4 4 ).

Senahara Korsa (PhD) 3/29/2023 52


Matrices
 A matrix can be regarded as a generalization of a vector.

 As with vectors, all the elements of a matrix must be of the


same data type.

 A matrix can be generated in two ways.

 Method 1: Using the function dim:

 Example

Senahara Korsa (PhD) 3/29/2023 53


Matrices…
Method 2: Using the function matrix:
> x <- matrix(c(1:8),2,4,byrow=F)

> x

[,1] [,2] [,3] [,4]

[1,] 1 2 3 4

[2,] 5 6 7 8

An equivalent expression:
> x<-matrix(c(1:8),nrow=2,ncol=4)

Senahara Korsa (PhD) 3/29/2023 54


Matrices…
 By default the matrix is filled by column.

 To fill the matrix by row specify byrow = T as


argument in the matrix function.

 Use the function cbind to create a matrix by binding


two or more vectors as column vectors.

 The function rbind is used to create a matrix


by binding two or more vectors as row vectors.

 Example:
> cbind(c(1,2,3),c(4,5,6))

> rbind(c(1,2,3),c(4,5,6))

Senahara Korsa (PhD) 3/29/2023 55


Matrix operations
 Matrix operations (multiplication, transpose, etc.) can easily
be performed in R using a few simple functions like:

Name Operation
dim() Dimension of the matrix (number of rows and columns)
as.matrix() Used to convert an argument into a matrix object
%*% Matrix multiplication
t() Matrix transpose
det() Determinant of a square matrix
solve() Matrix inverse; also solves a system of linear equations
eigen() Computes eigenvalues and eigenvectors

Senahara Korsa (PhD) 3/29/2023 56


Matrix Functions

Senahara Korsa (PhD) 3/29/2023 57


Arrays
 It can be considered as a multiply subscripted collection of

data entries, for example numeric.


 Arrays are generalizations of vectors and matrices.

 That means, vectors in the mathematical sense are one-

dimensional arrays where as matrices are two-dimensional


arrays; higher dimensions are also possible.
 There are two methods of creating arrays in R

 Method 1: Using vectors

 A vector can be used by R as an array only if it has a

dimension vector as its dimattribute.


 A dimension vector is a vector of non-negative integers.

 If its length is k then the array is k dimensional.


Senahara Korsa (PhD) 3/29/2023 58
Arrays…
 An array can be created by giving a vector structure, a dim,
which has the form > z=data_vector
> dimention_vector
 Example: The following is a 3 X 5 X 100 (3-dimentional) array
with dimension vector c(3,5,100) and a vector of 1500 elements.
> z=c(1:1500)

> dim(z) <- c(3,5,100)


 If the dimension vector for an array, say A, is c(3,4,2) then
there are 3X4X2 = 24 entries in A and the data vector holds
them in the order A[1,1,1], A[2,1,1], ..., A[2,4,2], A[3,4,2].
> A=c(5:28)
> dim(A)=c(3,4,2)

Senahara Korsa (PhD) 3/29/2023 59


Arrays…
Method 2: Using the function array()

 As well as giving a vector structure a dim attribute, arrays can be


constructed from vectors by the array function, which has the form

> Z<- array(data_vector, dim_vector)

 Example: if the vector h contains 24 or fewer, numbers then


the command

> Z<- array(h, dim=c(3,4,2))

would use h to set up 3 by 4 by 2 array in Z. If the size of h is


exactly 24 the result is the same as

> Z<- h ; dim(Z) <- c(3,4,2)

Senahara Korsa (PhD) 3/29/2023 60


Lists
 Lists: are collections of arbitrary objects.

 That is, the elements of a list can be objects of any type and
structure.

 Consequently, a list can contain another list and therefore it can


be used to construct arbitrary data structures.

 A list could consist of a numeric vector, a logical value, a matrix,


a complex vector, a character array, a function, and so on.

 Lists are created with the list()command:

L<-list(object-1,object-2,…,object-m)

Senahara Korsa (PhD) 3/29/2023 61


Lists example

Senahara Korsa (PhD) 3/29/2023 62


Lists…
 The elements of the list are accessed with the [[ ]] operator.

 Examples: Consider the previous L elements


>L <- list( c(1,5,3), matrix(1:6, nrow=3), c("Hello", "world"))

> L[[1]] # To obtain the 1st element of L

[1] 1 5 3

>L[[2]][2,1] # Element [2,1] of the second element of L

[1] 2
>L[[c(3,2)]] # Recursively: element 3 of L, hereof the 2nd element [1] "world”

> #OR

> L[[3]][2]

[1] "world"

Senahara Korsa (PhD) 3/29/2023 63


Data Frames
 Data frames: regarded as an extension to matrices.

 Data frames can have columns of different data types and are
the most convenient data structure for data analysis in R.

 Data frames are lists with the constraint that all elements are
vectors of the same length.

 The command data.frame() creates a data frame:

>dat<-data.frame(object-1,object-2,…,object-m)

Senahara Korsa (PhD) 3/29/2023 64


Example of Data Frames
 The data frame below contains the results of an experiment to
determine the effect of removing the tip of petunia plants grown
at 3 levels of nitrogen on various measures of growth.
 The data frame has 8 variables (columns) and each row
represents an individual plant.
 The variables treat and nitrogen are factors
(categorical variables).
 The treat variable has 2 levels (tip and notip) and the nitrogen
variable has 3 levels (low, medium and high).
 Variables height, weight, leafarea and shootarea are numeric
and the variable flowers is an integer representing the number of
flowers.

Senahara Korsa (PhD) 3/29/2023 65


Example…
treat nitrogen block height weight leafarea shootarea flowers

tip medium 1 7.5 7.62 11.7 31.9 1

tip medium 1 10.7 12.14 14.1 46 10

tip medium 1 11.2 12.76 7.1 66.7 10

tip medium 1 10.4 8.78 11.9 20.3 1

tip medium 1 10.4 13.58 14.5 26.9 4

tip medium 1 9.8 10.08 12.2 72.7 9


notip low 2 3.7 8.1 10.5 60.5 6
notip low 2 3.2 7.45 14.1 38.1 4
notip low 2 3.9 9.19 12.4 52.6 9
notip low 2 3.3 8.92 11.6 55.2 6
notip low 2 5.5 8.44 13.5 77.6 9
notip low 2 4.4 10.6 16.2 63.3 6

Senahara Korsa (PhD) 3/29/2023 66


Data Frame Construction
 We can construct a data frame from existing data objects
such as vectors using the data.frame() function.
 As an example, let’s create three
vectors p.height, p.weight and p.names and include all of
these vectors in a data frame object called dataf.
>p.height <- c(180, 155, 160, 167, 181)
>p.weight <- c(65, 50, 52, 58, 70)
>p.names <- c("Joanna", "Charlotte", "Helen", "Karen", "Amy")
>dataf <- data.frame(height = p.height, weight = p.weight, names =
p.names)
>dataf

Senahara Korsa (PhD) 3/29/2023 67


Data Frames…
 A data frame is a powerful two-dimensional object made
up of rows and columns which looks superficially very
similar to a matrix.
 However, whilst matrices are restricted to containing data
all of the same type, data frames can contain a mixture of
different types of data.
 Typically, in a data frame each row corresponds to an
individual observation and each column corresponds to a
different measured or recorded variable.

Senahara Korsa (PhD) 3/29/2023 68


Data Frames…
Example:
> name= c("Eden","Solomon","Zelalem","Kidist")
> age=c(18,22,25,27)
> sex=c("F","M","M","F")
> stud=data.frame(name,age,sex)
> stud
name age sex
1 Eden 18 F
2 Solomon 22 M
3 Zelalem 25 M
4 kidist 27 F

 To display the column names:

> names(stud) #OR colnames(stud)


[1] "name" "age" "sex"

 To display the row names:


> rownames(stud)
[1] "1" "2" "3" "4"

Senahara Korsa (PhD) 3/29/2023 69


Data Frames…
 You can use the “ names“ function to change the column names:
> names(stud)<-c("name","age","sex")
> stud
name age sex
1 Eden 18 F
Variable name
2 Solomon 22 M
3 Zelalem 25 M
4 kidist 27 F

 Similarly, use the “row.names” function to change the row names:


> row.names(stud)<("Wrt","Ato1","Ato2","Wro")
> stud
Wrt Eden 18 F
Ato1 Solomon 22 M
Ato2 Zelalem 25 M
Wro Kidist 27 F

Senahara Korsa (PhD) 3/29/2023 70


Factors
• R has a special data structure to store categorical variables,
which is factor.
• It tells R that a variable is nominal or ordinal by making it a
factor.
• Simplest form of the factor function:

Senahara Korsa (PhD) 3/29/2023 71


• Ideal form of the factor function:

The factor function has 3 parameters:


1. Vector Name
2. Values (Optional)
3. Value labels (Optional)

Senahara Korsa (PhD) 3/29/2023 72

You might also like