You are on page 1of 32

Management Information Systems

Lesson 2
Introduction to R (1)
INTRODUCTION TO R
• Software for Statistical Data Analysis
• Based on S
• Programming Environment
• Interpreted Language
• Data Storage, Analysis, Graphing
• Free and Open Source Software

2
INTRODUCTION TO R

3
Data Science
Data science is an
interdisciplinary field
that uses scientific
methods, processes,
algorithms and systems
to extract knowledge
and insights from data
in various forms, both
structured and
unstructured, similar to
data mining…

4
INTRODUCTION TO R

5
INTRODUCTION TO R

6
INTRODUCTION TO R
• Comprehensive R Archive Network:
• http://cran.r-project.org
• Free and Open Source
• Strong User Community
• Highly extensible, flexible
• Implementation of high end statistical methods
• Flexible graphics and intelligent defaults

7
INTRODUCTION TO R
• Highly Functional
• Everything done through functions
• Strict named arguments
• Abbreviations in arguments OK (e.g. T for TRUE)
• Object Oriented
• Everything is an object
• “<-” is an assignment operator
• “X <- 5”: X GETS the value 5
8
FUNCTIONS
• Functions: objects created
by the user and reused to
make specific operations.
• Functions form the core of
R; everything you do in R
uses a function in one way
or another.
• More importantly, the way
functions work in R allows
you to carry out multiple
complex operations in one
step or a few simple steps.

9
Data Structures
• Supports virtually any type of data
• Numbers, characters, logicals (TRUE/ FALSE)
• Arrays of virtually unlimited sizes
• Simplest: Vectors and Matrices
• Lists: Can Contain mixed type variables
• Data Frame: Rectangular Data Set

10
Data Structures
• R is an object-oriented language: an object in R is
anything (constants, data structures, functions,
graphs) that can be assigned to a variable:
• Data Objects: used to store real or complex numerical
values, logical values or characters. These objects are
always vectors: there are no scalars in R.
• Language Objects: functions, expressions

11
Data Structures
Data structure types
• Vectors: one-dimensional arrays used to store collection data of the
same mode
• Numeric Vectors (mode: numeric)
• Complex Vectors (mode: complex)
• Logical Vectors (model: logical)
• Character Vector or text strings (mode: character)
• Matrices: two-dimensional arrays to store collections of data of the
same mode. They are accessed by two integer indices.

12
Data Structures
• Arrays: similar to matrices but they can be multi-dimensional (more
than two dimensions)
• Factors: vectors of categorical variables designed to group the
components of another vector with the same size
• Lists: ordered collection of objects, where the elements can be of
different types
• Data Frames: generalization of matrices where different columns can
store different mode data.

13
Data Structures in R

Linear Rectangular

All
Same VECTORS MATRIX
Type

Mixed LIST DATA FRAME

14
R and R Studio
• R can be used with any operating system that R runs on (Mac, Linux, or
Windows)
• When you download R, you automatically download a console
application that’s suitable for your operating system.
• R Studio is a cross‐platform application, also known as an Integrated
Development Environment (IDE) with some very neat features to support
R.
• R Studio provides a common user interface across the major operating
systems.
• For this reason, we use R Studio to demonstrate some of the concepts
rather than any specific operating‐system version of R.

15
Performing multiple calculations with vectors
• R is a vector‐based language.
• You can think of a vector as a row or column of numbers or text.
• The list of numbers {1,2,3,4,5}, for example, could be a vector.
• Unlike most other programming languages, R allows you to apply functions
to the whole vector in a single operation without the need for an explicit
loop.
• First, assign the values 1:5 to a vector called x:
> x <- 1:5
>x
[1] 1 2 3 4 5

16
Performing multiple calculations with vectors
• Next, add the value 2 to each element in the vector x
>x+2
[1] 3 4 5 6 7

• You can also add one vector to another. To add the values 6:10
element‐wise to x, you do the following:
> x + 6:10
[1] 7 9 11 13 15

17
Exploring RGui

18
R Studio
• RStudio is a code editor and development environment with some very
nice features that make code development in R easy and fun:
• Code highlighting that gives different colors to keywords
and variables, making it easier to read
• Automatic bracket and parenthesis matching
• Code completion, so you don’t have to type out all
commands in full
• Easy access to R Help, with some nice features for exploring
functions and parameters of functions
• Easy exploration of variables and values

19
R Studio
• Source: The top‐left corner of
the screen contains a text editor
that lets you work with source
script files. Here, you can enter
multiple lines of code, save your
script file to disk, and perform
other tasks on your script.
• Console: In the bottom‐left
corner, you find the console. This
is where you do all the
interactive work with R.
20
R Studio
• Environment and History: The
top‐right corner is a handy
overview of your environment,
where you can inspect the
variables you created in your
session, as well as their values.
• This is also the area where you
can see a history of the
commands you’ve issued in R.

21
R Studio
• Files, plots, package, help, and
viewer: In the bottom‐right
corner, you have access to
several tools:
Files: This is where you can browse
the folders and files on your
computer.
Plots: This is where R displays your
plots (charts or graphs).
Packages: You can view a list of all
installed packages.
A package is a self‐contained set of
code that adds functionality to R,
similar to the way that add‐ins add
functionality to Microsoft Excel.

22
Get or Set Working Directory
• getwd returns an absolute
filepath representing the
current working directory of
the R process; setwd() is used
to set the working directory.
Usage
• getwd()
• setwd()
> setwd("C:/Users/user1/Desktop/R Proje")

23
Help
• The help() function and ? help
operator in R provide access to
the documentation pages for R
functions, data sets, and other
objects, both for packages in
the standard R distribution and
for contributed packages.
• Exp:
• ?median
• help (median)

24
First Codes, Variables and Data Entry
• Data Entry
• > X <- 10

Scaler or vector
• > a <-c(10, 20, 30)

25
First Codes, Variables and Data Entry
• Vector
• ''c()"
• "combine"

26
First Codes, Variables and Data Entry
• With ()
• Or without ()
• "#"

x<-10
a<-c(10,20,30)
a<-c(10,20,30)#Herhangi bir çıktı oluşmaz
b<-2*a #Herhangi bir çıktı oluşmaz
(b<-2*a) #Değişken konsolda gösterilir
[1] 20 40 60

27
First Codes, Variables and Data Entry
• The data type of the variables
can be dynamic (changeable).
• Determines the data type of
the variable dynamically
according to the type of data
we assign.
• The class () function is used to
X<-4.15
see a variable’s type. class(x)
• The ls () function is used to x <- "xyzt"
class(x)
see a list of all variables.

28
First Codes, Variables and Data Entry
• rm () can be used to remove
objects.
• The value of the internal
evaluation of a top-level R
expression is always assigned to
.Last.value

> 45/5+sqrt(81)+exp(pi*log(10))
> t <- .Last.value
>t

29
Basic Arithmetic Operations
> 25 + 32 - 17 + 3
[1) 43

> 17 + 27 1 3 - 1/2 * 4/6


[1] 25.66667

30
Basic Arithmetic Operations
• Prints its argument
print(exp(2), digits =2)
print ( ) [1] 7.4
print(exp(2), digits =3)
[1] 7.39
minimal number of significant
digits

round ()

round(pi, digits 2)
rounds the values in its first [1] 3.14
argument to the specified number round(pi, digits 3)
of decimal places (default 0). [1] 3.142

31
References:
• Arin Basu MD MPH, DataAnalytics, «Introduction to R»
http://www.pitt.edu/~super7/17011-18001/17641.ppt
• http://venus.ifca.unican.es/Rintro/dataStruct.html#vectors

You might also like