0% found this document useful (0 votes)

85 views17 pages

Introduction to R for Statistical Analysis

R is a free and open-source language and environment for statistical analysis and graphics. It contains thousands of pre-programmed statistical functions and can import data from various formats like Excel, SPSS, and STATA. R allows users to easily perform data manipulation, generate publication-quality graphs, conduct statistical tests, and develop statistical models.

Uploaded by

NBert Milla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views17 pages

Introduction to R for Statistical Analysis

Uploaded by

NBert Milla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

11/14/2016

A Brief Introduction to R

Dr. Norberto E. Milla

What is R?
• R is a language and environment for statistical
computing and graphics
• R is the open source - public domain version of S+
• Initially developed by Robert Gentleman and Ross
Ihaka of University of Auckland (early 1990’s)
• R is written by statisticians for statisticians (and the
rest of us)
• An environment–huge library of algorithms for data
access, data manipulation, analysis and graphics
• A community
–Thousands of contributors, 2 million users
–Resources and help in every domain

1
11/14/2016

Awesome thing #1: Its FREE!

• Open Source, licensed under GPL (like Linux!)
–Free as in freedom
• Flexible and runs on a wide array of platforms,
including Windows, Unix, and Mac OS X
• Open for integration
–Data ($A$, $P$$, $TATA, Excel, …)
• Broad user-base
–De-facto standard for data analysis and
teaching statistics

Awesome thing #2: Language

• Programming, not dialogs or cell formulas
–Freedom to combine methods
–Repeatable results
–Reliable and reusable
• Language designed for data analysis
–Object-oriented: vector, matrix, model, …
–Built-in library of algorithms
• Get more done, faster

2
11/14/2016

Awesome thing #3: Graphics

• Functions for standard graphs
–Scatterplot, boxplot, histogram, smoothing
–Bar plot, pie chart, dot chart, …
–Image plot, 3-D surface, map, …
• Customize without limits
–Combine graph types
–Create entirely new graphics
– Use of colors

Awesome thing #4: Statistics

• All standard statistical methods built in
–Mean, median, covariance, distributions, …
–Regression, ANOVA, cross-tabulations, …
–Survival, nonlinear mixed effects, GLM, …
–Neural networks, trees, GAM, …
• Object-oriented functions
–Access all parts of the analysis results
–Combine analytic methods
• Over 3,000 contributed packages for specialized
applications (as of 2011)

3
11/14/2016

Caveat

“Using R is a bit akin to smoking.

The beginning is difficult, one may
get headaches and even gag the
first few times. But in the long run,
it becomes pleasurable and even
addictive.”
--Francois Pinard

Downloading and installing R

Step 1: Go to the R homepage: http://www.r-project.org

Click here

4
11/14/2016

Downloading and installing R

Step 2: Select a CRAN mirror site
Click here

Downloading and installing R

Step 3: Select appropriate installer based on OS

Click here

5
11/14/2016

Downloading and installing R

Step 4: Select “base” installer

Downloading and installing R

Step 5: Download R installer

Click here

6
11/14/2016

Downloading and installing R

Step 5: Double click on the R application

Step 6. On the pop-up menu, click OK.

Step 7. Click Next on the next pop-up window and continue
answering all pop-up windows until you reach FINISH
window.

The R console

7
11/14/2016

Data types in R
• R has varied data types: scalars, vectors, matrices,
data frames and lists
• A vector is a single entity consisting of an ordered
collection of numbers (numeric, character, logical)
• A matrix is a vector that can be indexed by two or
more indices
• Data frames are matrix-like structures, in which the
columns can be of different types.
• Data frames are ‘data matrices’ with one row per
observational unit but with (possibly) both
numerical and categorical variables

Vector
• R is case-sensitive
• Assignment operators in R: <-, =
a<-c(1, 2, 5, 3, 6, -2 , 4) # numeric vector
b=c(“one”, ”two”, “three”) #character vector
c=c(TRUE, FALSE, TRUE, TRUE) #logical vector
• Elements of a vector can be referred to using
subscripts
• The following command will display the 2nd and 4th
elements of vector a
a[c(2, 4)]

8
11/14/2016

Matrix
• All columns in a matrix must have the same mode
(numeric, character, etc.) and the same length
mymatrix=matrix(vector,nrow=r,ncol=c,
byrow=FALSE,dimnames=list(char_vector_row
names, char_vector_colnames))

Example:
matrix1=matrix(1:20, 4, 5) #generates a 4x5 matrix

x=c(1:9)
rownames=c(“r1”,”r2”,”r3”)
colnames=c(“c1”,”c2”,”c3”)
matrix2=matrix(x, 3, 3, byrow=T,
dimnames=list(rownames,colnames)

Data Frame
• In a data frame different columns can have different
modes
• Similar to SAS and SPSS data sets
• Example:
x=c(1,2,3,4)
y=c(“red”, ”white”, ”red”, NA)
z=c(TRUE, TRUE, FALSE, FALSE)
mydata=data.frame(x,y,z) #will create the data frame
mydata
names(mydata)=c(“ID”, ”Color”, ”Passed”) #creates column
labels for mydata

9
11/14/2016

R built-in data editor

• One can enter data interactively into R using its
built-in spreadsheet
mydata=data.frame() #will create an empty data frame
mydata=edit(mydata) #will open the spreadsheet for
data entry
• Example:

Importing data from Excel

• For Excel 2003 or earlier, save the file in csv format
and use any one the following commands to import
the file into R
read.csv("D:/DMPS/R Training/QUICK-R/
import1.csv",header=TRUE,sep=",")
or,

read.table("D:/DMPS/R Training/QUICK-R/
import1.csv",header=TRUE,sep=",")

10
11/14/2016

Importing data from Excel

• For Excel 2007 or 2010, load first the xlsx library
using the following command
library(xlsx)

• Then use the following command to import the file

into R
read.xlsx("D:/DMPS/R Training/QUICK-R/
import2.xlsx",sheetIndex=1)
or, simply
read.xlsx("D:/DMPS/R Training/QUICK-R/
import2.xlsx“,1)

Importing data from SPSS

• There are two packages which can be used to
import SPSS data sets into R: foreign and Hmisc
• Load the foreign package
library(foreign)
• Use the following command to import the data into
R
myspssdata=read.spss(“D:/DMPS/R Training/QUICK-
R/ched_complete.sav”, use.value.labels=TRUE,
to.data.frame = TRUE)

11
11/14/2016

Importing data from SPSS

• Save the SPSS data set in portable (*.por) format
• Load the Hmisc package
library(Hmisc)
• Use the following command to import the data into
R
myspssdata=spss.get(“D:/DMPS/R Training/QUICK-
R/ched_complete.por”, use.value.labels=TRUE,
to.data.frame.=TRUE)

Importing data from STATA

• Call in the foreign package
library(foreign)
• Use the following command to import the data into
R
mystatadata=read.dta(“D:/DMPS/R Training/QUICK-
R/statadata.dta”, convert.factors=TRUE)

12
11/14/2016

Variable labels
• Using the edit() function we can specify the
variable labels in the R spreadsheet

• An alternative is by using the following command:

names(mydata)[3]=“age” # this assigns age as the label the 3rd

column of mydata

Value labels
• Use the factor() function for nominal data and the
ordered() function for ordinal data
• Suppose the variable v1 is coded 1, 2 or 3 and we
want to attach value labels 1=red, 2=blue and
3=green

mydata$v1=factor(mydata$v1, levels=c(1,2,3),
labels=c(“red”, ”blue”, ”green”))

13
11/14/2016

Value labels
• Suppose the variable y is coded 1, 3 or 5 and we
want to attach value labels 1=Low, 3=Medium, and
5=High

mydata$y=ordered(mydata$y, levels=c(1,3,5),
labels=c(“Low”, ”Medium”, ”High”))

Creating new variables

• There are three ways to create new variables from
existing variables in an R data set
• Suppose the R data set mydata has two variables x1
and x2 and we want to create two variables the
mean and sum of x1 and x2
• This can be accomplished as follows:
attach(mydata)
mydata$sum=x1+x2
mydata$sum=(x1+x2)/2
detach(mydata)

14
11/14/2016

Recoding variables
• Suppose we want to categorize age as follows:
>75=Old, 45-75=Middle Aged, and <=45=Young
• This can be done as follows:
attach(mydata)
mydata$agecat[age<=45]=“Young”
mydata$agecat[age>45 and age<=75]=“Middle Aged”
mydata$agecat[age>75]=“Old”
detach(mydata)

Renaming variables
• There are many ways to do this
• The simplest is using the fix() function
mydata=fix(mydata) # results are saved on close

15
11/14/2016

Merging data sets

• We can merge data sets horizontally using the
merge() function
newdata=merge(data1,data2,by=“id”) #assuming id is
common to data1 and
data2
• Vertical merging can be done using the rbind()
function
newdata=rbind(data1,data2) #assuming data1 and
data2 have the same
variables

Selecting variables
• The following command can be used to select
variables
newdata=mydata[c(“v1”,”v3”,”v15”)] # this selects variables
v1, v3, and v15 in
my data

Or,

newdata=mydata[c(5:10)] # this will select the 5th through

the 10th variables in mydata

16
11/14/2016

Excluding/removing variables
• The following command can be used to exclude
variables in the analysis
newdata=mydata[c(-1, -3)] # this will remove the 1st and 3rd
variables in mydata

Or,

mydata$v1=mydata$v3=NULL # this will delete the

variables v1 and v3 in mydata

Selecting observations
• Use the following commands to select observations
newdata=mydata[1:5,] #will select the first 5
observations in mydata

attach(mydata)
newdata=mydata[which(gender==“male” &
age>=65),] #will select males aged 65 and
over
detach(mydata)

Beginner's Guide to R Programming
No ratings yet
Beginner's Guide to R Programming
6 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
17 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
39 pages
R Programming for Statistical Analysis
No ratings yet
R Programming for Statistical Analysis
38 pages
R Programming Data Analytics Manual
100% (1)
R Programming Data Analytics Manual
33 pages
R Programming for Data Analysis Guide
No ratings yet
R Programming for Data Analysis Guide
66 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
20 pages
Statistical Analysis With R - A Quick Start
100% (1)
Statistical Analysis With R - A Quick Start
47 pages
R Workshop: Statistical Analysis Guide
No ratings yet
R Workshop: Statistical Analysis Guide
47 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
23 pages
R Programming Language Tutorial PDF
No ratings yet
R Programming Language Tutorial PDF
100 pages
R Programming: Installation & Basics
No ratings yet
R Programming: Installation & Basics
58 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
Beginner's Guide to R Programming
No ratings yet
Beginner's Guide to R Programming
155 pages
Introduction to R Programming
No ratings yet
Introduction to R Programming
34 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
10 pages
R Programming Guide for Economists
No ratings yet
R Programming Guide for Economists
62 pages
R for Statistics and Data Analysis
No ratings yet
R for Statistics and Data Analysis
91 pages
Introduction To R
No ratings yet
Introduction To R
36 pages
Day1 2017
No ratings yet
Day1 2017
74 pages
Business Analytics with R: Course Intro
No ratings yet
Business Analytics with R: Course Intro
35 pages
R Language Lab Manual Lab 1
No ratings yet
R Language Lab Manual Lab 1
32 pages
R Programming Lab
No ratings yet
R Programming Lab
26 pages
R Studio Manual
No ratings yet
R Studio Manual
61 pages
Introduction to R for Data Science
No ratings yet
Introduction to R for Data Science
14 pages
Unit - II Part-1
No ratings yet
Unit - II Part-1
37 pages
R Basics: Data Types and Importing
No ratings yet
R Basics: Data Types and Importing
43 pages
R Data Visualization and Analysis Guide
No ratings yet
R Data Visualization and Analysis Guide
49 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
Intro to R for Linear Regression
No ratings yet
Intro to R for Linear Regression
17 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
109 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
12 pages
Data Analytic R
No ratings yet
Data Analytic R
28 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
R Studio Manual FT
No ratings yet
R Studio Manual FT
39 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
25 pages
R for Database Management & Analysis
No ratings yet
R for Database Management & Analysis
79 pages
Tutorial 1
No ratings yet
Tutorial 1
29 pages
Unit I - Introduction To R
No ratings yet
Unit I - Introduction To R
21 pages
R Programming Lab Manual for B.Tech
100% (1)
R Programming Lab Manual for B.Tech
46 pages
R Statistical Package
No ratings yet
R Statistical Package
63 pages
Introduction to R for Statistics
No ratings yet
Introduction to R for Statistics
56 pages
Statistical Analysis with R Basics
100% (1)
Statistical Analysis with R Basics
45 pages
Data Structures and Statistics in R
No ratings yet
Data Structures and Statistics in R
78 pages
Statistics for Genome Analysis in R
No ratings yet
Statistics for Genome Analysis in R
6 pages
R Lecture 1
No ratings yet
R Lecture 1
17 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
20 pages
S24 Stats10 Lab1-1
No ratings yet
S24 Stats10 Lab1-1
8 pages
Untitled
No ratings yet
Untitled
59 pages
R Language Basics for Data Analysis
No ratings yet
R Language Basics for Data Analysis
22 pages
R Programming Lab r22 Lab Manual 3 26
No ratings yet
R Programming Lab r22 Lab Manual 3 26
24 pages
A Concise Tutorial On R
100% (1)
A Concise Tutorial On R
112 pages
R Programming: Data Analysis Guide
No ratings yet
R Programming: Data Analysis Guide
61 pages
Quick R Programming Tutorial
No ratings yet
Quick R Programming Tutorial
8 pages
Data Analysis Basics with R
No ratings yet
Data Analysis Basics with R
111 pages
R Programming Basics and Installation Guide
No ratings yet
R Programming Basics and Installation Guide
33 pages
Computer Peripherals & Interfacing
No ratings yet
Computer Peripherals & Interfacing
128 pages
Library Management System Diagrams
No ratings yet
Library Management System Diagrams
20 pages
Web Technologies Material For Degree Students
100% (2)
Web Technologies Material For Degree Students
42 pages
TECHNOLOGY: Computers: Hardware Software The Internet
No ratings yet
TECHNOLOGY: Computers: Hardware Software The Internet
13 pages
CPLD Fpga Notes FOR VLSI BACKGROUND
100% (1)
CPLD Fpga Notes FOR VLSI BACKGROUND
16 pages
Creating Animated GIFs (Photoshop, ImageReady)
100% (2)
Creating Animated GIFs (Photoshop, ImageReady)
7 pages
Updated Quotation For All
No ratings yet
Updated Quotation For All
7 pages
How To Generate GSAK Stats and Automatically Upload To Profile
No ratings yet
How To Generate GSAK Stats and Automatically Upload To Profile
11 pages
Chapter1 Basic Structure of Computers
100% (2)
Chapter1 Basic Structure of Computers
7 pages
9.2.4.3 Lab - Using Wireshark To Examine TCP and UDP Captures - ILM
No ratings yet
9.2.4.3 Lab - Using Wireshark To Examine TCP and UDP Captures - ILM
15 pages
Crosstool Build for ARM9TDMI SBC
100% (2)
Crosstool Build for ARM9TDMI SBC
9 pages
MD5 Summer
No ratings yet
MD5 Summer
4 pages
Kivy Latest
No ratings yet
Kivy Latest
932 pages
AWS High Performance Computing
No ratings yet
AWS High Performance Computing
47 pages
S1Agile EN RN I.1 PDF
No ratings yet
S1Agile EN RN I.1 PDF
10 pages
SAP HR Photo Upload Documnt
No ratings yet
SAP HR Photo Upload Documnt
4 pages
Oracle HRMS Date Tracking Guide
100% (1)
Oracle HRMS Date Tracking Guide
93 pages
Spring Framework Course Syllabus
0% (1)
Spring Framework Course Syllabus
4 pages
Task 1 Bytes and File Sizes
No ratings yet
Task 1 Bytes and File Sizes
2 pages
Petrel 2014 Installation Guide
100% (3)
Petrel 2014 Installation Guide
92 pages
DSE Configuration Suite Software Installation Manual
100% (3)
DSE Configuration Suite Software Installation Manual
38 pages
Employee Management System: Minor Project
No ratings yet
Employee Management System: Minor Project
31 pages
UI/Front End Developer Profile
75% (4)
UI/Front End Developer Profile
5 pages
Understanding CIC Filters in DSP
No ratings yet
Understanding CIC Filters in DSP
6 pages
Lab Manual C++
No ratings yet
Lab Manual C++
24 pages
Java Hibernate Cookbook - Sample Chapter
No ratings yet
Java Hibernate Cookbook - Sample Chapter
30 pages
Probabilistic Roadmap Homework 4
No ratings yet
Probabilistic Roadmap Homework 4
2 pages
Java Arrays: Declaration, Manipulation, and Usage
No ratings yet
Java Arrays: Declaration, Manipulation, and Usage
51 pages
OOP Assignment Questions
No ratings yet
OOP Assignment Questions
1 page
5G Wearable Network with Slicing Tech
No ratings yet
5G Wearable Network with Slicing Tech
19 pages

Introduction to R for Statistical Analysis

Uploaded by

Introduction to R for Statistical Analysis

Uploaded by

11/14/2016

Dr. Norberto E. Milla

Awesome thing #1: Its FREE!

Awesome thing #2: Language

Awesome thing #3: Graphics

Awesome thing #4: Statistics

“Using R is a bit akin to smoking.

Downloading and installing R

Downloading and installing R

Downloading and installing R

Downloading and installing R

Downloading and installing R

Downloading and installing R

Step 6. On the pop-up menu, click OK.

R built-in data editor

Importing data from Excel

Importing data from Excel

• Then use the following command to import the file

Importing data from SPSS

Importing data from SPSS

Importing data from STATA

• An alternative is by using the following command:

names(mydata)[3]=“age” # this assigns age as the label the 3rd

Creating new variables

Merging data sets

newdata=mydata[c(5:10)] # this will select the 5th through

mydata$v1=mydata$v3=NULL # this will delete the

You might also like