You are on page 1of 15

Week 10

Introduction to Non-Tabular Data Types: Time series, spatial data, Network data.
Data Transformations: Converting Numeric Variables into Factors, Date Operations,
String Parsing, Geocoding

Aim: To implement Time series, spatial data, Network data. Data Transformations:
Converting Numeric Variables into Factors, Date Operations, String Parsing, Geocoding

TIME SERIES:

Description:

Time series is a series of data points in which each data point is associated with a timestamp.
A simple example is the price of a stock in the stock market at different points of time on a
given day. Another example is the amount of rainfall in a region at different months of the
year. R language uses many functions to create, manipulate and plot the time series data. The
data for the time series is stored in an R object called time-series object. It is also a R data
object like a vector or data frame.
The time series object is created by using the ts() function.
Syntax
The basic syntax for ts() function in time series analysis is −
timeseries.object.name <- ts(data, start, end, frequency)

Arguments:

 data is a vector or matrix containing the values used in the time series.
 start specifies the start time for the first observation in time series.
 end specifies the end time for the last observation in time series.
 frequency specifies the number of observations per unit time.

Source code:

# Weekly data of COVID-19 positive cases from


# 22 January, 2020 to 15 April, 2020
x <- c(580, 7813, 28266, 59287, 75700,
87820, 95314, 126214, 218843, 471497,
936851, 1508725, 2072113)
> library(lubridate)

Attaching package: ‘lubridate’

The following objects are masked from ‘package:base’:

date, intersect, setdiff, union


# output to be created as png file
png(file ="timeSeries.png")

1
# creating time series object
# from date 22 January, 2020
mts <- ts(x, start = decimal_date(ymd("2020-01-22")),
frequency = 365.25 / 7)

# plotting the graph


plot(mts, xlab ="Weekly Data",
ylab ="Total Positive Cases",
main ="COVID-19 Pandemic",
col.main ="darkgreen")

# saving the file


dev.off()

Multivariate Time Series Analysis


# Weekly data of COVID-19 positive cases and
# weekly deaths from 22 January, 2020 to
# 15 April, 2020

positiveCases <- c(580, 7813, 28266, 59287,


75700, 87820, 95314, 126214,
218843, 471497, 936851,
1508725, 2072113)

deaths <- c(17, 270, 565, 1261, 2126, 2800,


3285, 4628, 8951, 21283, 47210,
88480, 138475)

# library required for decimal_date() function


library(lubridate)

# output to be created as png file

2
png(file="multivariateTimeSeries.png")

# creating multivariate time series object


# from date 22 January, 2020
mts <- ts(cbind(positiveCases, deaths),
start = decimal_date(ymd("2020-01-22")),
frequency = 365.25 / 7)

# plotting the graph


plot(mts, xlab ="Weekly Data",
main ="COVID-19 Cases",
col.main ="darkgreen")

# saving the file


dev.off()

3
SPATIAL DATA:

Description:

tmap element: Building block for drawing thematic maps. All element functions have the
prefix tm_.

Quick thematic map plot: Draw a thematic map quickly. This function is a convenient
wrapper of the main plotting method of stacking tmap-elements. Without arguments or with a
search term, this functions draws an interactive map.

Syntax:

qtm(shp, fill = NA, symbols.size = NULL, symbols.col = NULL, symbols.shape = NULL,


dots.col = NULL, text = NULL, text.size = 1, text.col = NA, lines.lwd = NULL, lines.col =
NULL, raster = NA, borders = NA, by = NULL, scale = NA, title = NA, projection = NULL,
bbox = NULL, basemaps = NA, overlays = NA, style = NULL, format = NULL, ...)
Source code:

> library(tmap)
> data(World,rivers,metro)
> qtm(World)

> qtm(World, fill="HPI", fill.n = 9, fill.palette = "div",fill.title = "Happy Planet Index", fill.id
= "name", style = "gray", format = "World", projection = "+proj=eck4")

4
> qtm(World, fill = "continent", format = "World", style = "col_blind", projection =
"+proj=eck4")

>
>qtm(World, fill = "economy", format = "World", style = "col_blind", projection =
"+proj=eck4")

5
> qtm(World, borders = NULL) + qtm(metro, symbols.size = "pop2010", symbols.title.size=
"Metropolitan Areas", symbols.id= "name",format = "World")

old-style crs object detected; please recreate object with a recent sf::st_crs()

old-style crs object detected; please recreate object with a recent sf::st_crs()

old-style crs object detected; please recreate object with a recent sf::st_crs()

> current.mode <- tmap_mode("view")

tmap mode set to interactive viewing

> qtm("Viskhapatnam")

tmaptools::geocode_OSM didn't found any results for: "Viskhapatnam".

6
> tm_shape(World) +tm_polygons("HPI")

> tm1 <- tm_shape(World, projection="+proj=eck4", simplify = 0.05) + tm_polygons() +


tm_layout("Simplification: 0.05")

> tm1

7
NETWORK DATA:

Description:

igraph is a library and R package for network analysis.


The description of an igraph object starts with up to four letters:
 D or U, for a directed or undirected graph
 N for a named graph (where nodes have a name attribute)
 W for a weighted graph (where edges have a weight attribute)
 B for a bipartite (two-mode) graph (where nodes have a type attribute)

Functions used in the Social Network Analysis


 library() function
library() function load and attach add-on packages.
Syntax:

library(package, help, logical.return = FALSE....)


 make_full_graph() function
This function is used to create a full graph.
Syntax:
make_full_graph(n, loops = FALSE, directed = FALSE)
 make_ring() function
A ring is a one-dimensional lattice and it can create lattices of arbitrary dimensions,
periodic or non-periodic ones.
Syntax:
make_ring(n, directed = FALSE, circular = TRUE, mutual = FALSE)

8
 make_star() function
This Function creates a star graph, where every single vertex is connected to the center
vertex and nobody else.
Syntax:
make_star(n, center = 1, mode = c("in", "out", "mutual", "undirected"))
 sample_gnp() function
This is a simple model where every possible edge is created with the same constant
probability.
Syntax:
 sample_gnp(n, p, loops = FALSE, directed = FALSE)
 plot() function
This function is used to draw the given graph in the active graphics window.
Syntax:
plot(defined_graph_name)
Full Graph
Syntax:
make_full_graph ()
Parameters:
 Number of vertices.
 directed = TRUE/FALSE Whether to create a directed graph or not.
 loops = TRUE/FALSE Whether to add self-loops to the graph or not.

Ring Graph
The Ring graph is a one-dimensional lattice and is a special case of make_lattice function.
Syntax:
make_ring ()
Parameters:
 Number of vertices.
 directed = TRUE/FALSE Whether to create a directed graph or not.
 mutual =TRUE/FALSE Whether directed edges are mutual or not. It is ignored in
undirected graph.
 circular =TRUE/FALSE Whether to create circular ring.

Star Graph
A star graph is where every single vertex is connected to the center vertex and nobody else.
Syntax:
make_star()
Parameters:
 Number of vertices
 center = Id of the center vertex
 mode = It defines direction of the edges in/out/mutual/undirected.
 in – The edges point to the center.
 out – The edges point from the center.
 mutual – A directed star graph is created with mutual edges.
 undirected – The edges are undirected.

9
Source code:

## Download and install the package


install.packages("igraph") ## Load package
>library(igraph)
Attaching package: ‘igraph’

The following objects are masked from ‘package:stats’:

decompose, spectrum

The following object is masked from ‘package:base’:

union
> #Create networks
> g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
> plot(g1)
> class(g1)
[1] "igraph"
> g1
IGRAPH ac528bb U--- 3 3 --
+ edges from ac528bb:
[1] 1--2 2--3 1—3

> g4 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"),
+ isolates=c("Jesse", "Janis", "Jennifer", "Justin") )
> # In named graphs we can specify isolates by providing a list of their names.
> plot(g4, edge.arrow.size=.5, vertex.color="gold", vertex.size=15,
+ vertex.frame.color="gray", vertex.label.color="black",
+ vertex.label.cex=0.8, vertex.label.dist=2, edge.curved=0.2)

10
> plot(graph_from_literal(a---b, b---c)) # the number of dashes doesn't matter

> plot(graph_from_literal(a:b:c---c:d:e))

> library(igraph)

11
> Full_Graph <- make_full_graph(8, directed = FALSE)

> plot(Full_Graph)

> library(igraph)
> Ring_Graph <- make_ring(12, directed = FALSE, mutual = FALSE, circular = TRUE)
> plot(Ring_Graph)

> library(igraph)
> Star_Graph <- make_star(10, center = 1)
> plot(Star_Graph)

12
13
CONVERTING NUMERICAL VARIABLES INTO R:

Description:

cut divides the range of x into intervals and codes the values in x according to which
interval they fall. The leftmost interval corresponds to level one, the next leftmost to level
two and so on.

Syntax:
cut(x, ...)

## Default S3 method:
cut(x, breaks, labels = NULL,
include.lowest = FALSE, right = TRUE, dig.lab = 3, ordered_result = FALSE, ...)
x: numeric data
breaks: If the value is provided, the entire range of numeric data gets divided into this
‘breaks’. include.lowest: If set to True, it include the lowest value in the consideration

Source code:

#create data frame


df <- data.frame(team=c('A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'),
points=c(12, 15, 22, 29, 35, 24, 11, 24))

#view data frame


df
#view structure of data frame
str(df)

#convert points column from numeric to factor


df$points <- as.factor(df$points)

#view updated data frame


df
#view updated structure of data frame
str(df)

Output:
team points
1 A 12
2 A 15
3 B 22
4 B 29
5 C 35
6 C 24
7 C 11
8 D 24
'data.frame': 8 obs. of 2 variables:

14
$ team : chr "A" "A" "B" "B" ...
$ points: num 12 15 22 29 35 24 11 24
team points
1 A 12
2 A 15
3 B 22
4 B 29
5 C 35
6 C 24
7 C 11
8 D 24
'data.frame': 8 obs. of 2 variables:
$ team : chr "A" "A" "B" "B" ...
$ points: Factor w/ 7 levels "11","12","15",..: 2 3 4 6 7 5 1 5

15

You might also like