Professional Documents
Culture Documents
Introduction to Non-Tabular Data Types: Time series, spatial data, Network data.
Data Transformations: Converting Numeric Variables into Factors, Date Operations,
String Parsing, Geocoding
Aim: To implement Time series, spatial data, Network data. Data Transformations:
Converting Numeric Variables into Factors, Date Operations, String Parsing, Geocoding
TIME SERIES:
Description:
Time series is a series of data points in which each data point is associated with a timestamp.
A simple example is the price of a stock in the stock market at different points of time on a
given day. Another example is the amount of rainfall in a region at different months of the
year. R language uses many functions to create, manipulate and plot the time series data. The
data for the time series is stored in an R object called time-series object. It is also a R data
object like a vector or data frame.
The time series object is created by using the ts() function.
Syntax
The basic syntax for ts() function in time series analysis is −
timeseries.object.name <- ts(data, start, end, frequency)
Arguments:
data is a vector or matrix containing the values used in the time series.
start specifies the start time for the first observation in time series.
end specifies the end time for the last observation in time series.
frequency specifies the number of observations per unit time.
Source code:
1
# creating time series object
# from date 22 January, 2020
mts <- ts(x, start = decimal_date(ymd("2020-01-22")),
frequency = 365.25 / 7)
2
png(file="multivariateTimeSeries.png")
3
SPATIAL DATA:
Description:
tmap element: Building block for drawing thematic maps. All element functions have the
prefix tm_.
Quick thematic map plot: Draw a thematic map quickly. This function is a convenient
wrapper of the main plotting method of stacking tmap-elements. Without arguments or with a
search term, this functions draws an interactive map.
Syntax:
> library(tmap)
> data(World,rivers,metro)
> qtm(World)
> qtm(World, fill="HPI", fill.n = 9, fill.palette = "div",fill.title = "Happy Planet Index", fill.id
= "name", style = "gray", format = "World", projection = "+proj=eck4")
4
> qtm(World, fill = "continent", format = "World", style = "col_blind", projection =
"+proj=eck4")
>
>qtm(World, fill = "economy", format = "World", style = "col_blind", projection =
"+proj=eck4")
5
> qtm(World, borders = NULL) + qtm(metro, symbols.size = "pop2010", symbols.title.size=
"Metropolitan Areas", symbols.id= "name",format = "World")
old-style crs object detected; please recreate object with a recent sf::st_crs()
old-style crs object detected; please recreate object with a recent sf::st_crs()
old-style crs object detected; please recreate object with a recent sf::st_crs()
> qtm("Viskhapatnam")
6
> tm_shape(World) +tm_polygons("HPI")
> tm1
7
NETWORK DATA:
Description:
8
make_star() function
This Function creates a star graph, where every single vertex is connected to the center
vertex and nobody else.
Syntax:
make_star(n, center = 1, mode = c("in", "out", "mutual", "undirected"))
sample_gnp() function
This is a simple model where every possible edge is created with the same constant
probability.
Syntax:
sample_gnp(n, p, loops = FALSE, directed = FALSE)
plot() function
This function is used to draw the given graph in the active graphics window.
Syntax:
plot(defined_graph_name)
Full Graph
Syntax:
make_full_graph ()
Parameters:
Number of vertices.
directed = TRUE/FALSE Whether to create a directed graph or not.
loops = TRUE/FALSE Whether to add self-loops to the graph or not.
Ring Graph
The Ring graph is a one-dimensional lattice and is a special case of make_lattice function.
Syntax:
make_ring ()
Parameters:
Number of vertices.
directed = TRUE/FALSE Whether to create a directed graph or not.
mutual =TRUE/FALSE Whether directed edges are mutual or not. It is ignored in
undirected graph.
circular =TRUE/FALSE Whether to create circular ring.
Star Graph
A star graph is where every single vertex is connected to the center vertex and nobody else.
Syntax:
make_star()
Parameters:
Number of vertices
center = Id of the center vertex
mode = It defines direction of the edges in/out/mutual/undirected.
in – The edges point to the center.
out – The edges point from the center.
mutual – A directed star graph is created with mutual edges.
undirected – The edges are undirected.
9
Source code:
decompose, spectrum
union
> #Create networks
> g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
> plot(g1)
> class(g1)
[1] "igraph"
> g1
IGRAPH ac528bb U--- 3 3 --
+ edges from ac528bb:
[1] 1--2 2--3 1—3
> g4 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"),
+ isolates=c("Jesse", "Janis", "Jennifer", "Justin") )
> # In named graphs we can specify isolates by providing a list of their names.
> plot(g4, edge.arrow.size=.5, vertex.color="gold", vertex.size=15,
+ vertex.frame.color="gray", vertex.label.color="black",
+ vertex.label.cex=0.8, vertex.label.dist=2, edge.curved=0.2)
10
> plot(graph_from_literal(a---b, b---c)) # the number of dashes doesn't matter
> plot(graph_from_literal(a:b:c---c:d:e))
> library(igraph)
11
> Full_Graph <- make_full_graph(8, directed = FALSE)
> plot(Full_Graph)
> library(igraph)
> Ring_Graph <- make_ring(12, directed = FALSE, mutual = FALSE, circular = TRUE)
> plot(Ring_Graph)
> library(igraph)
> Star_Graph <- make_star(10, center = 1)
> plot(Star_Graph)
12
13
CONVERTING NUMERICAL VARIABLES INTO R:
Description:
cut divides the range of x into intervals and codes the values in x according to which
interval they fall. The leftmost interval corresponds to level one, the next leftmost to level
two and so on.
Syntax:
cut(x, ...)
## Default S3 method:
cut(x, breaks, labels = NULL,
include.lowest = FALSE, right = TRUE, dig.lab = 3, ordered_result = FALSE, ...)
x: numeric data
breaks: If the value is provided, the entire range of numeric data gets divided into this
‘breaks’. include.lowest: If set to True, it include the lowest value in the consideration
Source code:
Output:
team points
1 A 12
2 A 15
3 B 22
4 B 29
5 C 35
6 C 24
7 C 11
8 D 24
'data.frame': 8 obs. of 2 variables:
14
$ team : chr "A" "A" "B" "B" ...
$ points: num 12 15 22 29 35 24 11 24
team points
1 A 12
2 A 15
3 B 22
4 B 29
5 C 35
6 C 24
7 C 11
8 D 24
'data.frame': 8 obs. of 2 variables:
$ team : chr "A" "A" "B" "B" ...
$ points: Factor w/ 7 levels "11","12","15",..: 2 3 4 6 7 5 1 5
15