You are on page 1of 15

Spatial Analysis using Vector in R:

Introduction
Putu Arya Wigita
Data Vector vs Raster
• Vector and Raster data are common in spatial analysis and Geographic
Information System (GIS).
• Vector data contains X, Y (longitude, latitude) which can be formed into
points, lines, and also polygons. Vector data are useful for storing and
representing data that has discrete boundaries.
• On the other hand, raster data usually divided into a regular array, where
each of these cells have associated value.
• Vector data are more suited to cartographic representation method.
Conversely, raster is more suited to mathematical modelling and analysis.
• We can use “sf” package to process vector data and “raster” package to
process raster data in R.
Reading Vector Data in R
• st_read() function can be used to read vector data.
• Vector data can be stored in several format, for instance:
1. .shp (shapefile) -> the most common
2. GeoJSON -> based on the JSON format
3. GPS
4. netCDF (Network Common Data Form)
• For example, use Indonesia data from the link.
• st_read(“Indonesia.shp”) # For example
• Vector data can be downloaded in
https://gadm.org/download_country_v3.htm l
• There is a column called geometry which has “list” class.
Knowing your Vector Data
• Knowing that vector data is read as a dataframe in R, it is very
convenient that you can use function on Tidyverse package to clean
and process the data.
• Geometry column stores information about longitude, latitude for
every row, hence it is stored as a list.
• List is another type of vector that contain multiple types of
information, we can even put a list in a list We can create list by using
list() function. For example:
a <- list(2, T, ‘UGM’, 2:5)
b <- list(1, ‘ITB’, a)
Knowing your Vector Data: Slicing Lists
• Slicing lists can be very tricky, we should know the information structure of list. For example:

a b
[[1]] [[1]]
[1] 2 1
[[2]] [[2]]
TRUE [1]
[[3]] “ITB”
UGM [[3]]
[[4]] [[3]][[1]]
2345 [1] 2
[[3]][[2]]
TRUE
[[3]][[3]]
UGM
[[3]][[4]]
2345
Knowing your Vector Data: Slicing Lists
• Slicing lists is a little bit different with vectors and dataframe. We use
[[]] instad of []. To access data “5” in list a we can use this code:
a[[4]][4]
• On the other hand, to access data “5” in list b we can use this code:
b[[3]][[4]][4]
Plotting Data
• First, lets read a shapefile data by using st_read()
indo <- st_read(“prov.shp”)
object.size(indo)
27659880 byte
indo <- ms_simplify(indo) ## To compress the size of the shapefile

st_area(indo) ## to know each province’s area

• Simple plotting data


plot(st_geometry(indo)) ## st_geometry() is used to eliminate attached attributes
Plotting Data
Coordinate Reference System
• Coordinate Reference System (CRS) reflects the system of latitude and
longitude from the data.
• Since we are dealing with spatial data, we require X/Y coordinates
that are based on a mathematical model of the shape of the earth.
• Our file usually has a CRS, however it’s not always defined.
• st_crs() is used to print out a vector object’s CRS
• crs() is used to print out raster object’s CRS
• If the shapefile does not have CRS, then we should do a background
research to find out the CRS, and then use the st_crs() function.
• Or we can transform the CRS on vectors by using st_transform()
Preparing the Data
• Use “Kemiskinan.csv” as the data.

• Clean the data by using gsub() to change “,” into “.” and also as.numeric() to
change the class.

• Merge both the shp and csv file by using merge() or total_join()
• Note : the identifier should have the same name, so it can be merged and
plotted correctly

• Remove “_” on the column list


Preparing the Data

• The data should be form as a tidy


Dataframe whereas the columns
represent the variable and the rows
represent observations.
• Note that there are several regions
with name!
Plotting Data: Static

• Use tm_shape() to plot the main df with major additions:


• tm_fill() to plot the columns;

• tm_borders() to create borders across area;

• tm_layout() to manipulate the title and legend

tm_shape(df) +
tm_fill("Kemiskinan Maret 2020", palette = "Blues", title = "dalam %") +
tm_borders(alpha = 0.3) +
tm_layout(main.title = "Kemiskinan Indonesia Berdasarkan Provinsi, Maret 2020",
main.title.position = "center",
legend.title.size = 1,
legend.text.size = 0.6,
legend.position = c("left", "bottom"))
Static Plot
Plotting Data: Interactive
• The difference between static plot and interactive plot is in tmap_mode(),
which has two arguments:
• tmap_mode(“plot”) ## to plot the data
• tmap_mode(“view”) ## to view the data

tmap_mode("plot")

map1 <- tm_shape(df) +


tm_fill("Kemiskinan Maret 2020", palette = "Blues", title = "dalam %") +
tm_borders(alpha = 0.3) +
tm_layout(main.title = "Kemiskinan Indonesia Berdasarkan Provinsi, Maret 2020",
main.title.position = "center",
legend.title.size = 1,
legend.text.size = 0.6,
legend.position = c("left", "bottom"))

tmap_mode("view")
map1
Thank You

You might also like