You are on page 1of 113

SPATIAL DATA ANALYSIS, VECTOR

OPERATIONS AND RASTER


OPERATIONS
GEO-SAFER SOUTHEASTERN MINDANAO, UNIVERSITY OF THE PHILIPPINES
MINDANAO
OUTLINE

I. SPATIAL DATA ANALYSIS


• SPATIAL DATA
• SPATIAL ANALYSIS
• ROLES OF GIS IN SPATIAL ANALYSIS
• SPATIAL DATA ANALYSIS IN GIS
• COMMON GIS ANALYSIS FUNCTIONS
OUTLINE CONT..

II. VECTOR OPERATIONS


• REVIEW OF CONCEPTS • TYPES OF VECTOR OVERLAYS
• MEASUREMENTS (LENGTHS, PERIMETERS, AREAS) • SPATIAL INTERPOLATION
• QUERYING • THIESSEN/VORONOI POLYGONS
• RECLASSIFICATION • SPATIAL QUERYING
• BUFFERING
• LIST OF VECTOR TOOLS IN QGIS
OUTLINE CONT..

III. RASTER OPERATIONS


• DATA ANALYSIS ENVIRONMENT • ZONAL OPERATIONS
• COMMON TYPES OF RASTER DATA ANALYSIS • DISTANCE MEASURES
• LOCAL OPERATIONS AND RECLASSIFICATION • OTHER COMMON OPERATIONS
• NEIGHBORHOOD OPERATIONS • SURFACE OPERATIONS
I. SPATIAL DATA ANALYSIS
SPATIAL DATA
• It can be most simply defined as
information that describes the
distribution of things upon the
surface of the earth.
• In effect, it involves any
information concerning the
location, shape of, and
relationships among,
geographic features.
SPATIAL ANALYSIS

• Manipulation of spatial
data into various forms to
be able to extract
additional and meaningful
information to understand
the real-world.
SPATIAL ANALYSIS
• It is easy to feel that a pattern is
present in a map.
• Spatial analysis allows us to test
that visual insight in a more
systematic way.
Example
Dr. John Snow
Father of Epidemiology

• 1854 Cholera outbreak; 11,495


died

Geospatial Revolution
Episode 4, Chapter 3
3 Fundamental Questions Regarding
Spatial Relationships
How can
How can we use
variations in
How can two (or what we have
geographic
more) spatial learned from an
properties over a
distributions be analysis(es) to
single area or
compared with predict future
data set be
each other? spatial
described and/or
distributions?
analyzed?
ROLE OF GIS IN SPATIAL ANALYSIS

Applied to the Real World / Decision


Evaluation / Validation
Making /
Planning
ROLE OF GIS IN SPATIAL ANALYSIS

• Data Acquisition

Elements of a GIS
Five Functional • Preprocessing
• Database Management
• Manipulation/Analysis
• Final Product Output
PROCESS, PATTERN, AND ANALYSIS
• Spatial analysis is
aimed at:
• Identifying and
describing the pattern
• Identifying and
understanding the
process
Description Explanation Exploration Prediction Judgment

• Display info. • Why, how, • Explore new • Forecast • “Decision


(query) - who, what processes ideas; prelim. location and support”
what, where, create analysis; to magnitude of systems (DSS);
when, how observed find patterns future change build
much? results? or events; consensus
projection
SPATIAL DATA ANALYSIS IN GIS
SPATIAL DATA ANALYSIS IN GIS
OUTPUT
INPUT LAYER
LAYER

Spatial Operation No. 1 Spatial Operation No. 2 Spatial Operation No. 3

OUTPUT OUTPUT
LAYER LAYER
SPATIAL DATA ANALYSIS IN GIS
INPUT LAYER

Spatial Operation No. 1 Spatial Operation No. 2 Spatial Operation No. 3

OUTPUT OUTPUT OUTPUT


LAYER LAYER LAYER
SPATIAL DATA ANALYSIS IN GIS
OUTPUT LAYER INPUT LAYER OUTPUT LAYER

Spatial Operation

OUTPUT LAYER
Common GIS Analysis Functions
• Query (e.g., identify, select)
• Recoding (reclassify)
• Proximity analysis (buffering, distance)
• Terrain analysis (e.g., slope, aspect, viewsheds)
• Neighborhood analysis (e.g., average, variety)
• Arithmetic operations
• Overlay analysis (logical and arithmetic)
• Network analysis (e.g., routing, allocation)
• Spatial modeling (simulation, projection)
II. VECTOR OPERATIONS
CHARACTERISTICS OF SPATIAL DATA

• Simple Definition of Spatial Data: the quantitative procedures


employed in the study of the spatial arrangement of features (points,
lines, polygons and surfaces)

• Spatial Data has two kinds of attributes:


• Spatial attributes (location information)
• Non-spatial attributes (characteristics)
CHARACTERISTICS OF SPATIAL DATA

• We are mainly interested in the non-spatial attributes


• but want to study them taking their location (spatial attributes)
into consideration.

• Objects with similar attributes usually are located nearby spatially


(spatial auto-correlation).
CHARACTERISTICS OF SPATIAL DATA
COMMON EXAMPLES OF SPATIAL DATA ANALYSIS
FOR VECTOR DATA
• Measurements (lengths, perimeters, areas)
• Querying
• Reclassification
• Buffering
• Vector Overlay
• Spatial Interpolation
• Spatial Querying
MEASUREMENT IN GIS
• Only an approximation of the true distance on the ground
• Vector data are stored as line segments  even curved lines are stored as
short line segments
• Measurements can be stored as attributes in vector GIS
DISTANCE MEASUREMENT
MEASUREMENTS OF VECTOR DATA
• Euclidean distance
• shortest distance/path obtained by using Pythagorean Theorem
(x, y)
• Perimeter
• sum of the straight line lengths of the boundary
• Area
• sum of areas of simple geometric shapes formed by subdividing
the feature of interest
QUERYING
• Can be performed either on
data that are part of the GIS
database or on new data
produced as a result of data
analysis
RECLASSIFICATION
• May be referred to as recoding or renumbering
• Uses rules to reclassify or assign new values
• Transformation from one classification system to another
(e.g. soil types to agricultural land use suitability)
• Results to a new layer
• May be Boolean or Weighted
RECLASSIFICATION IN VECTOR DATA
Household Income
0 – 5000
Low Income Group
5000 – 10,000
10,000 – 20,000
20,000 – 40,000 Middle Income Group
40,000 – 60,000
50,000 – 75,000
High Income Group
> 75,000
RECLASSIFICATION IN VECTOR DATA
• results usually in a lower number of classes
RECLASSIFICATION
BUFFERING
• Creation of a zone of interest
around an entity
• Point entity: circular buffer zone
• Line entity: elongated buffer zone
• Polygon entity: buffer zone has
the same shape as original
polygon, but larger
BUFFERING
BUFFERING
• Sample questions that
proximity analysis (buffering)
can address:
• How many evacuation
sites lie within 5 km of the
river?
• What structures are
within 10 km from high
erosion risk areas?
BUFFERING
• Problems may occur when buffering very
convoluted lines or areas; or for large
datasets
• May have to increase the virtual memory of your
system
• Or break the job up into a number of smaller
pieces
APPLICATIONS OF BUFFERING
• Site selection
• Determine location of
new well – make sure it
does not fall within 10km
of chemical factories
• Find all stream segments
within 300 feet of a
proposed logging area
APPLICATIONS OF BUFFERING
• Environmental
pollution
• Zone of noise
pollution around
major roads
• Buffers around
contaminated land
to prioritize sites
(according to land
use, water courses
& ground water
protection zones)
VECTOR OVERLAY
• Perhaps the most important or key GIS analysis
function
• Integrating data from two or more sources to
produce a new map layer
• Individual data layers to be overlaid have to be
topologically correct (lines should meet at nodes
and polygon boundaries are closed)
• Seldom used in isolation
• commonly used with querying
TYPES OF VECTOR OVERLAY
1.Point-in-polygon
2.Line-in-polygon
3.Polygon-on-polygon
POINT-IN-POLYGON OVERLAY
• used to find the polygon in which a point falls
within, or find a point or points that fall within a
certain polygon
LINE-IN-POLYGON OVERLAY
• used to find the polygon in which a line or
lines fall within, or find a line or lines that fall
within a certain polygon
POLYGON-ON-POLYGON OVERLAY
• used to determine which polygons from
two layers intersect or are within another
polygon
COMMON TYPES OF
POLYGON-ON-POLYGON OVERLAY
•Union
•Clip
•Intersect
•Dissolve
•Merge
UNION
• Corresponds to the Boolean
operator OR
• Output layer contains
polygons from both input
layers
CLIP
• Corresponds to the Boolean
operator NOT
• A polygon layer is used to
cut out the portion of
another polygon layer that
falls within the first polygon
• Masking - no analysis will
take place in the masked
areas but only inside the
unmasked
• Opposite: Difference
INTERSECT
• Corresponds to the Boolean
operator AND
• Output is the polygon of
intersection of two polygon
layers
DISSOLVE
• Used to remove boundaries
or nodes between adjacent
polygons or lines that have
the same values for a
specified attribute.
• Ex. Barangays with same
political affiliations.
MERGE
• To create a new theme
containing two or more
adjacent themes of the
same shapefile type.
• Ex. Merge several barangay
themes with census data to
make a metropolitan area.
SPATIAL INTERPOLATION
• Procedure of estimating values at unsampled sites within
an area with existing observations
• Used to fill the gaps between observed data points
• Interpolated data is only an approximation of the true
value
• Common Methods:
1. Thiessen Polygons or Voronoi Polygons
2. Triangulated Irregular Network (TIN)
THIESSEN/VORONOI POLYGONS
• Used to establish
area territories for a
set of points.
• Example:
• rainfall distribution
THIESSEN/VORONOI POLYGONS
TRIANGULATED IRREGULAR NETWORK
• An way of constructing a
surface form a set of
irregularly spaced data
points
• Often used to generate DEMs
• Interpolation is based on
local data points
• Adjacent data points are
connected by lines to form a
network of irregular triangles
SPATIAL QUERYING
• “Select Features”
• Where the feature:
• Contains
• Equals
• Overlap
• Crosses
• Intersects
• Is disjoint
• Touches
• Within
VECTOR TOOLS IN QGIS
VECTOR TOOLS IN QGIS
VECTOR TOOLS IN QGIS
VECTOR TOOLS IN QGIS
VECTOR TOOLS IN QGIS
III. RASTER OPERATIONS
RASTER DATA ANALYSIS
• Can be performed at the level of:
• Individual cells
• Groups of cells
• Cells within an entire raster
• ...belonging to:
• Single raster
• Two or more rasters
DATA ANALYSIS ENVIRONMENT
• Refers to the area for analysis and the output cell size
• Area extent for analysis:
• Specific raster
• Minimum and maximum x-, y- coordinates
• Combination of rasters (intersection; union)
• Output cell size
• Typically set to be equal to, or larger than, the largest cell size
among the input raster
LOCAL OPERATIONS (SINGLE RASTER)
• Given a single raster as the input, a local operation
computes each cell value in the output raster as a
mathematical function of the cell value in the input raster.

Arithmetic, logarithmic,
trigonometric, and power
functions for
local operations
LOCAL OPERATIONS (SINGLE RASTER)
• E.g.: Slope_d = 57.296 x arctan (Slope_p/100)

A local operation can convert a slope raster from percent (a) to degrees (b).
LOCAL OPERATIONS (MULTIPLE RASTERS)
• Compositing, overlaying, or superimposing maps (Tomlin
1990)
• Map algebra, a term that refers to algebraic operations
with raster map layers. (Tomlin 1990, Pullar 2001)
• Besides mathematical functions that can be used on
individual rasters, other measures that are based on the
cell values or their frequencies in the input rasters can also
be derived and stored on the output raster of a local
operation with multiple rasters.
LOCAL OPERATIONS (MULTIPLE RASTERS)

The cell value in (d) is the mean


calculated from three input rasters (a,
b, and c) in a local operation. The
shaded cells have no data.
LOCAL OPERATIONS (MULTIPLE RASTERS)

Each cell value in (c)


represents a unique
combination of cell values
in (a) and (b).
The combination codes
and their representations
are shown in (d).
RECLASSIFICATION (A LOCAL OPERATION)
• Recoding, or transforming through lookup tables (Tomlin
1990)
• 2 methods:
• 1-to-1 change; range of input cell values to a new value
• 3 main purposes:
• Create a simplified raster
• Create a new raster with a unique category of value
• Create a new raster showing ranking
(1) RECLASSIFICATION (A LOCAL OPERATION)
• May be referred to
as recoding or renumbering
• Uses reclassification rules to
reclassify or assign new
values to cells
• Results to a new image

May be Boolean or Weighted


(1) RECLASSIFICATION
• Tool functions
a. generalize complex dataset
e.g. map of all land ownership reclassified to public and private
(5000+ classes to 2 classes)
b. reclassify data based on attributes
e.g. soils types recoded to soil suitability for septic systems
c. assign map classes to ordinal values
e.g. best, worst
d. produce a mask
e.g. map of Metro Manila recoded to binary map showing only
Quezon City (1) and everything else (0)
(1A) BOOLEAN RECLASSIFICATION
• Produces a two-coded image from a complex image
• Original image is reclassified to an image with only 0 or 1
as cell values
(1B) RECLASSIFICATION BY WEIGHTING
• A different weight is assigned to different feature types or
classes, based on the purpose of the reclassification
• Higher weights can be assigned to priority classes while
lower weights to those of lower priority
• Example, if the purpose is for forest conservation, a weight
of 5 can be assigned to forestry and lower weights to
other features
(1)RECLASSIFICATION EXAMPLE
Land use Original New value: New value:
value Boolean Weighted
Forestry 10 1 4

Water 11 0 2

Settlement 12 0 1

Agricultural 13 0 3
Land
RECLASSIFICATION (A LOCAL OPERATION)
NEIGHBORHOOD OPERATIONS
• A neighborhood operation involves a
focal cell and a set of its surrounding
cells. The surrounding cells are chosen
for their distance and/or directional
relationship to the focal cell.
NEIGHBORHOOD OPERATIONS

Four common
neighborhood types:
rectangle (a), circle
(b), annulus (c), and
wedge (d). The cell
marked with an x is
the focal cell.
NEIGHBORHOOD OPERATIONS
The cell values in (b) are
the neighborhood
majority statistics of the
shaded cells in (a) using
a 3 x 3 neighborhood.
For example, the upper left
cell in the output
raster has a cell value of
2 because there are five
2s and four 1s in its
neighborhood.
NEIGHBORHOOD OPERATIONS

The cell values in (b) are


the neighbourhood means
of the shaded cells in (a)
using a 3 x 3
neighborhood. For
example, 1.56 in the
output raster is calculated
from (1 +2 +2 +1 +2 +2 +1
+2 +1) / 9.
NEIGHBORHOOD OPERATIONS
The cell values in (b) are
the neighborhood range
statistics of the shaded
cells in (a) using a 3 x 3
neighborhood. For
example, the upper-left
cell in the output raster
has a cell value of 100,
which is calculated from
(200 – 100).
ZONAL OPERATIONS
• A zonal operation works with groups of cells of same values or
like features. These groups are called zones. Zones may be
contiguous or non-contiguous.
• A zonal operation may work with a single raster or two rasters.
• Given a single input raster, zonal operations measure the
geometry of each zone in the raster, such as area, perimeter,
thickness, and centroid.
• Given two rasters in a zonal operation, one input raster and
one zonal raster, a zonal operation produces an output raster,
which summarizes the cell values in the input raster for each
zone in the zonal raster.
ZONAL OPERATIONS

Thickness and centroid for two


large watersheds (zones). Area
is measured in square kilometer,
and perimeter and thickness are
measured in kilometer. The
centroid of each zone is marked
with an x.
ZONAL OPERATIONS
The cell values in (c)
are the zonal means
derived from an input
raster (a) and a zonal
raster (b). For
example,
2.17 is the mean of
{1, 1, 2, 2, 4, 3} for
zone 1.
DISTANCE MEASURES
• The physical distance measures the straight-line or
euclidean distance.
• Physical distance measure operations calculate
straight-line distances away from cells designated
as the source cells.
DISTANCE MEASURES

A straight-line distance is measured


from a cell center to another cell
center. This illustration shows the
straight-line distance between cell
(1,1) and cell (3,3).
ALLOCATION AND DIRECTION
• Allocation produces a raster in which the cell value
corresponds to the closest source cell for the cell.
• Direction produces a raster in which the cell value
corresponds to the direction in degrees that the cell is
from the closest source cell.
ALLOCATION AND DIRECTION
• Allocation produces a raster in which the cell value
corresponds to the closest source cell for the cell.
• Direction produces a raster in which the cell value
corresponds to the direction in degrees that the cell is
from the closest source cell.
ALLOCATION AND DIRECTION

Based on the source cells denoted as 1 and 2, (a) shows the physical distance measures
in cell units from each cell to the closest source cell; (b) shows the allocation of each cell
to the closest source cell; and (c) shows the direction in degrees from each cell to the
closest source cell. The cell in a dark shade (row 3, column 3) has the same distance to
both source cells. Therefore, the cell can be allocated to either source cell. The direction
of 2430 is to the source cell 1.
DISTANCE MEASURES

• Continuous distance
measures from a
stream network.
• (Euclidean Distance
tool)
APPLICATIONS OF BUFFERING
• Epidemiology
• Disease clusters around certain
features (e.g. asthma surrounding
incinerators)
• Crime
• To examine if car crime is more
prominent in certain areas (e.g.
close to major roads, street
corners, car parks)
APPLICATIONS OF BUFFERING
• Density
– Spread point values over a
surface
– When added together, the
values (e.g. population) of all
the cells equal to the sum of
the original point layer.
OTHER RASTER DATA OPERATIONS
• For raster data management
• includes Clip and Mosaic.
• For raster data extraction
• includes use of a data set, a graphic object, or a query
expression to create a new raster by extracting data from an
existing raster.
• For raster data generalization
• includes Aggregate and RegionGroup.
CLIPPING

An analysis mask (b) is used to clip an input raster (a). The output raster is (c),
which has the same area extent as the analysis mask.
EXTRACTION
A circle, shown in white, is
used to extract cell values
from the input
raster (a). The output (b)
has the same area extent
as the input raster
but has no data outside
the circular area.
AGGREGATE

An Aggregate operation creates a lower-resolution raster (b) from the input (a). The
operation uses the mean statistic and a factor of 2 (i.e., a cell in b covers 2 x2 cells in a).
For example, the cell value of 4 in (b) is the mean of {2, 2, 5, 7} in (a).
REGION GROUP

Each cell in the output (b) has a unique number that identifies the connected region
to which it belongs in the input (a). For example, the connected region that has the
same cell value of 3 in (a) has a unique number of 4 in (b).
(3) SURFACE ANALYSIS
• Users can build and analyze complex surfaces to identify
patterns or features within the data.
• Many patterns that are not readily apparent in the original
data can be derived from the existing surface.
• Shaded relief
• Contours
• Angle of slope
• Aspect
• Hillshade
• Viewshed
• Curvature
• Cut/Fill.
(3) SURFACE ANALYSIS
• Topographic datasets be utilized to
generate a number of products
such as:
• Digital Elevation Models
• Slope/Aspect Polygons
• Terrain profiles
• Inter-visibility models
PRODUCTS FROM DEM slope

aspect

elevation stream networks

drainage basins
(3) SURFACE ANALYSIS TOOLS
a. Contour
b. Slope
c. Aspect
(3) SURFACE ANALYSIS TOOLS
a. Contour
b. Slope
c. Aspect
(3) SURFACE ANALYSIS TOOLS
a. Contour
b. Slope
c. Aspect
ADDED INFORMATION
DATA SOURCE
DATA SOURCE
DATA SOURCE
REFERENCES

• 1. LIDAR GIS TRAINING AND FLOOD DISASTER MANAGEMENT FOR LOCAL GOVERNMENT PRESENTATION,
• GIS GEOGRAPHY (HTTPS://GISGEOGRAPHY.COM/SPATIAL-DATA-TYPES-VECTOR-RASTER/)
• 2. HTTPS://GIS.STACKEXCHANGE.COM/QUESTIONS/57142/WHAT-IS-THE-DIFFERENCE-BETWEEN-VECTOR-
AND-RASTER-DATA-MODELS
• 3. RASTER VECTOR PRESENTATION (HTTPS://WWW.SLIDESHARE.NET/REHAMELSAFARINI/RASTER-DATA-AND-
VECTOR-DATA)
• 4. HTTPS://EN.WIKIPEDIA.ORG/WIKI/LIST_OF_GIS_DATA_SOURCES
• 5. PHILIPPINE DATA SOURCE (HTTPS://WWW.PHILGIS.ORG/) (SOFTWARE (HTTPS://WWW.PHILGIS.ORG/GIS-
SOFTWARE)
ACKNOWLEDGEMENT
THANK YOU…

You might also like