You are on page 1of 17

2021/08/30

Lecture 5

GIS220
Exploratory Spatial Data
Analysis

Prof Gregory Breetzke


greg.breetzke@up.ac.za
Room 1-19, Geography

Lecture outline
⚫ Spatial sampling and frameworks

⚫ What is ESDA?

⚫ ESDA tools

1
2021/08/30

Spatial sampling
⚫ Different to ‘normal’ statistical sampling. Why?

⚫ Tobler’s first law of geography

⚫ Two aspects of statistical sampling supported by GIS


packages:
 The selection (or sampling) of specific point (vector) or grid cell (raster) locations within an
existing dataset
 The removal of spatial bias from collected datasets using declustering

Point sampling within raster layer

2
2021/08/30

Point sampling within raster layer

Point sampling within vector polygons


Two stage process

Grids:
• Size
• Shape
• Orientation

Environmental and social

3
2021/08/30

Sampling frameworks within zones


Selection of 5 random points per zone

Grid generation - square grid within field Grid generation (hexagonal) -


boundaries selection of 1 point per cell, random
offset from centre

Spatial data exploration


A. 10% random sample from existing point set B. Stratified random selection, 30% of each stratum

800 radio-activity monitoring sites in Germany. 200 radio-activity monitoring sites in Germany.
Random sample of 80 (red/large dots) Random sample of 30 (red/large dots)<100 units of
radiation and 30 (crosses)>=100 units of radiation

4
2021/08/30

Random points on a network

100 points in Tripolis, Greece

Declustering

⚫ Point samples may be unduly spatially clustered


 Boreholes, access in built-up or industrial zones

5
2021/08/30

Declustering

⚫ Point samples may be unduly spatially


clustered
 Boreholes, access in built-up or industrial zones

⚫ Solution: spatial declustering

⚫ Removing or reducing the known or estimated adverse


effects of clustering

What is ESDA?
⚫ Description and exploration of spatial datasets

⚫ Mapping – geovisualisation

⚫ Data mining (no spatial visualisation)


ESDA allows you to examine your data in different ways.
Before empirical analysis, ESDA lets you gain a deeper
understanding of the phenomena you are investigating so that
you can make better decisions on issues relating to your data.

6
2021/08/30

EDA, and ESDA


⚫ Basic aims
 Maximise insight into a data set
Goal or expected
 Uncover underlying structure outcome of exploration
usually unknown in
 Extract important variables advance

 Detect outliers and anomalies

 Test underlying assumptions

 Develop parsimonious models

Descriptive statistics (last lecture)

⚫ Counts and specific values


 Count, majority, minority, min, max, sum - (at neighbourhood or zonal level)

⚫ Measures of centrality
 Mean, mode, median

⚫ Measures of spread
 Range, upper and lower quartile, IQR, variance, SD, correlation coefficient

⚫ Measures of distribution of shape


 Skewness (<0 left; >0 right), kurtosis

7
2021/08/30

What is ESDA?

⚫ Simplest form = basic statistics

⚫ Graphical analysis
1. Histograms
2. Quantile map
3. Percentile map
4. Box plots
5. Box map
6. Scatterplots
7. Parallel co-ordinate plot
8. Variography

OWNER_OCCUPANCY 86 census areas


Histograms

What can we infer?

8
2021/08/30

Mapped histograms

Global and local outliers


OWNER_OCCUPANCY 86 census areas

Quantile map

9
2021/08/30

Quantile map

Percentile map
⚫ Percentile maps in GeoDa
highlight extreme values, i.e.,
observations that are in the
bottom and top 1% of a data
distribution.
⚫ These maps group a ranked
distribution into six fixed
categories:
 0-1%,
 1-10%,
 10-50%,
 50-90%,
 90-99%, and
 99-100%.

10
2021/08/30

Box plots
Johannesburg homicide sample data set for 78 neighbourhoods

Hinge

Mapped box plots

11
2021/08/30

Example
% Female ~ Normal

⚫ 23

Example
% Foreign Born Skewed to Right

⚫ 24

12
2021/08/30

Example
% Unemployed Skewed to Right

⚫ 25

Box Map

26

13
2021/08/30

Scatter plot
Johannesburg homicide sample data set for 78 neighbourhoods

Parallel coordinate plot (PCP) & star plot

14
2021/08/30

Advantages of PCPs
⚫ Peculiarities of the data can be revealed and an
appreciation obtained as to how the data should be
further processed (e.g. filtered, transformed, split,
combined, etc.).

⚫ Hypotheses can be generated for further testing (e.g.


using statistical methods).

⚫ Proper methods can be selected for in-depth analysis of


the data.

29

Brushing and linking ESDA tools

15
2021/08/30

ESDA and GIS

⚫ What currently can/cannot be done in standard GIS?

⚫ CAN
 Some non-spatial stats
 Presentation graphics

⚫ CAN’T
 Identify outliers/spatial outliers
 Implement most ESDA techniques
 Employ visualisation methods

ESDA and GIS

⚫ What currently can/cannot be done in standard GIS?

⚫ CAN
 Some non-spatial stats
 Presentation graphics

⚫ CAN’T
 Identify outliers/spatial outliers
 Implement most ESDA techniques
 Employ visualisation methods

16
2021/08/30

ESDA and GIS

17

You might also like