You are on page 1of 29

Data entry and preparation

A. Arko-Adjei
Department of Geomatic Engineering
KNUST, Kumasi, Ghana
arkoadjei@hotmail.com

March 2013
Course content
• Introduction to GIS
• Spatial data types and representation
• Data input and methods of data capture
• Spatial referencing
• Fundamentals of remote sensing
• Sensors and platforms
• Image data characteristics and image interpretation
• Remote sensing applications

2
Lecture overview
• Spatial data input
• Examples of existing data
• Spatial data formats
• Digitizing
• Data preparation
• Clearing house

3
Spatial Data Input

• Spatial data is ideally gathered through


• Direct spatial capture
• Indirect spatial data capture

4
Spatial Data Input
Direct spatial data capture
•Direct observation of the relevant geographic phenomena.
•This can be done through
• ground-based field surveys, (GPS, theodolites, tapes, etc.)
• remote sensing
• photogrammetry

5
Spatial Data Input
Direct spatial data capture
•Ground-based techniques remain the most important source
for reliable data in many cases.
•Data
which is captured directly from the environment is
known as primary data.
•Core concern with primary data is to know its properties -
knowing the process by which it was captured, the
parameters of any instruments used and the rigor with which
quality requirements were observed.

6
Spatial Data Input
• Remotely sensed imagery is usually not fit for immediate use,
as various sources of error and distortion may have been
present, and the imagery should first be free from these errors
• In practice, it is not always feasible to obtain spatial data by
direct spatial data capture.
• Factors of cost and available time may be a hindrance, or
previous projects sometimes have acquired data that may fit
the current project’s purpose.
• The increasing availability of geographic data, and the
growing pressure on organisations to perform more
efficiently, causes the common practice to use data from
existing sources, such as existing maps and digital data sets

7
Indirect spatial data capture
• Spatial data can also be sourced indirectly
• This type of data is known as secondary data
• Sources of secondary data
• data derived from existing paper maps through scanning
• data digitized from a satellite image
• processed data purchased from data capture firms or
international agencies, and so on.

8
Examples of existing data sets

• Base map data (road, stream, city, town, boundary


features, etc)
• Natural resources (land use, land cover, soil type, geology,
water, etc)
• Digital Elevation data (heights and slopes)
• Census-related data (population distribution, information
on income, employment, etc)

9
Spatial data formats

Categories of vector data formats:


•GIS import/export formats (e.g. ArcView Shape,
etc)
•CAD/CAM formats (e.g. AutoCAD-dxf, etc
•Graphic formats
•Generic spatial data formats (e.g. TIGER files, etc
•Standard exchange formats (e.g. SDTS, etc

10
Spatial data formats

Categories of raster data formats:


•Graphicformats (typically not georeferenced) e.g. TIFF,
BMP, etc
•Imagery formats (typically georeferenced) e.g GeoTIFF,
etc
•Generic compression formats (e.g JPEG, etc)

11
Digitising
• A traditional method of obtaining spatial data is through
digitizing existing paper maps.
• Digitising is the conversion of an analogue map into a
digital vector map
• It is cost-effective method of data capture
• Before adopting this approach, one must be aware that
positional errors already in the paper map will further
accumulate, and one must be willing to accept these errors.
• A number of digitising techniques exist

12
Digitising process

Input
document

On-board/Tablet On screen Scanner


digitizing digitizing

Vectorization
Process

Spatial
Database

13
On tablet manual Digitising

Digitising tablet
14
On-screen manual Digitising

Scanned image

Cursor

15
On-screen and on-tablet digitizing
• In both approaches, an operator follows the map’s features
(mostly lines) with a mouse device, thereby tracing the
lines, and storing location coordinates relative to a number
of previously defined control
• The function of these points is to ‘lock’ a coordinate
system onto the digitized data:
• Control points on the map have known coordinates, and by
digitizing them we tell the system implicitly where all
other digitized locations are.
• At least three control points are needed, but preferably
more should be digitized to allow a check on the positional
errors made.

16
Map registration using control points

• For the coordinates to be useful, they must be in the


same coordinate system as that used by the map being
digitized.
• That system is based on a map projection yielding units
of usually metres on the ground and is called the grid
coordinate system.
• The process of solving for the parameters for a pre-
defined set of equations is called map registration, a
method of geo-referencing.

17
On-screen versus Manual Digitising

• More comfortable for the operator


• More accurate (zooming facilities)
• Faster (digitising and editing at the same time)
• Up-dating procedure (geometrically corrected satellite
imagery and scanned aerial photo’s can be overlaid with
the old vector data)
• Source documents have to be scanned

18
Digitising of features
Points

Point X,Y coordinates


1 2
1 2,4
2 3,2
3 5,3
3 4
4 6,2

Lines/Arcs
Point X,Y coordinates
1 2,4 : 3,3: 4,2: 5,2: 6,1: 7,2.5
2 1,4: 2,3: 3,2: 4,1
1
2

Polygons

Point X,Y coordinates


1
1 2,4: 2,5: 3,6:
4,5:3,4: 2,4
2 2 3,2, :3,3: 4,3: 4,4:
5,4: 5,1, 3,2

Copyright: A. rko-Adjei 19
Semi-automatic and automatic digitizing
• Another set of techniques also works from a scanned
image of the original map
• This techniques uses the GIS to find features in the image
• These techniques are known as semi-automatic or
automatic digitizing, depending on how much operator
interaction is required.
• This procedure is less labour-intensive, but can only be
applied on relatively simple sources.
• If vector data is to be distilled from this procedure, a
process known as vectorization follows the scanning
process.

20
Scanning processes
• The basic principle is a light source which illuminates
the document, and a sensor which measures the
intensity of the reflected or transmitted light
• Scanners are of various types with various resolutions
• The minimum required resolution depends on the details
in the map and the digitising technique
• 200-300dpi for manual on-screen map digitising

• 800-2400dpi for photogrammetric applications


• Documents can be scanned in colour, grey-scale, line art

21
Scanner Output
• The scanner output is only a digital copy of the
source document resolved into a matrix of cells
(pixels)
• Data are not structures into classified and coded
objects
• To obtain this, the data have to be vectored and
further structured

22
Vectorization
• The conversion from a raster to vector
• The process converts the pixel values of the scanned
document to points, lines and polygons with attributes
equivalent to the pixel values
• The data has to be structured after vectorization
• Splitting lines to form line segments and nodes
• Joining line segments to form objects
• Object coding

23
Selecting a digitising technique
• The technique to be preferred depends on
• quality of the map sheet
• contents of the map sheet

• Complex documents full of details and symbols maps


(topographical maps, aerial photographs) manual
digitising is preferred

24
Selecting a digitising technique

Digitising Technique Type of document Requirements

Manual Documents Complex maps/ interpretation from Digitising tablet


Digitising on tablets satellite imagery or aerial
photographs

Manual Documents Complex maps/ interpretation from Scanner or scanned document


Digitising on-screen satellite imagery or aerial
photographs

Semi-automatic (or Simple documents that requires Scanner or scanned


interactive) digitising some interpretation documents/processing software
for semi-automatic line tracing

(Fully) automatic digitising Simple documents or separates Scanner or scanned document/


with one type of information processing software for the
vectorization process

25
Digitising errors
• Some common digitizing errors that occur in GIS
include
• undershoots: Failure to close a polygon
• overshoots: Going beyond the entity you were
supposed to connect to.
• rubber sheeting
 based on the movement of known control points
to new locations
 comparing accurate ground survey data and
aerial survey
Digitising errors
• Some common digitizing errors that occur in GIS
include
• Edge matching
 Results of distorted edges from maps digitized at
different times or with different coordinate systems.
 Causes include: changes in humidity; inherent
digitizer tablet inaccuracies; and missing or
overlapping map coverage
Data preparation
• Spatial data preparation consists of editing data that is
to be entered into the GIS database
• Vector data may require a lot of time-consuming
editing, such as the trimming of overshoots of lines at
intersections, deleting duplicate lines, closing gaps in
line, and generating polygons
• Data may need to be vectorized or rasterized to match
existing data sets
• Additionally, processing includes associating attribute
data with objects through either manual input or
reading digital attribute files into GIS

28
Clearinghouse
• A clearinghouse is a distributed set of computer servers
connected with a network of system that produce, manage
and use spatial data.
• The availability of metadata (data descriptions) allows
users to determine what spatial data exits, where to find
the data they need, evaluate the usefulness of the data for
their applications, and obtain or order the data as
economically as possible.
• Each data provider describes available data and provides
these metadata over the network. Besides metadata, the
data provider also offers access to his geographic data.

29

You might also like