Lecture 4:

MSIT 218

Data Input Methods & Techniques



Data Input
need to have tools to transform spatial data of various types into digital format

the major bottleneck in the construction of a system (GIS).
costs of input often consume 80% or more of project costs data input is labor intensive, tedious, error-prone there is a danger that construction of the database may become an end in itself and the project may not move on to analysis of the data collected essential to find ways to reduce costs, maximize accuracy

need to automate the input process as much as possible, but:
automated input often creates bigger editing problems later source documents (maps) may often have to be redrafted to meet rigid quality requirements of automated input

data input to a GIS involves encoding both the locational and attribute data

the locational data is encoded as coordinates on a particular cartesian coordinate system source maps may have different projections, scales
several stages of data transformation may be needed to bring all data to a common coordinate system

attribute data is often obtained and stored in tables

Modes of Data Input

Keyboard Entry Digitizing Operation Spatial Data Integration
Used to directly type/ enter the tabular or non-spatial attributes of the data in the GIS database.

Used to produce a digital data from an analog data, e.g. scanning of paper docs & maps, digitizing of satellite/photographic images.

Involves a process of collating existing digital data and transform them into one format set by the GIS.

Criteria for Choosing Modes of Input

the type of data source images favor scanning maps can be scanned or digitized the database model of the GIS scanning easier for raster, digitizing for vector the density of data expected applications of the GIS implementation


Rasterization Vectorization


The process of converting vector data (points, lines, and polygons) into raster data (series of cells each with a discrete value). A vector to raster data conversion. Scanning is the most commonly used method to generate raster data from paper documents or maps.

Factors affecting the Accuracy of Scanning Method

The quality of the scanner. The quality of the image processing software used . The quality/ complexity of the source document

Guidelines in Scanning Paper Docs/ Maps

The document must be clean with no smudges or extra markings Lines should be at least o.1mm wide . For topographic maps, contour lines are continuous and should not be broken with text, hence, theres a need to manually edit the scanned image.


The process of converting raster images to vector features, or simply raster to vector data conversion. Aims to extract features and objects from the scanned images such as road, i.e. to represent the road as a series of coordinate points joined by a line instead of a collection of contiguous pixels. May be done by manual digitizing or computer assisted.

Data Editing
Common Errors in Geographic Data
Error Missing entities Duplicate entities Mislocated entities Missing labels Duplicate labels Description Missing points, lines or boundary segments Features that have been digitized twice Features digitized in the wrong place Unidentified polygons Two or more identification labels for a single polygon

Digitizing artifacts

Undershoots, overshoots, misplaced nodes, loops and spikes

Irrelevant data entered during digitizing, scanning or data transfer

Data Editing Functions

Raster Data Editing Filling Holes and Gaps Edge Smoothing Boundary simplification Deskewing Speckle Removal Filtering Clipping/Subsetting Drawing and Rasterization Vector Data Editing Setting editing environment; tolerance for editing Edit, Weed, Grain Topology building Data Editing and Error Correction Joining adjacent layers Attribute Data Editing Correcting attribute errors manually


The final step in building the GIS database. Types of spatial data integration: horizontal integration (or tiling) merges all adjacent datasets vertical integration (map overlay) stacks all the dataset layers


Parameters of the geographic data like format, projection, scale, and resampling parameter (for rasters) must be checked and conformed