You are on page 1of 48

GIS Data Structures


GIS data represents real world objects. Real world objects can be divided into two abstractions: Discrete (soil, land use, cities) Continuous (elevation or rain fall). Traditionally, there are two broad methods used to store data in a GIS for both abstractions: Raster & Vector




Cell or pixel is the basic spatial unit for a Raster / Grid data Pixels are generally square in shape (Square Tessellation) Pixels are organized into an array of Rows and Columns called a Grid/Raster Rows and columns are numbered from 0 Hence, origin for raster data is upper left corner Pixel locations are referenced by their row and column position Every pixel can be uniquely identified by its row and column position Pixels are assigned an integer, floating point, or NO DATA value Each pixel represent some kind of geographic phenomenon Number of rows and columns does not have to be the same

Point representation

Line / Arc representation

Polygon / Area representation

Raster dataset attribute table

Raster Data Types

Continuous Raster

Thematic Raster


Advantages Simple data structure Resolution is set by cell size Easily modified Display/output good for images Faster and very efficient for overlay operation Raster data mainly is obtained from satellite images and scanning Raster is utilized when data change continuously across a region (High spatial variability is efficiently represented)

Disadvantages Not all phenomena related directly with raster representation Requires large storage Errors in perimeter, and shape Displays jagged edges at large scale Implementing Topology is difficult Difficult network analysis

Vector Model
Primitive Features


Lines / Arcs


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25


Derived from the formulation of spatial concepts that emphasize on real world objects Geometry primitives of vector data model are Point, Line and Polygon objects can be built from these primitives Object location determined by represented location point Uniqueness of vector data model lies in its management and storage of data geometry primitives Origin for vector data is lower left corner


Spaghetti model Topology model

The Spaghetti Model

The spaghetti model is the most simple vector data model The model is a direct representation of a graphical image NO explicit topological information The geometries may be points, lines or polygons No constraints wrt how geometries may positioned (like two lines may
intersect without adding a point at intersection, two polygons may intersect without restriction)

The Spaghetti Model

Advantages Simple Ease of editing efficient for display and plotting

Redundant storage of data Major deficiencies for dealing with neighbourhood and inclusion though connectivity is possible Computation expense in building topological or network relationships among features Cannot be used to effectively represent surfaces inefficient for most types of spatial analysis

Topological Model
Connections & relationships between objects are independent of their coordinates Overcomes major weakness of spaghetti model allowing for GIS analysis (Overlaying,
Network, Contiguity, Connectivity)

Requires all lines be connected, polygons closed, loose ends removed.

Arc-Node Topology

To Node

Left Polygon

Right Polygon

From Node

Vector Topologic Data Model

Arc Coordinate Data A n1 a3 C a4 n2 B a2 a1 Arc a1 a2 a3 a4 Arc Topology StartXY IntermediateXY 4,5 4,5 4,5 4,3 Node Topology (4,8), (8,8), (8,1), (4,1) (6,7), (6,3) (1,3) EndXY 4,3 4,3 4,3 4,5 Polygon Topology ID A B C Arcs a1, a2 a2, a4 a3, a4

Arc Start
a1 a2 a3 n1 n1 n1

n2 n2 n2

Left Right

n1 n2

a4, a2, a1, a3 a2, a4, a3, a1




Planar Enforcement: No two individual features can overlap. There are no holes or slands that are not themselves features. Every feature is represented as a record in the attribute table.

Topological Model
(The Intelligent mode of representation)

Where is it? (location)

What is is next to (adjacency) Is it inside or outside (containment) How far is it (connectivity)

Topology represents the structuring of coordinate data which clearly describes adjacency, containment, and connectivity.

more efficient data storage (Compact data structure) topological encoding more efficient suitable for most usage and compatible with data good graphic presentation Efficient projection transformation Efficient for network analysis Accurate map output

overlay operation not efficient complex data structure High spatial variability is inefficiently represented

3-D Data Representation Triangulated Irregular Network (TIN)

TIN is a vector data structure that partitions geographic space into contiguous, non-overlapping triangles. The vertices of each triangle are sample data points with x, y and z values. These points are connected by lines to form Delaunay triangles.
TIN is a vector topological data model for representing surfaces TIN represents a surface as a set of interconnected triangular facets derived from sample points Associated Data tables: - Node table - lists each triangle and its defining nodes - Edge table - lists 3 adjacent triangles for each facet - XY coordinate table - stores nodes coordinates

Triangulated Irregular Network (TIN)

Node Face




Slope and Aspect calculated for each triangle and stored as attributes of the facet For areas of complex relief, TIN works better than raster More detailed representation for higher density of data points

Significantly more processing required to generate the TIN file to start (but then more efficient representation) Errors along edges often need correction

What is Database?

A system whose overall purpose is to record and maintain data. The data concerned can be anything that is deemed to be of significance to the organization.


(Describes conceptual structuring of data stored in database)

Hierarchical Data Model

Network Data Model

Relational Data Model

Object Data Model
(Object oriented and Object relational models)

Hierarchical Data Model (parts superior to suppliers)

Hierarchical Data Model

Now obsolete, a hierarchical DBMS assumed hierarchical relationships between data. i.e., tree structure. The root may have any number of dependents, each of these may have any number of lower-level dependents, and so on, to any number of levels. (Examples are IBMs IMS, Informatics Mark IV.) Many-to-many model is not possible with this structure. A true model for representing hierarchical structures from the real world.

Asymmetry is a major drawback Update operations are difficult

Network Data Model

Network Data Model

Network DBMS allowed complex data structures to be built but were inflexible and required careful design. The network model allows to model many-to-many relationship directly than does the hierarchical approach. The network structure is more symmetric than the hierarchical structure.Very efficient in storage and fast Best examples are airline booking systems. A pre-cursor to and largely superseded by Relational DBMS Fast and Efficient Inflexible Technically obsolete (although many in commercial use).

Relational Data Model

Relational Data Model

The Relational model to data is based on the realization that files that obey certain constraints may be considered as mathematical relations.
In much of the Relational literature, tables are referred to as relations. Rows of such tables referred to as tuples, also in general known as Record. Columns are referred to as attributes. The most popular type of DBMS in use, very simple and easy to under stand. Relational DBMS have to employ many tables to conform absolutely to the various normalization rules.

Object Data Model

Object orientation for a database means the capability of storing and retrieving objects in addition to mere data.

Objects are complex and not well handled by standard Relational DBMS.
Most systems can handle images, video and other objects but do so in a non-standard way in many cases. The first system to announce the use of an Object Oriented DBMS is Taos from Data Research Associates.

An Object-Relational database (ORDBMS) adds features associated with object oriented systems to a RDBMS.

It enables you to make the features in GIS datasets smarter by endowing them with natural behaviors and relationship among features. It brings a physical model closer to its logical model. It lets you implement the majority of custom behaviors without writing any code. (e.g., over passes and under passes

Basic OO Characteristics Polymorphism

The behaviors (or methods) of an object class that can adapt to variations of objects.

An object is accessed only through a well-defined set of software methods, organized into software interfaces.

An object class can be defined to include the behavior of another object class and have additional behaviors.

GIS Standards & Interoperability

Documented agreements containing technical
specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics, to ensure that materials, products, procedures, and services are fit for their purpose.
(as defined by ISO)

Standards facilitate data sharing and increase interoperability among geographic information systems.

Interoperability enables sharing and exchange of information and processes in heterogeneous, autonomous, and distributed computing environments. However, interoperability presents a much greater challenge in GIS than in other fields of information science because the greater complexity of geographic information.

GIS Standards include

Spatial Data Standards Metadata Standards Database Standards User interface Standards

Networking Standards
Database Query Standards Display and Plotting Standards Data Exchange Standards

ISO/TC 211

Digital Geographic Information Working Group


Organizations involved with developing standards and Interoperability in GeoSpatial both national and international are OGC Open Geospatial Consortium ( ISO International Organization for Standardization ( ) ANSI American National Standards Institute ( ) W3C World Wide Web Consortium ( ) WS-I Web Services Interoperability Organization ( ) IHO International Hydrographic Organization ( ) LIF Location Interoperability Forum ( ) GSDI Global Spatial Data Infrastructure ( CEN European Committee for Standardization ( DGIWG Digital Geographic Information Working Group (

National Spatial Data Infrastructure (NSDI) ( NNRMS standards

National Spatial Data Infrastructure (NSDI) A Clearing House for information on spatial data (Metadata) generated by various National and State Agencies

NSDI Stakeholders


For interoperability in GIS, Open GIS Consortium (OGC) is the key.

The Open Geospatial Consortium (OGC), an international voluntary consensus standards organization, originated in 1994. In the OGC, more than 400 commercial, governmental, nonprofit and research organizations worldwide collaborate in a consensus process encouraging development and implementation of open standards for geospatial content and services, GIS data processing and data sharing The OGC standards baseline comprises more than 30 standards

ISO/TC 211 is a standard technical committee formed within ISO, tasked with covering the areas of digital geographic information and geomatics. It is responsible for preparation of a series of International Standards and Technical Specifications numbered in the range starting at 19101.

ISO/TC 211

The Infrastructure for Geospatial Standardization

Data Models for Geographic Information

Geographic Information Management Geographic Information Services

Encoding of Geographic Information

Specific Thematic Areas

1. Principles of Geographical Information System for Land Resource Assessment by P.A. Burrough, Oxford University Press 2. Concepts and Techniques of GIS (2nd Edition) by Chor Pang Lo, Albert K W Yeung, Published by Prentice Hall 3. An Introduction to Database Systems by C.J. Date