You are on page 1of 32

GIS DATA MODELLING

AND MANAGEMENT
Prof. Ganesh D Bhutkar
Subject Teacher

Student Group: -
Sohan Pachhade BE IT 2008-09 J-29
Vivek Bamne BE IT 2008-09 J-06
Sanyog Salve BE IT 2008-09 J-34

Reference:
Chapters 8 & 9: Spatial Data Modeling and GIS Data Management
M. Anji Reddy, Remote Sensing and GIS, B S Publications,
Second Edition, 2006.

GIS Data Modelling and Management 1


SPATIAL DATA MODELLING
• It is a precise and clear process about how to turn data
about spatial entities into graphical representations.
• The two main approaches in which computer can handle
and display spatial entities are :-
1. Raster Approach
2. Vector Approach

• A map contains spatial elements like monuments, roads,


rivers and parks.
• Spatial modeling is very much useful in understanding
geographical problems.

GIS Data Modelling and Management 2


STAGES OF GIS
DATA MODELLING
• Identifying the spatial features from the real world
that are of interest in context of application.
• Representing the conceptual data model by an
appropriate spatial data model. This involves
choosing between one of the two approaches: raster
or vector.
• Selecting an appropriate spatial data structure to
store the model within the computer. The spatial data
structure is the physical way in which entities are
coded for purpose of storage and manipulation.

GIS Data Modelling and Management 3


GRAPHIC REPRESENTATION
OF SPATIAL DATA
• An entity is the element in reality
• Geographical entities can be represented by 3 main
entities, i.e Points, Lines and Areas.
• There are two additional spatial entities :-
1. Surface:
Surface It is used to represent continuous features or
phenomenon. For these features, there is a value at
every location. e. g. Temperature, Population Density.
2. Network: It is a series of interconnecting lines along
which there is a flow of data, objects or materials. e. g.
Road network along which there is a flow of traffic to and
from the areas.

GIS Data Modelling and Management 4


CHALLENGES IN
DEFINITION OF ENTITIES
1. How to select proper entity type for providing
appropriate representations ?

2. How to represent changes over time ?

e. g. Vegetation in forest may be a continuous feature


which can be represented by a surface for
ecologists whereas it may be represented as a
series of discrete area entities by government
officials.

GIS Data Modelling and Management 5


RASTER DATA REPRESENTATION
• The terrain is divided into number of parcels or units called
as grid cells. Each grid cell is of same size and hence it
occupies same amount of geographical space.
• It does not provide precise locational information because
of grid cells. The simplest way of including attribute data for
each entity is to assign a number representing the attribute
like a class of land cover, for each cell. E.g. 0 for Water and
1 for Land.
• The resolution is given by m * n i.e columns * rows.

Problems with raster representation:


1. Lack of absolute locational Information,
2. Reduced spatial accuracy, reliability of distance.
3. Need for large storage capacity.
GIS Data Modelling and Management 6
VECTOR DATA REPRESENTATION
• Vector representation allows us to to give specific spatial
locations specifically.
• All entities are represented using points (basic building
blocks) having x and y co-ordinates.
• Line and area entities are constructed by connecting a
series of points into chains and polygons.
• Attributes are linked through software linkage.

Problems with vector representation:


1. Selection of appropriate number of points to construct an
entity.
2. Representation of networks and surfaces is complex.

GIS Data Modelling and Management 7


TYPES OF
RASTER GIS MODELS

1. GRID Model

2. IMGRID Model

3. MAP (Map Analysis Package)


Model

GIS Data Modelling and Management 8


COMPACT RASTER DATA
MODELS
• Compact data reduces the information content to
absolute minimum.
• Compact data is needed for efficient storage and
faster retrieval.
• Based on nature of GIS data and existence of
available facilities the compact methods are grouped
as : -
1. Run-Length Coding
2. Raster Chain Codes
3. Block Codes
4. Quad trees

GIS Data Modelling and Management 9


COMPACT RASTER DATA
MODELS (Contd..)
RUN LENGTH CODES
• Each grid cell has a numerical value corresponding to a category of
data.
• If there are 500 * 500 grid cells, then 250000 numbers have to be
typed.
• There are long strings of same numbers in each row. The long string
is called run. Every run has some length, which is used for
compactness - (R, N).
• Its disadvantage is that it works on a row by row basis, so it’s tedious.
RASTER CHAIN CODES
• This method of data reduction works by defining the boundary of the
entity.
• Here the directions are represented by numbers to avoid mistakes.(0
is North, 1 is East, 2 is South, 3 is West)
• Method of storing data is based on (X,Y,N,D) where (X,Y) - start
points, N - No of cells & D - direction.
GIS Data Modelling and Management 10
COMPACT RASTER DATA
MODELS (Contd..)
BLOCK CODES
• Modified run length code i.e it selects a square group of cells and
assigns a starting point, the centre or corner, pick a grid cell value and
tell the computer how wide the square of grid cell is based on no. of
cells.
• Effective method of reducing the storage space for most thematically
layered digital data in GIS.
QUADTREES
• It’s a difficult approach which works on a square group of cells.
• Map is successively divided into uniform square group of grid cells with
same attribute value.
• The map is then divided in 4 quadrants. NW, NE, SW, SE.
• This method is only possible with raster data model and is quite
innovative because it uses recursion and divides the images into quads
or quarters till the smallest unit cell.
GIS Data Modelling and Management 11
TYPES OF VECTOR GIS
MODELS & COMPACT MODELS
1. Spaghetti model
2. Topological Models (GBF / DIME, TIGER &
POLYVRT)
3. Shape file
Compact Models:
1. Galton’s Model
2. Freeman-Huffman Chain Codes

GIS Data Modelling and Management 12


COMPARISION OF DATA MODELS
Parameter RASTER VECTOR
1. Data Structure Simple Complex
2. Data Structure Lesser More
Compactness
3. Overlay Operations Easily & efficiently More difficult to
implemented implement

4. High Spatial Efficiently represented Inefficient


Variability
5. Topological More difficult to Efficient encoding of
Relationships represent topology
6. Graphical Output Less aesthetically Better suited.
pleasing.
7.Base Location-based Object-based

GIS Data Modelling and Management 13


DATABASE MANAGEMENT
SYSTEM (DBMS)
DBMS is a software to control the storage,
retrieval and modification of data in a database.

It is designed for -
 File handling & management
 Record maintenance
 Extraction of information from data (Queries)
 Maintenance of data security and integrity
 Application building (Reports)

GIS Data Modelling and Management 14


DBMS APPLICATIONS

• Travel agency system,


• Banking system
• Library management system,
• Railway reservation system,
• Student admission system,
• Financial accounting system etc.

GIS Data Modelling and Management 15


FUNCTIONS OF DBMS
• Security : It refers to protection of data against
accidental or intentional disclosure to
unauthorized persons and protection against
unauthorized access, modification or destruction
of database.
• Integrity : It is an ability to protect data from
systems problems through a variety of assurance
measures like range checking, backup and
recovery.
• Synchronization : It refers to forms of protection
against inconsistencies that can result from
multiple simultaneous users.
GIS Data Modelling and Management 16
FUNCTIONS OF DBMS (Contd..)
• Physical data independence : It means the
underlying data storage & manipulation hardware
should not matter to the user.
• Minimization of redundancy : Redundancy is
generally not advisable in a database. And storing
and manipulating the dependencies increases
difficulty of working data. Soit uses Normalization.
• Efficiency : Data retrieval operations mainly
depend on volume of data, method of data
encoding, design of database structures and
complexity of query.
GIS Data Modelling and Management 17
COMPONENTS OF DBMS
1. Data definition
2. Storage definition
3. Database administration
4. Data manipulation

 In data retrieval, mapping must be made between


high-level objects in query language statement and
the physical location of data on storage device.
 Query compiler or optimizer is used to optimize the
code so that performance on the retrieval is
improved.
GIS Data Modelling and Management 18
GIS DATA FILE MANAGEMENT
Following are the basic file file structures used in GIS:
Simple List :
Records are placed in the order in which they are
entered. The main advantage is to add a record just
append it. The disadvantage is lack of structure which
makes searching very inefficient.
Ordered Sequential Files:
It uses alphabetic characters. Data Is arranged in
recognizable sequences against which individuals can
be compared . The normal search strategy is sort of
divide and conquer approach. It avoids search time to
get data.

GIS Data Modelling and Management 19


GIS DATA FILE MANAGEMENT
(Contd..)
Indexed Files:
These are more superior than the rest of the methods.
These are based on the index or code. It uses a
pointer to locate a record. This type of search has 3
requirements first it requires a criteria before hand,
second it requires recalculation of index from original
data, third sequential search methods are needed to
obtain information.
Relative File:
These are like indexed files only; but index used is
record number.

GIS Data Modelling and Management 20


BUILDING GIS MODELS
Four Options to build GIS real world model are:
LGCU (Least Common Geographical Units) based GIS :
It integrates all pertinent spatial data records into a
single set of all classes.
Layered based GIS : Each layer reflects different set of
attributes. It is a series of thematic layers. GIS data is
broken down into logical terrain units related to layers.
Feature based GIS : It is a new approach where GIS
features are stored as spatial or non spatial data.
Object orionted GIS : Features are not divided into layers,
but grouped into classes and hierarchies of objects. The
advantage is its reusability, but Implementation is
complex.
GIS Data Modelling and Management 21
DATABASE MODELS
• Implementation Issue is the integration of GIS with
existing internal databases. Most of the database are
relational.
• Other models by which real world database model is
built are hierarchical and network database models.
• Almost all existing and most widely used GIS software
like ARC / INFO are based on RDBMS.
• RDBMS is Relational DBMS and it is very easy to learn
and well suited for adhoc queries. A relational query
language like SQL is very easy to learn.
• Three most popular data modeling approaches are
record-based, object-based and object-relational based
on ER Diagram.
GIS Data Modelling and Management 22
HIERARCHIAL
DATABASE MODEL
• When the data has a parent or a child or one to many
relation, it is called hierarchical model.
• This model has many advantages like
- easy to understand,
- easy to update or expand,
- good for quick data retrieval.
• This model has many disadvantages like
- large index files to be maintained,
- certain attribute values are repeated, so redundancy
increases and it occupies more storage space and also
data access becomes slow.
GIS Data Modelling and Management 23
NETWORK
DATABASE MODEL
• When the data has many to many relationship, it is
called network systems model.
• This model has many advantages like
- more flexibility,
- avoids redundancy.
• This model has many disadvantages like
- overhead of pointers,
- complex system,
- more no. of pointers, so more storage space.

GIS Data Modelling and Management 24


RELATIONAL DATABASE MODEL
• It is a collection of tabular relations each with a set of
attributes.
• Data is stored as a set of rows called as tuples; consisting
of values for each attribute.
• There are two schemas upon which the entire database
depends. They are relation schema and database schema.
• Relation Schema – It is usually declared when database is
set up and does not change much during life span of the
system.
• Database schema – It is a set of relation schema and
relational database with some constraints.
• Primitive operations of relational algebra - Union,
Difference, Intersection, Join etc.
GIS Data Modelling and Management 25
RELATIONAL DATABASE MODEL
(Contd..)
• Relational algebra provides a specific set of rules for
design and function of these systems.
• Relational join is a linking mechanism to match / relate
data in one table to another.
• A single or multiple columns can be used to define search
strategy and this search criterion is called primary key.
• When a primary key in one table is related to another
column in second table, the column in the second table
row to which primary key is linked, is called foreign key.
• In process of relational joins, many a times redundancy is
created. A set of rules called Normal Forms has been
established to reduce it.
GIS Data Modelling and Management 26
RELATIONAL DATABASE
MODEL (Contd..)
There are THREE basic normal forms.
• 1st Normal Form - There should be a single value
only in each row location.
• 2nd Normal Form - Every column that is not a primary
key be totally dependent on the primary key.
• 3rd Normal Form - Columns should depend on
primary keys but primary keys should not depend on
any non-primary key.

There are more advanced normal forms available.


They can be used to improve quality of the database,
GIS Data Modelling and Management 27
STANDARD QUERY LANGUAGE
(SQL)
• The tables which are stored in database are queried and these
represent some virtual views which is done using SQL.
• Queries may be related to one table. e. g. Which hotels in city
are five star? The answer to the query can be Hotel Taj.
• Also, queries may be related to many tables. e. g. Which
tourists originating from Europe stay more in five star hotels in
city? (Two tables involved may be Tourist and Occupancy).
• Advantages of SQL - Completeness, Simplicity, Pseudo
English language style.
• SQL is not developed to handle geographical concepts like
“near to”, “far from”, “connected to” etc.
• RDBMS software supporting SQL – ARC / INFO, ORACLE,
Geovision.
GIS Data Modelling and Management 28
LOCATION BASED REPRESENTATION
FOR SPATIO-TEMPORAL DATA
• It is a layered approach where layer holds information
about a single thematic domain at a single known time.
• Data is stored in terms of “snapshots”.
• Drawbacks:-
1. Data Volume is enormous.
2. Time consuming process to access data.
3. Individual change w.r.t cells can’t be determined.
TEMPORAL GRID APPROACH
• Variable length list is associated with each pixel.
• Each entry brings a change at each location with new
value and time (event history)
GIS Data Modelling and Management 29
ENTITY BASED REPRESENTATION
FOR SPATIO - TEMPORAL DATA
Also called as: AMENDMENT VECTOR APPROACH
• It tracks the changes in geometry of entities w. r. t.
time.
• Changes are incrementally recorded (Vectors).
• As time progresses, number of amendment vectors
grow to increase complexity.

GIS Data Modelling and Management 30


TIME BASED REPRESENTATION FOR
SPATIO-TEMPORAL DATA
• Time-Based Representations for Spatio-Temporal Data use
time as the organizational basis.

• With this type of time-based representation, the changes


related to time are explicitly stored.

• This type of representation has the unique advantage of


facilitating time-based queries.

• Adding new events as time progresses is also straightforward


and are simply added to the end of timeline.

GIS Data Modelling and Management 31


THANK YOU !

GIS Data Modelling and Management 32