You are on page 1of 28

CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Spatial Data Structure and Database Design


3.1. Concepts of geographic phenomena and data modeling
Geographic phenomena are often classified according to the spatial dimension best used
to describe their nature. These include points, lines, areas, and volumes (3D). As you
likely remember, we used the spatial dimension of map elements (e.g., line vs. point) in
the last lab to decide how to symbolize and apply feature labels to our maps.

The Geographic phenomena Are those drastic and observable changes that take place
in nature. They can occur abruptly and are capable of transforming the environment, so
that, after these phenomena occur, a new reality arises.

Geographic phenomena are complemented by Geographical facts , Which refer to


elements that are stable and whose variations are perceived over longer periods.

1
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Then, in the nature is part of a geographic fact. Then a phenomenon that generates an
abrupt variation in the environment is generated, and the new reality that is generated
later becomes a new geographic fact.

A GIS operates under the assumption that the spatial phenomena involved occur in a
two- or three-dimensional Euclidean space. Euclidean space can be informally defined
as a model of space in which locations are represented by coordinates (x, y) in 2D and
(x, y, z) in 3D space and distance and direction can defined with geometric formulas. In
2D, this is known as the Euclidean plane. To represent relevant aspects of real-world
phenomena inside a GIS, we first need to define what it is we are referring to. We might
define a geographic phenomenon as a display of an entity or process of interest that:

➢ Item can be named or described;


➢ Item can be georeferenced; and
➢ Item can be assigned a time (interval) at which it is/was present.

Geographic phenomena can be classified according to the elements from which they
occur. This classification includes three types: physical, biological and human.

Characteristics of geographical phenomena

Physical geographic phenomena: Physical geographic phenomena refer to those that


are generated without involving any living organism. These drastic changes usually occur
as a result of climatic, physical or chemical elements, among others, naturally generated.

Within physical geographic changes can be storms, tornados, heavy rains and
earthquakes, among others. The physical geographic changes that are generated are
able to transform the landscape and generate a new reality.

Examples of physical geographic changes can be:

Rivers overflow: A river can overflow as a result of different natural causes. Some of the
possible causes may be the following:

➢ As a result of heavy rains and sustained in a short time


➢ Consistent rainfall for a long time
➢ Obstruction of channels by landslides

2
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

➢ Rise of sea level


➢ Warming

When a river overflows, it can generate lasting changes in the landscape. It is possible
for the river to permanently expand its channel, to flood surrounding plant species, and if
there are human communities nearby, it is possible that it may collapse with houses,
buildings, roads and other constructions.

Biological geographical phenomena

Biological geographic phenomena are those that are generated by living beings,
excluding humans. Within this classification are the geographic variations produced by
plants, animals, insects and microorganisms. Some examples of biological geographic
changes can be:

Plague deforestation

The appearance of pests can destroy large areas of vegetation. Pests can appear, for
example, as a result of an imbalance in fauna; If there are no natural predators, a species
can become a pest.

Pests especially affect plants located on land with few nutrients, which can lead to
deforestation of entire regions and altering the environment completely. Pests can also
greatly reduce the amount of animal organisms in an area.

Human Geographical Phenomena

These phenomena are the most obvious and, in many cases, invasive that can be found
on the planet. Human geographic phenomena are caused exclusively by the action of
man in his environment.

Like physical and biological phenomena, human geographic phenomena alter the
environment in a lasting way. As a result of these transformations, positive and, in many
cases, negative consequences can be generated. Some examples of human geographic
phenomena:

Construction of roads

3
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

As a result of the need to expand its communication channels, human beings have
transformed their environment. This has involved the construction of roads and roads that
intervene openly in the environment. The construction of this type of structures has been
beneficial for the development of the human race, allowing to extend the interaction
between the men and to generate a more effective communication.

However, in some cases the intervention has been detrimental to nature, because some
Ecosystems Have been affected.

Construction of dams

Hydraulic dams are structures, made of walls and containment elements, whose main
function is to store or divert water from a river to serve different purposes.

Among the functions of a water dam are the regulation of water supply in a particular
region, the storage of water for irrigation or energy production.

When constructing a dam, the human being interferes to a great extent in the nature.
These constructions generate positive consequences for human life, such as renewable
energy production, flood control in certain areas and facilitating access to water for human
consumption.

On the other hand, the construction of dams is considered a geographical phenomenon


because it transforms the environment permanently: Generates still water, which can
bring disease. Block the passage to different marine species, affecting migratory
movements Promotes the extinction of whole colonies of organisms, which make life in
the rivers.

Data modeling in GIS

Data modeling is the process of creating a simplified diagram of a software system and
the data elements it contains, using text and symbols to represent the data and how it
flows. Data models provide a blueprint for designing a new database or reengineering a
legacy application. Overall, data modeling helps an organization use its data effectively
to meet business needs for information.

4
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

A data model can be thought of as a flowchart that illustrates data entities, their attributes
and the relationships between entities. It enables data management and analytics teams
to document data requirements for applications and identify errors in development plans
before any code is written.

Alternatively, data models can be created through reverse-engineering efforts that extract
them from existing systems. That's done to document the structure of relational databases
that were built on an ad hoc basis without upfront data modeling and to define schemas
for sets of raw data stored in data lakes or NoSQL databases to support specific analytics
applications.

Data modeling can also help establish common data definitions and internal data
standards, often in connection with data governance programs. In addition, it plays a big
role in data architecture processes that document data assets, map how data moves
through IT systems and create a conceptual data management framework. Data models
are a key data architecture component, along with data flow diagrams, architectural
blueprints, a unified data vocabulary and other artifacts.

What are the different types of data models?

Data modelers use three types of models to separately represent business concepts and
workflows, relevant data entities and their attributes and relationships, also the technical
structures for managing the data. The models typically are created in a progression as
organizations plan new applications and databases. These are the different types of data
models and what they include:

Conceptual data model.

This is a high-level visualization of the business or analytics processes that a system will
support. It maps out the kinds of data that are needed, how different business entities
interrelate and associated business rules. Business executives are the main audience for
conceptual data models, to help them see how a system will work and ensure that it meets
business needs. Conceptual models aren't tied to specific database or application
technologies.

5
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Logical data model.

Once a conceptual data model is finished, it can be used to create a less-abstract logical
one. Logical data models show how data entities are related and describe the data from
a technical perspective. For example, they define data structures and provide details on
attributes, keys, data types and other characteristics. The technical side of an
organization uses logical models to help understand required application and database
designs. But like conceptual models, they aren't connected to a particular technology
platform.

Physical data model.

A logical model serves as the basis for the creation of a physical data model. Physical
models are specific to the database management system (DBMS) or application software
that will be implemented. They define the structures that the database or a file system will
use to store and manage the data. That includes tables, columns, fields, indexes,
constraints, triggers and other DBMS elements. Database designers use physical data
models to create designs and generate schema for databases.

Data modeling techniques

Data modeling emerged in the 1960s as databases became more widely used on
mainframes and then minicomputers. It enabled organizations to bring consistency,
repeatability and disciplined development to data processing and management. That's
still the case, but the techniques used to create data models have evolved along with the
development of new types of databases and computer systems.

These are the data modeling approaches used most widely over the years, including
several that have largely been supplanted by newer techniques.

1. Hierarchical data modeling

Hierarchical data models organize data in a treelike arrangement of parent and child
records. A child record can have only one parent, making this a one-to-many modeling
method. The hierarchical approach originated in mainframe databases -- IBM's
Information Management System (IMS) is the best-known example. Although hierarchical

6
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

data models were mostly superseded by relational ones beginning in the 1980s, IMS is
still available and used by many organizations. A similar hierarchical method is also used
today in XML, formally known as Extensible Markup Language.

2. Network data modeling

This was also a popular data modeling option in mainframe databases that isn't used as
much now. Network data models expanded on hierarchical ones by allowing child records
to be connected to multiple parent records. The Conference on Data Systems
Languages, a now-defunct technical standards group commonly called CODASYL,
adopted a network data model specification in 1969. Because of that, the network
technique is often referred to as the CODASYL model.

3. Relational data modeling

The relational data model was created as a more flexible alternative to hierarchical and
network ones. First described in a 1970 technical paper by IBM researcher Edgar F.
Codd, the relational model maps the relationships between data elements stored in
different tables that contain sets of rows and columns. Relational modeling set the stage
for the development of relational databases, and their widespread use made it the
dominant data modeling technique by the mid-1990s.

4. Entity-relationship data modeling

A variation of the relational model that can also be used with other types of databases,
entity-relationship (ER) models visually map entities, their attributes and the relationships
between different entities. For example, the attributes of an employee data entity could
include last name, first name, years employed and other relevant data. ER models
provide an efficient approach for data capture and update processes, making them
particularly suitable for transaction processing applications.

Geographic objects and fields

The two fundamental ways of representing geography are discrete objects and fields.
Discrete objects the discrete object view represents the world as objects with well-defined
boundaries in empty space. Just as the desktop may be littered with books, pencils, or

7
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

computers, the geographic world is littered with cars, houses, forest stands, and other
discrete objects. One characteristic of the discrete object view is that objects can be
counted. For example, there may be 49 houses in a particular subdivision.

Geographic objects are identified by their dimensionality. Objects that occupy area,
including lakes, and forest stands, are termed two-
dimensional and generally referred to as areas or
polygons. Other objects that are linear, including roads,
railways, and rivers, are termed one-dimensional and
generally referred to as lines. Objects that are single
locations, including individual animals and buildings, are
termed zero-dimensional and generally referred to as
points.

The discrete object view leads to a powerful way of


representing geographic information about objects.
Consider a class of objects of the same dimensionality for
example, all the grizzly bears in the Kenai Peninsula of
Alaska. We would naturally think of these objects as points. We might want to know the
sex of each bear and its date of birth if our interests were in monitoring the bear
population. We might also have a collar on each bear that transmitted the bear's location
at regular intervals.
All of this information could be expressed in a table, like the one shown below, with each

8
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

row corresponding to a different discrete object and each column to an attribute of the
object.

Fields

While we might think of land as composed of discrete mountain peaks, valleys, ridges,
slopes, etc., and think of listing them in tables and counting them, in practice there are
unresolvable problems of definition for all of these objects. Instead, it is much more useful
to think of terrain as a continuous surface in which elevation can be defined rigorously at
every point. Such continuous surfaces form the basis of the other common view of
geographic phenomena, known as the field view. The field view represents the real
world as a finite number of variables, each one defined at every possible position.
Discrete objects are distinguished by their dimensions and naturally fall into categories of
points, lines, and areas. Fields, on the other hand, can be notable by what varies and
how smoothly. A field of elevation, for example, varies much more smoothly in a
landscape that has been worn down by glaciation or compressed by blowing sand than
one recently created by cooling lava. Cliffs are places in fields where elevation changes
suddenly rather than smoothly.

Fields can also be created from


classifications of land, into categories
of land use or soil type. Such fields
change suddenly at the boundaries
between different classes. Other types
of fields can be defined by continuous
variation along lines rather than across
space. Traffic density, for example,
can be defined everywhere on a road
network and flow volume can be
defined everywhere on a river. Below
is an example of field-like phenomena.

9
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

3.2 Vector data and Raster data model

There are two essential methods used to store information in a Geographic Information
System GIS for both reflections: Raster and Vector Data Model. GIS data represents
real-world objects such as roads, land use, elevation with digital data. The Real-world
objects or features of earth can be divided into two abstractions: discrete objects (a
Tree) and continuous fields (like elevation).

To work in a GIS environment, real world observations (objects or events that can be
recorded in 2D or 3D space) need to be reduced to spatial entities. These spatial entities
can be represented in a GIS as a vector data model or a raster data model.

Raster Data Model

A raster data type is made up of pixel or cells and each pixel has an associated value.
Digital Photography is the best example of raster data type model, anyone who is familiar
with digital photography can recognize the pixels as the smallest individual unit of an
image, where each pixel value in the image corresponds to a particular color and the
combination of these pixels will create an image. As of now, the best example of raster
data that is commonly used is Aerial photos, with only one purpose, to display a detailed
image on a map or for the purposes of digitization. Raster data type consists of rows and
columns of cells and these each cells stores a single value. Raster data can be
images (raster images) with each pixel containing a color value. In Raster, data is
represented as a grid of (usually square) cells. Each cell of a raster, stores a single
value and it can be extended by using raster bands to represent RGB (red, green, blue)
colors.

Raster Data Model Advantages

➢ Better for storing Image data.


➢ A powerful format for statistical and spatial analysis.
➢ Easy and efficient overlaying.
➢ Simple Data Structure.
➢ Same Grid Cell for several attributes.

10
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Raster Data Model Disadvantages

➢ Dataset can be large, storage space can be a problem.


➢ Network analysis is difficult to perform.
➢ Loss of information when using large cells.
➢ Insufficient projection transformation.
➢ Difficult in a representation of Topology connections.
Vector Data Model

Vector data represent the features as an individual point, and


they are stored as pairs of (x, y) coordinates. If these points are
joined, they create a lines feature, or if they joined into a closed
ring, they create a polygon, but all vector data fundamentally
consists of lists of coordinates that define vertices and paths.
Vectors are frequently used in all kinds of applications. One
common area is urban planning, where land parcels and
buildings are often represented as polygons, roads as polylines
or polygons (road edge), and small features like telephone poles
are represented by points.

Geographical features are best to represent by below-mentioned types of


geometry:

Points When geographic features are too small to represent as polygons, points features
are used; in other words, simple location. For example, the locations of Trees, depth,
Point of Interest. These vector points are simply XY Co-ordinates.

Lines or polylines vector lines or polylines connect with each vertex with paths, they
usually represent features that are linear such as rivers, roads, railroads, and
pipelines.

Polygons Cartographers used polygons to display geographic features that have an


area. For example, it may include lakes, park boundaries, buildings, city boundaries, or
land uses.

11
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Vector features are group into layers and features in a specific layer have the same
geometry type. For example, if a layer contains a Polygons feature, then GIS application
will only allow a user to create a new polygon feature in the same layer. Each of the vector
features is stored in a database along with their attributes. For example, a database that
describes a Street may contain a Street’s Name, Type, speed limit. The User can perform
spatial analysis with different geometries.

Vector Data Model Advantages


➢ Compact data structure – Need less space for storing data.
➢ Accurate Graphic output.
➢ Since most information, e.g. printed version maps, is in vector form no data
conversion is required.
➢ Exact geographic location of data is maintained.
➢ Easily make a connection between topology and network, efficient for network
analysis.
Vector Data Model Disadvantages

➢ The location of each vertex needs to be stored explicitly.


➢ It has a complex Data Structure.
➢ Difficult overlay operations.
➢ high spatial variability is inefficiently represented.
➢ Spatial analysis and filtering within polygons are impossible.
Raster or Vector?

While deciding whether to use a vector data model or raster data model in your work it
entirely depends on the data you have as input and what your goals are for displaying or
analyzing the data. There are many analyses
that make use of both data models i.e. vector
and raster or require the conversion of one
data model to another. While conversion is a
common procedure, it’s suggested that any
translation between data model to be kept at
a minimum to avoid accumulating error in

12
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

your spatial model. The size of the dataset should be a consideration, as raster dataset
can be quite large and difficult for some workstation to process in a timely manner. Vector
data model is recommended for analysis, unless modeling a continuous surface and
when using a raster data model, it is important to use cell sizes that are appropriate for
analysis.

3.3. Spatial Relationships and Topology

The term “spatial relationships” refers to the way objects are arranged in relation to
one another in geographic space. For example, we can describe them as adjacency,
contiguity, overlap, and proximity.

Here are a couple of quotes that all good geographers should remember to better
understand the concept of spatial relationships.

“Everything is related to everything else. But near things are more related than distant
things.” -Waldo R. Tobler

“GIS is the nervous system for the planet.” -Jack Dangermond

“Geography has made us neighbors. History has made us friends. Economics has made
us partners, and necessity has made us allies” -John F Kennedy

What each quote implies is that geography is connected. And this is the same as spatial
relationships because they identify how features relate to one another in geographic
space.

Spatial Relationships Types

The adjacency, contiguity, overlap, and proximity are the four ways of describing the
relationship between two or more entities.

Adjacency: Adjacency is when two polygon entities are directly connected to each other
sharing a boundary.

Contiguity: Contiguity is the relationship between two or more entities when they share
an edge such as the 48 contiguous states.

13
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Overlap: An overlap occurs when a single entity shares the same location or partial
location of another entity.

Proximity: Finally, proximity is defined as being close enough that you can establish
contact with each other without having physical contact with each other. There is no single
rule for what a spatial relationship between two objects is. But the common types of
spatial relationships include overlap, proximity, contiguity, and adjacency.

Spatial Joins and Geographic Relationships

A spatial join is a powerful tool in the GIS world. It allows data to be connected or joined
together, by sharing layers with the same attributes based on location. In GIS, the spatial
join can be achieved through different spatial relationships.

Spatial Join Type Description


Intersect Two features touch at any location. An Intersect is commonly
used to describe a point where two or more lines, planes, or
surfaces meet.
Within a distance Two features are within a set distance. This spatial relationship
is defined by how far two or more entities can be found on a
map.
Completely within The join feature is within the target feature. This type of
geographic relationship involves a feature completely being
inside the other.

14
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Identical Both features match identically. When two features are exactly
without having any difference or variant of any kind, the
features are identical.
Closest The join feature is closest to the target feature. When there are
several possible features to join, this spatial join takes only the
closest feature.

Today, topology in GIS is generally defined as the spatial relationships between adjacent
or neighboring features. Mathematical topology assumes that geographic features occur
on a two-dimensional plane.

Geospatial topology is the study and application of qualitative spatial relationships


between geographic features, or between representations of such features in geographic
information, such as in geographic information systems (GIS). For example, the fact that
two regions overlap or that one contains the other are examples of topological
relationships. It is thus the application of the mathematics of topology to GIS, and is
distinct from, but complimentary to the many aspects of geographic information that are
based on quantitative spatial measurements through coordinate geometry. Topology
appears in many aspects of geographic information science and GIS practice, including
the discovery of inherent
relationships through spatial query,
vector overlay and map algebra;
the enforcement of expected
relationships as validation rules
stored in geospatial data; and the
use of stored topological
relationships in applications such
as network analysis. Spatial
topology is the generalization of
geospatial topology for non-
geographic domains.

15
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

3.4 GIS data formats and data conversion

Geospatial data is created, shared, and stored in many different formats.

Vector Data File Formats

SDC: Smart Data Compression: SDC is ESRI's highly compressed format, which is
directly readable by ArcGIS software, but not by ArcView 3.x. Many ESRI Data and Maps
datasets are natively in SDC format. (Environmental Systems Research Institute)

With ESRI Data and Maps 2006, a standalone "Data Distribution Application" was
included that converts SDC data files directly to shapefiles.

LYR: Layer File: A .lyr file is directly readable only by ArcGIS software and other newer
software applications. This file does not contain actual geographic data, but rather
contains specifications for the presentation of other datasets. Such specifications include
color shading, naming, label properties (font, color, placements, etc.). Such presentation
properties are usually time consuming to create, so a .lyr file allows these settings to be

16
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

saved and shared. In order to use a .lyr file, you must also have a seperate data file with
the same prefix name saved in the same filespace.

SHP: Shapefile: The ESRI Shapefile has become an industry standard geospatial data
format, and is compatible to some extent with practically all recently released GIS
software. To have a complete shapefile, you must have at least 3 files with the same
prefix name and with the following extensions: .shp = shapefile, .shx = header and .dbf =
associated database file. Additionally, you may have a .prj = Projection file, a .lyr = layer
file, and other index files. All these files must be saved in the same workspace.

ArcInfo Coverage: An ArcInfo coverage does not have an individual file extension.
Instead, it is composed of two folders within a "workspace" which each contain multiple
files. One of the two folders carries the name of the coverage, and contains a number of
various .adf files. The other folder is an "info" folder, which typically contains .dat and .nit
files for all the coverages and grids in the workspace. The best way to manage (copy,
move, delete, rename) ArcInfo coverages is with ArcCatalog or ArcInfo Workstation
(command line).

E00: Arc Export or Interchange Format: .e00 (pronounced e-o-o or e-zero-zero) files
are ArcInfo Interchange or export files, used to conveniently copy and move ArcInfo GIS
coverages (see above) and grids (see below). An .e00 file must be "imported" in order to
use the data in ArcView or other GIS software.

MDB: Geodatabase: The geodatabase is a collection of geographic datasets of various


types, with the most basic types being vector, raster, and tabular data. There are three
types of geodatabases: file, personal, and ArcSDE. Geodatabases are the native data
format for ESRI's ArcGIS. A full discussion and online help is available at ESRI Support
Center.

TIN: Triangular Irregular Network: A TIN is a vector-based model which represents


geographic surfaces as contiguous non-overlapping triangles. The vertices of each
triangle are known data points (x,y) with values in the third dimension (z) taken from
surveys, topographic maps, or digital elevations models (DEMs). The surface of each
triangle has a slope, aspect, surface area, and continuous, interpolated elevation values.

17
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

The selective inclusion of points within a TIN gives the triangles their irregular pattern and
reduces the amount of data storage required relative to the regularly distributed points in
a DEM.

Raster Data File Formats:

ArcInfo Grid: An ArcInfo Grid does not have an individual file extension. Instead it is
composed of two folders within a "workspace" which each contain multiple files. One of
the two folders carries the name of the grid, and contains a number of various .adf files.
The other folder is an "info" folder, which typically contains .dat and .nit files for all the
coverages and grids in the workspace. The best way to manage (copy, move, delete,
rename) ArcInfo Grids is with ArcCatalog or ArcInfo Workstation (command line).

MrSID: MrSID (pronounced "mister sid") is a proprietary format of LizardTech's


GeoExpress software for imagery compression, and is commonly used on orthoimages.
See the DOQQ page for MrSID content. The MrSID file extension is .sid. A companion
file with a .sdw extension and the same prefix name as the .sid is used as a world file for
georeferencing a MrSID image. Most greyscale TIFF images are compressed with MrSID
to 10:1 or 15:1. Color images are usually compressed to 30:1 or 40:1. GeoExpress is also
commonly used to create image mosaics.

Most recent GIS software, including ArcGIS, are able to read MrSID compressed images
without any additional extensions. ArcView 3.x, however, requires a MrSID Extension for
image access. Plugins for other software, such as AutoCAD and Photoshop, may or may
not be required.

ECW: ECW is a proprietary format of ERMapper for imagery compression. It is a more


recent format than MrSID, but is gaining popularity because of free compression utilities
available from ER Mapper's website(This link is broken. We're working on fixing it).

JPEG 2000: JPEG 2000 is a non-proprietary image compression format based on ISO
standards, and typically uses .jp2 as the file extension. It's advantages are that it offers
lossy and lossless compression, and world files (.j2w) can be used to georeference an

18
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

image in GIS software. Compression ratios are similar to MrSID and ECW formats. For
more information, see the Wikipedia entry for JPEG 2000.

File Formats for GIS Name Description

Vector Data
.kml Google KML File Is a XML-based plaintext file that may contain
geometry, data, or a pointer to a web service
.kmz Compressed KML Same as above, but compressed. You can
uncomrpess this with 7zip (free/open) and you will
see plaintext KML
.gpx GPS data file XML-based GPS data file, usually coming from a
GPS device.
.gdb File Geodatabase Esri File Geodatabase can be used to store vector
(Directory) and raster data as well as more complex data
containing topologies, and other supporting
files. The File Geodatase is a directory, which
contains many files which can not be read on their
own
.mdb Personal Esri Personal Geodatabase format
Geodatabase/MS
Access Database
.sqlite Sqlite/Spatialite Extension is optional for this format, but is often
Database .sqlite. Spatialite is an an extenion to sqlite which
spatialally enables it. This format is often used
with QGIS.
Shapefile Has traditionally been the “standard” vector format
for ArcGIS. Is actually a collection of files, not a
single file, all in the same directory, with the same
file base name (the name without the
extension). When viewed in an ArcGIS filesystem
dialog, is usually displayed as a single file.

19
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

.shp Shapefile Main file that stores the data


geometry. Sometimes somewhat confusingly
refered to a as a shapefile.
.shx Shapefile index file Shapefile index file
.dbf Shapefile data file Shapefile data file which stores attribute data. It’s
actually a tabular format called dBASE, which can
be read on its own, in other programs.
.prj Shapefile (Optional) Stores spatial reference and projection
projection file metadata. It’s a plaintext file that can be read on
its own.
Other shapefile Shapefiles can also include many other file
supporting files formats such as .xml, .sbn, .sbx, etc., which are
often used for indexes or some other metadata
.E00 ArcInfo Coverage A legacy file format
Raster Data
.tif, .tfw TIFF and TIFF Uncompressed image data which, when
World File, associated in a directory with a tfw (“world file”) of
GeoTIFF the same file basename, is a GeoTIFF,
georeferenced image.
.jp2, .jpw, .jpx JPEG 2000 Compressed image data often
georeferenced. Sometimes includes an
associated jpw (“world file”) of the same
basename (see .tfw for info). .jpx can contain
additional metadata.
.jpg JPEG Lossy compressed image data which, when
associated in a directory with a jpw (“world file”) of
the same file basename, is a GeoJPEG,
georeferenced image.
.png Ping, Portable Non-georeferenced lossless compressed image
Network Graphic data which, when associated in a directory with a

20
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

pgw (“world file”) of the same file basename, is a


GeoJPEG, georeferenced image.
ArcInfo ASCII Grid Directory-based
.asc ASCII grid A grid file in plaintext (ASCII) format
.bnd Grid Boundary file The extent/boundary metadata for the grid
.hdr Grid Header file Grid metadata such as cell size
.sta Grid Statistics file Grid cell statistics
.vat Grid attribute table The equivalent of a grid attribute table, for integer
grids only. Stores attributes based on zones.
Tabular
.dbf dBASE data file Is often associated with a shapefile (see above),
but is sometimes standalone tabular data which
can be opened in older versions of Excel or
Access.
.csv comma delimited Plaintext tabular data delimited by comma. Can
file sometimes be directly improted (e.g., QGIS,
ArcGIS).
.xls Excel file MS Excel data/spreadsheet format. Can be
directly read into some programs (e.g., ArcGIS)
but is sometimes buggy.
.txt text file Plaintext, often delimited with tabs, commas,
pipes, or semi-colons. Can sometimes be directly
imported, but will often need to be translated to a
native tabular format for the software being used
(e.g., dbf)
GIS Program These do not contain data, only configuration
Files information
.qgs QGIS project Similar to ArcGIS mxd .. Saved workspace
configuration (a “map”) including pointers to layer
data, symbology, data frame properties, and

21
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

layout properties, as well as other miscellenous


settings
ArcGIS
.mxd Map Document Saved workspace configuration (a “map”)
including pointers to layer data, symbology, data
frame properties, and layout properties, as well as
other miscellenous settings
.mxt Map Template Contains settings that would normally be made to
a map document, but for reuse in new map
documents
.lyr Layer File Symbology settings for a data layer. If name is
the same as a related data source in the same
directory, will automatically display
.loc Locator File Used for geocoding (Addresses)

Data Conversion: It is a critical process in the migration of information from existing


information databases to new ones that often require changes in data formats. It refers to
the alteration and transfer of data between different systems, when the systems undergo
replacement or updates.

It is also of great importance in the insurance sector. Companies can make use of different
strategies for converting data to ensure that the data is compatible with their systems.

Right data conversion should ensure the following:

➢ Data is converted into an appropriate format and is transferred correctly


➢ Data works in the new destination database
➢ Data retains it’s quality and data consistency is maintained at all times across all
the systems.

GIS Consortium has the capability to convert hard copy into a wide range of electronic
formats. GISC’s team of experienced and multi skilled specialists have a vast knowledge
of converting various types of geospatial data. It can deal with the present-day challenges

22
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

such as complexity of data, project timelines and effect on the quality and accessibility of
the data, ensuring a smooth and successful data conversion.

For a good quality GIS Services, you can get in touch with the GIS Consortium.

3.5. Spatial database design with the concepts of geo-database.

GIS design involves organizing geographic information into a series of data themes-layers
that can be integrated using geographic location. So it makes sense that geodatabase
design begins by identifying the data themes to be used, then specifying the contents and
representations of each thematic layer. This involves defining:

➢ How the geographic features are to be represented for each theme (for example,
as points, lines, polygons, or rasters) along with their tabular attributes.
➢ How the data will be organized into datasets, such as feature classes, attributes,
raster datasets, and so forth.
➢ What additional spatial and database elements will be needed for integrity rules,
for implementing rich GIS behavior (such as topologies, networks, and raster
catalogs), and defining spatial and attribute relationships between datasets.

That’s why organizations require an enterprise GIS geodatabase, Infographic offers the
following Geodatabase specializations:

➢ GeoSpatial Data Modeling, Geodatabase Design.


➢ Spatial data modeling, spatial and topology business rules.
➢ Network data models.
➢ Geo-spatial data migration, validation, auditing.
➢ Metadata development.
➢ GIS data maintenance workflow design.
➢ Geodatabase Versioning Plans.

Geodatabase design is based on a common set of fundamental GIS design steps, so it's
important to have a basic understanding of these GIS design goals and methods. GIS
design involves organizing geographic information into a series of data themes layers that
can be integrated using geographic location. So, it makes sense that geodatabase design

23
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

begins by identifying the data themes to be used, then specifying the contents and
representations of each thematic layer.

24
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Representation
Each GIS database design begins with a decision as to what the geographic
representations will be for each dataset. Individual geographic entities can be
represented as
➢ Feature classes (sets of points, lines, and polygons)
➢ Imagery and rasters.
➢ Continuous surfaces that can be represented using features (such as
contours), rasters (digital elevation models [DEM]), or triangulated irregular
networks (TINs) using terrain datasets
➢ Attribute tables for descriptive data

Data themes

Geographic representations are organized in a series of data themes (sometimes referred


to as thematic layers). A key concept in a GIS is one of data layers, or themes. A data
theme is a collection of common geographic elements such as a road network, a
collection of parcel boundaries, soil types, an elevation surface, satellite imagery for a
certain date, well locations, and so on.

The concept of a thematic layer was one of the early notions in GIS. Practitioners thought
about how the geographic information
in maps could be partitioned into logical
information layers as more than a
random collection of individual objects
(such as a road, a bridge, a hill, a
house, a peninsula). These early GIS
users organized information in thematic
layers that described the distribution of
a phenomenon and how it should be
portrayed across a geographic extent.
These layers also provided a protocol
(capture rules) for collecting the
representations (as feature sets, raster layers, attribute tables, and so on).

25
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

In GIS, thematic layers are one of the main organizing principles for GIS database design.

Each GIS will contain multiple themes for a common geographic area. The collection of
themes acts as layers in a stack. Each theme can be managed as an information set
independent of other themes. Each has its own representations (points, lines, polygons,
surfaces, rasters, and so on). Because the various independent themes are spatially
referenced, they overlay one another and can be combined in a common map display.
Plus, GIS analysis operations, such as overlay, can fuse information between themes.

Eleven steps to geodatabase design


Identify the information products that you will create and manage with your GIS. Your
1.
GIS database design should reflect the work of your organization. Consider compiling and
maintaining an inventory of map products, analytic models, Web mapping applications, data
flows, database reports, key responsibilities, 3D views, and other mission-based
requirements for your organization. List the data sources you currently use in this work. Use
these to drive your data design needs. Define the essential 2D and 3D digital base maps
for your applications. Identify the set of map scales that will appear in each base map as
you pan, zoom, and explore its contents.
Identify the key data themes based on your information requirements. Define more
2.
completely some of the key aspects of each data theme. Determine how each dataset will
be used—for editing, GIS modeling and analysis, representing your business workflows,
and mapping and 3D display. Specify the map use, data sources, and spatial
representations for each specified map scale; data accuracy and collection guidelines for
each map view and 3D view; and how the theme is displayed—its symbology, text labels,
and annotation. Consider how each map layer will be displayed in an integrated fashion
with other key layers. For modeling and analysis, consider how information will be used
with other datasets (for example, how they are combined and integrated). This will help you
identify some key spatial relationships and data integrity rules. Ensure that these 2D and
3D map display and analysis properties are considered part of your database design.
Specify the scale ranges and the spatial representations of each data theme at each
3.
scale. Data is compiled for use at a specific range of map scales. Associate your
geographic representation for each map scale. Geographic representation will often change

26
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

between map scales (for example, from polygon to line or point). In many cases, you may
need to generalize the feature representations for use at smaller scales. Rasters can be
resampled using image pyramids. In other situations, you may need to collect alternative
representations for different map scales.
Decompose each representation into one or more geographic datasets. Discrete
4.
features are modeled as feature classes of points, lines, and polygons. You can consider
advanced data types such as topologies, networks, and terrains to model the relationships
between elements in a layer as well as across datasets. For raster datasets, mosaics and
catalog collections are options for managing very large collections. Surfaces can be
modeled using features, such as contours, as well as using rasters and terrains.
Define the tabular database structure and behavior for descriptive attributes. Identify
5.
attribute fields and column types. Tables also might include attribute domains, relationships,
and subtypes. Define any valid values, attribute ranges, and classifications (for use as
domains). Use subtypes to control behaviors. Identify tabular relationships and associations
for relationship classes.
Define the spatial behavior, spatial relationships, and integrity rules for your
6.
datasets. For features, you can add spatial behavior and capabilities and also characterize
the spatial relationships inherent in your related features for a number of purposes using
topologies, address locators, networks, terrains, and so on. For example, use topologies to
model the spatial relationships of shared geometry and enforce integrity rules. Use address
locators to support geocoding. Use networks for tracing and pathfinding. For rasters, you
can decide if you need a raster dataset or raster catalog.
Propose a geodatabase design. Define the set of geodatabase elements you want in your
7.
design for each data theme. Study existing designs for ideas and approaches that work.
Copy patterns and best practices from the ArcGIS data models.
Design editing workflows and map display properties. Define the editing procedures
8
and integrity rules (for example, all streets are split where they intersect other streets, and
street segments connect at endpoints). Design editing workflows that help you meet these
integrity rules for your data. Define display properties for maps and 3D views. Determine
the map display properties for each map scale. These will be used to define map layers.

27
devtimilsina@pncampus.edu.np
CSC468 Geographical Information System | Prithivi Narayan Campus | Dev Timilsina

Assign responsibilities for building and maintaining each data layer. Determine who
9.
will be assigned the data maintenance work within your organization or assigned to other
organizations. Understanding these roles is important. You will need to design how data
conversion and transformation is used to import and export data across various partner
organizations.
Build a working prototype. Review and refine your design Test your prototype design.
10.
Build a sample geodatabase copy of your proposed design using a file, personal, or
enterprise geodatabase. Build maps, run key applications, and perform editing operations
to test the design's utility. Based on your prototype test results, revise and refine your
design. Once you have a working schema, load a larger set of data (such as loading it into
an enterprise geodatabase) to check out production, performance, scalability, and data
management workflows. This is an important step. Settle on your design before you begin
to populate your geodatabase.
Document your geodatabase design. Various methods can be used to describe your
11.
database design and decisions. Use drawings, map layer examples, schema diagrams,
simple reports, and metadata documents. Some users like working with UML. However,
UML is not sufficient on its own. UML cannot represent all the geographic properties and
decisions to be made. Also, UML does not convey the key GIS design concepts such as
thematic organization, topology rules, and network connectivity. UML provides no spatial
insight into your design. Many users create a graphic representation of their geodatabase
schema with Visio, such as those published with the ArcGIS data models. Esri provides a
tool that can help you capture these kinds of graphics of your data model elements using
Visio.

The 11 steps presented above outline a general GIS database design process. The initial
design steps 1 through 3 help you identify and characterize each thematic layer. In steps
4 through 7, you begin to develop representation specifications, relationships, and
ultimately, geodatabase elements and their properties. In steps 8 and 9, you will define
the data capture procedures and assign data collection responsibilities. In the final stage
(steps 10 and 11), you will test and refine your design through a series of initial
implementations. In this final phase, you will also document your design.

28
devtimilsina@pncampus.edu.np

You might also like