GIS Database Creation and Design

KADUNA POLYTECHNIC
COLLEGE OF ENVIRONMENTAL STUDIES

DEPARTMENT OF CARTOGRAPHY AND GIS
LECTURE NOTE FOR NDII
ON
GIS DATABASE DESIGN AND CREATION (CAG 103)
BY
A.A USMAN
2021
0
Introduction
The aim of this course is to exposed student to Database creation, design and also to
manage the database after creation. The course is divided into five (5) chapters, in other
to make it easier for the students. Mobile app GPS and ArcGIS software will be used as
the practical tools for this course.
The first chapter will let them understand what database is, it structures, component,
classifications. Also definitions of some important terms and advantages of the database
system will be discussed.
Chapter two will explain what data layer and data file are. The chapter will focus on the
principles and procedures for data capture and creation of data file. We will look at
different data layer and file types, also the principles of referencing common features.
The procedure for linking data layer and data file will also be thought. Lastly in this
chapter, we are going to create a data file for different layers.
Chapter three will detail us on how to capture GIS data, and will be explaining the
principles and procedures of data capture using different methods. This chapter is more of
practical aspect, as we will be visiting fields, and downloading data from different GIS
sources (to capture both primary and secondary data).
Chapter four will look at the two data types that are; spatial and non-spatial data and so
also their storage. We are going to describe both data and their characteristic, which will
help in differentiating them. So also we will capture both data and create a database.
Here the student will be grouped and a mini project will be assign to each group, to see
how familiar they are with the creation of database.
Chapter five will briefly explain the basic operations on geographic database. From there
we are going to display the data captured by each group and carryout some basic analysis,
so as for the student to see how to drive out some information from the row data. Lastly
we will be looking at some ways of requesting information from the database (query).
1
CHAPTER ONE
Introduction
Data can be the facts related to any object in consideration, for example your name, age,
height, weight etc. are some data related to you. A picture, image, file, pdf, etc. can also
be considered as data. So the term data can be define as a known fact that could be
recorded and stored on the computer media. It is also define as a row fact from which the
required information is produce.
Data and information are closely related and are often used interchangeable. Information
is nothing but refine a data, it is the processed, organized or summarized data. So
information according to Burch et al; is a data that have been put into a meaningful and
useful content and communicated to a recipient who uses it made decisions.
What is a Database?
A database is a systematic collection of data. They support electronic storage and
manipulation of data. Database makes data management easy. A database system
simplifies the tasks of managing the data and extracting useful information in a timely
fashion. A database system is an integrated collection of related files along with the
details of the interpretation of the data. Database is organized by field, records and file.
i. Fields: it is the smallest unit of the data that has meaning to its users and is
also called data item or data element. Name, address, and telephone number
are example of fields. These are presented in the database by a value.
ii. Records: is a collection of logically related fields and each field is processing
a fixed number of bytes and is of fixed data type. A record is a complete set of
field and each field has some values. For example, an information about a
particular phone number in a database represents a record. Records are of two
types, fixed length record and variable length record.
iii. Files: is a collection of related records. Generally all the records in a file may
be of the same size and may not be (as stated in RECORD that is fixed or
variable length record).
2
DATABASE MANAGEMENT SYSTEM (DBMS)
DBMS is a software system or program that allows access to data contained in a
database. The objective of the DBMS is to provide a convenient and effective method of
defining, storing, and retrieving the information in the database.
The database and DBMS have become essential for managing business, government,
schools, bank etc. the primary function of the database is to provide timely and reliable
information that supports the daily operation of an organization. A database is perceived
not merely as a collection of data file, but as important asset for an organization.
Database system comprise complex hardware and software system that serve as a total
data library capable of managing conventional test-base and numerical data as well as
raster image, vector graphics and multimedia files. More advanced system also includes
data analysis functions to support decision making.
TYPES OF DATABASE
i. Hierarchical database; this type of DBMS employs the parent-child
relationship of storing data. Its structure is like a tree with nodes representing
records and branches representing fields.
ii. Relational database; this type of database define database relationship in the
form of tables. It is also called relational DBMS, which is the most popular
DBMS type in the market. Database example for RDBMS includes, MySQL,
oracle, and Microsoft SQL server database.
iii. Object-oriented database; this type of computer database support the storage
of all data types. The data is store in the form of object, the object to be held
in the database has attributes and method that define what to do with the data.
Postgre SQL is an example of an object-oriented DBMS.
iv. Centralized database; it is centralized location, and users from different
background can access this data. This type of computer database store
application procedures that help users access the data even from a remote
location. Examples of such include, Google, yahoo, whatapp etc.
v. Open-source database; this kind of database store information related to
operation. It is mainly used in the field of marketing, employee relations and
customer’s services of database.
3
vi. Cloud database; a cloud database is a database which is optimized or built for
a virtualized environment. There are so many advantages of cloud database,
some of which can pay for a storage capacity and bandwidth. It also offers
scalability on-demand, along with high availability.
vii. Graph database; this type of database uses graph theory to store maps, and
query relationship. This type of computer database is mostly used for
analyzing interconnections. For example, an organization can use a graph
database to mine data about customers from social media.
viii. Personal database; is used to store data on personal computers that are
smaller and easily manageable. The data mostly used by the same department
of the company and is accessed by a small group of people.
DATABASE COMPONENTS
Database component comprises of computer hardware, software, data, procedure, and
data access language.
Hardware
Data access language Software
Data
Procedur
1. Hardware; the hardware consist of the physical, electronic device like computer,
storage device, scanner etc. this offers the interface between computer and real-
world systems.
2. Software; this is the set of programs used to manage and control the overall
database. This includes the database software itself, the operation system, the
4
network software used to share the data among users and the application program
for accessing data in the database.
3. Data; is a raw and un-organized fact that is required to be processed to make it
meaningful. Generally data comprises of facts, observations, perceptions, numbers,
characters, symbols, images etc.
4. Procedures; these are sets of instructions and rules that help you to use the
DBMS.it is designing and running the database using documented methods, which
allows you to guide the user who operate and manage it.
5. Data access language; is used to access the data to and fro the database. It allowed
the entry of new data, update already existing data, or retrieve required data from
the DBMS. The user writes some specific commands in the database access
language and submit these to the database.
CLASSIFICATION OF DATABASE SYSTEMS
Database systems can be classified in a variety of ways according to different criteria.
Conventionally, they were classified according to the different data models on which they
were built. These models fell into three categories that characterize the evolution of
database systems, namely hierarchical, network and relational. A fourth class called
Object-oriented database systems emerged in the 1990s as a result of the advances in
what is now commonly known as object-orientation technology.
Another way of classifying database systems is to use the characteristics of the data in the
database as the principal criterion of classification. Database systems classified in this
way can be labeled either as spatial or non-spatial in terms of their contents. So also
database can be classified by the number of users supported, where the data are located,
the type of data store, the intended data usage, and the degree to which the data are
structured.
The number of users determines whether the database is classified as single-user or multi-
user. A single-user database supports only one user at a time. A single-user database run
on a personal computer is called a DESKTOP DATABASE. The multiuser database
supports multiple users at the same time. When the multiuser database support a
relatively small number of user (usually fewer than 50) like a specific department within
an organization, this type of database is called WORKGROUP DATABASE. When the
5
database is used by the entire organization and support many users (more than 50) across
many department, the database is known as an ENTERPRISE DATABASE.
Location might also be used to classify the database. For example, a database that
supports data located at a single site is called CENTRALIZED DATABASE. A database
that support data distributed across several different site is called a DISTRIBUTED
DATABASE.
In research environment, a popular way of classifying database is according to the type of
data stored in them. Using this criterion, databases are grouped into two categories;
General-purpose and discipline-specific database. The general-purpose database contains
a wide variety of data used in multiple disciplines. For example a census database that
contains a general demographic data, and the LexisNexis and ProQuest database that
contain newspapers, magazine, and journals articles for a variety of topics.
Discipline-specific database contain data focused on specific subject area. The data in
this type of database are used mainly for academic or research purposes within a small set
of disciplines. For example a Geographic Information System database that store
geospatial and other related data, a medical database that store confidential medical
history data etc.
The most popular way of classifying database today is based on how they will be used
and on the time sensitivity of the information gathered from them.
ADVANTAGES OF DATABASE SYSTEM
1. Controlled redundancy: In a traditional file system, each application program has
its own data, which causes duplication of common data items in more than one
file. This duplication/redundancy requires multiple updating for a single
transaction and wastes a lot of storage space. We cannot eliminate all redundancy
due to technical reasons. But in a database, this duplication can be carefully
controlled, that means the database system is aware of the redundancy and it
assumes the responsibility for propagating updates.
2. Improved data sharing: The DBMS helps create an environment in which end
user have better access to more and better-managed data. Such access makes it
possible for end users to respond quickly to changes in the environment.
6
3. Improved data security: The more users access the data, the greater the risk of
data security breaches. Corporations invest considerable amount of time, effort,
and money to ensure that corporate data are used properly. A DBMS provides a
framework for a better enforcement of data privacy and security policies.
4. Better data integration: Wider access to well-manage data promotes an integrated
view of the organizations operations and a clearer view of the big picture. It
becomes much easier to see how actions in one segment of the organization affect
other segment.
5. Minimized data inconsistency: Data inconsistency exists when different versions
of the same data appear in different places. For example, data inconsistency exist
when a student in Cartography & GIS department NDII name and registration
number appears in the record that the student did not pay his school fee in Central
Admin, while seen in same database shared by CES, that the same student had
paid his fee. The probability of such data inconsistency is greatly reduced in a
proper designed database.
6. Improved data access: The DBMS makes it possible to produce quick answers to
adhoc queries. A query is a specific request issued to the DBMS for data
manipulation. For example, when dealing with a huge amount of data (student
data), end users might want a quick answer to questions like;
a. What are the percentages of male to female in Kadpoly?
b. List the names and department of the student that did not pay their school fee?
7. Improve decision making: Better managed data and improved data access makes
it possible to generate better quality information, on which better decision are
based. DBMS does not guarantee data quality, it provide a framework to facilitate
data quality initiatives.
8. Increased end user productivity: The availability of data, combined with the tools
that transform data into usable information empowers end users to make quick,
informed decisions that can make the difference between success and failure in
the global economy for example.
7
CHAPTER TWO
Introduction
In this chapter we are going to discuss data layer and data file, all together with their
types. We are to look at their principles and procedures for data capture and creation of a
data file. Than we lastly linked data layer and data file together by displaying them in a
GIS software.
What is a Layer?
Layers are the mechanism used to display geographic datasets. Each layer references a
dataset and specifies how that dataset is portrayed using symbols and text labels. When
you add a layer to a map, you specify its dataset and set its map symbols and labeling
properties. A dataset is a collection of homogeneous features. Geographic representations
are organized in series of dataset or layers. Most datasets are collection of simple
geographic elements such as a road network, a collection of parcel boundaries, soil types,
an elevation surface, satellite imagery for certain date, well location etc.
Layer File (.lyr): This is a file that stores the path to a source dataset and other layer
properties, including symbology. In comparison to a shapefile, a layer file is a just a
link\reference to actual data, such as a shapefile, feature class, etc. It is not actual data
because it does not store the data's attributes or geometry. A layer file primarily stores the
symbology for a feature and other layer properties related to what is seen when the data is
viewed in a GIS application.
Data File: This is any file that contains information, but not code. It is only meant to be
read or viewed and not executed. Example of this is a web page, a letter you write in a
word processor and a text file are all consider data files. Programs may also rely on data
files to get information. For instance, a data file may contain the setting of a program that
tells the program how to display information. In another word, a data file is a computer
file which stores data to be used by a computer application or system which include input
and output data. E.g txt for Text, xls for Excel, img for Image, shp. for Shapefile etc.
In the GIS world, you will encounter many different GIS file formats. Some file formats
are unique to specific GIS applications, others are universal. For example shapefile for
vector data, image and GeoTiff file for raster data, while we have a file geodatabase
which can be used for both vector and raster data.
8
A shapefile is a file based data format and a feature class which stores a collection of
features that have the same geometry type (point, line, polygon), the same attribute and a
common spatial extent. Shapefile is actually composed of at least three files and as many
as eight. Each file that makes up a shapefile has a common file name but different
extension types.
File extension Content
.dbf Attribute information
.shp Feature geometry
.shx Feature geometry index
.aih Attribute index
.ain Attribute index
.prj Coordinate system information
.sbn Spatial index file
.sbx Spatial index file
Image file format; was originally created by an image processing Software Company
called ERDAS. This file format consists of a single .img file. This is simpler file format
than the shapefile. It is sometimes accompanied by an .xml file which usually stores
metadata information about the raster layer.
GeoTiff format; is a popular public domain raster data format. It has an extension of .tiff,
and has a maximum portability and platform independence which is very important.
File Geodatabase can store both vector and raster file.it has the benefit of defining image
mosaic structures thus allowing the user to create “stitched” image from multiple image
file stored in the geodatabase. The file geodatabase is a relational database store format
and consist of a .gdb folder housing dozen of files.
GEOGRAPHIC REPRESENTATION
In GIS, spatial data collections are typically organized as feature class dataset or raster-
based dataset. Raster dataset are used to represent georeferenced imagery as well as
continues surface such as elevation, slope, aspect etc. for vector feature are represented
geographic earth features in form of point, line, and polygon.
Theme Geographic representation
9
Hydrography Lines
Road centerlines Lines
Vegetation Polygons
Urban areas Polygons
Administrative boundaries Polygons
Elevation contours Lines
Well locations Points
Orthophotography Rasters
Satellite imagery Rasters
Land parcels Polygons
Parcel tax records Tables
CREATION OF DATA FILE/LAYER FILE IN ARCGIS

Here will be of practical aspect were the students will be taking to GIS lab or come along
with their personal computer for the practical. They will learn how to create a vector data
file and a raster data file and also display them as layers.
CHAPTER THREE
DATA CAPTURE
There are three distinct phase to data input process, the first comes the database design
were you identify and conceptually code all the needed features and attributes. Then
10
comes the second one which is data acquisition; this involves the needed data from
various agencies, store houses, organization etc. and getting it into a format that you GIS
program reads. Finally is the data capture, here you digitized hard-copy maps and data
directly into your GIS and transform existing digital data into a format your GIS reads.
PHASE 1: GIS DATABASE DESIGN
In designing your database there are some certain question you need to ask yourself at a
start, these questions are;
What is your goal or research question?
How should you proceed?
You need to define your objective at the very beginning. Having a well-defined research
question, goal, or even multiple goals is the key to a successful GIS project because it
guides the project’s input, analysis, and output stages. Spend time and thought on the
design of your GIS because good planning results in successful projects.
Start by thinking about the people, land, and the issues in your study. This has a direct
bearing on what datasets (features and attributes) are needed. Next, think about how you
will analyze the data. This could affect your choice of GIS software and your data model
(vector or raster). All components of GIS project need to be planned, as said before. You
need to know what software and hardware you will use and what procedures and people
will guide your operation.
KEY QUESTIONS TO ASK YOURSELF IN PLANINIG A GIS DATABASE
1. Determine Your Features
What features are necessary? Think back to your project’s goals. For example, you want
to analyze a particular species distribution. It may be necessary to have a feature devoted
to the specific plant type. Equally important, however, are the other features—nearby
plant species, soil types, climate conditions, land tenure practices, and landform
conditions like slope and aspect. These other features, along with many others, play a role
in the distribution of your plant. If you are developing a GIS database for a city’s
planning department, you will want layers for many features including streets, parcels,
parks, water, sewer, electricity, and buildings.
2. Determine the Project’s Spatial Extent, Scale, and Temporal Extent
11
You must determine the area and the period in which your project focuses. Sometimes it
is obvious. Along with the project’s spatial extent, you should think about an appropriate
scale. Small-scale maps depict large territories, but they usually are less precise and may
require that some reference layers be left out. Large-scale maps show smaller areas but
comparatively include more detail. In your study similarly, you may want to define a
temporal extent. Is time an important variable in your study? Most GIS projects focus on
the contemporary scene and ignore the past. If, however, you want to determine how
much an area has changed, you need to define a period for your project. So determining
the temporal period helps you determine your project needed attributes.
3. Determine the Attributes for Each Feature Type
Attributes are the characteristics of features. You need to identify the required attributes
for each feature type. The more you can do this before you collect your data, the less you
will retrace your steps and collect additional attributes later. You cannot use some
analytical processes (like many statistical tests) if the attribute values that you collect are
in an improper form to be used in a particular analytical process. One other thing to
consider at this point is that some attributes (like a polygon’s area, a line’s length, and
even the number of point features falling within polygon features) can be generated
automatically by the GIS software. Additional attributes can be created by multiplying,
dividing, adding, and subtracting, truncating, or concatenating attributes with other
attributes, numbers, or characters.
4. Determine How the Features and Their Attributes should be Coded
Once you have decided on the features and their attributes, determine how they will be
coded in the GIS database. . Decide whether to code each feature type as a point, line, or
polygon. Then define the format and storage requirements for each of the feature’s
attributes. For instance, is the attribute going to be in characters (string) or numbers? If
they are going to be numbers, are they byte, integer, or real numbers? You will have to
establish these database parameters before you enter data into the GIS.
It is critical that you think about the value of your attributes before you code. Obviously,
for example if one street segment needed room for 9 numbers to report its length, than 8
is not enough and the correct value could not be entered without modifying the field’s
length. Also, while thinking about your attribute values, consider where it fits on the
12
“levels of measurement” scale with its four different data values: nominal, ordinal,
interval, and ratio.
Nominal; data use characters or numbers to establish identity or categories within a
series. They do not suggest a rank order or relative value. Nominal data are usually coded
as character (string) data in a GIS database.
Ordina;l datasets establish rank order and they are measured on an ordinal scale. The
ranks ‘high’, ‘medium’, and ‘low’, ‘first’, ‘second’, or ‘last’, etc. are also ordinal. So
while we know the rank order, we do not know the interval. Usually both numeric and
character ordinal data are coded with characters because ordinal data cannot be added,
subtracted, multiplied, or divided in a meaningful way.
Interval; scale, this pertains only numbers. There are no uses of character data, they
shows ‘differences’, ‘distance’, ‘time’ etc. they can be added, subtracted, multiplied.
Interval data, unlike ratio data however do not have a starting of a true zero.
Ratio; is similar to interval. The difference is that ratio values have an absolute or natural
zero point. This scale ranks base on the numerical value that are measure with reference
to an absolute data.
5. Determine the Base Map Reference Features

What features are helpful to include? Add reference features that help people orient
themselves within your study area even if you are not going to analyze these features.
Major roads, rivers, and principal buildings are good examples of features that help orient
13
map readers. In short, having these base-map features may not be important for analysis,
but they are important for clarity.
6. Determine your Project’s Projection, Coordinate System, and Datum
Before you collect or look for data, you should decide on which projection, coordinate
system, and datum to use. These three terms, collectively termed “projection parameters”,
it is important that these parameters remain consistent throughout you layers. Consistency
enables you to properly overlay your feature layers to produce maps and analyze feature
relationships
PHASE 2: GIS DATA ACQUISITION
In the data acquisition phase, you obtain the data for your GIS. Getting all the data
together (and in a suitable format) is the most costly and time-consuming task for any
GIS project. Most estimates suggest that between 75 to 80 percent of your time is spent
collecting, entering, cleaning, and converting data. There are four methods of acquiring
data;
1. Collecting new data
2. Converting/transforming legacy data
3. Sharing/exchanging data
4. Purchasing data
1. Collecting new data: is a technique in which the information on various map

attributes, facilities, assets, and organizational data are digitized and organized on a target
GIS system in appropriate layers. They are usually derived from experiment or from field
work. Here the data is collected first hand by the research or group of researchers, or was
tested in the lab to access or produce a data. Data collection is an area where cost savings
mechanisms are needed. For instance, Global Positioning Systems and mobile units are
now being used to take field data and enter them directly from the source. Therefore,
before data are initially collected, strict controls must be in place. All of the analysis,
definitions, and standards need to be in place prior to any field information collection.
While this may seem obvious, it is not always practiced. Good planning will reduce this
heavy budget item. Data must be reviewed and updated on a regular schedule to maintain
a high standard of quality.
14
2. Converting/transforming legacy data; Data conversion is the process of moving data
from one format into another, whether it is from one data model to another or from one
data format to another. With data formats, you are moving data from another format
altogether, such as shapefiles, coverages, or Vector Product Format (VPF) sources into a
geodatabase. Converting data from one GIS format to another. When obtaining GIS data
from the Internet or from other sources, it requires extensive preprocessing to make it in
to usable format.
3. Sharing/exchanging data; Data Sharing Agreements need to include provisions
concerning access and dissemination. It is not wise to enter into a data sharing agreement
where privacy information may be disclosed to non-Federal organizations since they are
not subject to the Privacy Act. When thinking about storing and sharing digital geospatial
data for the long-term, we need to think about how to ensure data remain usable in the
future. Data formats that are popular and easy to read at one point in time may later be
rendered unreadable by changes in software and updates to the format definitions. Data
can also come in a proprietary format that can only be opened by particular software. The
Internet is a great place to start looking for data. If you find existing GIS datasets that
serve your purpose and passes your specifications, it saves you time and money. A search
may reveal multiple copies of what seems to be the same data, but check the details—
examine the metadata—because minor differences might make one dataset better than the
other. Much base map data (countries, states, counties, major roads, rivers, township and
range) exists on the Internet. It would be convenient to retrieve all of your GIS datasets
from the Internet, and although more and more data are available, the Internet will not
provide you with everything you need.
4. Purchasing data; Purchase Agreements: Data purchases require a Purchasing
Agreement. By purchasing data, you are endorsing the data. Such data then becomes
subject to the Information Quality Act, which covers all data, not just geospatial data.
Many data companies modify “public” data to create a “value-added” product that you
can purchase and load directly into your GIS. “Value-added” datasets usually originated
from a government agency or an organization that creates the basic GIS dataset, but a
commercial company obtains the data and “improves” it by adding attributes or
15
improving its spatial precision. The commercial company can then sell the “value-added”
portion of the data. Many of these datasets can also be obtained over the Internet.
Here these data can be classify as either primary/secondary or observable/non-observable

a. Primary Data; are measurements that you or team collect. They are usually
derived from experiment or from field work.
b. Secondary Data; are dataset that someone else collects. These dataset collected
from experiment or fieldwork, were collected for a purpose other than your own.
Most researchers prefer primary data because they have not been previously
conceived and shaped. Still, secondary datasets are tremendously valuable if you
determine how and why they were collected and if your project can accept those
preconceptions.
c. Observable Data; are the type of data were someone or something observes the
characteristic or the behavior of an object.
d. Non-observable Data; are when respondents are asked questions in an interview
or on a questionnaire, but the data gatherer does not physically observe the
characteristic or behavior.
PHASE 3: DATA CAPTURE
When you have exhausted your contacts and the Internet, it is time to capture the data
yourself. In this phase, you create new GIS datasets from both digital data that are not
currently in a GIS format and from non-digital, hard-copy data sources. Examples of
digital and non-digital data sources include maps (hard-copy and digital), aerial
photographs (hardcopy and digital), questionnaires, field observations, digital satellite
imagery, survey data, and Global Positioning System (GPS) coordinates. The data
capturing phase is often tedious, laborious, and frustrating, but necessary. The key steps
in data capture phase. Here you digitize hardcopy maps and data directly into your GIS or
transform existing digital data into a format your GIS reads.
Converting Digital Data
Here we looks at digital datasets that are currently not in a GIS format, but that are often
manipulated to create GIS layers. These sources include automated surveying,
photogrammetry, GPS, Light Detection and Ranging (LIDAR).
16
a. Automated surveying; uses electronic data capturing instruments like theodolites,
electronic distance measurement (EDM) systems, and total stations to capture
spatial and attribute data. The most sophisticated of these instruments is the total
station that combines the theodolite’s angle-measuring capabilities with the
EDM’s distance calculations. Surveyors download the distance and direction data
from their instruments directly into many vector-based GIS programs. The data,
however, usually requires preprocessing before it can be used to make a map.
b. Photogrammetry; obtains accurate measurements from aerial photographs.
Photogrammetric techniques determine ground distances and directions, heights
of features, and terrain elevations. Photogrammetry creates GIS data through 3-D
stereo digitizing and by producing spatially rectified aerial photographs that can
be entered into the GIS as a layer.
c. GPS (Global Positioning System); is a radio-based navigation system that uses
GPS receivers to compute accurate locations on the Earth’s surface from a series
of orbiting satellites. With a small, inexpensive, hand-held GPS receiver you can
determine your location usually within about three meters.
d. LIDAR (Light Detection and Ranging); is a remote sensing technology that uses
laser light pulses to measure the distance to a surface. It is similar to other types
of radar, but uses light instead of radio waves. Airborne LIDAR systems have
resulted in topographic layers that depict the tops of ground-based features better
than traditional remote sensing and radar methods, and this results in topographic
layers that portray the shape of our cities (including the widths and heights of
buildings) and forest canopies more accurately.
Converting Non-Digital Data

Existing, hard-copy maps and aerial photographs (physical paper documents) are a major
source of spatial data for GIS. Different processes, including digitizing, scanning, and
“heads up” digitizing, exist to input these hard-copy sources into GIS. For most projects,
17
you will need to capture both the spatial location of the feature and some of its attributes.
Data input is usually the major bottleneck in the development of a GIS database, and
converting hard-copy data from maps, aerial photographs, printed reports, and field
notebooks is often the least desirable option because it is tedious and time consuming.
Still, it is a way of making sure that you get a certain level of accuracy and precision for
your project.
a. Scanning; is a popular way to convert hard-copy maps and aerial photographs into
digital images. The resultant scanned image is a raster file, arranged as an array of
pixels in columns and rows. Scanners capture what is on the original document by
assigning a color or grayscale value to each pixel in the array. Scanner types
include flat bed, sheet fed, drum, and video. Flatbed (or desktop) scanners are the
most common and consist of a glass board where you lay the documents you want
to scan.
b. Heads Up Digitizing; After you create a scanned image, you georeference it and
use it as a background image within your vector system. Then with the image at
its proper geographic location, trace the features that appear on the scanned
image. This process, called “heads-up” digitizing (or on-screen digitizing), is like
manual digitizing (described below) but without a physical digitizing board.
Instead, you see on the screen a scanned image in its correct geographic position,
and, with your mouse, you trace the position of features into new or existing
point, line, and polygon layers.
c. Digitizing; involves tracing by hand the extent of features directly from a hard
copy map or photograph that is mounted onto a digitizer, a large table or board
with an imbedded electronic grid that senses the position of a pointer called a
puck (a mouse like device). All GIS packages have a specific procedure for
manual digitizing. Generally, it involves three steps: mounting the map on the
digitizer, establishing control points, and adding map features.
d. GPS digitizing; involves using a GPS receiver to record feature data in the field.
Using GPS for point locations (waypoints) was described above, but mapping-
grade GPS units, like Trimble’s GeoXT, are capable of recording the nodes and
18
locations of points, lines and polygons by following the feature’s extent and
registering a waypoint at each of its vertices.
e. Pilot Project; A pilot project is a rehearsal. Here you collect a small subset of the
GIS datasets you require for the larger project. Then you input the data into the
GIS, preprocess the datasets, analyze them, and create some output. When
something goes wrong, you tweak the project’s parameters until the process
works smoothly. Pilot projects give you the opportunity to “ground truth” your
secondary data. Remember, it is foolish to believe these datasets are without
flaws. You need to ground truth your GIS data to ensure that the datasets are
representative of what’s on the ground. It is done by traveling to your study area
and using your eyes to verify your datasets.
Assignment
1. Explain the term GIS data source and state 5 GIS data sources?
2. Explain three primary data capture and three secondary data source?
3. Explain in relation to your project;
i. The source of your data
ii. What features are necessary
iii. What is the spatial and temporal extent of your project
iv. How are you going to analyze you data
CHAPTER FOUR
Storage of spatial and non-spatial data
19
The database is an important part of the GIS and plays a major role in providing inputs in
a GIS environment. The features of the real-world system are converted into different
themes or layers of the database. These layers are composed of characteristics of those
features in the form of spatial and non-spatial types of database. So, in GIS, two types of
database are used: one is spatial and the other non-spatial. All the geographical features
of the Earth’s surface or spatial features are represented by point, line and polygon
features. These features can be stored in vector and raster data structures. The vector data
structure is stored in a pair of x–y coordinates, i.e. it is dimensionless. The spatial features
in raster format are a group of pixels or cells arranged in a row and column.
Both raster and vector data models have different utilities, represent geographical data,
and are complementary to and inter-convertible with each other. There are different
sources of spatial databases, in the form of analogue maps, remote-sensing imagery,
aerial photographs, GPS data and field surveys.
A non-spatial database includes the characteristics of a spatial database stored
alphanumerically and provides information in tabular form. The three most common non-
spatial database structures are;
1. Hierarchical, is where a tree structure exists
2. Networked, is where multiple connectivity with all the data is and were viewed as an
upside-down tree.
3. Relational, where the relation is established while joining or appending the various
databases.
Non-Spatial Data in GIS
20
Non-spatial data are stored in GIS as tables. Such tables are known as non-spatial
(attribute) tables. A non-spatial table is represented by rows and columns in which
each row shows a spatial feature and each column represents a characteristic. The
intersection of a row and a column gives the value of a specific characteristic for a
particular feature. A row is also known as a record or a tuple and a column is known as a
field or item.
Arrangement of rows and columns of a non-spatial data
ID NAME LOCATION QUALIFICATION
1 VIVIAN KADUNA BARNAWA DIPLOMA
2 AISHA KANO DAMBATTA DEGREE
3 DAVID ABUJA MAITAMA HIGH DIPLOMA
Spatial data is any data with a direct or indirect reference to a specific location or

geographical area. Spatial data is often referred to as geospatial data or geographic
information. Spatial data can help us make better predictions about human behavior and
understand what variables may influence an individual's choices. By
performing spatial analysis on our communities, we can ensure that neighborhoods are
accessible and usable by everyone. Spatial data comprise the relative geographic
information about the earth and its features. A pair of latitude and longitude coordinates
defines a specific location on earth. Spatial data are of two types according to the storing
technique, namely, raster data and vector data.
Raster data are composed of grid cells identified by row and column. The whole
geographic area is divided into groups of individual cells, which represent an image.
Satellite images, photographs, scanned images, etc., are examples of raster data.
Vector data are composed of points, polylines, and polygons. Wells, houses, etc., are
represented by points. Roads, rivers, streams, etc., are represented by polylines. Villages
and towns are represented by polygons.
Attribute Data
Attribute data comprise the pertinent information about the spatial data. The querying
feature works based on attribute data, i.e., it is attached to geospatial data. Types of
attribute data are: nominal data, ordinal data, interval data, and ratio data.
PRACTICAL PART
 Acquire spatial data
21
 Correct the problems arising from the acquired data
 Inputting non –spatial data tabular database
 Correct for errors arising from inputting the non-spatial data
 Link spatial and non-spatial data
CHAPTER FIVE
Operations on Geographic Database (Practical Class)
22
 Understand basic operations on geographic database.
 Explain the basic operations on a geographic database.
 Select various draining features (one after the other) and display graphically.
 Carryout simple analysis of information derivable from the graphic displays.
 Request for displays and their associated attributes.
23

GIS Database Creation and Design

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GIS Database Creation and Design

Uploaded by

Copyright:

Available Formats

KADUNA POLYTECHNIC

COLLEGE OF ENVIRONMENTAL STUDIES

LECTURE NOTE FOR NDII

GIS DATABASE DESIGN AND CREATION (CAG 103)

Data access language Software

Theme Geographic representation

CREATION OF DATA FILE/LAYER FILE IN ARCGIS

5. Determine the Base Map Reference Features

1. Collecting new data: is a technique in which the information on various map

Here these data can be classify as either primary/secondary or observable/non-observable

Converting Non-Digital Data

Non-Spatial Data in GIS

Spatial data is any data with a direct or indirect reference to a specific location or

You might also like