You are on page 1of 5

A GemStone GIS

Mary Garvey+, Mike Jackson+, Martin Roberts*


University of Wolverhampton
+
School of Computing & IT, Wolverhampton, West Midlands, WV1 1EL, UK
*
School of Applied Sciences, Wolverhampton, West Midlands, WV1 1SB, UK
{M.Garvey, M.S.Jackson, M.G.Roberts} @wlv.ac.uk

Abstract: Object oriented databases have been portrayed as being the solution for complex applications such as
Geographical Information Systems. Traditional database systems, such as relational, are not adequate for the
rich data types typically required from such systems. This paper discusses an application using object-
oriented database technology, detailing the schema used, problems encountered and benefits perceived.

Key words: object-oriented databases, GIS, MVC, GemStone

motivation for the use of an object-oriented


approach to the production of such a system is
1. INTRODUCTION therefore the expectation that the approach will
result in a system which has a clean interface
This paper describes the construction of a and is easier to maintain than an equivalent
geographical information system (GIS) using an system built using conventional programming
object-oriented database management system. techniques.
The motivation for this approach is that
geographical data does not conform to the The paper discusses briefly the essential objects
traditional record structures found in a in a GIS. This data model is then used as the
conventional relational DBMS. Geographical application model within the Model-View-
datasets are not distinct (i.e. made up of Controller (MVC) paradigm in order to create a
unconnected strings and numbers) but rather are visualisation of the data. The resulting
multi-dimensional and continuous. Kent [1987] representation has been made persistent in the
comments that "records are an excellent tool GemStone database but has led to performance
for processing information that fits a certain problems. Finally an outline of current work to
pattern", geographical information, however, add non-spatial data to the map representation
does not match this pattern and hence needs a is given.
DBMS that can handle complex information.
Object-oriented data models possess rich
modelling structures to deal with non- 2. GIS AND DATABASES
conventional data types.
2.1 Geographical Features
An additional claim made for object–oriented
software systems is that they make the task of Geographical features are usually defined with
constructing software considerably easier. This respect to two main data categories: spatial and
should be particularly relevant to software that aspatial. A spatial database describes a
manipulates complex data structures and creates collection of entities, some of which have a
visualisations of the data in these data permanent location in a global, dimensioned
structures. A successful GIS system will space. Normally a system will contain a
invariably embody this type of data structure mixture of spatial and aspatial entity types.
and will almost certainly require a visual Spatial data types have the basic topographical
representation of map data. A further properties of location, dimension and shape.

44 ICEIS 2000
For GIS, spatial data is a key feature and can be commercial object-oriented database
regarded as more important than aspatial data, management system with a Smalltalk interface.
since this differentiates GIS from vanilla The format of the data suggested a design for
information systems [Maguire & Dangermound the initial object-oriented schema. Each dataset
1991]. A good system, however, will contain an contains three types of data: Points, which are
integrated database combining both the spatial three dimensional points representing features
and aspatial entities. In particular, the data such as spot heights or lighthouses; Lines,
model and data language should permit the use which are a set of three dimensional points,
of spatial and aspatial relationships. each line representing a segment of the feature,
such as a road or river and Areas, which are a
2.2 What Object-Orientation Offers set of three dimensional points, where the start
and end points are the same, such as lakes or
An OODB must first provide the features and urban areas.
functionality that we traditionally expect from a
For example, point data:
DBMS. Database features are typically not
found in the current breed of object-oriented 1POINT 156 1
programming languages, such as Smalltalk. 348033.75407000 318661.908200000 . 00000000
OODBs behave like object-oriented languages,
however the objects are persistent, that is, they line data:
remain after the program has terminated 1LINE 230 2
[Garvey and Jackson 1989]. Object-oriented 342380.00000000 21300.0000000000 . 00000000
languages offer richer data structures, which is 342320.00000000 321620.000000000 . 00000000
important for applications with complex data
such as GIS and an environment where systems area data:
can be developed rapidly through reuse. 1AREA 164 6
350000.00000000 314460.00000000 .00000000
Most commercial OODBs have either a C++, 350220.00000000 315160.00000000 .00000000
Smalltalk, or recently, Java interface. A C++ 350060.00000000 315220.00000000 .00000000
interface has the drawback that once the data 350000.00000000 315080.00000000 .00000000
structures have been set-up, it is difficult to add 349820.00000000 314520.00000000 .00000000
350000.00000000 314460.00000000 .00000000
new ones dynamically. This has important
ramifications for aspatial data in a GIS. A user
Simply adding the map data as it stands to the
may wish to add new structures at a later point
GIS is not sufficient. A point on the map may
to hold aspatial data and this would involve
appear in a number of lines or areas. This
recompiling the schema. Smalltalk on the other
occurs when more than one object can be found
hand allows classes to be created dynamically.
at a map reference. If the map consists of just
This has the disadvantage that the code is not
collections of points, lines and areas then
compiled, so will run slower.
answering the query “What objects are at point
X” will be a complex task. To deal with this
adequately each new point added to the GIS is
3. PROBLEM OUTLINE
added to a collection, which contains a point
representing every place in the map where there
The problem area studied was the
is at least one feature. Curiously this is a case
representation of typical information found on a
where the relational concept of a unique key is
map, for example, roads, rivers, urban areas,
of more help than the O-O concept of object
etc. The data source used was the Bartholomew
identifiers.
(BART) datasets exported from Genamap in a
simple portable format. The platform for A user of a GIS will wish to investigate the
implementing the GIS initially was GemStone relationships between different objects. In
[Butterworth, Otis and Stein 1991], which is a Smalltalk and GemStone, relationships are

Enterprise Database Technology and its Applications 45


defined by creating an attribute (instance from the classes they contain. The distinction
variable) which contains a reference to another between classes and collection classes is not
object. A relationship of this type is, however, often made in many modelling tools, but is
uni-directional. For example, a map item has an important because each has its own separate
attribute, tag, which is constrained to be of type properties and behaviour.
feature code. This allows a user to find the
feature details of a given map item. A user
looking at the feature code however could not
find which map objects are related to the feature
unless another relationship from the feature
code to the map item is set-up. To gain
maximum flexibility, all classes in the schema
have been defined with two-way relationships.
This idea which appears as the inverse
relationship in the ODMG standard [Cattell et.
al. 1997] is not explicitly represented in
GemStone.

4. GIS SCHEMA Figure 1 Point Classes

Given the data identified above, there are Finally a Map class is used to tie the points,
basically three strands to the schema: Point lines, areas and network together from a
details, Map Features and Feature Code particular map set. The network class is used
definitions. for network analysis calculations.
Point is fundamental to the whole schema.
Smalltalk has a predefined class called Point, 5. GemStone SMALLTALK
which has much of the required functionality INTERFACE
already defined. Point, however, only applies to
two-dimensional data, some datasets may Servio’s GemStone and the ParcPlace
require the capability to support three- VisualWork's Smalltalk Interface have been
dimensional data, so a new ThreeDPoint class used to implement the GIS.
was introduced as a subclass of Point, with an
Within the GemStone Smalltalk Interface
extra instance variable z.
(GSI), there are two different worlds: Smalltalk
The discussion in section 3 highlighted a need and GemStone. The GIS application could be
for every point on the map to be related to all developed in either or both environments.
the features that can be found at that point. For Smalltalk provided a rich set of classes so the
example, at a given point there could be a river, front-end was developed using this. GemStone
bridge, traffic lights, etc. An extra class was used to store the data and provide the
GISPoint inherits from ThreeDPoint with an standard database functions required.
additional instance variable mapFeatures, which
holds the set of map features that are at that 5.1 Model-View-Controller Architecture
geographical location.
Figure 1 shows the structure, using ROME (a Interactive applications in Smalltalk are built
tool for object-oriented modelling [Barclay and using an architecture that identifies a model, a
Savage 1997]). The collection classes are view and a controller. This known as the
Model-View-Controller (MVC) paradigm
explicitly modelled in this representation,
because they exhibit behaviour which differs [Krasner and Pope 1988] and is the key to the
rapid development of graphical applications.

46 ICEIS 2000
MVC separates a graphical application into two the GemStone non-graphical front-end, Topaz.
parts: abstract application (or model) and the Loading large datasets into GemStone takes a
interface. This means the application and long time, for example, 340 hours for a 6MB
interface can be developed separately. dataset representing the north west of England.
Once loaded, starting the system is also slow,
5.2 Data Storage using GemStone approximately twenty minutes for the map of
the north west of England (cold start). A cold
Smalltalk on its own is sufficient for a small start when drawing a map is significantly
single-user system, as the data could be stored slower than when the data has been loaded into
in an image file. To realise the benefits of a memory. At the start of the project the
DBMS the Smalltalk environment must be performance of the system was acceptable,
linked to GemStone. adding more complex structures has revealed
GemStone gives more efficient access when it that the system is not scalable.
knows the data types of the instance variables Different experiments with using either the
(attributes). In contrast Smalltalk is a typeless objects or symbols as the key for lookups on the
language, which means it does not have to network tables has not improved performance.
know what the data types are. This leads to
some maintenance problems if any GemStone
classes are regenerated from the Smalltalk 6. ASPATIAL DATA
environment as this loses the type constraints.
Spatial data by itself is not sufficient for most
In the system described in this paper, Smalltalk GIS users, they also want the ability to add
handles all the application code and GemStone attribute, or aspatial data. For example, a Local
stores the data. Therefore the classes and Authority might want to store details of
behaviour were created in the Smalltalk roadworks and link these to the actual roads on
environment and the matching classes (without a map. Such data may already be stored in a
behaviour) were defined in GemStone. The relational database, such as Oracle. It is not
option of using forwarders was chosen. Here envisaged that this system will provide a direct
the classes required for storing the data are kept link to an external database, instead existing
solely in the GemStone database, whilst a data could be exported in a standard format
dummy class is created in the Smalltalk such as XML and loaded into the GIS.
environment, acting as a link to the real data. If
The current system allows a user to build their
any messages are sent to the dummy class, they
own classes. This involves using the meta-class
are forwarded on to GemStone for processing.
classes, the important ones being Class,
This saves the overhead of transferring the
Behavior, ClassDescription found under the
datasets from one environment to another. The
Kernel-Classes class category. The meta-class
non-dataset classes used in the view and
classes are the classes used by the system for
controller were not affected.
creating classes, methods, etc.
5.3 Performance Standard accessing methods are created next for
storing and updating each of the instance
Initially the data was loaded into Smalltalk and variables (setters and getters). Anything
then committed to GemStone. With small additional to these would have to be added by
datasets the time to do this could be measured the user independently.
in minutes rather than hours. Problems with
speed started occurring when the structures for 7. CURRENT STATUS
the network analysis were added and large
datasets were used. The data could no longer be Currently, the GIS displays two windows: an
added via Smalltalk because it took too long to information window for invoking options and
commit, instead it was loaded overnight using the map window for drawing the required map.

Enterprise Database Technology and its Applications 47


When adding new functionality to the GIS, the existence of these indexes, but work faster
some amendments have been made to the data when they are present.
structures to make them more efficient. This has One peculiar feature of the implementation
been chiefly to the classes directly involved in method chosen which would not be available
the MVC, the domain model classes have been either in relational systems or in most other
relatively unchanged. object-oriented systems is the ability to
A true GIS does not simply store the map dynamically define new data types. This has
details, it must provide the facilities to been shown to be central to the integration of
manipulate the data too. The functionality aspatial data with the representation of the
added so far includes nearest point, shortest spatial as described in this paper.
path using Dijkstra's algorithm [Dijkstra 1976],
zooming in and out. REFERENCES
The next planned enhancement is to provide
default windows for insert/editing/viewing the Barclay K. and Savage J., 1997. Object-
aspatial data, and to provide the ability to link Oriented Design with C++, Prentice Hall.
the aspatial data to the spatial. Butterworth P., Otis A. and Stein J., 1991. The
GemStone Object Database Management
8. CONCLUSIONS System. Communications of the ACM, 34 (10),
Oct. 1991, 64-77.
It has been claimed that object oriented Cattell R.G.G. (Ed.), 1997. The Object
database products are suitable for the Database Standard: OMDG 2.0. Morgan
development of systems similar to the GIS Kaufmann.
developed at Wolverhampton. This research Dijkstra E.W., 1976, A Discipline of
supports that claim to an extent. The effort Programming, Englewood Cliffs, Prentice-Hall.
required to implement this system has been
Garvey M.A. and Jackson M.S., 1989.
considerably less than that anticipated if a
Introduction to Object-Oriented Databases.
relational, or object-relational, database system
Information and Software Technology, 31(10),
had been used. The main drawback so far has
Dec. 1989, 521-528.
been the speed of the system. Further work is
being carried out in this area to resolve the Jordan M., 1999. The Javatm Platform as a
problem. An investigation is currently Database. Proceedings of Java and Databases:
underway to port the system to Java, using Persistence options, Workshop 27, Conference
PJama as the persistent store [Jordan 1999]. on Object-oriented Programming, Systems,
The schema has transferred successfully and Languages and Applications (OOPSLA '99),
initial results regarding performance are November 1-5, 1999 Denver USA.
proving promising. Kent W., 1979. Limitations of Record-Based
Standard algorithms have been used for typical Information Systems, ACM TODS 4(1), 85-97.
GIS functionality, which have fitted in with the Kranser G. and Pope S.T., 1988, A Cookbook
schema. The only amendment to the data for using the model-view-controller user
structures so far was during the addition of the interface in Smalltalk-80, in Journal of Object-
network analysis. The methods that read the Oriented Programming, 1(3), Aug-Sept. 1988,
data in had to be amended to test for road 26-49.
segments, which were added to the map’s Maguire D.J. and Dangermond J., 1991 The
network graph. This network graph was added Functionality of GIS, in (Eds.) Maguire D.J.,
for convenience, rather than necessity and acts Goodchild M.F. and Rhind D.W., GIS Vol1:
like an index to the road objects. The Principles. Longman, Scientific and Technical,
algorithms as implemented do not depend on 319-335.

48 ICEIS 2000

You might also like