You are on page 1of 16

Database Modelling in UML

By Geoffrey Sparks, sparks@sparxsystems.com.au : http://www.sparxsystems.com.au


Originally published in Methods & Tools e-newsletter : http://www.martinig.ch/mt/index.html

Introduction Data Profile (as proposed by Rational


Software).
When it comes to providing reliable, Some familiarity with object-oriented
flexible and efficient object persistence design, UML and relational database
for software systems, today's designers modelling is assumed.
and architects are faced with many
choices. From the technological The Class Model
perspective, the choice is usually
between pure Object-Oriented, Object- The Class Model in the UML is the
Relational hybrids, pure Relational and main artefact produced to represent the
custom solutions based on open or logical structure of a software system.
proprietary file formats (eg. XML, It captures the both the data
OLE structured storage). From the requirements and the behaviour of
vendor aspect Oracle, IBM, Microsoft, objects within the model domain. The
POET and others offer similar but techniques for discovering and
often-incompatible solutions. elaborating that model are outside the
scope of this article, so we will assume
This article is about only one of those the existence of a well designed class
choices, that is the layering of an model that requires mapping onto a
object-oriented class model on top of a relational database.
purely relational database. This is not
to imply this is the only, best or The class is the basic logical entity in
simplest solution, but pragmatically it the UML. It defines both the data and
is one of the most common, and one the behaviour of a structural unit. A
that has the potential for the most class is a template or model from
misuse. which instances or objects are created
at run time. When we develop a logical
We will begin with a quick tour of the model such as a structural hierarchy in
two design domains we are trying to UML we explicitly deal with classes.
bridge: firstly the object-oriented class
model as represented in the UML, and When we work with dynamic
secondly the relational database model. diagrams, such as sequence diagrams
and collaborations, we work with
For each domain we look only at the objects or instances of classes and their
main features that will affect our task. inter-actions at run-time.
We will then look at the techniques
and issues involved in mapping from The principal of data hiding or
the class model to the database model, encapsulation is based on localisation
including object persistence, object of effect. A class has internal data
behaviour, relationships between elements that it is responsible for.
objects and object identity. We will Access to these data elements should
conclude with a review of the UML be through the class's exposed
behaviour or interface. Adherence to
Person Person Class attributes :
the encapsulated
- Address: CAddress
data
# Name:
# St i double
Age:

+ getAge() : int
+ setAge(n)
+ getName() :
+ St i
setName(s)
A simple person class
with no state or Class operations:
behaviour shown the behaviour

Attributes and operations define the state of an object


t
run-time and the capabilities or behaviour of the
bj t

Figure 1 - Classes, attributes and operations

this principal results in more relationship that is most interesting: for


maintainable code. example an Address class may be
associated with a Person class. The
Behaviour mapping of this relationship into the
relational data space requires some
Behaviour is captured in the class care.
model using the operations that are
defined for the class. Operations may Aggregation is a form of association
be externally visible (public), visible to that implies the collection of one class
children (protected) or hidden of objects within another. Composition
(private). By combining hidden data is a stronger form of aggregation that
with a publicly accessible interface and implies one object is actually
hidden or protected data manipulation, composed of others. Like the
a class designer can create highly association relationship, this implies a
maintainable structural units that complex class attribute that requires
support rather than hinder change. careful consideration in the process of
mapping to the relational domain.
Relationships and Identity
While a class represents the template
Association is a relationship between 2 or model from which many object
classes indicating that at least one side instances may be created, an object at
of the relationship knows about and run time requires some means of
somehow uses or manipulates the other identifying itself such that associated
side. This relationship may by objects may act upon the correct object
functional (do something for me) or instance. In a programming language
structural (be something for me). For like C++, object pointers may be
this article it is the structural passed around and held to allow
objects access to a unique object with which to locate or re-create the
instance. required object.
Often though, an object will be
destroyed and require that it be re- The Relational Model
created as it was during its last active
instance. These objects require a The relational data model has been
storage mechanism to save their around for many years and has a
internal state and associations into and proven track record of providing
to retrieve that state as required. performance and flexibility. It is
essentially set based and has as its
Inheritance provides the class model fundamental unit the 'table', which is
with a means of factoring out common composed of a set of one or more
behaviour into generalised classes that 'columns', each of which contains a
then act as the ancestors of many data element.
variations on a common theme.
Inheritance is a means of managing Tables and Columns
both re-use and complexity. As we will
see, the relational model has no direct A relational table is collection of one
counterpart of inheritance, which or more columns each of which has a
creates a dilemma for the data unique name within the table construct.
modeller mapping an object model Each column is defined to be of a
onto a relational framework. certain basic data type, such as a
number, text or binary data. A table
Navigation from one object at run time definition is a template from which
to another is based on absolute table rows are created, each row being
references. One object has some form an instance of a possible table instance.
of link (a pointer or unique object ID)

A class hierarchy Person Aggregation captures the concept Family


showing a generalised of
collection or composition between classes
person class from
which other classes
are derived

Parent Child The main relationships we are


interested in are Association,
2 Aggregation and Inheritance.
1..n These
describe the ways classes interact
Association captures
a having or using or relate to each
other
relationship between
classes

Figure 2 - UML Class model notation


Share
A Person may
own a set of
Shares

A Person is composed of
a strict set of ID
0..n documents (having n
A Person may
reside at zero elements)
or more
addresses 0..1
Address Person ID Document

0..n 0..n
1 n

An Address may Three forms of the Aggregation relationship. The weak form is
have zero or depicted with an unfilled diamond head, the strong form
more Persons in (composition) with a filled head.
residence

Figure 3- Aggregation Relationships

Public Data Access Database stored procedures provide a


means of extending database
The relational model only offers a functionality through proprietary
public data access model. All data is language extensions used to construct
equally exposed and open to any functional units (scripts). These
process to update, query or manipulate functional procedures do not map
it. Information hiding is unknown. directly to entities, nor have a logical
relationship to them.
Behaviour
Navigation through relational data sets
The behaviour associated with a table is based on row traversal and table
is usually based on the business or joins. SQL is the primary language
logical rules applied to that entity. used to select rows and locate
instances from a table set.
Constraints may be applied to columns
in the form of uniqueness Relationships and Identity
requirements, relational integrity
constraints to other tables/rows, The primary key of a table provides the
allowable values and data types. unique identifying value for a
particular row. There are two kinds of
Triggers provide some additional primary key that we are interested in:
behaviour that can be associated with firstly the meaningful key, made up of
an entity. Typically this is used to data columns which have a meaning
enforce data integrity before or after within the business domain, and
updates, inserts and deletes. second the abstract unique identifier,
such as a counter value, which have no
business meaning but uniquely identify Having looked at the two domains of
a row. We will discuss this and the interest and compared some of the
implications of meaningful keys later. important features of each, we will
digress briefly to look at the notation
A table may contain columns that map proposed to represent relational data
to the primary key of another table. models in the UML.
This relationship between tables
defines a foreign key and implies a The UML Data Model Profile
structural relationship or association
between the two tables. The Data Model Profile is a proposed
UML extension (and currently under
Summary review - Jan 2001) to support the
modelling of relational databases in
From the above overview we can see UML. It includes custom extensions
that the object model is based on for such things as tables, data base
discrete entities having both state schema, table keys, triggers and
(attributes/data) and behaviour, with constraints. While this is not a ratified
access to the encapsulated data extension, it still illustrates one
generally through the class public possible technique for modelling a
interface only. The relational model relational database in the UML.
exposes all data equally, with limited
support for associating behaviour with Tables
data elements through triggers, indexes
and constraints. Customer

You navigate to distinct information in


the object model by moving from
object to object using unique object
identifiers and established object
relationships (similar to a network
A table in the UML Data Profile is a
data model). In the relational model
class with the Table stereotype,
you find rows by joining and filtering
displayed as above with a table icon in
result sets using SQL using generalised
the top right corner.
search criteria.
Columns
Identity in the object model is either a
run-time reference or persistent unique
ID (termed an OID). In the relational Customer

world, primary keys define the PK OID: int


Name: VARCHAR2
uniqueness of a data set in the overall Address: VARCHAR2
data space.

In the object model we have a rich set


of relationships: inheritance,
aggregation, association, composition, Database columns are modelled as
dependency and others. In the attributes of the Table class. For
relational model we can really only example, the figure above shows some
specify a relationship using foreign attributes associated with the Customer
keys. table. In the example, an object id has
been defined as the primary key, as
well as two other columns, Name and Customer
Address. Note that the example above PK OID: int
defines the column type in terms of the Name: VARCHAR2
Address: VARCHAR2
native DBMS data types.
+ PK idx_customer00()
+ FK idx_customer02()
Behaviour + Index idx_customer01()
+ Trigger trg_customer00()
+ Unique unq_customer00()
So far we have only defines the logical + Proc spUpdateCustomer()
(static) structure of the table; in + Check chk_customer00()

addition we should describe the


behaviour associated with columns,
including indexes, keys, triggers, The example illustrates the following
procedures & etc. Behaviour is possible behaviour:
represented as stereotyped operations.
1. A primary key constraint (PK);
The figure below shows our table 2. A Foreign key constraint (FK);
above with a primary key constraint 3. An index constraint (Index);
and index, both defined as stereotyped 4. A trigger (Trigger);
operations: 5. A uniqueness constraint (Unique);
6. A stored procedure (Proc) - not
formally part of the data profile,
Customer
but an example of a possible
PK OID: int
Name: VARCHAR2 modelling technique; and a
Address: VARCHAR2 7. Validity check (Check).
+ PK idx_customer00()
+ index idx_customer01() Using the notation provided above, it is
possible to model complex data
structures and behaviour at the DBMS
level. In addition to this, the UML
Note that the PK flag on the column provides the notation to express
'OID' defines the logical primary key, relationships between logical entities.
while the stereotyped operation "PK
idx_customer00" defines the Relationships
constraints and behaviour associated
with the primary key implementation The UML data modelling profile
(that is, the behaviour of the primary defines a relationship as a dependency
key). of any kind between two tables. It is
represented as a stereotyped
Adding to our example, we may now association and includes a set of
define additional behaviour such as primary and foreign keys.
triggers, constraints and stored
procedures as in the example below: The data profile goes on to require that
a relationship always involves a parent
and child, the parent defining a
primary key and the child
implementing a foreign key based on
all or part of the parent primary key.

The relationship is termed 'identifying'


if the child foreign key includes all the
elements of the parent primary key and hardware (a 'node' in UML).
'non-identifying' if only some elements
of the primary key are included. To represent schema within the
database, use the schema stereotype
The relationship may include on a package. A table may be placed in
cardinality constraints and be modelled a schema to establish its scope and
with the relevant PK - FK pair named location within a database.
as association roles. Figure 4 illustrates
this kind of relationship modelling schema
using UML. User

Child
Grandchild
The Physical Model Grandparent
Parent
Person
UML also provides some mechanisms
for representing the overall physical
structure of the database, its contents
and deployed location. To represent a
physical database in UML, use a
stereotyped component as in the figure Mapping from the Class Model to
below: the Relational Model

Database Having described the two domains of


MainOraDB interest and the notation to be used, we
can now turn our attention as to how to
map or translate from one domain to
the other. The strategy and sequence
A component represents a discrete and presented below is meant to be
deployable entity within the model. In suggestive rather than proscriptive -
the physical model, a component may adapt the steps and procedures to your
be mapped on to a physical piece of personal requirements and

Parent Child
PK_PersonID FK_PersonID
2 identifying 0..n

An identifying relationship between child and parent, with role names


based on primary to foreign key relationship.

Figure 4 - UML relationship


environment. construct from the object-oriented
model that requires translating into the
1. Model Classes relational model. The relational space
is essentially flat, every entity being
Firstly we will assume we are complete in its self, while the object
engineering a new relational database model is often quite deep with a well-
schema from a class model we have developed class hierarchy.
created. This is obviously the easiest
direction as the models remain under The deep class model may have many
our control and we can optimise the layers of inherited attributes and
relational data model to the class behaviour, resulting in a final, fully
model. In the real world it may be that featured object at run-time. There are
you need to layer a class model on top three basic ways to handle the
of a legacy data model - a more translation of inheritance to a relational
difficult situation and one that presents model:
its own challenges. For the current
discussion will focus on the first 1. Each class hierarchy has a single
situation. At a minimum, your class corresponding table that contains
model should capture associations, all the inherited attributes for all
inheritance and aggregation between elements - this table is therefore the
elements. union of every class in the
hierarchy. For example, Person,
2. Identify persistent objects Parent, Child and Grandchild may
all form a single class hierarchy,
Having built our class model we need and elements from each will appear
to separate it into those elements that in the same relational table;
require persistence and those that do 2. Each class in the hierarchy has a
not. For example, if we have designed corresponding table of only the
our application using the Model-View- attributes accessible by that class
Controller design pattern, then only (including inherited attributes). For
classes in the model section would example, if Child is inherited from
require persistent state. Person only, then the table will
contain elements of Person and
3. Assume each persistent class maps Child only;
to one relational table 3. Each generation in the class
hierarchy has a table containing
A fairly big assumption, but one that only that generation's actual
works in most cases (leaving the attributes. For example, Child will
inheritance issue aside for the map to a single table with Child
moment). In the simplest model a class attributes only
from the logical model maps to a
relational table, either in whole or in There are cases to be made for each
part. The logical extension of this is approach, but I would suggest the
that a single object (or instance of a simplest, easiest to maintain and less
class) maps to a single table row. error prone is the third option. The first
option provides the best performance
4. Select an inheritance strategy at run-time and the second is a
compromise between the first and last.
Inheritance is perhaps the most
problematic relationship and logical
The first option flattens the hierarchy be replicated in many child tables.
and locates all attributes in one table - Even worse, the parental data in two or
convenient for updates and retrievals more child classes may be redundantly
of any class in the hierarchy, but stored in many tables; if a parent's
difficult to authenticate and maintain. attributes are modified, there is
Business rules associated with a row considerable effort in locating
are hard to implement, as each row dependent children and updating the
may be instantiated as any object in the affected rows.
hierarchy. The dependencies between
columns can become quite The third option more accurately
complicated. In addition, an update to reflects the object model, with each
any class in the hierarchy will class in the hierarchy mapped to its
potentially impact every other class in own independent table. Updates to
the hierarchy, as columns are added, parents or children are localised in the
deleted or modified from the table. correct space. Maintenance is also
relatively easier, as any modification
The second option is a compromise of an entity is restricted to a single
that provides better encapsulation and relational table also. The down side is
eliminates empty columns. However, a the need to re-construct the hierarchy
change to a parent class may need to at run-time to accurately re-create a

Parent tbl_Parent
A Parent class with unique ID (OID)
- OID: GUID and Name and Sex attributes maps to AddressOID: VARCHAR
# Name: String a relational table. Name: VARCHAR
# Sex: Gender PK OID: VARCHAR
Sex: VARCHAR
+ setName(String)
+ getName() : String
+ setSex(String) <<realises>>
+ getSex() : String

m_Address 0..n
The Address association from the logical model becomes
a foreign key relationship in the data model

1
Address
- OID: GUID
# City: String The Address class in the logical tbl_Address
# Phone: String model becomes a table in the City: VARCHAR
# State: String data model PK OID: VARCHAR
# Street: String Phone: VARCHAR
State: VARCHAR
+ getCity() : String Street: VARCHAR
+ getStreet() : String
+ setCity(String) <<realises>>
+ setStreet(String)

Figure 5 - Class to Table mapping


child class's state. A Child object may A complete run-time scenario can then
require a Person member variable to be loaded from storage reasonably
represent their model parentage. As efficiently.
both require loading, two database
calls are required to initialise one An important point about the OID
object. As the hierarchy deepens, with values above is that they have no
more generations, the number of inherent meaning beyond simple
database calls required to initialise or identity. They are only logical pointers
update a single object increases. and nothing more. In the relational
model, the situation is often quite
It is important to understand the issues different.
that arise when you map inheritance
onto a relational model, so you can Identity in the relational model is
decide which solution is right for you. normally implemented with a primary
key. A primary key is a set of columns
in a table that together uniquely
5. For each class add a unique identify a row. For example, name and
object identifier address may uniquely identify a
'Customer'. Where other entities, such
In both the relational and the object as a 'Salesperson', reference the
world, there is the need to uniquely 'Customer', they implement a foreign
identify an object or entity. key based on the 'Customer' primary
key.
In the object model, non-persistent
objects at run-time are typically The problem with this approach for our
identified by direct reference or by a purposes is the impact of having
pointer to the object. Once an object is business information (such as customer
created, we can refer to it by its run- name and address) embedded in the
time identity. However, if we write out identifier. Imagine three or four tables
an object to storage, the problem is all have foreign keys based on the
how to retrieve the exact same instance customer primary key, and a system
on demand. change requires the customer primary
key to change (for example to include
The most convenient method is to 'customer type'). The work required to
define an OID (object identifier) that is modify both the 'customer' table and
guaranteed to be unique in the the entities related by foreign key is
namespace of interest. This may be at quite large.
the class, package or system level,
depending on actual requirements. On the other hand, if an OID was
implemented as the primary key and
An example of a system level OID formed the foreign key for other tables,
might be a GUID (globally unique the scope of the change is limited to
identifier) created with Microsoft's the primary table and the impact of the
'guidgen' tool; eg. {A1A68E8E-CD92- change is therefore much less.
420b-BDA7-118F847B71EB}. A class
level OID might be implemented using Also, in practice, a primary key based
a simple numeric (eg. 32 bit counter). on business data may be subject to
change. For example a customer may
If an object holds references to other change address or name. In this case
objects, it may do so using their OID. the changes must be propagated
correctly to all other related entities, detailed below for handling
not to mention the difficulty of associations and aggregation.
changing information that is part of the
primary key. 7. Map associations to foreign keys

An OID always refers to the same More complex class attributes (ie.
entity - no matter what other those which represent other classes),
information changes. In the above are usually modelled as associations.
example, a customer may change name An association is a structural
or address and the related tables relationship between objects. For
require no change. example, a Person may live at an
Address. While this could be modelled
When mapping object models into as a Person has City, Street and Zip
relational tables, it is often more attributes, in both the object and the
convenient to implement absolute relational world we are inclined to
identity using OID's rather than structure this information as a separate
business related primary keys. The entity, an Address.
OID as primary and foreign key
approach will usually give better load In the object domain an address
and update times for objects and represents a unique physical object,
minimise maintenance effort. In possibly with a unique OID. In the
practice, a business related primary relational, an address may be a row in
key might be replaced with: an Address table, with other entities
having a foreign key to the Address
1. A uniqueness constraint or index primary key.
on the columns concerned;
2. Business rules embedded in the In both models then, there is the
class behaviour; tendency to move the address
3. A combination of 1 and 2. information into a separate entity. This
helps to avoid redundant data and
Again, the decision to use meaningful improves maintainability.
keys or OID's will depend on the exact
requirements of the system being So for each association in the class
developed. model, consider creating a foreign key
from the child to the parent table.
6. Map attributes to columns

In general we will map the simple data


attributes of a class to columns in the
relational table. For example a text and
number field may represent a person's
name and age respectively. This sort of
direct mapping should pose no
problem - simply select the appropriate
data type in the vendor's relational
model to host your class attribute.

For complex attributes (ie. attributes


that are other objects) use the approach
Salesperson Customer

PK OID: int PK OID: int


Name: VARCHAR2 PK_Salesperson FK_Salesperson Name: VARCHAR2
Department: VARCHAR2 Address: VARCHAR2
1 0..* Salesperson: int
+ PK PK_Salesperson()
+ PK PK_Customer()
+ FK FK_SalesPerson()

Relationships are based on the PK- FK pair. This example relates a Salesperson to a
Customer by the appropriate primary and foreign keys. The assumption is that a
customer may only be associated with one salesperson.

Figure 6 - Table relationships in UML

8. Map Aggregation and shares from a Share table, but each


Composition Share may be associated with zero or
one Person. If the Person ceases to
Aggregation and composition exist, the Shares become un-owned or
relationships are similar to the are passed to another Person. In the
association relationship and map to relational world, this could be
tables related by primary-foreign key implemented as each Share having an
pairs. There are however, some points 'owner' column which stored a Person
to bear in mind. ID (or OID) .

Ordinary aggregation (the weak form) The strong form of aggregation,


models relationships such as a Person however, has important integrity
resides at one or more Addresses. In constraints associated with it.
this instance, more than one person Composition, implies that an entity is
could live at the same address, and if composed of parts, and those parts
the Person ceased to exist, the have a dependent relationship to the
Addresses associated with them would whole. For example, a Person may
still exist. This example parallels the have identifying documents such as a
many-to-many relationship in Passport, Birth Certificate, Driver's
relational terminology, and is usually License & etc. A Person entity may be
implemented as a separate table composed of the set of such identifying
containing a mapping of primary keys documents. If the Person is deleted
from one table to the primary keys of from the system, then the identifying
another. documents must be deleted also, as
they are mapped to a unique
A second example of the weak form of individual.
aggregation is where an entity has use
or exclusive ownership of another. For If we ignore the OID issue for the
example, a Person entity aggregates a moment, a weak aggregation could be
set of shares. This implies a Person implemented using either an
may be associated with zero or more intermediate table (for the many-to-
many case) or with a foreign key in the uniqueness and data constraints, and
aggregated class/table (one-to-many relational integrity.
case). In the case of the many-to-many
relationship, if the parent is deleted, A non-persistent object model would
the entries in the intermediate table for typically implement all the behaviour
that entity must also be deleted also. In required in one or more programming
the case of the one-to-many languages (eg. Java or C++). Each
relationship, if the parent is deleted, class will be given its required
the foreign key entry (ie. 'owner') must behaviour and responsibilities in the
be cleared. form of public, protected and private
methods.
In the case of composition, the use of a
foreign key is mandatory, with the Relational databases from different
added constraint that on deletion of the vendors typically include some form of
parent the part must be deleted also. programmable SQL based scripting
Logically there is also the implication language to implement data
with composition that the primary key manipulation. The two common
of the part forms part of the primary examples are triggers and stored
key of the whole - for example, a procedures.
Person's primary key may composed of
their identifying documents ID's. In When we mix the object and relational
practice this would be cumbersome, models, the decision is usually whether
but the logical relationship holds true. to implement all the business logic in
the class model, or to move some to
the often more efficient triggers and
9. Define relationship roles stored procedures implemented in the
relational DBMS.
For each association type relationship,
each end of the relationship may be From a purely object-oriented point of
further specified with role information. view, the answer is obviously to avoid
Typically, you will include the Primary triggers and stored procedures and
Key constraint name and the Foreign place all behaviour in the classes. This
Key Constraint name. Figure 6 localises behaviour, provides for a
illustrates this concept. This logically cleaner design, simplifies maintenance
defines the relationship between the and provides good portability between
two classes. DBMS vendors.

In addition, you may specify additional In the real world, the bottom line may
constraints (eg. {Not NULL}) on the be scaling to 100's or 1000's of
role and cardinality constraints (eg. transactions per second, something
0..n). stored procedures and triggers are
purpose designed for.
10. Model behaviour
If purity of design, portability,
We now come to another difficult maintenance and flexibility are the
issue: whether to map some or all class main drivers, localise all behaviour in
behaviour to the functional capabilities the object methods.
provided by database vendors in the
form of triggers, stored procedures, If performance is an over-riding
concern, consider delegating some
behaviour to the more efficient DBMS In UML, the physical model describes
scripting languages. Be aware though how something will be deployed into
that the extra time taken to integrate the real world - the hardware platform,
the object model with the stored network connectivity, software,
procedures in a safe way, including operating system, dll's and other
issues with remote effects and components. You produce a physical
debugging, may cost more in model to complete the cycle - from an
development time than simply initial use case or domain model,
deploying to more capable hardware. through the class model and data
models and finally the deployment
As mentioned earlier, the UML Data model.
Profile provides the following
extensions (stereotyped operations) Typically for this model you will
with which you can model DBMS create one or more nodes that will host
behaviour: the database(s) and place DBMS
software components on them. If the
! Primary key constraint (PK); database is split over more than one
! Foreign key constraint (FK); DBMS instance, you can assign
! Index constraint (Index); packages (schema) of tables to a
! Trigger (Trigger); single DBMS component to indicate
! Uniqueness constraint (Unique); where the data will reside.
! Validity check (Check).
Conclusion

11. Produce a physical model This concludes this short article on


database modelling using the UML. As

schema
System
MainServer sys_aliases
sys_procs
sys_queries
sys_tables
sys_users
Database
MainOraDB

schema
User
Child
Grandchild
Grandparent
Parent
Person

A Node is a physical piece of hardw are (such as a Unix server) on w hich components are deployed. The
database component in this example is also mapped to tw o logical schema, each of w hich contains a
number of tables.

Figure 7 - The Physical Model


you can see, there are quite a few
issues to consider when mapping from
the object world to the relational. The
UML provides support for bridging the
gap between both domains, and
together with extensions such as the
UML Data Profile is a good language
for successfully integrating both
worlds.

References

Muller, Robert J., Database Design for


Smarties, Morgan Kaufman, 1999.

Rational Software, The UML and Data


Modelling, Rational Software

Ambler, Scott W., Mapping Objects to


Relational Databases, AmbySoft inc,
1999

About the Author

Geoffrey Sparks is the director of


Sparx Systems, an Australian company
that specialises in UML tools.

The Sparx Systems web site is at:


www.sparxsystems.com.au

Geoffrey may be contacted at


sparks@sparxsystems.com.au
A quick summary guide to data modelling in UML

1. Create a class model for your development domain

2. Identify persistent classes from the model

3. Assume each persistent class in the model will map to one relational table

4. Select a suitable inheritance strategy for each class hierarchy

5. For each class add a unique ID (OID) or select a suitable primary key

6. For each class map simple data types to table columns

7. For each class, map complex attributes (association, aggregation) to PK/ FK pairs. Take
special note of the strong and weak forms of aggregation.

8. For related classes, map PK, FK pairs naming the role ends according to selected key.

9. Label relationship roles with their appropriate cardinality and stereotype: <<identifying>>
or <<non-identifying>>

10. Add stereotyped operations for table behaviour (keys, indexes, uniqueness, checks, and
triggers)

11. Divide persistent classes into logical schema

12. Create a deployment model and link database components to physical nodes

You might also like