You are on page 1of 13

Topic 2.

4: The Evolution of Data Models


The quest for better data management has led to different models that attempt to
resolve the file system’s critical shortcomings. Because each data model evolved
from its predecessors, it is essential to examine the major data models in roughly
chronological order.

2.4.1 The Hierarchical Model


The first data model was developed by Rockwell and IBM in the 1970s. It is
known as the hierarchical model. The hierarchical database is a collection of
records that is logically organized to conform to the upside-down tree
(hierarchical) structure. Within the hierarchy, the top layer (the root) is perceived
as the parent of the segment directly beneath it. While this model represents
1:M relationships well, it does not represent M:N relationships.

Basic Structure
Given its manufacturing heritage, the hierarchical model’s best basic logical
structure is best understood when you examine a manufacturing process. For,
example, let’s examine a somewhat simplified production process that creates a
filing cabinet:

1. A filing cabinet has many components: a frame, a set of drawers, and


sliding bars for those drawers.
2. A component may be composed of many smaller assemblies. For
example, each drawer has a handle with a latching mechanism, a set of
rollers that fits into the frame’s sliding bars, and a divider blade.
3. An assembly may contain many parts. For instance, each roller is
composed of a small wheel, an axle, and a brace.
4. The production process is based on data relationships that remain fixed
over time. Whether a given filing cabinet model is produced today or
tomorrow, the same parts are put together in the same ways to produce
the same assemblies that are combined to produce the same components
that are assembled in the same way to create the filing cabinet.

Tracking the parts, the assemblies, and the components we have just described
is facilitated by understanding the logical process that is represented by the
upside-down “tree,” known as a hierarchical structure, shown in Figure 2.1. We
have labeled the structure’s components to help you understand the basic
hierarchical model’s vocabulary.

As you examine Figure 2.1, note that the user perceives the hierarchical
database as a hierarchy of segments. A segment is the equivalent of a file
system’s record type. In other words, the hierarchical database is a collection
of record segment structures that is logically organized to conform to the
upside-down tree (hierarchical) structure shown in Figure 2.1.Within the
hierarchy, the top layer (the root) is perceived as the parent of the segment
directly beneath it.

For example, in Figure 2.1, the root segment is the parent of level 1segments,
which in turn, are the parents of the level 2 segments, and so on. In turn the
segments below other segments are the children of the segment above them. In
short:

° Each parent can have many children


° Each child has only one parent
In this hierarchical structure, it is easy to trace both the database’s components
and the 1:M relationships among them.

Advantages
– Conceptual simplicity
– Database security
– Data independence (because the data characteristics of the database
structure are not defined in the programs accessing the database, instead
the database structure and its data characteristics are defined in the data
dictionary component of the DBMS. Therefore the programs accessing
the database become independent of the database)
– Database integrity (because data duplication or data redundancy is
minimized as a result of relating the segments or records)
– Efficiency (the hierarchical DBMS file storage organization and access
methods are based on the new hierarchal database structure which is
much faster than the file storage organization and access methods used in
the old file system)

Disadvantages
– Complex implementation
– Difficult to manage
– Lacks structural independence (because the programmer still needs to
write instructions on how and where to find the data stored on the
computer disk, which depends on the database structure)
– Complex applications programming and use
– Implementation limitations (because the hierarchical data model does not
support entities or record segments having multiple parents which are
modeled in a M:M relationships between two or more entities)
– Lack of standards among the implementation software (DBMS) developed
by various software vendors

2.4.2 The Network Model


The network model was created to represent complex data more effectively than
the hierarchical model could, to improve database performance, and to impose a
database standard.

Basic Structure
In many respects the network model resembles the hierarchical model. For
example, as in the hierarchical model, the user perceives the network database
as a collection of records in 1:M relationships. However, unlike the hierarchical
model, the network model allows a record to have more than one parent or
multiple parents. This feature allows the network model to handle complex (M:M)
relationships between two or more entities, such the commonly encountered M:M
relationships depicted in Figure 2.2 can be handled easily by the network model.

In Figure 2.2, the M:M relationship between the ORDER and PART is resolved
by the introduction of the ORDER_LINE bridge entity.

In network database terminology, a relationship is called a set. Each set is


composed of at least two record types: an owner record and a member record.
The difference between the hierarchical model and the network model is that the
latter might include a condition in which a record can appear (as a member) in
more than one set. In other words, a member may have several owners. A set
represents a 1:M relationship between the owner and the member. An example
of such a relationship is depicted in Figure 2.3.

Advantages
– Conceptual simplicity
– Handles more relationship types
– Data access flexibility
– Promotes database integrity
– Data independence
– Conformance to standards

Disadvantages
– System complexity
– Lack of structural independence (because the programmer still needs to
write instructions on how and where to find the data stored on the
computer disk, which depends on the database structure)
2.4.3 The Relational Model
The basic building block of the relational model is the table, which is a matrix of
rows and columns. Tables are related to each other via a common entity
characteristic or attribute (primary key in the parent table is a foreign key in the
child table). The parent table is the table which maps to the entity of the “1” side
of the relationship and the child table maps to the entity of the “many” side of the
relationship between the two tables.

All three relationship types are easily represented in this model. One of the
disadvantages of the relational model is that it requires substantial system
overhead to run the Relational DBMS (RDBMS). However, with the currently
available advanced computer hardware and software, high requirements for
processing relational databases do not represent an overhead problem any
more.

Basic Structure
The relational data model is implemented through a very sophisticated relational
database management system (RDBMS). The RDBMS performs the same
basic functions provided by the hierarchical and network database systems, plus
a host of other functions that make the relational data model easier to understand
and to implement.

The most important advantage of the RDBMS is its ability to let the user/designer
operate in a human logical environment. The RDBMS manages all of the
complex physical details. Thus, the relational database is perceived by the user
to be a collection of tables in which data are stored.

Each table is a matrix consisting of series of row/column intersections. Tables,


also called relations, are related to each other by sharing a common entity
characteristic/attribute. For example, the CUSTOMER table in Figure 2.4 might
contain a sales agent’s number which maintains a common link to the agent
table.
The common link between the CUSTOMER and AGENT tables thus enables us
to match the customer to his/her sales agent, even though the customer data are
stored in another table. Although the tables are completely independent of one
another, we can easily connect the data between tables. The relational model
thus provides a minimum level of controlled redundancy to eliminate most of the
redundancies found in old file systems.
The relationship type (1:1, 1:M, or M:N) is often shown in a relational schema, an
example of which is depicted in Figure 2.5. A relational schema is a visual
representation of the relational database’s entities, the attributes within those
entities, and the relationship between those entities.

As you examine Figure 2.5, note that the relational schema shows the
connecting fields (in this case, AGENT_CODE) and the relationship type, 1:M.
MS Access DBMS software used to generate Figure 2.5, employs the ∞ symbol
to indicate the “many” side. In this example, the CUSTOMER represents the
“many” side because an AGENT can serve many CUSTOMERS. The AGENT
represents the “1” side, because each CUSTOMER is served by only one
AGENT.

Advantages
– Structural independence
– Improved conceptual simplicity
– Easier database design, implementation, management, and use
– Ad hoc query capability
– Powerful database management system

Disadvantages
– Substantial hardware and system software overhead (this is not an issue
any more because of the currently available hardware and software)
– Can facilitate poor design and implementation (Less experienced
database designers may develop poor database design)
– May promote “islands of information” problems (because various users in
different departments will be developing their own database applications)

2.4.4 The Entity Relationship Model


An alternate model is the Entity Relationship (ER) model. In this model, entities
are drawn by using diagrams with line connectors that depict their relationships.
This model has the advantage of visually depicting relationships. A disadvantage
is that there is no corresponding (data management language (DML).

The ER model or ERM is a widely accepted and adapted graphical tool for data
modeling. Peter Chen first introduced the ER data model in 1976 in his landmark
paper “The Entity Relationship Model: Toward a Unified View of Data”. The ERM
yielded a graphical representation that popularized the use of the ER diagrams
as a tool for conceptual-level data modeling. Better yet, the ER model
complemented the relational model concepts, thus providing the foundation for a
tightly structured database design environment to ensure the proper design of
relational databases.

Basic Structure

ER models are normally represented in an entity relationship diagram (ERD),


which uses graphical representations to model the database requirements.

° An entity is represented in the ERD model by a rectangle, also known as


an entity box. The name of the entity, a noun, is written in the center of the
rectangle. The name of the entity is generally written in capital letters and
is written in singular form: PAINTER rather than PAINTERS. Normally,
when applying the ERD to the relational model, an entity is mapped to a
relational table. Each row in the relational table is known as an entity
instance or entity occurrence in the ER model. Each entity is described
by a set of attributes that describe particular characteristics of the entity.
For example, the entity EMPLOYEE will have attributes such as Social
Security number, a last name, and a first name.

° Relationships describe associations among data entities. Most


relationships describe associations between two entities. ERD modelers
use the term connectivity to label the types of relationships (1:M, M:N,
1:1). The entity connectivity is written next to each entity box.
Relationships are represented by a diamond connected to the related
entities through a relationship line. The name of the relationship, an active
or passive verb, is written inside the diamond. For example, each of the
company’s DEPARTMENTs has many EMPLOYEEs. And a PAINTER
paints many PAINTINGS.

Figure 2.6 shows some basic ERD models that illustrate these relationships and
connectivity type.

The ERD shown in Figure 2.6 is based on the so-called Chen model. Although
the entities and relationships are shown in a horizontal format in Figure 2.6, they
also may be oriented vertically. The entity location and the order in which the
entities are presented are immaterial – just remember to always read a 1:M
relationship from the “1” side to the “M” side.

A more current version of the ERD is the Crow’s Foot Model shown in Figure
2.7. The label “Crow’s Foot” is derived from the three-pronged symbol used to
represent the “many” side of the relationship. The Crow’s Foot model places the
relationship name in the relationship line.

As you examine the basic Crow’s Foot ERD in Figure 2.7, note that the
connectivity is represented by symbols. For example, the “1” is represented by a
short line segment and the “M” is represented by the three-pronged “crow’s foot.”
Like the Chen ERD, the entities and the relationships may be represented
horizontally or vertically. And again like the Chen ERD, the location and the order
in which the entities are presented in a Crow’s Foot ERD are immaterial.

Advantages
– Exceptional conceptual simplicity
– Visual representation
– Effective communication tool
– Integrated with the relational data model

Disadvantages
– Limited constraint representation
– Limited relationship representation (relationships between attributes can not
be modeled)
– No data manipulation language
– Loss of information content (limited space is available to draw large number
of entities in the Chen original notations of the ERD technique)
2.4.5 The Object Oriented Model
In the Object Oriented model entities are represented as objects that contain
both data and operations. An advantage of this model is the addition of semantic
content. A disadvantage is the steeper learning curve.

The semantic data model (SDM) modeled both data and their relationships in a
single structure known as an object. Because its basic modeling structure is an
object, the SDM is said to be an object oriented data model (OODM). In turn,
the OODM becomes the basis for the object oriented database management
system (OODBMS).

An OODM reflects a very different way to define and use entities. Like the
relational model’s entity, an object is described by its factual content. But, quite
unlike an entity, an object includes information about relationships between the
facts within the object, as well as information about relationships with other
objects. Therefore, the facts within the objects are given greater meaning.

Basic Structure

The object oriented data model is based on the following components:

° An object is an abstraction of a real-world thing. An object class is a


representation of a set of objects with shared attributes and behavior. For
example, an object class student is a model of all students in an
educational institution. An object class may be considered equivalent to
an ER model’s entity. More precisely, an object represents only one
individual occurrence of an entity.
° Attributes describe the properties of an object. For example, a PERSON
object class includes the attributes ID, Name, Social Security Number and
Date of Birth.
° Objects that share similar characteristics are grouped in classes. A class
is a collection of similar objects with shared structure (attributes) and
behavior (methods). In a general sense, a class resembles the ER
model’s entity set. However, a class is different from an entity in that it
contains a set of procedures known as methods. A class’s method
represents a real-world action such as finding a selected PERSON’s
name, changing a PERSON’s name, or printing a PERSON’s address. In
other words, methods are the equivalent of procedures in traditional
programming languages. In object oriented terms methods define an
object’s behavior.
° Classes are organized in class hierarchy. The class hierarchy resembles
an upside-down tree in which each class has only one parent. For
example, the CUSTOMER and EMPLOYEE class share a parent
PERSON class. However it is possible that one child class to have
multiple parents.
° Inheritance is the ability of an object within the class hierarchy to inherit
the attributes and methods of the classes above it. For example, we can
create two classes, CUSTOMER and EMPLOYEE, as subclasses from
the class PERSON. In this case CUSTOMER and EMPLOYEE will inherit
all attributes and methods from PERSON.

To illustrate the difference between the OO model and the ER model, let’s
examine their graphic representations in the simple invoicing problem shown in
Figure 2.8.

As you examine Figure 2.8, note that:

° The OO data model represents an object class as a box; all of the object’s
attributes and relationships to other objects are included within the object
class box. The object class representation of the INVOICE includes all
related objects within the same object class box.
° The ER model uses three separate entities and two relationships to
represent an invoice transaction. Because customers can put more than
one item at a time, each invoice references one or more lines, one item
per line. And, because invoices are generated by customers, the data-
modeling requirements include a customer entity and a relationship
between the customer and the invoice.

Advantages
– Adds semantic content
– Visual presentation includes semantic content
– Database integrity
– Both structural and data independence

Disadvantages
– Slow pace of OODM standards development
– Complex navigational data access
– Steep learning curve
– High system overhead slows transactions
– Lack of market penetration

2.4.6 Other Models


Another semantic data model was developed in response to the increasing
complexity of applications- the extended relational data model (ERDM). The
ERDM championed by many relational database researchers constitutes the
relational model’s response to the OODM challenge. This model includes many
of the OO model’s best features within an inherently simpler relational database
structural environment. That’s why a DBMS based on the ERDM is often
described as an object/relational database management system (O/RDBMS).

The OODM and ERDM are similar in the sense that each attempts to address the
demand for more semantic information to be incorporated into the model.
However, the OODM and the ERDM differ substantially both in underlying
philosophy and in the nature of the problem to be addressed.

Although the ERDM includes a strong semantic component, it is primarily based


on the relational data model’s concepts. In contrast, the OODM is wholly based
on the OO semantic data model concepts. The ERDM is primarily geared to
business applications, while the OODM tends to focus on very specialized
engineering and scientific applications. In the database arena, the most likely
scenario appears to be an ever-increasing merging of OO and relational data
model concepts and procedures.

2.4.7 Data Models: Summary


The evolution of database management systems has always been driven by the
search for new ways of modeling increasingly complex real-world data. A
summary of the most commonly recognized data models is shown in Figure 2.9.
Concept Check
What are major types of data models?

How does the hierarchical data model address the problem of data redundancy?

What are the features of relational data models?

You might also like