Professional Documents
Culture Documents
Topic 2.4: The Evolution of Data Models
Topic 2.4: The Evolution of Data Models
Basic Structure
Given its manufacturing heritage, the hierarchical model’s best basic logical
structure is best understood when you examine a manufacturing process. For,
example, let’s examine a somewhat simplified production process that creates a
filing cabinet:
Tracking the parts, the assemblies, and the components we have just described
is facilitated by understanding the logical process that is represented by the
upside-down “tree,” known as a hierarchical structure, shown in Figure 2.1. We
have labeled the structure’s components to help you understand the basic
hierarchical model’s vocabulary.
As you examine Figure 2.1, note that the user perceives the hierarchical
database as a hierarchy of segments. A segment is the equivalent of a file
system’s record type. In other words, the hierarchical database is a collection
of record segment structures that is logically organized to conform to the
upside-down tree (hierarchical) structure shown in Figure 2.1.Within the
hierarchy, the top layer (the root) is perceived as the parent of the segment
directly beneath it.
For example, in Figure 2.1, the root segment is the parent of level 1segments,
which in turn, are the parents of the level 2 segments, and so on. In turn the
segments below other segments are the children of the segment above them. In
short:
Advantages
– Conceptual simplicity
– Database security
– Data independence (because the data characteristics of the database
structure are not defined in the programs accessing the database, instead
the database structure and its data characteristics are defined in the data
dictionary component of the DBMS. Therefore the programs accessing
the database become independent of the database)
– Database integrity (because data duplication or data redundancy is
minimized as a result of relating the segments or records)
– Efficiency (the hierarchical DBMS file storage organization and access
methods are based on the new hierarchal database structure which is
much faster than the file storage organization and access methods used in
the old file system)
Disadvantages
– Complex implementation
– Difficult to manage
– Lacks structural independence (because the programmer still needs to
write instructions on how and where to find the data stored on the
computer disk, which depends on the database structure)
– Complex applications programming and use
– Implementation limitations (because the hierarchical data model does not
support entities or record segments having multiple parents which are
modeled in a M:M relationships between two or more entities)
– Lack of standards among the implementation software (DBMS) developed
by various software vendors
Basic Structure
In many respects the network model resembles the hierarchical model. For
example, as in the hierarchical model, the user perceives the network database
as a collection of records in 1:M relationships. However, unlike the hierarchical
model, the network model allows a record to have more than one parent or
multiple parents. This feature allows the network model to handle complex (M:M)
relationships between two or more entities, such the commonly encountered M:M
relationships depicted in Figure 2.2 can be handled easily by the network model.
In Figure 2.2, the M:M relationship between the ORDER and PART is resolved
by the introduction of the ORDER_LINE bridge entity.
Advantages
– Conceptual simplicity
– Handles more relationship types
– Data access flexibility
– Promotes database integrity
– Data independence
– Conformance to standards
Disadvantages
– System complexity
– Lack of structural independence (because the programmer still needs to
write instructions on how and where to find the data stored on the
computer disk, which depends on the database structure)
2.4.3 The Relational Model
The basic building block of the relational model is the table, which is a matrix of
rows and columns. Tables are related to each other via a common entity
characteristic or attribute (primary key in the parent table is a foreign key in the
child table). The parent table is the table which maps to the entity of the “1” side
of the relationship and the child table maps to the entity of the “many” side of the
relationship between the two tables.
All three relationship types are easily represented in this model. One of the
disadvantages of the relational model is that it requires substantial system
overhead to run the Relational DBMS (RDBMS). However, with the currently
available advanced computer hardware and software, high requirements for
processing relational databases do not represent an overhead problem any
more.
Basic Structure
The relational data model is implemented through a very sophisticated relational
database management system (RDBMS). The RDBMS performs the same
basic functions provided by the hierarchical and network database systems, plus
a host of other functions that make the relational data model easier to understand
and to implement.
The most important advantage of the RDBMS is its ability to let the user/designer
operate in a human logical environment. The RDBMS manages all of the
complex physical details. Thus, the relational database is perceived by the user
to be a collection of tables in which data are stored.
As you examine Figure 2.5, note that the relational schema shows the
connecting fields (in this case, AGENT_CODE) and the relationship type, 1:M.
MS Access DBMS software used to generate Figure 2.5, employs the ∞ symbol
to indicate the “many” side. In this example, the CUSTOMER represents the
“many” side because an AGENT can serve many CUSTOMERS. The AGENT
represents the “1” side, because each CUSTOMER is served by only one
AGENT.
Advantages
– Structural independence
– Improved conceptual simplicity
– Easier database design, implementation, management, and use
– Ad hoc query capability
– Powerful database management system
Disadvantages
– Substantial hardware and system software overhead (this is not an issue
any more because of the currently available hardware and software)
– Can facilitate poor design and implementation (Less experienced
database designers may develop poor database design)
– May promote “islands of information” problems (because various users in
different departments will be developing their own database applications)
The ER model or ERM is a widely accepted and adapted graphical tool for data
modeling. Peter Chen first introduced the ER data model in 1976 in his landmark
paper “The Entity Relationship Model: Toward a Unified View of Data”. The ERM
yielded a graphical representation that popularized the use of the ER diagrams
as a tool for conceptual-level data modeling. Better yet, the ER model
complemented the relational model concepts, thus providing the foundation for a
tightly structured database design environment to ensure the proper design of
relational databases.
Basic Structure
Figure 2.6 shows some basic ERD models that illustrate these relationships and
connectivity type.
The ERD shown in Figure 2.6 is based on the so-called Chen model. Although
the entities and relationships are shown in a horizontal format in Figure 2.6, they
also may be oriented vertically. The entity location and the order in which the
entities are presented are immaterial – just remember to always read a 1:M
relationship from the “1” side to the “M” side.
A more current version of the ERD is the Crow’s Foot Model shown in Figure
2.7. The label “Crow’s Foot” is derived from the three-pronged symbol used to
represent the “many” side of the relationship. The Crow’s Foot model places the
relationship name in the relationship line.
As you examine the basic Crow’s Foot ERD in Figure 2.7, note that the
connectivity is represented by symbols. For example, the “1” is represented by a
short line segment and the “M” is represented by the three-pronged “crow’s foot.”
Like the Chen ERD, the entities and the relationships may be represented
horizontally or vertically. And again like the Chen ERD, the location and the order
in which the entities are presented in a Crow’s Foot ERD are immaterial.
Advantages
– Exceptional conceptual simplicity
– Visual representation
– Effective communication tool
– Integrated with the relational data model
Disadvantages
– Limited constraint representation
– Limited relationship representation (relationships between attributes can not
be modeled)
– No data manipulation language
– Loss of information content (limited space is available to draw large number
of entities in the Chen original notations of the ERD technique)
2.4.5 The Object Oriented Model
In the Object Oriented model entities are represented as objects that contain
both data and operations. An advantage of this model is the addition of semantic
content. A disadvantage is the steeper learning curve.
The semantic data model (SDM) modeled both data and their relationships in a
single structure known as an object. Because its basic modeling structure is an
object, the SDM is said to be an object oriented data model (OODM). In turn,
the OODM becomes the basis for the object oriented database management
system (OODBMS).
An OODM reflects a very different way to define and use entities. Like the
relational model’s entity, an object is described by its factual content. But, quite
unlike an entity, an object includes information about relationships between the
facts within the object, as well as information about relationships with other
objects. Therefore, the facts within the objects are given greater meaning.
Basic Structure
To illustrate the difference between the OO model and the ER model, let’s
examine their graphic representations in the simple invoicing problem shown in
Figure 2.8.
° The OO data model represents an object class as a box; all of the object’s
attributes and relationships to other objects are included within the object
class box. The object class representation of the INVOICE includes all
related objects within the same object class box.
° The ER model uses three separate entities and two relationships to
represent an invoice transaction. Because customers can put more than
one item at a time, each invoice references one or more lines, one item
per line. And, because invoices are generated by customers, the data-
modeling requirements include a customer entity and a relationship
between the customer and the invoice.
Advantages
– Adds semantic content
– Visual presentation includes semantic content
– Database integrity
– Both structural and data independence
Disadvantages
– Slow pace of OODM standards development
– Complex navigational data access
– Steep learning curve
– High system overhead slows transactions
– Lack of market penetration
The OODM and ERDM are similar in the sense that each attempts to address the
demand for more semantic information to be incorporated into the model.
However, the OODM and the ERDM differ substantially both in underlying
philosophy and in the nature of the problem to be addressed.
How does the hierarchical data model address the problem of data redundancy?