Professional Documents
Culture Documents
LECTURE NOTE
ON
DATABASE DESIGN 1
(COM 312)
COMPILED BY:
SHEHU MOHAMMED
1
1.0 Introduction
Data Management is one of the areas of Computer Science that has applications in
almost every field. In this unit, we shall examine some basic terms in database
management system.
2
2. Logical Models: Represent the logic of how a business
operates. For example, the relationship between different
entities and the flow of data through the organization. Based
on the User's model.
3. Physical Models: Represent how the database is actually
implemented on a computer system. This is based on the
logical model.
1.2 Why Database
To manage large chunks of data: Yes, you can store data into a spreadsheet, but if
you add large chunks of data into the sheet, it will simply not work. For instance: if
your size of data increases into thousands of records, it will simply create a problem
of speed.
Ease of updating data: With the database, you can flexibly update the data
according to your convenience. Moreover, multiple people can also edit data at same
time.
Security of data: There is no denying the fact that your data is less secure in
spreadsheets. Anyone can easily get access to file and can make changes to it. With
databases you have security groups and privileges you set to restrict access.
Data integrity: Data integrity also becomes a question when storing data in
spreadsheets. In databases, you can be assured of accuracy and consistency of data
due to the built in integrity checks and access controls.
Shared - Data in a database are shared among different users and applications.
Persistence - Data in a database exist permanently in the sense the data can live
beyond the scope of the process that created it.
Validity/Integrity/Correctness - Data should be correct with respect to the real
world entity that they represent.
Security - Data should be protected from unauthorized access.
Consistency - Whenever more than one data element in a database represents
related real-world values, the values should be consistent with respect to the
relationship.
COM 312: DATABASE SESIGN 1
3
Non-redundancy - No two data items in a database should represent the same real-
world entity.
Independence - The three levels in the schema (internal, conceptual and external)
should be independent of each other so that the changes in the schema at one level
should not affect the other levels.
a. The Database;
b. The DBMS; and
c. Application Programs (what users interact with)
4
e. Security and Privacy
f. Multiple views of data
15 Awolowo
Mrs.James Lagos LA 0003
124 Ave. 1000
a. User Data
b. Metadata
c. Indexes
d. Application metadata
i. Users work with database directly by entering, updating and viewing data.
COM 312: DATABASE SESIGN 1
5
ii. For our purposes, data will be generally stored in tables with some relationships
between tables.
iii. Each table has one or more columns. A set of columns forms a database record.
iv. Recall our example database for the bank. What were some problems we
discussed?
v. Here is one improvement - split into 2 tables:
vi. The customers table has 4 records and 5 columns. The Accounts table has 7
records and 3 columns.
vii. Note relationship between the two tables - CustomerID column.
viii. How should we split data into the tables? What are the relationships between the
tables?
These are questions that are answered by Database Modeling and Database
Design. We shall consider Database modeling in the next unit.
6
Data modeling is the process of creating a data model for the data to be stored in a
Database. This data model is a conceptual representation of
Data objects
The associations between different data objects
The rules.
Data modeling helps in the visual representation of data and enforces business rules,
regulatory compliances, and government policies on the data. Data Models ensure
consistency in naming conventions, default values, semantics, and security while ensuring
quality of the data.
Data model emphasizes on what data is needed and how it should be organized instead of
what operations need to be performed on the data. Data Model is like architect's building
plan which helps to build a conceptual model and set the relationship between data items.
Ensures that all data objects required by the database are accurately represented.
Omission of data will lead to creation of faulty reports and produce incorrect results.
A data model helps design the database at the conceptual, physical and logical levels.
Data Model structure helps to define the relational tables, primary and foreign keys and
stored procedures.
It provides a clear picture of the base data and can be used by database developers to
create a physical database.
It is also helpful to identify missing and redundant data.
Though the initial creation of data model is labor and time consuming, in the long run, it
makes your IT infrastructure upgrade and maintenance cheaper and faster.
1. Conceptual: This Data Model defines WHAT the system contains. This model is
typically created by Business stakeholders and Data Architects. The purpose is to
organize, scope and define business concepts and rules.
7
2. Logical: Defines HOW the system should be implemented regardless of the DBMS. This
model is typically created by Data Architects and Business Analysts. The purpose is to
developed technical map of rules and data structures.
3. Physical: This Data Model describes HOW the system will be implemented using a
specific DBMS system. This model is typically created by DBA and developers. The
purpose is actual implementation of the database.
Conceptual Model
The main aim of this model is to establish the entities, their attributes, and their
relationships. In this Data modeling level, there is hardly any detail available of the actual
Database structure.
For example:
Customer and Product are two entities. Customer number and name are attributes of the
Customer entity
Product name and price are attributes of product entity
Sale is the relationship between the customer and product
8
Characteristics of a conceptual data model
Conceptual data models known as Domain models create a common vocabulary for all
stakeholders by establishing basic concepts and scope.
Logical data models add further information to the conceptual model elements. It defines
the structure of the data elements and set the relationships between them.
The advantage of the Logical data model is to provide a foundation to form the base for the
Physical model. However, the modeling structure remains generic.
At this Data Modeling level, no primary or secondary key is defined. At this Data modeling
level, you need to verify and adjust the connector details that were set earlier for
relationships.
Describes data needs for a single project but could integrate with other logical data
models based on the scope of the project.
Designed and developed independently from the DBMS.
Data attributes will have datatypes with exact precisions and length.
Normalization processes to the model is applied typically till 3NF.
A Physical Data Model describes the database specific implementation of the data model. It
offers an abstraction of the database and helps generate schema. This is because of the
richness of meta-data offered by a Physical Data Model.
9
This type of Data model also helps to visualize database structure. It helps to model
database columns keys, constraints, indexes, triggers, and other RDBMS features.
The physical data model describes data need for a single project or application though it
may be integrated with other physical data models based on project scope.
Data Model contains relationships between tables that which addresses cardinality and
nullability of the relationships.
Developed for a specific version of a DBMS, location, data storage or technology to be
used in the project.
Columns should have exact data types, lengths assigned and default values.
Primary and Foreign keys, views, indexes, access profiles, and authorizations, etc. are
defined.
The main goal of a designing data model is to make certain that data objects offered by
the functional team are represented accurately.
The data model should be detailed enough to be used for building the physical database.
The information in the data model can be used for defining the relationship between
tables, primary and foreign keys, and stored procedures.
Data Model helps business to communicate the within and across organizations.
Data model helps to documents data mappings in ETL process
Help to recognize correct sources of data to populate the model
To develop Data model one should know physical data stored characteristics.
This is a navigational system produces complex application development, management.
Thus, it requires a knowledge of the biographical truth.
Even smaller change made in structure require modification in the entire application.
There is no set data manipulation language in DBMS.
10
record is a collection of fields, with each field containing only one value. The type of a record
defines which fields the record contains.
The hierarchical database model mandates that each child record has only one parent, whereas
each parent record can have one or more child records. In order to retrieve data from a
hierarchical database the whole tree needs to be traversed starting from the root node. This model
is recognized as the first database model created by IBM in the 1960s.
In this model, the employee data table represents the "parent" part of the hierarchy, while the
computer table represents the "child" part of the hierarchy. In contrast to tree structures usually
found in computer software algorithms, in this model the children point to the parents. As shown,
each employee may possess several pieces of computer equipment, but each individual piece of
computer equipment may have only one employee owner.
In this, the "child" is the same type as the "parent". The hierarchy stating EmpNo 10 is boss of
20, and 30 and 40 each report to 20 is represented by the "ReportsTo" column. In Relational
database terms, the ReportsTo column is a foreign key referencing the EmpNo column. If the
"child" data type were different, it would be in a different table, but there would still be a foreign
key referencing the EmpNo column of the employees table.
11
Network model
The network model is a database model conceived as a flexible way of representing objects and
their relationships. Its distinguishing feature is that the schema, viewed as a graph in which
object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or
lattice.
12
These include business analysts, scientists, engineers, others
thoroughly familiar with the system capabilities.
Many use tools in the form of software packages that work closely
with the stored database.
Stand-alone:
Mostly maintain personal databases using ready-to-use packaged
applications.
An example is a tax program user that creates its own internal
database.
Another example is a user that maintains an address book
3.5.1 Who Is A DBA (Database Administrator)
A Database Administrator is a person or a group of person who are responsible for managing all
the activities related to database system. This job requires a high level of expertise by a person or
group of person. There are very rare chances that only a single person can manage all the
database system activities so companies always have a group of people who take care of database
system.
Installing and Configuration of database: DBA is responsible for installing the database software.
He configure the software of database and then upgrades it if needed. There are many database
software like oracle, Microsoft SQL and MySQL in the industry so DBA decides how the
installing and configuring of these database software will take place.
Types of Attributes
Simple attribute − Simple attributes are atomic values, which cannot be divided further.
For example, a student's phone number is an atomic value of 10 digits.
Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first_name and last_name.
Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average_salary in a department should not be saved directly in the database,
instead it can be derived. For another example, age can be derived from data_of_birth.
Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
Multi-value attribute − Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email_address, etc.
These attribute types can come together in a way like −
13
Entity-Set and Keys: Key is an attribute or collection of attributes that uniquely identifies an
entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
Super Key − A set of attributes (one or more) that collectively identifies an entity in an
entity set.
Candidate Key − A minimal super key is called a candidate key. An entity set may have
more than one candidate key.
Primary Key − A primary key is one of the candidate keys chosen by the database
designer to uniquely identify the entity set.
Relationship: The association among entities is called a relationship. For example, an
employee works_at a department, a student enrolls in a course. Here, Works_at and Enrolls are
called relationships.
Relationship Set: A set of relationships of similar type is called a relationship set. Like entities,
a relationship too can have attributes. These attributes are called descriptive attributes.
Degree of Relationship: The number of participating entities in a relationship defines the
degree of the relationship.
Binary = degree 2
Ternary = degree 3
n-ary = degree
Mapping Cardinalities: Cardinality defines the number of entities in one entity set, which can
be associated with the number of entities of other set via relationship set.
One-to-one − One entity from entity set A can be associated with at most one entity of
entity set B and vice versa.
One-to-many − One entity from entity set A can be associated with more than one
entities of entity set B however an entity from entity set B, can be associated with at
most one entity.
14
Many-to-one − More than one entities from entity set A can be associated with at most
one entity of entity set B, however an entity from entity set B can be associated with
more than one entity from entity set A.
Many-to-many − One entity from A can be associated with more than one entity from B
and vice versa.
ER Diagram Representation
Let us now learn how the ER Model is represented by means of an ER diagram. Any object, for
example, entities, attributes of an entity, relationship sets, and attributes of relationship sets, can
be represented with the help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they
represent.
Attributes:
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is
then connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.
15
Multivalued attributes are depicted by double ellipse.
Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written
inside the diamond-box. All the entities (rectangles) participating in a relationship, are
connected to it by a line.
Binary Relationship and Cardinality
A relationship where two entities are participating is called a binary relationship. Cardinality
is the number of instance of an entity from a relation that can be associated with the relation.
One-to-one − When only one instance of an entity is associated with the relationship, it
is marked as '1:1'. The following image reflects that only one instance of each entity
should be associated with the relationship. It depicts one-to-one relationship.
16
entity on the left and more than one instance of an entity on the right can be associated
with the relationship. It depicts one-to-many relationship.
Many-to-one − When more than one instance of entity is associated with the
relationship, it is marked as 'N:1'. The following image reflects that more than one
instance of an entity on the left and only one instance of an entity on the right can be
associated with the relationship. It depicts many-to-one relationship.
Many-to-many − The following image reflects that more than one instance of an entity
on the left and more than one instance of an entity on the right can be associated with the
relationship. It depicts many-to-many relationship.
Participation Constraints
Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.
Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.
17
Relationship, which is association among entities.
Mapping Entity
An entity is a real-world object with some attributes.
Mapping Process
18
Mapping Process
Mapping Process
Create tables for all higher-level entities.
Create tables for lower-level entities.
Add primary keys of higher-level entities in the table of lower-level entities.
In lower-level tables, add all other attributes of lower-level entities.
Declare primary key of higher-level table and the primary key for lower-level table.
Declare foreign key constraints.
c. Data Model:
i A set of primitives for defining the structure of a database.
ii A set of operations for specifying retrieval and updates on a database
iii Examples: Relational, Hierarchical, Networked, Object-Oriented
19
d. Database Instance or State: The actual data contained in a database at a
given time.
The following are brief outline describing the database development process.
a. User needs assessment and requirements gathering: Determine what the users
are looking for, what functions should be supported, how the system should behave.
b. Data Modeling: Based on user requirements, form a logical model of the system.
This logical model is then converted to a physical data model (tables, columns,
relationships, etc.) that will be implemented.
c. Implementation: Based on the data model, a database can be created.
Applications are then written to perform the required functions.
d. Testing: The system is tested using real data.
e. Deployment: The system is deployed to users. Maintenance of the system begins.
For our Bank example, let’s assume that the managers are interested in creating a
database to track their customers and accounts.
a. Tables
CUSTOMERS
CustomerId, Name, Street, City, State, Zip
ACCOUNTS
CustomerId, AccountNumber, AccountType, DateOpened, Balance
Note that we use an artificial identifier (a number we make up) for the
customer called CustomerId. Given a CustomerId, we can uniquely
identify the remaining information. We call CustomerId a Key for the
CUSTOMERS table.
b. Relationships
20
The relationship between CUSTOMERS and ACCOUNTS is by CustomerId. Since
a customer may have more than one account at the bank, we call this a One to Many
relationship. (1:N).
c. Domains
A domain is a set of values that a column may have. Domain also includes
the type and length or size of data found in each column.
CUSTOMERS
Column Domain
ACCOUNTS
Column Domain
Data Type Size
CustomerId (FK) Integer 20
AccountNumber (Key) Integer 15
AccountType Character 2
DateOpened Date
Balance Real 12 ,
2
This logical model is then converted to a physical model and implemented as tables.
The following is some example data for the Accounts and Customers tables:
Customers Table
CustomerID Name Street City State Zip
123 Mr. Sola 12 Lekki Lagos LA 01
124 Mrs. James 15 Awolowo Ave. Lagos LA 01
125 Mr. Ade 43 Gwagwa Ln. Maitama AB 09
127 Mr. & Mrs. Bayo 61 Zik Rd. Garki AB 10
COM 312: DATABASE SESIGN 1
21
Accounts Table
CustomerId AccountNumber AccountType DateOpened Balance
123 0001 Checking 10/12/08 4000.00
123 0002 Savings 10/12/08 2000.00
124 0003 Savings 01/05/09 1000.00
125 0004 Checking 12/01/09 6000.00
125 0005 Savings 12/01/09 9000.00
127 0006 Savings 08/22/09 500.00
127 0007 Checking 11/13/08 800.00
d. Business Rules
Business rules allow us to specify constraints on what data can appear in tables and
what operations can be performed on data in tables. For example:
22
3.1 Data Modeling in the Context of Database Design
Database design is defined as: “designing the logical and physical structure of one or more
databases to accommodate the information needs of the users in an organization for a
defined set of applications”. The design process roughly follows five steps:
23
The data model gets its inputs from the planning and analysis stage. Here the modeler,
along with system analysts, collects information about the requirements of the database by
reviewing the existing documentation and interviewing end-users.
The data model has two outputs. The first is an entity-relationship diagram which represents
the data structure in a pictorial form. Because the diagram is easily learned, it is valuable
tool to communicate the model to the end-user. The second component is a data document.
This is a document that describes in detail the data objects, relationships, and rules required
by the database.
The goal of the data model is to make sure that all the data objects required by the database
are completely and accurately represented. Because the data model uses easily understood
notations and natural language, it can be reviewed and verified as correct by the end-users.
The data model is also detailed enough to be used by the database developers as a
“blueprint” for building the physical database.
The information contained in a data model will be used to define the relational tables, the
primary and the foreign keys, stored procedures, and triggers.
A poorly designed database will require more time in the long-run. Without a careful
planning you may create a database that omits data required to create critical reports,
produces results that are incorrect or inconsistent, and is unable to accommodate changes in
user’s requirements.
24
h. Integration: How will the proposed database fit with the organization’s
existing and future database?
Introduction to SQL
SQL is a standard language for accessing and manipulating databases.
What is SQL?
SQL stands for Structured Query Language
SQL lets you access and manipulate databases
SQL became a standard of the American National Standards Institute (ANSI) in 1986,
and of the International Organization for Standardization (ISO) in 1987
25