You are on page 1of 51

DATABASE DESIGN

CHAPTER THREE
DATABASE DESIGN

• Its the process of coming up with different kinds of specification for the data to
be stored in the database.
• The database design part is one of the middle phases we have in information
systems development where the system uses a database approach.
• Design is the part on which we would be engaged to describe how the data
should be perceived at different levels and finally how it is going to be stored in a
computer system.
DATABASE SYSTEM DEVELOPMENT LIFE CYCLE

• Planning of Information systems Design


• Requirements Analysis,
• Design (Conceptual, Logical and Physical Design)
• Implementation
• Testing and deployment
• Operation and Support
DBMS DATABASE MODELS
• A Database model defines the logical design and structure of a database and defines
how data will be stored, accessed and updated in a database management system.
While the Relational Model is the most widely used database model, there are other
models too:
• Hierarchical Model
•  Network Model
•  Entity-relationship Model
•  Relational Model
HIERARCHICAL MODEL
• This database model organises data into a tree-like-structure, with a single root, to which all the
other data is linked. The heirarchy starts from the Root data, and expands like a tree, adding child
nodes to the parent nodes.

• In this model, a child node will only have a single parent node. This model efficiently describes
many real-world relationships like index of a book, recipes etc.

• In hierarchical model, data is organised into tree-like structure with one one-to-many
relationship between two different types of data, for example, one department can have many
courses, many professors and of-course many students.
NETWORK MODEL
• This is an extension of the Hierarchical model. In this model data is organised more like a graph,
and are allowed to have more than one parent node.

• In this database model data is more related as more relationships are established in this database
model. Also, as the data is more related, hence accessing the data is also easier and fast.

• This database model was used to map many-to-many data relationships. This was the most
widely used database model, before Relational Model was introduced.
ENTITY-RELATIONSHIP MODEL
• In this database model, relationships are created by dividing object of interest into entity and its
characteristics into attributes.

• Different entities are related using relationships. E-R Models are defined to represent the
relationships into pictorial form to make it easier for different stakeholders to understand.

• This model is good to design a database, which can then be turned into tables in relational
model(explained below).

• Let's take an example, If we have to design a School Database, then Student will be


an entity with attributes name, age, address etc. As Address is generally complex, it can be
another entity with attributes street name, pincode, city etc, and there will be a relationship
between them. Relationships can also be of different types.
RELATIONAL MODEL
• In this model, data is organised in two-dimensional tables and the relationship is maintained by
storing a common field.

• This model was introduced by E.F Codd in 1970, and since then it has been the most widely used
database model, infact, we can say the only database model used around the world. The basic
structure of data in the relational model is tables. All the information related to a particular type
is stored in rows of that table.

• Hence, tables are also known as relations in relational model. In the coming tutorials we will
learn how to design tables, normalize them to reduce data redundancy and how to use Structured
Query language to access data from tables.
DATABASE DESIGN METHODOLOGY

• Three main phases


• Conceptual database design In developing a good design, one should
• Logical database design answer such questions as:
• Physical database design 1. What are the relevant Entities for the
Organization
2. What are the important features of each
Entity
3. What are the important Relationships
4. What are the important queries from the
user
5. What are the other requirements of the
Organization and the Users
DATABASE DESIGN METHODOLOGY

• Three main phases


• Conceptual database design In developing a good design, one should
• Logical database design answer such questions as:
• Physical database design 1. What are the relevant Entities for the
Organization
2. What are the important features of each
Entity
3. What are the important Relationships
4. What are the important queries from the
user
5. What are the other requirements of the
Organization and the Users
CONCEPTUAL DATABASE DESIGN
• Conceptual design is the process of constructing a model of the
information used in an enterprise, independent of any physical
considerations.
• It is the source of information for the logical design phase.
• Mostly uses an Entity Relationship Model to describe the data at this level.
• After the completion of Conceptual Design one has to go for refinement
of the schema, which is verification of Entities, Attributes, and Relationships
LOGICAL DATABASE DESIGN
• It is the process of constructing a model of the information used in an
enterprise based on a specific data model
(e.g. relational, hierarchical or network or object), but independent of a particular
DBMS and other physical considerations.
• Normalization process
• Collection of Rules to be maintained
• Discover new entities in the process
• Revise attributes based on the rules and the discovered Entities
PHYSICAL DATABASE DESIGN

• Physical design is the process of producing a description of the


implementation of the database on secondary storage. -- defines specific
storage or access methods used by database
• Describes the storage structures and access methods used to achieve efficient access
to the data.
• Tailored to a specific DBMS system -- Characteristics are function of DBMS and
operating systems
• Includes estimate of storage space
CONCEPTUAL DATABASE DESIGN
• Conceptual design revolves around discovering and analyzing
organizational and user data requirements
• The important activities are to identify
• Entities
• Attributes
• Relationships
• Constraints

• And based on these components develop the ER model using


• ER diagrams
ER MODEL

• ER model stands for an Entity-Relationship model. It is a


high-level data model. This model is used to define the data
elements and relationship for a specified system.
• It develops a conceptual design for the database. It also
develops a very simple and easy to design view of data.
• In ER modeling, the database structure is portrayed as a
diagram called an entity-relationship diagram.
• For example, Suppose we design a school database. In this database,
the student will be an entity with attributes like address, name, id, age,
etc. The address can be another entity with attributes like city, street
name, pin code, etc and there will be a relationship between them.
COMPONENT OF ER DIAGRAM
1. ENTITY:
• An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as a shape of rectangles. Entity name

• The name given to an entity should always be a singular noun descriptive of each item to be
stored in it. E.g. : student NOT students.
• Every relation has a schema, which describes the columns, or fields the relation itself
corresponds to our familiar notion of a table:
• A relation is a collection of tuples, each of which contains values for a fixed number of
attributes
• Existence Dependency: the dependence of an entity on the existence of one or more entities.
• Consider an organization as an example- manager, product, employee, department etc. can be
taken as an entity.

.
WEAK ENTITIES

• an entity that can not exist without the entity with which it has a relationship – it is
indicated by a double rectangle
• A weak entity can be identified uniquely only by considering the primary key of
another (owner) entity.
– Owner entity set and weak entity set must participate in a one-to-many
relationship set (one owner, many weak entities).
– Weak entity set must have total participation in this identifying relationship set.
Name Cost
Pname Age
SSNo lot

employees Policy Dependants

Weak entities have only a “partial key” (dashed underline)


2. ATTRIBUTE
• The attribute is used to describe the property of an entity. Oval is used to represent an
attribute.
Attribute name

• Attributes will give rise to recorded items of data in the database


• Attribute name (be explanatory words or phrases)
• The domain from which attribute values are taken (A DOMAIN is a set of values from
which attribute values may be taken.) Each attribute has values taken from a domain.
• For example, id, age, contact number, name, etc. can be attributes of a student.
Phone No
ID

Student
Name

Age
2. ATTRIBUTE...
A. Key Attribute
• The key attribute is used to represent the main characteristics of an entity. It represents a primary key.
The key attribute is represented by an oval with the text underlined.
ID Phone No

Student

Age
Name

B. Simple attribute : contains a single value (not divided into sub parts) E.g. Age, gender
ATTRIBUTE...CONTI
b. Composite Attribute

• An attribute that composed of many other attributes is known as a composite attribute. The composite
attribute is represented by an oval, and those Ovals are connected with an oval.

c. Multivalued Attribute

• An attribute can have more than one value. These attributes are known as a multivalued attribute. The
double oval is used to represent multivalued attribute.

• For example, a student can have more than one phone number
or Person may have several college degrees.

.
ATTRIBUTE...CONTI
E. Derived Attribute
• An attribute that can be derived from another attribute is known as a derived attribute. Primary Keys are
underlined, and A derived attribute is indicated by a DOTTED LINE. (……..)
• For example, A person's age changes over time and can be derived from another attribute like Date of birth.

F. Null Values
• NULL applies to attributes which are not applicable, or which do not have values.
• You may enter the value NA (meaning not applicable)
• Value of a key attribute can not be null.
• Default value - assumed value if no explicit value
3. RELATIONSHIP
• A relationship is used to describe the relation between entities.
• Relationships are represented by diamond-shaped box. Name of the relationship is written
inside the diamond-box. All the entities (rectangles) participating in a relationship, are
connected to it by a line.

• A relationship set is a mathematical relation among n  2 entities, each taken from


entity sets
{(e1, e2, … en) | e1  E1, e2  E2, …, en  En} where (e1, e2, …, en) is a relationship

Binary Relationship and Cardinality


• A relationship where two entities are participating is called a binary relationship.
• Cardinality is the number of instance of an entity from a relation that can be associated
with the relation.
major cardinalities of a relationship are as follows:
I. One-to-One Relationship
• When only one instance of an entity is associated
with the relationship, then it is known as one to one
relationship.
• For example, A female can marry to one
male, and a male can marry to one female.

II. One-to-many relationship


• When only one instance of the entity on the left, and
more than one instance of an entity on the right associates
with the relationship then this is known as a one-to-many
relationship.
• For example, Scientist can invent many inventions, but the
invention is done by the only specific scientist.
major cardinalities of a relationship are as follows:…
III. Many-to-one relationship

• When more than one instance of the entity on the left, and
only one instance of an entity on the right associates with
the relationship then it is known as a many-to-one
relationship.
• For example, Student enrolls for only one course,
but a course can have many students.

IV. Many-to-many relationship


• When more than one instance of the entity on the left, and
more than one instance of an entity on the right associates
with the relationship then it is known as a many-to-many
relationship.

• For example, Employee can assign by many projects and


project can have many employees.
Participation Constraints
•   Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.

•   Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.
DEGREE OF RELATIONSHIP SET
• An important point about a relationship is how many entities participate in it.
• The number of entities participating in a relationship is called the DEGREE
of the relationship.
Among the Degrees of relationship, the following are the basic:
• UNARY/RECURSIVE RELATIONSHIP: Tuples/records of a Single entity are
related withy each other.
• BINARY RELATIONSHIPS: Tuples/records of two entities are associated in a
relationship
• TERNARY RELATIONSHIP: Tuples/records of three different entities are associated
And a generalized one:
• N-ARY RELATIONSHIP: Tuples from arbitrary number of entity sets are
participating in a relationship.
NOTATION OF ER DIAGRAM
• Database can be represented using the notations. In ER diagram, many notations are used to express
the cardinality. These notations are as follows:

• Fig: Notations of ER diagram


DATABASE KEYS
• Keys are very important part of Relational database model.

• They are used to establish and identify relationships between tables and also to uniquely identify
any record or row of data inside a table.

• A Key can be a single attribute or a group of attributes, where the combination may act as a key.

• Why we need a Key?


• In real world applications, number of tables required for storing the data is huge, and the
different tables are related to each other as well. Also, tables store a lot of data in them.

• Tables generally extends to thousands of records stored in them, unsorted and unorganised. Now
to fetch any particular record from such dataset, you will have to apply some conditions, but what
if there is duplicate data present and every time you try to fetch some data by applying certain
condition, you get the wrong data. How many trials before you get the right data?

• To avoid all this, Keys are defined to easily identify any row of data in a table.
KEYS

• Keys play an important role in the relational database.

• It is used to uniquely identify any record or row of data from the table. It is also
used to establish and identify relationships between tables.
• For example: In Student table, ID is used as a key because it is unique for each
student. In PERSON table, passport_number, license_number, SSN are keys
since they are unique for each person.
TYPES OF KEY:

1. Primary key
 It is the first key which is used to identify one and
only one instance of an entity uniquely. An entity can
contain multiple keys as we saw in PERSON table.
The key which is most suitable from those lists
become a primary key.

 In the EMPLOYEE table, ID can be primary key


since it is unique for each employee. In the
EMPLOYEE table, we can even select
License_Number and Passport_Number as primary
key since they are also unique.

 For each entity, selection of the primary key is based


on requirement and developers.
2. CANDIDATE KEY

• A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
• The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.
• For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the attributes like SSN,
Passport_Number, and License_Number, etc. are considered as a candidate key.
3. SUPER KEY

• Super key is a set of an attribute which can uniquely identify a tuple. Super key is a superset
of a candidate key.
• For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the name of two
employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a
key.

• The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-NAME), etc.


4. FOREIGN KEY

• Foreign keys are the column of the table which is used to point to the primary key of another table.

• In a company, every employee works in a specific department, and employee and department are two
different entities. So we can't store the information of the department in the employee table. That's why
we link these two tables through the primary key of one table.

• We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the
EMPLOYEE table.

• Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
RELATIONAL CONSTRAINTS/INTEGRITY RULES
Relational Integrity
• Domain Integrity: No value of the attribute should be beyond the allowable limits

• Entity Integrity: In a base relation, no attribute of a Primary Key can assume a value of NULL

• Referential Integrity: If a Foreign Key exists in a relation, either the Foreign Key value must match a
Candidate Key value in its home relation or the Foreign Key value must be NULL

• Enterprise Integrity: Additional rules specified by the users or database administrators of a database
are incorporated
Mapping ER-models to relational tables
• ER diagrams can be mapped to relational schema, that is, it is possible to create relational schema
using ER diagram. We cannot import all the ER constraints into relational model, but an
approximate schema can be generated.

• There are several processes and algorithms available to convert ER Diagrams into Relational
Schema. Some of them are automated and some of them are manual. We may focus here on the
mapping diagram contents to relational basics.

• ER diagrams mainly comprise of −


• Entity and its attributes
• Relationship, which is association among entities.

• Mapping Entity
• An entity is a real-world object with some attributes.

• Mapping Process (Algorithm)


• Create table for each entity.
• Entity's attributes should become fields of tables with their respective data types.
• Declare primary key.
MAPPING ENTITY
• An entity is a real-world object with some attributes.

• Mapping Process (Algorithm)


• Create table for each entity.
• Entity's attributes should become fields of tables with their respective data types.
• Declare primary key.
MAPPING RELATIONSHIP
• A relationship is an association among entities.

• Mapping Process
• Create table for a relationship.
• Add the primary keys of all participating Entities as fields of table with their respective data
types.
• If relationship has any attribute, add each attribute as field of table.
• Declare a primary key composing all the primary keys of participating entities.
• Declare all foreign key constraints.
MAPPING WEAK ENTITY SETS
• A weak entity set is one which does not have any primary key associated with it.

• Mapping Process
• Create table for weak entity set.
• Add all its attributes to table as field.
• Add the primary key of identifying entity set.
• Declare all foreign key constraints.
MAPPING HIERARCHICAL ENTITIES
• ER specialization or generalization comes in the form of hierarchical entity sets.

• Mapping Process
·        Create tables for all higher-level entities.
·        Create tables for lower-level entities.
·        Add primary keys of higher-level entities in the table of lower-level entities.
·        In lower-level tables, add all other attributes of lower-level entities.
·        Declare primary key of higher-level table and the primary key for lower-level table.
·        Declare foreign key constraints.
REDUCTION OF ER DIAGRAM TO TABLE
• The database can be represented using the notations, and these notations can be reduced to a
collection of tables.

• In the database, every entity set or relationship set can be represented in tabular form.

• The ER diagram is given below:


POINTS TO CONVERTING THE ER DIAGRAM TO THE TABLE:

• Entity type becomes a table.


• In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual
tables.

• All single-valued attribute becomes a column for the table.


• In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT
table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so
on.

• A key attribute of the entity type represented by the primary key.


• In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are
the key attribute of the entity.

• The multivalued attribute is represented by a separate table.


• In the student table, a hobby is a multivalued attribute. So it is not possible to represent multiple
values in a single column of STUDENT table. Hence we create a table STUD_HOBBY with
column name STUDENT_ID and HOBBY. Using both the column, we create a composite key.
POINTS TO CONVERTING THE ER DIAGRAM TO THE TABLE:......
• Composite attribute represented by components.
• In the given ER diagram, student address is a composite attribute. It contains CITY, PIN, DOOR#, STREET, and STATE. In
the STUDENT table, these attributes can merge as an individual column.

• Derived attributes are not considered in the table.


• In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time by calculating the difference
between current date and Date of Birth.

• Using these rules, you can convert the ER diagram to tables and columns and assign the mapping
between the tables. Table structure for the given ER diagram is as below:

Figure: Table structure


ENHANCED-ER MODEL 
• EER is a high-level data model that incorporates the extensions to the original ER model.

It is a diagrammatic technique for displaying the following concepts

• Sub Class and Super Class

• Specialization and Generalization

• Union or Category

• Aggregation

• These concepts are used when the comes in EER schema and the resulting schema diagrams
called as EER Diagrams.
FEATURES OF EER MODEL

• EER creates a design more accurate to database schemas.


• It reflects the data properties and constraints more precisely.
• It includes all modelling concepts of the ER model.
• Diagrammatic technique helps for displaying the EER schema.
• It includes the concept of specialization and generalization.
• It is used to represent a collection of objects that is union of objects of
different of different entity types.
A. SUB CLASS AND SUPER CLASS
• Sub class and Super class relationship leads the concept of
Inheritance.

• The relationship between sub class and super class is denoted with  
symbol.
d
1. Super Class
•   Super class is an entity type that has a relationship with one or
more subtypes.

• An entity cannot exist in database merely by being member of any


super class.
• For example: Shape super class is having sub groups as Square, Circle, Triangle.
2. Sub Class

•  Sub class is a group of entities with unique attributes.

• Sub class inherits properties and attributes from its super class.
• For example: Square, Circle, Triangle are the sub class of Shape super class.
B. SPECIALIZATION AND GENERALIZATION
1. Generalization

• Generalization is the process of generalizing the entities which


contain the properties of all the generalized entities.

• It is a bottom approach, in which two lower level entities


combine to form a higher level entity.

• Generalization is the reverse process of Specialization.

• It defines a general entity type from a set of specialized entity


type.

• It minimizes the difference between the entities by identifying


the common features.
For example: look at the diagram

• In the diagram example, Employee can be specialized as


Developer or Tester, based on what role they play in an
Organization.
C. CATEGORY OR UNION

• Category represents a single super class or sub class relationship with more than one super class.

• It can be a total or partial participation.


For example Car booking, Car owner can be a person, a bank (holds a possession on a Car) or a
company. Category (sub class) → Owner is a subset of the union of the three super classes → Company,
Bank, and Person. A Category member must exist in at least one of its super classes.
D. AGGREGATION
• Aggregation is a process that represent a relationship between a whole object and its component
parts.

• It abstracts a relationship between objects and viewing the relationship as an object.

•  It is a process when two entity is treated as a single entity.

• In the above example, the relation between College and Course is acting as an Entity in Relation
with Student.
INHERITANCE
• We use all the above features of ER-Model in order to create classes of objects in object-oriented
programming. The details of entities are generally hidden from the user; this process known
as abstraction.

• Inheritance is an important feature of Generalization and Specialization. It allows lower-level


entities to inherit the attributes of higher-level entities.

• For example, the attributes of a Person class such as name, age, and gender can be inherited
by lower-level entities such as Student or Teacher.

You might also like