Professional Documents
Culture Documents
com
Please carry out the learning activity proposed for this week.
REQUESTED:
Teacher:
Daza XXX
Students
Bogota DC
2021
DEFINITIONS:
Conceptual data modeling represents the initial phase of developing the permanent data design and
permanent data storage for the system. In many cases, persistent data for the system is managed
by a relational database management system (RDBMS). The business and system entities that are
identified at a conceptual level of business models and system requirements will be developed
during use case analysis, use case design, and database design tasks to become detailed physical
table designs to be implemented in the RDBMS.
Please note that the conceptual data model discussed in this concept paper is not a separate work
product. It consists of a composite view of information contained in business modeling work
products, requirements, and analysis and design disciplines that is important to the development of
thedata model.
Conceptual- This phase includes the identification of key top-level business and system
entities and their relationships, which define the scope of the problem to be addressed by the
system. These key system and business entities are defined by using modeling elements of
the UML profile for business modeling, including elements of the business analysis model and
the business analysis model.analysis class of the analysis model.
A conceptual data model identifies the highest-level relationships between different entities.
A logical data model describes the data in as much detail as possible, regardless of how it will
be physically implemented in the database.
The physical data model represents how the model will be built in the database.
A physical database model displays all table structures, including column name, column data
type, column constraints, primary key, foreign key, and relationships between the tables.
The physical data model will be different for differentDatabase Management Systems. For
example, the data type for a column can be different between MySQL and SQL Server.
The basic steps for designing the physical data model are as follows:
Convert entities to tables.
Convert relations to foreign keys.
Convert attributes to columns.
Modify the physical data model based on physical constraints / requirements.
Here we compare these three types of data models. The table below compares the different
features:
Characteristic Conceptual Logical Physical
Entity names ✓ ✓
Entity relationships ✓ ✓
Attributes ✓
Main keys ✓ ✓
Foreign Keys ✓ ✓
Table names ✓
Column names ✓
Column data types ✓
Tasks related to database design span the entire software development life cycle, and tasks related
to initial database design can be initiated during theinitial phase.
In projects that use business modeling to describe the business context of the application, database
design can begin at a conceptual level with the identification of business actors and business use
cases in the business use case model, and the business workers and business entities in the
business analysis model.
In projects that do not use business modeling, database design can begin at the conceptual level
with the identification of system actors and system use cases in the use case model, and the
identification ofanalysis classesatanalysis modelof the use case realizations.
OBJECTIVES:
ENTITY / RELATIONSHIP MODEL
The Entity-Relationship model is the most widely used model for the conceptual design of
databases. It was introduced by Peter Chen in 1976 and is based on the existence of objects that
are given the name of entities, and associations between them, called relationships. Its main
symbols are represented in the table below.
Entities
An entity is any object or element about which information can be stored in the DB. Entities can be
concrete like a person or abstract like a date. Entities are represented graphically by rectangles and
their name appears inside. An entity name can only appear once in the conceptual schema.
Entity types
There are two types of entities: strong and weak. A weak entity is an entity whose existence
depends on the existence of another entity. A strong entity is an entity that is not weak.
Attributes
One entity is characterized and distinguished from another by attributes, sometimes called
properties or fields, that represent the characteristics of an entity. The attributes of an entity can
take a set of allowed values known as the attribute domain. Giving values to these attributes, the
different occurrences of an entity are obtained.
In essence, there are two types of attributes: - Entity identifiers (also called primary key or primary
key): these are attributes that uniquely identify each occurrence of an entity. - Entity descriptors:
these are attributes that show characteristics of the entity.
Examples of attributes:
Attribute types
Identifying or identifying attributes: They are attributes whose values are not repeated within
the same entity or relationship. They serve to uniquely identify each occurrence. They act
as a primary or primary key.
For example CCC (Current Account Code) that identifies each bank account. Or ISBN
(International Standard Book Number) that identifies each book that is published. An
identifying attribute can be a composite attribute.
For example CCC could be broken down into 3 attributes: num_banco, num_sucursal and
num_cuenta.
Multi-valued attributes: They are descriptive attributes that have several values of the same
domain. For example, if we need to store several e-mails from the same person then we
must use a multivalued attribute. The same happens with the telephone. If we only need to
store a single value we will use a normal descriptive attribute.
Compound attributes: Many times they are confused with the previous ones, although they
have nothing to do with them. A compound attribute is an attribute that can be decomposed
into other attributes belonging to different domains.
Relations:
A relationship is the association that exists between two or more entities. Each relationship has a
name that describes its function. Relationships are graphically represented by rhombuses and their
name appears inside.
Normally we will name the first or first letters of the entities that it relates. The entities that are
involved in a certain relationship are called participating entities. The number of participants in a
relationship is called the degree of the relationship.
For example, the CUSTOMER-CAR relationship is degree 2 or binary, since two entities are
involved.
Notice that the name we give the relationship uses the first letters of each entity. In this case, as
both begin with "C", a few more letters are added to refer to CUSTOMERS.
We could also have put a more descriptive name for the relationship, for example "Purchase"
(CUSTOMER buys CAR), but this nomenclature can lead to confusion when determining the
cardinality of the relationship when we are learning.
The PUBLISH relationship is grade 3, since it involves the BOOK, EDITORIAL and AUTHOR
entities.
Although the ER model allows relationships of any degree, most applications of the model only
consider relationships of degree 2.
It is the function you have in a relationship. The roles or roles are specified when you want to clarify
the meaning of an entity in a relationship.
Next, we show the same examples from the previous point, but including the role or role of each
entity in the relationships:
The Cardinality of a relationship
When the relationship is binary, which is the case in most cases, cardinality is the number of
occurrences of one entity associated with one occurrence of the other entity. There are mainly three
types of binary cardinalities:
One-to-one ratio 1: 1
To each element of the first entity corresponds no more than one element of the second
entity, and vice versa. It is represented graphically as follows:
Example cardinality 1: 1
One-to-many 1: N ratio
It means that each element of an entity of type A can be related to any number of elements
of an entity of type B, and an element of an entity of type B can only be related to one
element of an entity of type A. Its graphical representation is the following: Note in this case
that the dotted end of the arrow of the relation of A and B, indicates an element of A
connected to many of B.
Example cardinality 1: N
Many to many N: M
It establishes that any number of elements of an entity of type A can be related to any
number of elements of an entity of type B. The end of the arrow that is dotted indicates the
"several" of the relationship.
Example cardinality N: M
Participation of an entity
The participation of an entity is also known as the cardinality of the entity within a relationship. The
same entity can have different cardinality within different relationships. To obtain participation, you
must set a specific occurrence of one entity and find out how many occurrences of the other entity
correspond to it as a minimum and as a maximum. Then do the same in the other direction. These
minimum and maximum occurrences (also called participation of an entity) will be represented in
parentheses and with lowercase letters on the side of the relationship opposite to the entity whose
occurrences are fixed. To determine the cardinality we keep the maximum shares of both and they
are represented with capital letters separated by two points next to the symbol of the relationship.
Let's see some examples:
Example 1
A driver "drives" at least 1 car and at most 1 car → Participation (1,1) and stands on the opposite
side of DRIVER, that is, next to CAR. A car is "driven" by a minimum of 1 driver and a maximum of
1 driver → Share (1,1) and is placed on the opposite side to CAR, that is, next to DRIVER. To
determine the cardinality we are left with the two maximum shares. That is → 1: 1.
Example 2
A customer "buys" at least 1 car and at most can buy more than one car, that is, several cars. This
various is represented by the letter «n» → Participation (1, n)) and is placed on the opposite side to
CUSTOMER, that is, next to CAR. A car is «bought» by a minimum of 1 customer and a maximum
of 1 customer → Participation (1,1) and is placed on the opposite side to CAR, that is, next to
CUSTOMER. To determine the cardinality we keep the two maximum shares and the "n" is
capitalized "N". That is → 1: N.
Example 3
An employee "works" at least 1 department and at most can work in several. This various is
represented by the letter «n» → Participation (1, n)) and is placed on the opposite side to
EMPLOYEE, that is, next to DEPARTMENT. A department "has" at least 1 employee and at most it
can have several → Participation (1, n) and it is placed on the opposite side to DEPARTMENT, that
is, next to EMPLOYEE. To determine the cardinality we keep the two maximum shares and the "n"
is capitalized "N" and to differentiate the other "several" instead of "N" we put "M" (As when in
mathematics there were two variables it was not put xex but x and y). That is → N: M.
Attributes of a relationship
Relations can also have attributes, they are called proper attributes. They are those attributes
whose value can only be obtained in the relationship, since they depend on all the entities that
participate in the relationship. Let's look at an example.
Example:
We have the «Purchase» relationship between customer and product. Thus a customer can buy
one or more products, and a product can be bought by one or more customers. We find a series of
attributes of each of the entities [CLIENT (Client_Code, Name, Address, age, telephone) and
PRODUCT (Product_Code, Name, Description, Unit_Price)], but we can also observe how the
attribute «Quantity» is a attribute of the relationship. Why? Well, because the same customer can
buy different quantities of different products and the same product can be bought in different
quantities by different customers. In other words, the quantity attribute depends on the customer
and the product in question.
When defining the entities we speak of two types of them: strong and weak. A weak entity is linked
to a strong entity through a relationship of dependency. There are two types of dependency
relationships:
Dependency in existenceIt occurs when a weak entity needs the presence of a strong one to exist. If
the existence of the strong entity disappears, that of the weak entity is meaningless. It is
represented with a bar through the rhombus and the letter E inside. They are rare relationships.
Example:
The figure shows the case that an employee can have none, one or more children, so the data of
the children must be taken in a separate entity, although they are still data belonging to an
employee. If a record of an employee were deleted, it would not make sense to continue to keep
information about their children in the database.
Dependence on identificationIt occurs when a weak entity needs the strong to identify itself. By itself,
the weak is not able to unequivocally identify its occurrences. The weak entity key is formed by
matching the strong entity key with the identifying attributes of the weak entity.
Example:
The figure shows that the province has one or more municipalities and that a municipality belongs
to a single province. Now, if what identifies the municipalities is the code that appears in the postal
code, the first two figures correspond to the province code and the last three to the municipality. For
example, the CP of Écija is 41400, where 41 is the code of the province and 400 that of the
municipality. In this way, there will be different municipalities with code 400 in different provinces.
One of these municipalities will be distinguished from the rest by placing the first two digits
corresponding to the province code before it.
Exclusivity restriction between two types of relationships R1 and R2 with respect to entity E1. It
means that E1 is related to either E2 or E3, but both relationships cannot occur simultaneously.
Inclusiveness constraint between two types of relationships R1 and R2 with respect to entity E1. For
entity E1 to participate in relationship R2, it must previously participate in relationship R1.
Example: For an employee to work as a product designer, they must have attended at least two
courses.
Exclusion constraint between two types of relationships R1 and R2. It means that E1 is related to
E2 either through R1 or through R2 but that both relationships cannot occur simultaneously.
Example: Employees, depending on their capabilities, are either designers of products or are
operators and manufacture them, it is not possible for any employee to be a designer and
manufacturer at the same time.
Inclusion constraint between two types of relationships R1 and R2. For entity E1 to participate in
relationship R2 with E2, it must previously participate in relationship R1.
Example: For a man to divorce a woman, he must have previously married her.
MODEL E / R EXTENDED
The extended Entity / Relationship model includes everything seen in the Entity / Relationship
model but also the Hierarchy Relationships. A hierarchical relationship occurs when an entity can
be related to others through a relationship whose role would be "It is a type of".
We want to make a BD about the animals in a Zoo. We have the entities ANIMAL, FELINE, BIRD,
REPTILE, INSECT. FELINE, BIRD, REPTILE and INSECT would have the same type of
relationship with ANIMAL: «they are a type of». Now, its representation using the classic E / R
would be quite cumbersome:
To avoid having to repeat the rhombus of the same relationship so many times, special symbols are
used for these cases and all relationship rhombuses "is a type of" are replaced by an inverted
triangle, where the entities below are always a type of the entity above and are called subtype and
child entities.
The one above will be called the supertype or parent entity. Hierarchical relationships are always
based on an attribute that is placed next to the "is_un" relationship. In the figure below it would be
"type". The example above would look like this using symbols from the extended I / R.
Hierarchy Relations
Total: We subdivide the Employee entity into: Engineer, Secretary and Technician and in
our DB there is no other employee that does not belong to one of these three types.
Partial: We subdivide the Employee entity into: Engineer, Secretary and Technician, but in
our DB there may be employees who do not belong to any of these three types.
Overlapped: We subdivide the Employee entity into: Engineer, Secretary and Technician
and in our DB there may be employees who are both Engineers and secretaries, or
secretaries and technicians, etc.
Exclusive: We subdivide the Employee entity into: Engineer, Secretary and Technician. In
our database, no employee belongs to more than one sub-entity.
Examples:
In this DB an employee can only perform one of the three occupations (exclusive). He can also be a
technician, or be an astronaut, or be a scientist or also perform a different job, for example, he could
be PHYSICAL (partial).
RELATIONAL MODEL
The DBMS can be classified according to the logical model they support, the number of users, the
number of positions, the cost ... The most important classification of the DBMS is based on the
logical model, being the main models used in the market the following: Hierarchical, Networked,
Relational and Object Oriented.
Most of the current commercial DBMS are based on the relational model, which we are going to
focus on, while the older systems were based on the network model or the hierarchical model.
The reasons for the success of the relational model are fundamentally two: - They are based on
relational algebra, which is a mathematical model with solid foundations. In this section the
relational model is presented. We will describe the basic principles of the relational model: the
relational data structure and the integrity rules. They offer simple and efficient systems for
representing and manipulating data. - The fundamental structure of the relational model is precisely
that, the «relation», that is to say, a two-dimensional table made up of rows (records or tuples) and
columns (attributes or fields). The relationships or tables represent the entities of the E / R model,
while the attributes of the relationship will represent the properties or attributes of those entities. For
instance, If the entity PERSON has to be represented in the database, it will become a relation or
table called "PEOPLE", whose attributes describe the characteristics of the people (following table).
Each tuple or record in the "PEOPLE" relationship will represent a specific person.
PERSONS
Actually, being rigorous, a RELATION of the RELATIONAL MODEL is only the definition of the
structure of the table, that is, its name and the list of the attributes that compose it. A representation
of the definition of that relationship could be the following:
To distinguish one record from another, the "primary key or primary key" is used.
In a relationship there may be more combinations of attributes that allow a row to be uniquely
identified (these will be called «candidate keys or keys»), but only one will be chosen from among
these to be used as the primary key. The attributes of the primary key cannot assume the null
value.
Relationship (table): They represent the entities of which you want to store information in
the DB. It's formed by:
o Rows (Registers or Tuples): Correspond to each occurrence of the entity.
o Columns (Attributes or fields): They correspond to the properties of the entity.
Being rigorous, a relationship is constituted only by the attributes, without the
tuples.
Relationships have the following properties:
o Each relationship has a name and this is different from the name of all other
relationships in the same database.
o No two attributes are named the same in the same relationship.
o The order of the attributes does not matter: the attributes are not ordered.
o Each tuple is different from the others: there are no duplicate tuples. (At least they
will differ in the primary key)
o The order of the tuples does not matter: the tuples are not ordered.
Candidate key: attribute that uniquely identifies a tuple. Any of the candidate keys could be
chosen as the primary key.
Primary Key: Candidate key that we choose as the identifier of the tuples.
Alternative Key: Any candidate key that is not a primary key (those that we have not chosen
as the primary key)
A primary key cannot assume the value null (Entity Integrity).
Domain of an attribute: Set of values that can be assumed by said attribute.
Foreign or foreign or foreign key: the attribute or set of attributes that make up the primary
key of another relationship. The fact that an attribute is a foreign key in a table means that,
to enter data in that attribute, it must have been previously entered in the source table. That
is, the values present in the foreign key have to correspond to values present in the
corresponding primary key (Referential Integrity).
Note
When moving from the E / R schema to the Relational schema, we must add the foreign keys
necessary to establish the interrelationships between the tables. These foreign keys are not
represented in the E / R schema.
Important
The relational diagrams must be elaborated in such a way that, after entering data, there is no
foreign key to null value (NULL). For this, the rules shown below are followed.
Entities
Each entity is transformed into a table. The identifier (or identifiers) of the entity becomes the
primary key of the relationship and appears underlined or with the indication: PK (Primary Key). If
there is an alternative key, it is put in "bold".
Example: All entities in the previous example generate table. Specifically, the CLASSROOM entity
generates the following table:
Relationships N: M: It is the simplest case. They always generate table. A table is created that
incorporates as foreign or foreign keys FK (Foreign Key) each of the keys of the entities that
participate in the relationship. The primary key of this new table is made up of these fields. It is
important to note that it is not 2 primary keys, but a primary key composed of 2 fields.
If there are attributes of their own, they go to the relationship table. It would be exactly the same if
there were minimum participations 0. Order of the attributes in the composite keys: All the attributes
that make up the key must be placed on the left. The order of the attributes that make up the key
will be determined by the queries to be made. The tuples in the table are usually ordered following
the key as an index. Therefore, it is convenient to put first the one (s) for which the query is going to
be carried out.
Example:
Let's move to tables of the N: M relationship between MODULE (1, n) and STUDENT (1, n). This
type of relationship always generates a table and the attributes of the relationship are passed to the
table that it generates.
1: N ratios:They do not generally generate a table. There are 2 cases:
Case 1: If the entity on the "1" side presents participation (0,1), then a new table is created
for the relationship that incorporates the keys of both entities as foreign keys. The primary
key of the relationship will only be the key of the entity on the "N" side.
Case 2: For the rest of the situations, the entity on the "N" side receives the key of the entity
on the "1" side as a foreign key. The attributes of the relationship are passed to the table
where the foreign key has been incorporated.
Case example 1:Let's move to tables of the 1: N relationship between TEACHER (1, n) and
COMPANY (0,1). Since on the «1» side we find minimum participation 0, a new table will be
generated.
Case example 2:Let's move to tables of the 1: N relationship between MODULE (1,1) and TOPIC
(1, n). As there is no minimum participation "0" in side 1, it does not generate table and the primary
key of side "1" passes as foreign to side "n".
Case 1: If the two entities participate with participation (0,1), then a new table is created for
the relation.
Case 2: If some entity, but not both, participates with a minimum participation of 0 (0.1),
then the foreign key is placed in said entity, to avoid null values as much as possible.
Case 3: If we have a 1: 1 relationship and none has a minimum participation of 0, we
choose the primary key of one of them and enter it as a foreign key in the other table. One
way or another will be chosen depending on how you want to organize the information to
facilitate consultations. The attributes of the relationship are passed to the table where the
foreign key is entered.
Case example 1:There is no such situation in the scheme studied. A situation where this case can
occur is in MAN (0,1) he marries WOMAN (0,1). It is similar to case 1 of the previous section in 1: N
relationships, although in this case we must establish a single value constraint for FK2.
Example of case 2 (and 3): Let's move to tables of the 1: 1 relationship between STUDENT (1,1)
and SCHOLARSHIP (0,1). As the SCHOLARSHIP has a minimum participation of 0, we incorporate
into it, as a foreign key, the STUDENT key. This way of proceeding is also valid for case 3, with the
foreign key being able to accept any of the entities.
Trick
Dependency relationships in existence:They behave like a normal 1: N. The primary key from
side 1 passes to the "N" side as foreign (where the arrow points)
Example: We do not find any example, reviewed as such, in the previous case. Now, in the passage
to tables they behave like any other 1: N relationship. It would only take into account, the fact of
being weak in existence for at the time of creation of the DB, to impose that when deleting an
occurrence in the side «1», the associated ones on the «n» side are deleted.
N-ary relationships (we will only see up to degree 3): They always generate a table. The primary
keys of the entities participating in the relationship are passed to the new table as foreign keys. And
only those on the "n" sides form the main one. If there are attributes of the relationship, they are
included in that table. Example: We do not find any example of a relationship of more than degree 2
in the previous assumption. They will be seen when they appear in an exercise.
Reflective relationships
Example:Let's make the step to tables of the reflective relationship of STUDENT. Since it has no
minimum participation "0" in side 1, it does not generate a table. The primary key of STUDENTS will
reappear in STUDENTS as a foreign key, as in any 1: N relationship. Now, as there cannot be two
fields with the same name in the same table, we will have to change the name of the primary key a
bit, so that it refers to the role it occupies as a foreign key.
Hierarchies
Elimination of hierarchical relationships: Hierarchical relationships are a special case. You can give
some guides that serve as a reference, but in most cases, it will depend on the specific problem.
These guides are: A table will be created for the supertype entity. Unless it has so few attributes
that leaving it is a complication.
If the subtype entity has no attributes and is not related to any other entity, it disappears. If the
subtype entity has any attributes, a table is created. If it does not have its own key, it inherits that of
the supertype entity. If the relationship is unique, the attribute that generates the hierarchy is
incorporated into the table of the supertype entity. If a table has been created for each of the
subtype entities, it is not necessary to add that attribute to the supertype entity.
Example:We do not find any example of a relationship of hierarchy 2 in the previous assumption.
Its passage to tables, will be seen in when they appear in the concrete examples.
STANDARDIZATION
The design of a relational database can be carried out by applying to the real world, in a first phase,
a model such as the E / R model, in order to obtain a conceptual scheme; in a second phase, said
scheme is transformed to the relational model by means of the corresponding transformation rules.
It is also possible, although perhaps less advisable, to obtain the relational schema without
performing that intermediate step that is the conceptual schema. In both cases, it is convenient
(mandatory in the direct relational model) to apply a set of rules, known as Normalization Theory,
which allow us to ensure that a relational schema meets certain properties, avoiding:
In practice, if the DB has been designed using semantic models such as the E / R model,
normalization is not usually necessary. On the other hand, if they provide us with a database
created without carrying out a prior design, it is very likely that we will need to normalize.
In relational database theory, normal forms (FN) provide the criteria for determining the degree of
vulnerability of a table to inconsistencies and logical anomalies.
The higher the normal form applicable to a table, the less vulnerable it is to inconsistencies and
anomalies. Edgar F. Codd originally defined the first three normal forms (1FN, 2FN, and 3FN) in
1970.
These normal forms have been summarized as requiring that all attributes be atomic, dependent on
the complete key, and in a direct (non-transitive) way. The Boyce-Codd normal form (FNBC) was
introduced in 1974 by the two authors who appear in its name. The fourth and fifth normal forms
(4FN and 5FN) deal specifically with the representation of many-to-many and one-to-many
relationships between attributes and were introduced by Fagin in 1977 and 1979 respectively. Each
normal form includes the previous ones.
Before giving the concepts of normal forms, let's see some previous definitions:
Example: Suppose we have the following table with student data from a secondary school.
Students
DNI Name Course Registration Tutor Location Province phones
Date Student Student
11111111A Eve 1ESO-A 01-July- Isabel Ecija Seville 660111222
2016
22222222B Ana 1ESO-A 09-July- Isabel Ecija Seville 660222333 660333444
2016 660444555
As can be seen, this table is not in 1FN since the Telephones field contains several data within the
same cell and therefore it is not a field whose values are atomic. The solution would be the
following:
Students
DNI Name Course Registration Tutor Location Province
Date Student Student
11111111A Eve 1ESO-A 01-July-2016 Isabel Ecija Seville
22222222B Ana 1ESO-A 09-July-2016 Isabel Ecija Seville
33333333C Suzanne 1ESO-B July 11, 2016 Robert Ecija Seville
44444444D Juan 2ESO-A 05-July-2016 Frederick The Villar Cordova
55555555E Joseph 2ESO-A 02-July-2016 Frederick The Villar Cordova
phones
DNI Telephone
11111111A 660111222
22222222B 660222333
22222222B 660333444
22222222B 660444555
55555555E 661000111
55555555E 661000222
Second Normal Form: 2FN
A Relationship is in 2FN if and only if it is in 1FN and all attributes that are not part of the Primary
Key have complete functional dependency on it.
Example: We continue with the previous example. We will work with the following table:Students
DNI Name Course Registration Date Tutor Location Student Province Student
11111111A Eve 1ESO-A 01-July-2016 Isabel Ecija Seville
22222222B Ana 1ESO-A 09-July-2016 Isabel Ecija Seville
33333333C Suzanne 1ESO-B July 11, 2016 Robert Ecija Seville
44444444D Juan 2ESO-A 05-July-2016 Frederick The Villar Cordova
55555555E Joseph 2ESO-A 02-July-2016 Frederick The Villar Cordova
Let's examine the functional dependencies. The graph that represents them is the following:
Whenever a DNI appears, the corresponding Name and the corresponding Student
Location will appear. Therefore DNI → Name and DNI → LocalidadAlumno. On the other
hand, whenever a Course appears, the corresponding Tutor will appear. Therefore Course
→ Tutor. The attributes Name and StudentLocation do not functionally depend on Course,
and the Tutor attribute does not functionally depend on DNI.
The only attribute that does depend completely on the composite key DNI and Course is
Enrollment Date: (National ID, Course) → Enrollment Date.
When establishing the Primary Key of a table we must choose an attribute or set of them on which
the rest of the attributes functionally depend. Also, it must be a full functional dependency. If we
choose DNI as the primary key, we have an attribute (Tutor) that does not functionally depend on it.
If we choose Course as the primary key, we have other attributes that do not depend on it.
If we choose the combination (DNI, Course) as the primary key, then we do have all the other
attributes with functional dependence on this key. But it is a partial dependency, not a total one
(except EnrollmentDate, where there is a complete dependency). Therefore, this table is not in 2FN.
The solution would be the following:
Students
DNI Name Location Province
11111111A Eve Ecija Seville
22222222B Ana Ecija Seville
33333333C Suzanne The Villar Cordova
44444444D Juan The Villar Cordova
55555555E Joseph Ecija Seville
License plates
DNI Course Registration Date
courses
Course Tutor
1ESO-A Isabel
1ESO-B Robert
2ESO-A Frederick
A Relation is in 3NF if and only if it is in 2NF and there are no transitive dependencies. All functional
dependencies must be on the primary key.
Example: We continue with the previous example. We will work with the following table:
Students
DNI Name Location Province
11111111A Eve Ecija Seville
22222222B Ana Ecija Seville
33333333C Suzanne The Villar Cordova
44444444D Juan The Villar Cordova
55555555E Joseph Ecija Seville
The existing functional dependencies are as follows. As we can see, there is a transitive functional
dependency: DNI → Locality → Province
For the table to be in 3FN, there cannot be transitive functional dependencies. To solve the problem
we will have to create a new table. The result is:
Students
DNI Name Location
11111111A Eve Ecija
22222222B Ana Ecija
33333333C Suzanne The Villar
44444444D Juan The Villar
55555555E Joseph Ecija
Localities
Location Province
Ecija Seville
The Villar Cordova
FINAL SCORE
Students
DNI Name Location
11111111A Eve Ecija
22222222B Ana Ecija
33333333C Suzanne The Villar
44444444D Juan The Villar
55555555E Joseph Ecija
Localities
Location Province
Ecija Seville
The Villar Cordova
phones
DNI Telephone
11111111A 660111222
22222222B 660222333
22222222B 660333444
22222222B 660444555
55555555E 661000111
55555555E 661000222
License plates
courses
Course Tutor
1ESO-A Isabel
1ESO-B Robert
2ESO-A Frederick
Boyce-Codd Normal Form: FNBC
A Relation is in FNBC if it is in 3FN and there is no overlap of candidate keys. We only have to take
this normal form into account when we have several composite candidate keys and there is overlap
between them. This is rarely the case.
Example: We have a table with vendor information, part codes, and quantities of that part provided
by vendors. Each provider has a unique name. The data are:
Supplies
CIF Name Part Code Quantity Pieces
S-11111111A Ferroman 1 10
B-22222222B Ferrotex 1 7
M-33333333C Ferropet 3 4
S-11111111A Ferroman 2 twenty
S-11111111A Ferroman 3 fifteen
B-22222222B Ferrotex 2 8
B-22222222B Ferrotex 3 4
The PieceQuantity attribute has functional dependency on two composite candidate keys, which
are:
There is also a functional dependency in two ways (which does not affect us): NombreProveedor <-
> CIFProveedor.
For this table there is an overlap of 2 composite candidate keys. To avoid the overlap of candidate
keys we split the table. The solution is:
Providers
CIF Name
S-11111111A Ferroman
B-22222222B Ferrotex
M-33333333C Ferropet
Supplies
S-11111111A 1 10
B-22222222B 1 7
M-33333333C 3 4
S-11111111A 2 twenty
S-11111111A 3 fifteen
B-22222222B 2 8
B-22222222B 3 4
A Relation is in 4FN if and only if it is in 3FN (or FNBC) and the only multi-valued dependencies
are those that depend on the candidate keys
Example: We have a table with the information of our students and the subjects they take as well
as the sports they practice.
Student body
Student body
CIF Name
To normalize this table, we must realize that the supply of subjects is made up of a limited set of
values. The same happens with sports. Therefore there are two multivalued dependencies:
Student →→ Subject
Student →→ Sport
On the other hand, there is no dependency between the subject taken and the sport practiced. To
normalize to 4FN we create 2 tables:
Study Subject
Student Subject
11111111A Math
11111111A Tongue
22222222B Math
Sports practice
Student Sport
11111111A Swimming
11111111A Basketball
22222222B Soccer
22222222B Swimming
Equivalent E / R diagram
Fifth Normal Form: 5FN
The fifth normal form (5NF), is a generalization of the previous one. Also known as the projection-
union normal form (PJ / NF). A table is said to be in 5NF if and only if it is in 4NF and every join
dependency on it is implied by the candidate keys. Example: We have a table with several
suppliers that provide us with parts for different projects. We assume that a Supplier supplies
certain Parts in particular, a Project uses certain Parts, and a Project is supplied by certain
Suppliers, then we have the following multi-valued dependencies:
Supplier →→ Part
Piece →→ Project
Project →→ Provider
Supplies
Supplies
E1 PI3 PR2
E1 PI3 PR4
E1 PI6 PR2
E1 PI6 PR4
E4 PI3 PR2
E4 PI3 PR4
E4 PI6 PR2
E4 PI6 PR4
E6 PI3 PR2
E6 PI3 PR4
E6 PI6 PR2
E6 PI6 PR4
E2 PI1 PR1
You can see how a cycle occurs:
Supplies
E2 PI1 PR3
E2 PI2 PR1
E2 PI2 PR3
E5 PI1 PR1
E5 PI1 PR3
E5 PI2 PR1
E5 PI2 PR3
E3 PI4 PR5
E3 PI4 PR6
E3 PI5 PR5
E3 PI5 PR6
E7 PI4 PR5
E7 PI4 PR6
E7 PI5 PR5
E7 PI5 PR6
Supplier-Part
Supplie
Part
r
E1 PI3
E1 PI6
E4 PI3
E4 PI6
E6 PI3
E6 PI6
Supplier-Part
Supplie
Part
r
E2 PI1
E2 PI2
E5 PI1
E5 PI2
E3 PI4
E3 PI5
E7 PI4
E7 PI5
Piece-Project
Part Project
PI3 PR2
PI3 PR4
PI6 PR2
PI6 PR4
PI1 PR1
PI1 PR3
PI2 PR1
PI2 PR3
PI4 PR5
PI4 PR6
PI5 PR5
PI5 PR6
Project-Provider
Projec
Supplier
t
PR2 E1
PR4 E1
Project-Provider
Projec
Supplier
t
PR2 E4
PR4 E4
PR2 E6
PR4 E6
PR1 E2
PR3 E2
PR1 E5
PR3 E5
PR5 E3
PR6 E3
PR5 E7
PR6 E7
The natural product of these 3 tables gives us the original table. Supplier-Part | x | Piece-Project | x |
Project-Supplier = Supplies
Equivalent E / R diagram