Schema Refinement

Chapter 3
SCHEMA REFINEMENT
Table Of Contents
● Schema Refinement
● Problems Caused By Redundancy
● Use Of Decompositions
● Problems Related To Decomposition
● Functional Dependencies
● Types Of Functional Dependencies
● Normalization
• First Normal Form
• Second Normal Form
• Third Normal Form
• Boyce-codd Normal Form
• Fourth Normal Form
• Fifth Normal Form
SCHEMA REFINEMENT
Schema refinement refers to the process of improving or enhancing the structure and
organization of a database or data model. It's about making sure things are arranged
neatly and logically so that it's easy to find what you need and the database works
efficiently. Schema refinement aims to fix problems in how data is organized in a
database. The main issue it tackles is storing the same information multiple times, which
wastes space and makes things confusing.
To solve this, we use a method called decomposition, which means breaking down the
data into smaller, more organized pieces. However, decomposition can create its own
problems if not done carefully.
REDUNDANCY& ANOMALIES
● Redundancy refers to repetition of same data or duplicate copies of same data stored in
different locations. Anomalies:
● Anomalies refers to the problems occurred after poorly planned and normalized
databases where all the data is stored in one table which is sometimes called a flat file
database. Let us consider such type of schema –
Here all the data is stored in a single table

which causes redundancy of data or say
anomalies as SID and Sname are repeated
once for same CID . Let us discuss
anomalies one by one.
PROBLEMS CAUSED BY REDUNDANCY
Due to redundancy of data we may get the following problems, those are-
1. Redundant\Wasted Storage Space: Storing the same information multiple times takes up
unnecessary space in the database.
2.Insertion Anomalies : It may not be possible to store some information unless some other
information is stored as well.
3.Update Anomalies: If one copy of redundant data is updated, then inconsistency is created
unless all redundant copies of data are updated.
4.Deletion Anomalies: It may not be possible to delete some information without losing some
other information as well.
Problem In Updation / Updation Anomaly – If there is updation in
the fee from 5000 to 7000, then we have to update FEE column in all
the rows, else data will become inconsistent.
Insertion Anomaly and Deletion Anomaly- These anomalies exist only

due to redundancy, otherwise they do not exist.
Insertion Anomalies: New course is introduced C4, But no student is
there who is having C4 subject.
Because of insertion of some data, It is forced to insert some other dummy data
Deletion Anomaly : Deletion of S3 student cause the deletion of course. Because
of deletion of some data forced to delete some other useful data.
Solutions To Anomalies : Decomposition of

Tables – Schema Refinement
USE OF DECOMPOSITIONS
Decomposition is a technique used in database design to address problems caused by
redundancy. It involves breaking down a large table into smaller, more manageable tables,
each serving a specific purpose. Here's how decompositions can be helpful:
1. Eliminating Redundancy: By breaking down large tables into smaller ones, we can
store each piece of information only once, reducing redundancy and saving storage
space.
2. Improving Data Integrity: Decomposition helps maintain data integrity by ensuring
that each piece of information is stored in only one place, reducing the risk of
inconsistencies and update anomalies.
3. Simplifying Data Maintenance: Smaller, more specialized tables are easier to update
and maintain than larger, more complex ones. Decomposition makes it easier to add,
update, and delete data without affecting other parts of the database.
4. Enhancing Query Performance: Decomposition can improve query performance by
organizing data more efficiently and reducing the need for complex joins and
calculations.
PROBLEMS RELATED TO DECOMPOSITION
Decomposition, while helpful in addressing redundancy issues, can also introduce its own set of
problems:
1. Data Integrity Challenges: Decomposing tables may create dependencies between the decomposed
tables, which can lead to integrity constraints being violated if not managed properly.
2. Join Overhead: Decomposition often necessitates joining tables to retrieve related information,
which can result in increased query complexity and slower performance, especially if the joins
involve large datasets.
3. Increased Storage Requirements: While decomposition reduces redundancy, it may increase the
total storage requirements due to the need for additional indexes, keys, and storage for the
decomposed tables.
4. Complexity in Database Maintenance: Managing a database with decomposed tables can be more
complex, requiring careful attention to relationships and dependencies between tables during
updates, inserts, and deletes.
5. Data Consistency Concerns: Decomposed tables may introduce challenges in maintaining data
consistency, especially in distributed or highly concurrent environments where multiple transactions
are being processed simultaneously.
6. Loss of Semantic Meaning: Breaking down tables into smaller components may result in a loss of
semantic meaning or context, making it more challenging for users to understand the relationships
between different pieces of data.
FUNCTIONAL DEPENDENCIES
Functional dependencies are a fundamental concept in database theory. Here's a simple explanation:
In a database table, certain attributes (columns) depend on other attributes. For instance, in a table of
employees, the "employee ID" uniquely identifies each employee. This means that if you know the
"employee ID," you can determine the "employee name" or "department" associated with that ID.
A functional dependency is a relationship between two sets of attributes within a table, where the value
of one set of attributes uniquely determines the value of another set of attributes. It essentially describes
how the values of certain attributes are determined by other attributes in the table.
For example:
• In a table with attributes (A, B, C), if A determines B (i.e., if you know the value of A, you can
determine the value of B), we write it as A → B.
Functional dependencies are crucial for database design, normalization, and ensuring data integrity.
They help in organizing data efficiently and avoiding redundancy.
Types of functional dependencies:
1)Trivial functional dependency:-If X Y is a functional dependency where Y
subset X, these type of FD’s called as trivial functional dependency.
2)Non-trivial functional dependency:-If X Y and Y is not subset of X then it is

called non-trivial functional dependency.
3)Completely non-trivial functional dependency:-If X Y and X∩Y=Ф(null)

then it is called completely non-trivial functional dependency.
Normalization
It is a technique to remove or reduce a redundancy from a table is known as
Normalization.
There are five types of Normalization.
1. First Normal form
2. Second Normal form
3. Third Normal form
4. Boyce-Codd Normal Form
5. Fourth Normal Form
6. Fifth Normal Form
First Normal form
Rule for First Normal form
• Single valued attributes(Atomic Values)
• Unique name for attributes / columns
• Order does not matter
• Attributes name should not change
Example
ID Name Age Subject ID Name Age Subject
1 Fasih 18 DBMS,OS 1 Fasih 18 DBMS
2 Chanda 18 DBMS 1 Fasih 18 OS
3 Ali 19 JAVA 2 Chanda 18 DBMS
3 Ali 19 JAVA
Second Normal form
Rule for second Normal form:
• Table should be in Ist Normal form
• No partial dependency
Partial dependency
When there are two are more primary keys in one table is known as partial
dependency
Subject
Student
S ID S NAME Address Sub ID Sub Name
Score
Sc-ID Sc-S ID Sc-Sub ID Marks Teacher Sub ID Sub Name Teacher
Third Normal form
Rule for third Normal form:
• Table should be in 3rd NF
• No transitive dependency
Transitive dependency
When one column depends on a column which is not a primary key.
Score
Sc-ID S ID Sub ID Marks Exam-name Total marks
101 201 1 18 Mid 20
102 201 2 69 Final 80

103 202 1 18 Mid 20
Score
Sc-ID S ID Sub ID Marks Exam-ID
Exam
Exam-ID
+
Exam-name Total marks
Boyce-Codd Normal Form
Rule for BCNF
BCNF is the Advance version of the 3NF.
It is stricter than 3NF.
X Y, X is the super key of the table.
For BCNF, every table should be in 3NF and for every functional dependencies, LHS is
super key.
Super Key: A set of one or more attributes (columns) that uniquely identifies each row
(record) in a table.
Example
Employee_ID
I First_Name Last_Name Department
101 John Smith HR
102 Alice Johnson IT
103 Bob Williams Sales
104 Sarah Brown Marketing
Example
Employee_ID Project_ID Employee_Name Project_Name Department
101 501 John Project X HR
102 502 Alice Project Y IT
103 501 Bob Project X Sales
104 503 Sarah Project Z Marketing

Employee Table:
Employee_ID Employee_Name Department
101 John HR
102 Alice IT
103 Bob Sales
104 Sarah Marketing
Project Table:
Project_ID Project_Name
501 Project X
502 Project Y
503 Project Z
Employee-Project Table:
Employee_ID Project_ID
101 501
102 502
103 501
104 503
Fourth Normal Form
A relation will b in 4Nf if it is in BCNF and has no multi-valued Dependency.
Multi-valued Dependency:
MVD occurs when two or more independent multi-valued facts about
the same attribute occurs with in same relation.
MVD is denoted by X Y. there is a multivalued of dependency of Y or
multi-determines of X
faculty subject Committee

Kailash DBMS Placement
Kailash JAVA Placement
Kailash C Placement
Kailash DBMS Scholarship
Kailash JAVA Scholarship
Kailash C Scholarship
Faculty-Committee
Faculty Committee
Kailash Placement
Kailash Scholarship
Faculty-Course
faculty subject
Kailash DBMS
Kailash JAVA
Kailash C
Fifth Normal Form
The 5NF also known as 5th Normal form.
A relations is in fifth Normal form, if it is in 4NF,and would not have lossless decomposition in to smaller
table.
5NF is satisfied when all the tables are broken in to many tables as possible in order to avoid redundancy.
After that you combined these all tables if it is equal to the original table then 5NF.

Kailash C Placement
Faculty-Committee
Faculty Committee
Kailash Placement
Kailash Scholarship
Faculty-Course
faculty subject
Kailash DBMS
Kailash JAVA
Kailash C
Combined

Kailash C Placement

Schema Refinement

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Schema Refinement

Uploaded by

Copyright:

Available Formats

Chapter 3

Here all the data is stored in a single table

Insertion Anomaly and Deletion Anomaly- These anomalies exist only

Solutions To Anomalies : Decomposition of

2)Non-trivial functional dependency:-If X Y and Y is not subset of X then it is

3)Completely non-trivial functional dependency:-If X Y and X∩Y=Ф(null)

ID Name Age Subject ID Name Age Subject

1 Fasih 18 DBMS,OS 1 Fasih 18 DBMS

2 Chanda 18 DBMS 1 Fasih 18 OS

3 Ali 19 JAVA 2 Chanda 18 DBMS

101 201 1 18 Mid 20

102 201 2 69 Final 80

Sc-ID S ID Sub ID Marks Exam-ID

101 501 John Project X HR

102 502 Alice Project Y IT

103 501 Bob Project X Sales

104 503 Sarah Project Z Marketing

faculty subject Committee

faculty subject Committee

faculty subject Committee

You might also like