Anomalies and Dependencies
Dr. Mabruk Ali
Semantics of the Relation Attributes
GUIDELINE 1: Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to individual relations and their attributes).
Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in the same relation Only foreign keys should be used to refer to other entities Entity and relationship attributes should be kept apart as much as possible.
Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy to interpret.
Redundant Information in Tuples and Update Anomalies
Mixing attributes of multiple entities may cause problems Information is stored redundantly wasting storage Problems with update anomalies ◦ Insertion anomalies ◦ Deletion anomalies ◦ Modification anomalies
Data redundancy
Values stored repetitively in relations (esp. poorly designed relations) Potential for anomalous data to be stored Employee Salary Project Budget Role
Brown 20 Alpha 35 Gamma 35 Epsilon 55 Epsilon 55 Gamma 48 Gamma 48 Epsilon 2 Technician 15 Designer 9 Designer 9 Manager 15 Consultant 15 Manager 9 Designer
Slide 4
This relation associates employees with projects. Assume no nulls are allowed.
Green Green Hoskins Hoskins Moore Moore
Update anomalies
Each person’s salary is repeated for each project they are involved with. What does this imply when we need to increase someone’s salary?
Employee Brown Green Green Hoskins Hoskins Moore Moore Salary Project 20 Alpha Budget Role 2 Technician 15 Designer 9 Designer 9 Manager 15 Consultant 15 Manager 9 Designer
Slide 5
Both values updated: OK
35 Gamma 37
35 Epsilon 37 55 Epsilon 55 Gamma 50 48 Gamma 48 Epsilon
Only one value updated: ANOMALY
Delete anomalies
If a project ends (i.e., is deleted), what happens to the data for employees on that project?
Employee Salary Project 20 Alpha 35 Gamma 35 Epsilon 55 Epsilon 55 Gamma 48 Gamma 48 Epsilon Budget Role 2 Technician 15 Designer 9 Designer 9 Manager 15 Consultant 15 Manager 9 Designer
Slide 6
Delete project Alpha What happens to (Brown, 20)? ANOMALY
Brown Green Green Hoskins Hoskins Moore Moore
Insert anomalies
What happens when we hire a new person? (remember, no nulls allowed)
Employee Brown Salary Project 20 Alpha 35 Gamma 35 Epsilon 55 Epsilon 55 Gamma 48 Gamma 48 Epsilon 36 ??? Budget Role 2 Technician 15 Designer 9 Designer 9 Manager 15 Consultant 15 Manager 9 Designer ??? ???
Slide 7
Johnson hasn’t yet been assigned to a project, but no nulls allowed Where do we store (Johnson, 36) until then? ANOMALY
Green Green Hoskins Hoskins Moore Moore Johnson
The solution: Normalisation
Breaking up the relation eliminates the worst of the redundancy
Employee Project Role
Employee
Brown Green Hoskins
Salary
20 35 55
Brown
Green Green Hoskins
Alpha
Gamma Epsilon Epsilon
Technician
Designer Designer Manager Project Alpha Gamma Epsilon Budget 2 15 9
Moore
48
Hoskins
Moore Moore
Gamma
Gamma Epsilon
Consultant
Manager Designer
Slide 8
Functional Dependencies (FD)
An important concept associated with normalization. Functional dependency describes the relationship between attributes. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A → B), if each value of A in R is associated with exactly one value of B in R. An alternative way to describe the relationship between attributes A and B is to say that “A functionally determines B”.
A Called (the Determinant) B Called (the dependent)
Characteristics of FDs
Determinants should have the minimal number of attributes necessary to maintain the functional dependency with the attribute(s) on the right hand-side. This requirement is called full functional dependency.
Identifying FDs
Identifying all functional dependencies between a set of attributes is relatively simple if the meaning of each attribute and the relationships between the attributes are well understood. This information should be provided by the enterprise in the form of discussions with users and/or documentation such as the users’ requirements specification.
Identifying FDs (Cont)
However, if the users are unavailable for consultation and/or the documentation is incomplete then depending on the database application it may be necessary for the database designer to use their common sense and/or experience to provide the missing information.
Examples of FD constraints (1)
social security number determines employee name SSN -> ENAME project number determines project name and location PNUMBER -> {PNAME, PLOCATION} employee ssn and project number determines the hours per week that the employee works on the project {SSN, PNUMBER} -> HOURS
Types of functional dependency
Full Partial Transitive
Full Functional Dependency
Full functional dependency indicates that if A and B are attributes of a relation. B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A.
A functional dependency A → B is a partially dependency if there is some attribute that can be removed from A and yet the dependency still holds. A B == LHS RHS
Example of Full FD
Example: R(Year, Course_code, Course_coordinator) ◦ year + course_code course_coordinator ◦ (i.e., course_coordinator determined by combination of both a particular year and a course_code)
◦ If we remove either Year or Course code from the left hand side (LHS) (the determinant), the dependency is no longer exist.
Year Course_coordinator Course_code
Slide 16
Partial functional dependency
R1(StudentId, StudentName,DateOfBirth) R2(InvoiceNumber, InvoiceDate, InvoiceTotal) Invoice Number Invoice Date
Student ID Student Name
Date of Birth
Invoice Total
Subset of left hand side determines right hand side
◦ “extra” attributes on LHS are unnecessary
Slide 17
Now Full functional dependency
Student ID
Date of Birth
Invoice Number
Invoice Total
left hand side determines right hand side
◦ No “extra” attributes on LHS are unnecessary
Slide 18
Transitive dependency
Transitive dependency
◦ part number determines supplier number ◦ supplier number determines supplier name ◦ therefore, part number alone also determines supplier name
Part number
Supplier number
Supplier name
Ideally should not exist within the same relation
Slide 19
Transitive Dependency
It is important to recognize a transitive dependency because its existence in a relation can potentially cause update anomalies.
Transitive dependency describes a condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).
MVD & JD
Normal Forms will be discussed next lecture. The fourth normal form makes use of a new kind of dependency, called a multivalued dependency (MVD); MVDs are a generalization of FDs.
The fifth normal form makes use of another new kind of dependency, called a join dependency (JD); JDs are a generalization of MVDs, just as MVDs are a generalization
The End
Lecture 05 - ER to Relation Mapping
22