You are on page 1of 94

Database Management Systems

Chapter 3
Data Normalization

Jerry Post
Copyright © 2003
1
D Why Normalization?
A
 Need standardized data definition
T  Advantages of DBMS require careful design
 Define data correctly and the rest is much easier
A  It especially makes it easier to expand database later
 Method applies to most models and most DBMS
B  Similar to Entity-Relationship

A  Similar to Objects (without inheritance and methods)


 Goal: Define tables carefully
S  Save space
 Minimize redundancy

E  Protect data

2
D Definitions
A  Relational database: A collection of tables.
 Table: A collection of columns (attributes) describing an entity.
Individual objects are stored as rows of data in the table.
T  Property (attribute): a characteristic or descriptor of a class or entity.
 Every table has a primary key.

A  The smallest set of columns that uniquely identifies any row


 Primary keys can span more than one column (concatenated keys)

B
 We often create a primary key to insure uniqueness (e.g., CustomerID,
Product#, . . .) called a surrogate key.

A Primary key Properties


Class: Employee
Rows/Objects Employee

S EmployeeID TaxpayerID
12512
15293
888-22-5552
222-55-3737
LastName
Cartom
Venetiaan
FirstName
Abdul
Roland
HomePhone
(603) 323-9893
(804) 888-6667
Address
252 South Street
937 Paramaribo Lane

E
22343 293-87-4343 Johnson John (703) 222-9384 234 Main Street
29387 837-36-2933 Stenheim Susan (410) 330-9837 8934 W. Maple

3
D Keys
A  Primary key
T  Every table (object) must have a primary key
 Uniquely identifies a row (one-to-one)

A  Concatenated (or composite) key


 Multiple columns needed for primary key
B  Identify repeating relationships (1 : M or M : N)
 Key columns are underlined
A  First step

S
 Collect user documents
 Identify possible keys: unique or repeating relationships

E
4
D Notation
A Table name Table columns

T Customer(CustomerID, Phone, Name, Address, City, State, ZipCode)

A Primary key is underlined

B CustomerID Phone LastName FirstName Address City State Zipcode

A
1 502-666-7777 Johnson Martha 125 Main Street Alvaton KY 42122
2 502-888-6464 Smith Jack 873 Elm Street Bowling Green KY 42101
3 502-777-7575 Washington Elroy 95 Easy Street Smith’s Grove KY 42171
4 502-333-9494 Adams Samuel 746 Brown Drive Alvaton KY 42122

S
5 502-474-4746 Rabitz Victor 645 White Avenue Bowling Green KY 42102
6 616-373-4746 Steinmetz Susan 15 Speedway Drive Portland TN 37148
7 615-888-4474 Lasater Les 67 S. Ray Drive Portland TN 37148
8 615-452-1162 Jones Charlie 867 Lakeside Drive Castalian Springs TN 37031

E
9 502-222-4351 Chavez Juan 673 Industry Blvd. Caneyville KY 42721
10 502-444-2512 Rojo Maria 88 Main Street Cave City KY 42127

5
D Identifying Key Columns
A Orders Each order has

T OrderID
8367
Date
5-5-04
Customer
6794
only one customer.
So Customer is not
8368 5-6-04 9263 part of the key.
A OrderItems

B OrderID
8367
Item
229
Quantity
2
Each order has
many items.
A 8367
8367
8368
253
876
555
4
1
4
Each item can appear
on many orders.

S 8368 229 1 So OrderID and Item


are both part of the key.

E
6
D Surrogate Keys
A  Real world keys sometimes cause problems in a database.
T  Example: Customer
 Avoid phone numbers: people may not notify you when numbers

A change.
 Avoid SSN (privacy and most businesses are not authorized to

B
ask for verification, so you could end up with duplicate values)
 Often best to let the DBMS generate unique values
 Access: AutoNumber
A  SQL Server: Identity
 Oracle: Sequences (but require additional programming)
S  Drawback: Numbers are not related to any business data, so
the application needs to hide them and provide other look up
E mechanisms.

7
D Common Order System
A Customer
1
Salesperson
1
*
T Order
1
*

A *
OrderItem

B *
1
Item
A Customer(CustomerID, Name, Address, City, Phone)

S Salesperson(EmployeeID, Name, Commission, DateHired)


Order(OrderID, OrderDate, CustomerID, EmployeeID)

E OrderItem(OrderID, ItemID, Quantity)


Item(ItemID, Description, ListPrice)

8
D Client Billing Example
A Client Billing
Client(ClientID, Name, Address, BusinessType)
T Partner(PartnerID, Name, Speciality, Office, Phone)
PartnerAssignment(PartnerID, ClientID, DateAcquired)
Billing(ClientID, PartnerID, Date/Time, Item, Description, Hours, AmountBilled)
A
B Each partner can be assigned many clients.
A Each client can be assigned to many partners.

S
E
9
D Client Billing--Different Rules
A Client(ClientID, Name, Address, BusinessType)
Partner(PartnerID, Name, Speciality, Office, Phone) combine
T PartnerAssignment(PartnerID, ClientID, DateAcquired)
Billing(ClientID, PartnerID, Date/Time, Item, Description, Hours, AmountBilled)
A Each client is assigned to only one partner.

B
Cannot key PartnerID.
Combine Client and PartnerAssignment tables, since
they have the same key.
A
S
E
10
D Client Billing--New Assumptions
A Billing

ClientID PartnerID Date/Time Item Description Hours AmountBilled

T 115
295
115
963
967
963
8-4-04 10:03
8-5-04 11:15
8-8-04 09:30
967
754
967
Stress analysis
New Design
Stress analysis
2
3
2.5
$500
$750
$650

A
More realistic assumptions for a large firm.
B Each Partner may work with many clients.
Each client may work with many partners.
A Each partner and client may work together many times.
The identifying feature is the date/time of the service.

S What happens if you do not include Date/Time as a key?

E
11
D Sample: Video Database
A Possible Keys Repeating section

T
A
B
A
S
E
12
D Initial Objects
A  Customers  RentalTransaction
T  Key: Assign a CustomerID
 Sample Properties
 Event/Relationship
 Key: Assign TransactionID

A  Name
 Address
 Sample Properties
 CustomerID
 Phone
B
 Date
 Videos  VideosRented
 Key: Assign a VideoID  Event/Repeating list
A  Sample Properties
 Title
 Keys: TransactionID +
VideoID

S  RentalPrice
 Rating
 Sample Properties
 VideoCopy#

E
 Description

13
D Initial Form Evaluation
A RentalForm(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode,

Collect forms from users


(VideoID, Copy#, Title, Rent ) )

T Write down properties


Find repeating groups ( . . .)
A Look for potential keys: key
Identify computed values
B Notation makes it easier to
identify and solve problems
A Results equivalent to diagrams,
but will fit on one or two

S pages

E
14
D Problems with Repeating Sections
A RentalForm(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode,

Storing data in this raw form would not work


(VideoID, Copy#, Title, Rent ) )

T very well. For example, repeating sections will


cause problems.
Note the duplication of data. Repeating Section
A Also, what if a customer has not yet checked
out a movie--where do we store that Causes duplication

B 1
customer’s data?
TransID RentDate
4/18/04
CustomerID
3
LastName
Washington
Phone
502-777-7575
Address
95 Easy Street
VideoID
1
Copy#
2
Title
2001: A Space Odyssey
Rent
$1.50

A
1 4/18/04 3 Washington 502-777-7575 95 Easy Street 6 3 Clockwork Orange $1.50
2 4/30/04 7 Lasater 615-888-4474 67 S. Ray Drive 8 1 Hopscotch $1.50
2 4/30/04 7 Lasater 615-888-4474 67 S. Ray Drive 2 1 Apocalypse Now $2.00
2 4/30/04 7 Lasater 615-888-4474 67 S. Ray Drive 6 1 Clockwork Orange $1.50
3 4/18/04 8 Jones 615-452-1162 867 Lakeside Drive 9 1 Luggage Of The Gods $2.50
3 4/18/04 8 Jones 615-452-1162 867 Lakeside Drive 15 1 Fabulous Baker Boys $2.00

S
3 4/18/04 8 Jones 615-452-1162 867 Lakeside Drive 4 1 Boy And His Dog $2.50
4 4/18/04 3 Washington 502-777-7575 95 Easy Street 3 1 Blues Brothers $2.00
4 4/18/04 3 Washington 502-777-7575 95 Easy Street 8 1 Hopscotch $1.50
4 4/18/04 3 Washington 502-777-7575 95 Easy Street 13 1 Surf Nazis Must Die $2.50
4 4/18/04 3 Washington 502-777-7575 95 Easy Street 17 1 Witches of Eastwick $2.00

E
15
D Problems with Repeating Sections
A  Store repeating data
Name
Phone Customer Rentals

T  Allocate space
 How much?
Address
City
State

A  Can’t be short
 Wasted space
ZipCode

VideoID Copy# Title Rent

B  e.g., How many videos


will be rented at one
1. 6
2. 8
3.
1
2
Clockwork Orange
Hopscotch
1.50
1.50
time?
A  A better definition
eliminates this problem.
4.
5.
{Unused Space}

S Not in First Normal Form

E
16
D First Normal Form
A RentalForm(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode,
(VideoID, Copy#, Title, Rent ) )

T RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode)

A
RentalLine(TransID, VideoID, Copy#, Title, Rent )

 Remove repeating sections

B  Split into two tables


 Bring key from main and repeating section

A  RentalLine(TransID, VideoID, Copy#, . . .)


 Each transaction can have many videos (key VideoID)

S
 Each video can be rented on many transactions (key TransID)
 For each TransID and VideoID, only one Copy# (no key on Copy#)

E
17
D Nested Repeating Sections
A Table (Key1, . . . (Key2, . . .
(Key3, . . .) ) )
T Table1(Key1, . . .) TableA (Key1,Key2 . . .(Key3, . . .) )

A
B Table2 (Key1, Key2 . . .) Table3 (Key1, Key2, Key3, . . .)

A  Nested: Table (Key1, aaa. . . (Key2, bbb. . . (Key3, ccc. . .) ) )


 First Normal Form (1NF)

S  Table1(Key1, aaa . . .)
 Table2(Key1, Key2, bbb . .)

E
 Table3(Key1, Key2, Key3, ccc. . .)

18
D First Normal Form Problems (Data)
A
TransID RentDate CustID Phone LastName FirstName Address City State ZipCode
1 4/18/04 3 502-777-7575 Washington Elroy 95 Easy Street Smith's Grove KY 42171
2 4/30/04 7 615-888-4474 Lasater Les 67 S. Ray Drive Portland TN 37148
3 4/18/04 8 615-452-1162 Jones Charlie 867 Lakeside Drive Castalian Springs TN 37031
4 4/18/04 3 502-777-7575 Washington Elroy 95 Easy Street Smith's Grove KY 42171

T
 1NF splits repeating
A groups
 Still have problems
B  Replication
 Hidden dependency:
TransID
1
VideoID
1
Copy#
2
Title
2001: A Space Odyssey
Rent
$1.50
1 6 3 Clockwork Orange $1.50

A  If a video has not been


rented yet, then what
2
2
2
3
8
2
6
9
1
1
1
1
Hopscotch
Apocalypse Now
Clockwork Orange
Luggage Of The Gods
$1.50
$2.00
$1.50
$2.50

S
is its title? 3 15 1 Fabulous Baker Boys $2.00
3 4 1 Boy And His Dog $2.50
4 3 1 Blues Brothers $2.00
4 8 1 Hopscotch $1.50

E 4
4
13
17
1
1
Surf Nazis Must Die
Witches of Eastwick
$2.50
$2.00

19
D Second Normal Form Definition
A Depends on both TransID and VideoID

RentalLine(TransID, VideoID, Copy#, Title, Rent)


T
Depend only on VideoID
A  Each non-key column must  Dependence (definition)

B
depend on the entire key.  If given a value for the key
 Only applies to you always know the value
concatenated keys of the property in question,

A  Some columns only depend


on part of the key
then that property is said to
depend on the key.

S  Split those into a new table.  If you change part of a key


and the questionable
property does not change,

E then the table is not in 2NF.

20
D Second Normal Form Example
A RentalLine(TransID, VideoID, Copy#, Title, Rent)

T VideosRented(TransID, VideoID, Copy#)


A Videos(VideoID, Title, Rent)
B  Title depends only on VideoID
 Each VideoID can have only one title

A  Rent depends on VideoID


 This statement is actually a business rule.

S  It might be different at different stores.


 Some stores might charge a different rent for each

E video depending on the day (or time).


 Each non-key column depends on the whole key.

21
D Second Normal Form Example (Data)
A VideosRented(TransID, VideoID, Copy#)

T TransID
1
1
VideoID
1
6
Copy#
2
3
Videos(VideoID, Title, Rent)

A
2 2 1
VideoID Title Rent
2 6 1
1 2001: A Space Odyssey $1.50
2 8 1
2 Apocalypse Now $2.00
3 4 1

B
3 Blues Brothers $2.00
3 9 1
4 Boy And His Dog $2.50
3 15 1
5 Brother From Another Planet $2.00
4 3 1
6 Clockwork Orange $1.50

A
4 8 1
7 Gods Must Be Crazy $2.00
4 13 1
8 Hopscotch $1.50
4 17 1

S (Unchanged)
RentalForm2(TransID, RentDate, CustomerID, Phone,
E Name, Address, City, State, ZipCode)

22
D Second Normal Form Problems (Data)
A RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode)

T TransID
1
2
RentDate
4/18/04
4/30/04
CustID
3
7
Phone
502-777-7575
615-888-4474
LastName
Washington
Lasater
FirstName
Elroy
Les
Address
95 Easy Street
67 S. Ray Drive
City
Smith's Grove
Portland
State
KY
TN
ZipCode
42171
37148

A
3 4/18/04 8 615-452-1162 Jones Charlie 867 Lakeside Drive Castalian Springs TN 37031
4 4/18/042 3 502-777-7575 Washington Elroy 95 Easy Street Smith's Grove KY 42171

B  Even in 2NF, problems remain


 Replication
A  Hidden dependency
 If a customer has not rented a video yet,

S where do we store their personal data?


 Solution: split table.

E
23
D Third Normal Form Definition
A Depend on TransID

T RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode)

A Depend only on CustomerID

 Each non-key column must  Dependence (definition)


B depend on nothing but the
key.
 If given a value for the key
you always know the value

A  Some columns depend on


columns that are not part of
of the property in question,
then that property is said to
the key. depend on the key.
S  Split those into a new table.
 Example: Customers name
 If you change the key and
the questionable property

E does not change for every


transaction.
does not change, then the
table is not in 3NF.

24
D Third Normal Form Example
A RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode)

T Rentals(TransID, RentDate, CustomerID )

Customers(CustomerID, Phone, Name, Address, City, State, ZipCode )


A  Customer attributes depend only on Customer

B ID
 Split them into new table (Customer)
 Remember to leave CustomerID in Rentals
A table.
 We need to be able to reconnect tables.

S  3NF is sometimes easier to see if you identify


primary objects at the start--then you would

E recognize that Customer was a separate


object.

25
D Third Normal Form Example Data
A RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode

T Rentals(TransID, RentDate, CustomerID )


TransID RentDate CustomerID

A
1 4/18/04 3
2 4/30/04 7
3 4/18/04 8
4 4/18/04 3

B Customers(CustomerID, Phone, Name, Address, City, State, ZipCode )


CustomerID Phone LastName FirstName Address City State ZipCode

A 1
2
3
4
502-666-7777
502-888-6464
502-777-7575
502-333-9494
Johnson
Smith
Washington
Adams
Martha
Jack
Elroy
Samuel
125 Main Street
873 Elm Street
95 Easy Street
746 Brown Drive
Alvaton
Bowling Green
Smith's Grove
Alvaton
KY
KY
KY
KY
42122
42101
42171
42122

S 5
6
7
8
502-474-4746
615-373-4746
615-888-4474
615-452-1162
Rabitz
Steinmetz
Lasater
Jones
Victor
Susan
Les
Charlie
645 White Avenue
15 Speedway Drive
67 S. Ray Drive
867 Lakeside Drive
Bowling Green
Portland
Portland
Castalian Springs
KY
TN
TN
TN
42102
37148
37148
37031

E 9
10
502-222-4351
502-444-2512
Chavez
Rojo
Juan
Maria
673 Industry Blvd.
88 Main Street
Caneyville
Cave City
KY
KY
42721
42127

VideosRented(TransID, VideoID, Copy#)


26
Videos(VideoID, Title, Rent)
D Third Normal Form Tables (3NF)
A Rentals
1
TransID

T Customers
1
* RentDate
CustomerID
*
VideosRented

A
CustomerID TransID *
Phone VideoID
LastName Copy#

B
FirstName Videos
Address
1
City VideoID

A
State Title
ZipCode Rent

S Rentals(TransID, RentDate, CustomerID )


Customers(CustomerID, Phone, Name, Address, City, State, ZipCode )

E VideosRented(TransID, VideoID, Copy#)


Videos(VideoID, Title, Rent)

27
D 3NF Rules/Procedure
A
 Split out repeating sections
T  Be sure to include a key from the parent section in the new
piece so the two parts can be recombined.
A  Verify that the keys are correct
 Is each row uniquely identified by the primary key?
B  Are one-to-many and many-to-many relationships correct?
 Check “many” for keyed columns and “one” for non-key
A columns.
 Make sure that each non-key column depends on
S the whole key and nothing but the key.
 No hidden dependencies.
E
28
D Checking Your Work (Quality Control)
A  Look for one-to-many relationships.
T 

Many side should be keyed (underlined).
e.g., VideosRented(TransID, VideoID, . . .).

A 

Check each column and ask if it should be 1 : 1 or 1: M.
If add a key, renormalize.

B  Verify no repeating sections (1NF)


 Check 3NF

A  Check each column and ask:


 Does it depend on the whole key and nothing but the key?

S  Verify that the tables can be reconnected (joined) to form the


original tables (draw lines).
 Each table represents one object.
E  Enter sample data--look for replication.

29
D Boyce-Codd Normal Form (BCNF)
A  Hidden dependency Employee-Specialty(E#, Specialty, Manager)

T  Example:
 Employee-Specialty(E#, Specialty, Manager)

A
 Is in 3NF now. Employee(E#, Manager)
 Business rules.
 Employee may have many specialties. Manager(Manager, Specialty)

B 

Each specialty has many managers.
Each manager has only one specialty.

A  Employee has only one manager for each specialty.


 Problem is hidden relationship between manager
and specialty.
S  Need separate table for manager.
 But then we don’t need to repeat specialty.
Employee(E#, Specialty, Manager)

Manager(Manager, Specialty)

E  In real life, probably accept the duplication


(specialty listed in both tables). acce pt able

30
D Fourth Normal Form (Keys)
A EmployeeTasks(EID, Specialty, ToolID)

T
 Technically, if you keyed every column,
any table would be in 3NF, which does not
solve any problems.

A  In some cases, there are hidden


relationships between key properties. EmployeeSpecialty(EID, Specialty)

B
 Example: EmployeeTools(EID, ToolID)
 EmployeeTasks(EID, Specialty, ToolID)
 In 3NF (BCNF) now.

A  Business Rules
 Each employee has many specialties.

S  Each employee has many tools.


 Tools and specialties are unrelated

E
31
D Domain-Key Normal Form (DKNF)
A  DKNF is ultimate goal: table will always be in 4NF, etc.
 Drawbacks
T  No mechanical method to get to DKNF
 No guarantee a table can be converted to DKNF

A  Rules
 Table => one topic

B  All business rules explicitly written as domain constraints and


key relationships.
 No hidden relationships.
A Employee(EID, Name, speciality)

S Business rule: An employee can have many specialties.


So example is not in DKNF, since EID is not unique.

E
32
D DKNF Examples
A Employee(EID, Name, Speciality)

T Business rule: An employee can have many specialties.


Example is not in DKNF: EID is not unique.
Employee(EID, Name, Speciality)
A Business rule: An employee has one name.

B
Example is not DKNF: hidden relationship between EID and name.
Employee(EID, Name)

A EmployeeSpecialty(EID, Speciality)

S
E
33
D DKNF Examples
A Student(SID, Name, Major, Advisor)

T Advisor(FID, Name, Office, Discipline)

Business rules: A student can have many advisors, but


A only one for each major. Faculty can only be advisors
for their discipline.

B Not in DKNF: Primary key and hidden rule.

A Student(SID, Name)
Advisors(SID, Major, FID)

S Faculty(FID, Name, Office, Discipline)

E DKNF: Foreign key (Major <--> Discipline) makes advisor


rule explicit.

34
D No Hidden Dependencies
A  The simple normalization rules:  Solution: Split the table.

T  Remove repeating sections


 Each non-key column must
 Make sure you can rejoin the
two pieces to recreate the

A depend on the whole key and original data relationships.


nothing but the key.  For some hidden dependencies
 There must be no hidden within keys, double-check the
B dependencies. business assumption to be sure
that it is realistic. Sometimes
you are better off with a more
A flexible assumption.

S
E
35
D Data Rules and Integrity
A  Simple business rules O#
Order
Odate C# …

T  Limits on data ranges


 Price > 0
1173
1174
1185
1-4-97
1-5-97
1-8-97
321
938
337
 Salary < 100,000

A  DateHired > 1/12/1995


 Choosing from a set
1190
1192
1-9-97
1-9-97
321
776

B
 Gender = M, F, Unknown
 Jurisdiction=City, County, State, No data for this
Federal
customer yet!

A  Referential Integrity
 Foreign key values in one table must
exist in the master table.
Customer

S  Order(O#, Odate, C#,…)


 C# must exist in the customer table.
C#
321
337
Name
Jones
Sanchez
Phone
9983-
7738-

E 938 Carson 8738-

36
D SQL Foreign Key (Oracle, SQL Server)
A
T CREATE TABLE Order
( OID NUMBER(9) NOT NULL,
A Odate
CID
DATE,
NUMBER(9),
B CONSTRAINT pk_Order PRIMARY KEY (OID),
CONSTRAINT fk_OrderCustomer
A FOREIGN KEY (CID)
REFERENCES Customer (CID)
S )
ON DELETE CASCADE

E
37
D Effect of Business Rules
A
T
A
B
A
Key business rules:
S A player can play on only one team.
There is one referee per match.
E
38
D Business Rules 1
A There is one referee per match.

T A player can play on only one team.

A Match(MatchID, DatePlayed, Location, RefID)


Score(MatchID, TeamID, Score)
B Referee(RefID, Phone, Address)
Team(TeamID, Name, Sponsor)
A Player(PlayerID, Name, Phone, DoB, TeamID)
PlayerStats(MatchID, PlayerID, Points, Penalties)
S
RefID and TeamID are not keys in the Match
E and Team tables, because of the one-to-one rules.

39
D Business Rules 2
A There can be several referees per match.
A player can play on only several teams (substitute),
T but only on one team per match.
Match(MatchID, DatePlayed, Location, RefID)
A Score(MatchID, TeamID, Score)
Referee(RefID, Phone, Address)

B Team(TeamID, Name, Sponsor)


Player(PlayerID, Name, Phone, DoB, TeamID)
PlayerStats(MatchID, PlayerID, Points, Penalties)
A
To handle the many-to-many relationship, we need to make
S RefID and TeamID keys. But if you leave them in the same
tables, the tables are not in 3NF. DatePlayed does not depend

E on RefID. Player Name does not depend on TeamID.

40
D Business Rules 2: Normalized
A There can be several referees per match.
A player can play on only several teams (substitute),

T but only on one team per match.

Match(MatchID, DatePlayed, Location)


A RefereeMatch(MatchID, RefID)
Score(MatchID, TeamID, Score)

B Referee(RefID, Phone, Address)


Team(TeamID, Name, Sponsor)
Player(PlayerID, Name, Phone, DoB)
A PlayerStats(MatchID, PlayerID, TeamID, Points, Penalties)

S
E
41
D Converting a Class Diagram
to Normalized Tables
A Manager

T Supplier
1 * Purchase * 1
*
Employee
1

Order
A *

B
*
Item

A
subtypes
S Raw
Materials
Assembled
Components
Office
Supplies

E
42
D One-to-Many Relationships
A Supplier
1 *
Purchase * 1
Employee
Order
T Supplier(SID, Name, Address, City, State, Zip, Phone)

A Employee(EID, Name, Salary, Address, …)

B PurchaseOrder(POID, Date, SID, EID)

A The many side becomes a key (underlined).


Each PO has one supplier and employee.
S (Do not key SID or EID)
Each supplier can receive many POs. (Key PO)
E Each employee can place many POs. (Key PO)

43
D One-to-Many Sample Data
A Supplier

T
A Purchase Order

B
POID Date SID EID
22234 9-9-2004 5676 221
22235 9-10-2004 5676 554

A 22236 9-10-2004 7831 221


22237 9-11-2004 8872 335

S Employee

E
44
D Many-to-Many Relationships
A Purchase PurchaseOrder(POID, Date, SID, EID)
Order 1
Purchase
Order

T * 1
* *
POItem(POID, ItemID, Quantity, PricePaid) POItem
A * 1
*
*
1

B Item Item(ItemID, Description, ListPrice) Item

A Each POID can have many Items (key/underline ItemID).


Each ItemID can be on many POIDs (key POID).

S Need the new intermediate table (POItem) because:


You cannot put ItemID into PurchaseOrder because Date, SID, and EID
do not depend on the ItemID.
E You cannot put POID into Item because Description and ListPrice
do not depend on POID.

45
D Purchase Order
Many-to-Many Sample Data

A POID
22234
Date
9-9-2004
SID
5676
EID
221

T
22235 9-10- 5676 554
2004
22236 9-10- 7831 221

A
2004
POItem9-11-
22237 8872 335
2004

B
A Item

S
E
46
D N-ary Associations
A Employee
Name
...

T *
1

A Component
1 *
Assembly
* 1
Product
CompID ProductID

B Type
Name
Type
Name

A Assembly
EmployeeID

S
CompID
ProductID

E
47
D Composition
A Bicycle
Bicycle
Size

T Model Type

SerialNumber
ModelType Components

A Wheels WheelID
CrankID
ComponentID
Category

B Crank
StemID

Description
Weight

A Stem
Cost

S
E
48
D Generalization or Subtypes

A Item

T Raw Assembled Office

A
Materials Components Supplies

Item(ItemID, Description, ListPrice)


B RawMaterials(ItemID, Weight, StrengthRating)

A AssembledComponents(ItemID, Width, Height, Depth)

S OfficeSupplies(ItemID, BulkQuantity, Discount)

Add new tables for each subtype.

E Use the same key as the generic type (ItemID)--one-to-one relationship.


Add the attributes specific to each subtype.

49
D Subtypes Sample Data
Item
A
T
A RawMaterials

B
A AssembledComponents

S OfficeSupplies

E
50
D Recursive Relationships
A *
Manager
1

T Employee

A Employee(EID, Name, Salary, Address, Manager)


Employee
B
A
S Add a manager column that contains Employee IDs.
An employee can have only one manager. (Manager is not a key.)
E A manager can supervise many employees. (EID is a key.)

51
D Normalization Examples
A  Possible topics
T 

Auto repair
Auto sales

A 

Department store
Hair stylist

B
 HRM department
 Law firm
 Manufacturing
A 

National Park Service
Personal stock portfolio

S 

Pet shop
Restaurant

E 

Social club
Sports team

52
D Multiple Views & View Integration
A  Collect multiple views  Example
T  Documents
 Reports
 Federal Emergency
Management Agency

A  Input forms
 Create normalized tables from
(FEMA). Disaster planning
and relief.
 Make business assumptions
B
each view
as necessary, but try to
 Combine the views into one keep them simple.
complete model.
A  Keep meta-data in a data
dictionary

S 

Type of data
Size

E 

Volume
Usage

53
D The Pet Store: Sales Form
A
T
A
B
A
S Sales(SaleID, Date, CustomerID, Name, Address, City, State, Zip, EmployeeID,
Name, (AnimalID, Name, Category, Breed, DateOfBirth, Gender, Registration,
E Color, ListPrice, SalePrice), (ItemID, Description, Category, ListPrice, SalePrice,
Quantity))

54
D The Pet Store: Purchase Animals
A
T
A
B
A
S AnimalOrder(OrderID, OrderDate, ReceiveDate, SupplierID, Name,
Contact, Phone, Address, City, State, Zip, EmployeeID, Name, Phone,
E DateHired, (AnimalID, Name, Category, Breed, Gender, Registration,
Cost), ShippingCost)

55
D The Pet Store: Purchase Merchandise
A
T
A
B
A
S MerchandiseOrder(PONumber, OrderDate, ReceiveDate, SupplierID,
Name, Contact, Phone, Address, City, State, Zip, EmployeeID, Name,
E HomePhone, (ItemID, Description, Category, Price, Quantity,
QuantityOnHand), ShippingCost)

56
D Pet Store Normalization
Sale(SaleID, Date, CustomerID, EmployeeID)

A
SaleAnimal(SaleID, AnimalID, SalePrice)
SaleItem(SaleID, ItemID, SalePrice, Quantity)
Customer(CustomerID, Name, Address, City, State, Zip)

T Employee(EmployeeID, Name)
Animal(AnimalID, Name, Category, Breed, DateOfBirth,
Gender, Registration, Color, ListPrice)
A Merchandise(ItemID, Description, Category, ListPrice)
AnimalOrder(OrderID, OrderDate, ReceiveDate, SupplierID, EmpID, ShipCost)
B AnimalOrderItem(OrderID, AnimalID, Cost)
Supplier(SupplierID, Name, Contact, Phone, Address, City, State, Zip)

A
Employee(EmployeeID, Name, Phone, DateHired)
Animal(AnimalID, Name, Category, Breed, Gender, Registration, Cost)

S MerchandiseOrder(PONumber, OrderDate, ReceiveDate, SID, EmpID, ShipCost)


MerchandiseOrderItem(PONumber, ItemID, Quantity, Cost)
Supplier(SupplierID, Name, Contact, Phone, Address, City, State, Zip)
E Employee(EmployeeID, Name, Phone)
Merchandise(ItemID, Description, Category, QuantityOnHand)

57
Pet Store View Integration
D Sale(SaleID, Date, CustomerID, EmployeeID)
SaleAnimal(SaleID, AnimalID, SalePrice)
A SaleItem(SaleID, ItemID, SalePrice, Quantity)
Customer(CustomerID, Name, Address, City, State, Zip)
Employee(EmployeeID, Name, Phone, DateHired)
T Animal(AnimalID, Name, Category, Breed, DateOfBirth,
Gender, Registration, Color, ListPrice, Cost)

A Merchandise(ItemID, Description, Category, ListPrice, QuantityOnHand)

AnimalOrder(OrderID, OrderDate, ReceiveDate, SupplierID, EmpID, ShipCost)


B AnimalOrderItem(OrderID, AnimalID, Cost)
Supplier(SupplierID, Name, Contact, Phone, Address, City, State, Zip)
Employee(EmployeeID, Name, Phone, DateHired)
A Animal(AnimalID, Name, Category, Breed, Gender, Registration, Cost)

S MerchandiseOrder(PONumber, OrderDate, ReceiveDate, SID, EmpID, ShipCost)


MerchandiseOrderItem(PONumber, ItemID, Quantity, Cost)
Supplier(SupplierID, Name, Contact, Phone, Address, City, State, Zip)
E Employee(EmployeeID, Name, Phone)
Merchandise(ItemID, Description, Category, QuantityOnHand)

58
D Pet Store Class Diagram
Animal

A
AnimalOrder Animal AnimalID
OrderItem Name SaleAnimal
OrderID * * *
OrderID Category
OrderDate * SaleID
AnimalID Breed
* ReceiveDate AnimalID
Cost Breed * DateBorn

T
SupplierID * SalePrice
Gender
ShippingCo st Category
Registered
EmployeeID * Breed
Color
Employee
ListPrice
Supplier Customer
EmployeeID Photo

A SupplierID
Name
ContactName
Phone
City
CityID
ZipCode
LastName
FirstName
Phone
Address
Sale
SaleID
SaleDate
CustomerID
Phone
FirstName
LastName

B
Address ZipCode * Address
City EmployeeID
ZipCode * CityID ZipCode
State CustomerID * *
CityID TaxPayerID CityID
AreaCode SalesTax
* DateHired Category
Population1990
DateReleased
Population1980 Category

A
Country Registration
SaleItem
Latitude
*
Longitude Me rchandise SaleID
OrderItem
ItemID
* ItemID
PONumber * Quantity

S
Merchandise * Description
Order ItemID SalePrice
QuantityO nHand
Quantity
PONumber ListPrice
Cost
OrderDate Category
ReceiveDate *

E
SupplierID
* EmployeeID
ShippingCo st *

59
D Rolling Thunder Integration Example
A
T
A
B
A
S Bicycle Assembly form. The main EmployeeID control
is not stored directly, but the value is entered in the
E assembly column when the employee clicks the column.

60
D Initial Tables for Bicycle Assembly
A
T BicycleAssembly(
SerialNumber, Model, Construction, FrameSize, TopTube, ChainStay, HeadTube, SeatTube,

A
PaintID, PaintColor, ColorStyle, ColorList, CustomName, LetterStyle, EmpFrame, EmpPaint,
BuildDate, ShipDate,
(Tube, TubeType, TubeMaterial, TubeDescription),
(CompCategory, ComponentID, SubstID, ProdNumber, EmpInstall, DateInstall, Quantity, QOH) )

B
A
Bicycle(SerialNumber, Model, Construction, FrameSize, TopTube, ChainStay, HeadTube, SeatTube,
PaintID, ColorStyle, CustomName, LetterStyle, EmpFrame, EmpPaint, BuildDate, ShipDate)
Paint(PaintID, ColorList)

S BikeTubes(SerialNumber, TubeID, Quantity)


TubeMaterial(TubeID, Type, Material, Description)

E BikeParts(SerialNumber, ComponentID, SubstID, Quantity, DateInstalled, EmpInstalled)


Component(ComponentID, ProdNumber, Category, QOH)

61
D Rolling Thunder: Purchase Order
A
T
A
B
A
S
E
62
D RT Purchase Order: Initial Tables
A PurchaseOrder(PurchaseID, PODate, EmployeeID, FirstName, LastName, ManufacturerID,

T MfgName, Address, Phone, CityID, CurrentBalance, ShipReceiveDate,


(ComponentID, Category, ManufacturerID, ProductNumber, Description, PricePaid, Quantity,
ReceiveQuantity, ExtendedValue, QOH, ExtendedReceived), ShippingCost, Discount)

A
PurchaseOrder(PurchaseID, PODate, EmployeeID, ManufacturerID,
B ShipReceiveDate, ShippingCost, Discount)
Employee(EmployeeID, FirstName, LastName)

A Manufacturer(ManufacturerID, Name, Address, Phone, Address, CityID, CurrentBalance)


City(CityID, Name, ZipCode)

S PurchaseItem(PurchaseID, ComponentID, Quantity, PricePaid, ReceivedQuantity)


Component(ComponentID, Category, ManufacturerID, ProductNumber, Description, QOH)

E
63
D Rolling Thunder: Transactions
A
T
A
B
A
S
E
64
D RT Transactions: Initial Tables
A ManufacturerTransactions(ManufacturerID, Name, Phone, Contact, BalanceDue,
(TransDate, Employee, Amount, Description) )
T
A Manufacturer(ManufacturerID, Name, Phone, Contact, BalanceDue)
ManufacturerTransaction(ManufacturerID, TransactionDate, EmployeeID,
B Amount, Description)

A
S
E
65
D Rolling Thunder: Components
A
T
A
B
A
S
E
66
D RT Components: Initial Tables
A
ComponentForm(ComponentID, Product, BikeType, Category, Length, Height,
T Width, Weight, ListPrice,Description, QOH, ManufacturerID, Name,
Phone, Contact, Address, ZipCode, CityID, City, State, AreaCode)

A
Component(ComponentID, ProductNumber, BikeType, Category, Length, Height,
B Width,Weight, ListPrice, Description, QOH, ManufacturerID)
Manufacturer(ManufacturerID, Name, Phone, Contact, Address, ZipCode, CityID)

A City(CityID, City, State, ZipCode, AreaCode)

S
E
67
D RT: Integrating Tables
A Duplicate Manufacturer tables:

T
PO Mfr(ManufacturerID, Name, Address, Phone, CityID, CurrentBalance)
Mfg Mfr(ManufacturerID, Name, Phone, Contact, BalanceDue)
Comp Mfr(ManufacturerID, Name, Phone, Contact, Address, ZipCode, CityID)

A Note that each form can lead to duplicate tables.


B Look for tables with the same keys, but do not expect
them to be named exactly alike.

A Find all of the data and combine it into one table.

S Manufacturer(ManufacturerID, Name, Contact, Address, Phone,


Address, CityID, |ZipCode, CurrentBalance)

E
68
D RT Example: Integrated Tables
A Bicycle(SerialNumber, Model, Construction, FrameSize, TopTube, ChainStay, HeadTube,
SeatTube, PaintID, ColorStyle, CustomName, LetterStyle, EmpFrame,

T EmpPaint, BuildDate, ShipDate)


Paint(PaintID, ColorList)
BikeTubes(SerialNumber, TubeID, Quantity)

A TubeMaterial(TubeID, Type, Material, Description)


BikeParts(SerialNumber, ComponentID, SubstID, Quantity, DateInstalled, EmpInstalled)

B
Component(ComponentID, ProductNumber, BikeType, Category, Length, Height, Width,
Weight, ListPrice, Description, QOH, ManufacturerID)
PurchaseOrder(PurchaseID, PODate, EmployeeID, ManufacturerID,

A ShipReceiveDate, ShippingCost, Discount)


PurchaseItem(PurchaseID, ComponentID, Quantity, PricePaid, ReceivedQuantity)
Employee(EmployeeID, FirstName, LastName)

S Manufacturer(ManufacturerID, Name, Contact, Address, Phone,


CityID, ZipCode, CurrentBalance)

E
ManufacturerTransaction(ManufacturerID, TransactionDate, EmployeeID, Amount,
Description, Reference)
City(CityID, City, State, ZipCode, AreaCode)

69
Rolling Thunder Tables
Groupo

D
Customer Bicycle BicycleTube
BikeTubes TubeMaterial
CompGroup
CustomerID SerialNumber SerialNumber GroupName
Phone CustomerID SerialNumber TubeID
ModelType TubeID BikeType
FirstName ModelType TubeName Material
Quantity TubeID Year
LastName PaintID ModelType Description
EndYear

A
Address FrameSize Length Diameter
Description Weight
ZipCode OrderDate ComponentID Thickness
CityID StartDate Roundness
BalanceDue ShipDate ModelSize Weight
ShipEmployee Stiffness
FrameAssembler ModelType ListPrice

T
Painter Paint MSize BikeParts Construction
Construction TopTube Component
WaterBottle PaintID ChainStay SerialNumber
CustomName ColorName TotalLength ComponentID ComponentID
LetterStyleID ColorStyle GroundClearance SubstituteID ManufacturerID
CustomerTrans GroupCompon
StoreID ColorList HeadTubeAngle Location ProductNumber

A
EmployeeID DateIntroduced SeatTubeAngle Quantity Road
CustomerID GroupID
TopTube DateDiscontinued DateInstalled Category
TransDate ComponentID
ChainStay EmployeeID Length
EmployeeID
HeadTubeAngle Height
Amount
SeatTueAngle Width
Description
ListPrice Weight
Reference

B
SalePrice LetterStyle Year
SalesTax Employee EndYear
SaleState LetterStyle PurchaseItem Description ComponentName
ShipPrice EmployeeID Description ListPrice
FramePrice TaxpayerID PurchaseID EstimatedCost ComponentName
ComponentList LastName ComponentID QuantityOnHand AssemblyOrder

A
FirstName PricePaid Description
HomePhone Quantity
PurchaseOrder QuantityReceived
Address
ZipCode
PurchaseID
CityID
RetailStore EmployeeID
DateHired
ManufacturerID

S
DateReleased
StoreID TotalList ManufacturerTrans
CurrentManager Manufacturer
StoreName ShippingCost
SalaryGrade
Phone Customer Discount ManufacturerID
Salary ManufacturerID
ContacFirstName OrderDate TransactionDate
Title ManufacturerName
ContactLastName CityID ReceiveDate EmployeeID
WorkArea ContactName
AmountDue Amount

E
Address ZipCode Phone
Zipcode City Description
Address Reference
CityID State ZipCode
AreaCode CityID
Population1990 BalanceDue
Population1980
Country
StateTaxRate
Latitude
Longitude
State
TaxRate 70
D View Integration (FEMA Example 1)
A Team# Date Formed
Team Roster
Leader

T Home Base
Response time (days)
Name
Address, C,S,Z
Fax Phone
Home phone

A ID Name
Team Members/Crew
Home phone Specialty DoB SSN Salary

B
A Total Salary

S  This first form is kept for each team that can be called on to

E help in emergencies.

71
D View Integration (FEMA Example 2)
A Disaster Name
Local Agency
Political Contact
HQ Location
Commander
On-Site Problem Report

T Date Reported
Problem Description
Assigned Problem# Severity

A Reported By:
Verified By:
Specialty
Specialty
Specialty Rating
Specialty Rating

B SubProblem Details
Sub Prob# Category Description Action Est. Cost
A Total Est. Cost

S  Major problems are reported to HQ to be prioritized and

E scheduled for correction.

72
D View Integration (FEMA Example 3)
A LocationID, Address
Latitude, Longitude
Location Damage Analysis
Team Leader
Cellular Phone
Title
Date Evaluated
Repair Priority
Damage Description

T Room Damage Descrip. Damage% Item Value $Loss

A
B Room Damage Descrip. Damage% Item Value $Loss

A Item Loss Total

S Estimated Damage Total

E  On-site teams examine buildings and file a report on damage at that location.

73
D View Integration (FEMA Example 3a)
A  Location Analysis(LocationID, MapLatitude, MapLongitude, Date,

T Address, Damage, PriorityRepair, Leader, LeaderPhone, LeaderTitle,


(Room, Description, PercentDamage, (Item, Value, Loss)))

A
B
A
S
E
74
D View Integration (FEMA Example 4)
A Disaster Name
Task Completion Report
Disaster Rating HQ Phone
Date

T Problem# Supervisor Date


SubProblem Team# Team Specialty CompletionStatus Comment Expenses

A Total Expenses

B Problem# Supervisor Date


SubProblem Team# Team Specialty CompletionStatus Comment Expenses

A Total Expenses

S
E  Teams file task completion reports. If a task is not completed, the
percentage accomplished is reported as the completion status.

75
D View Integration (FEMA Example 4a)
A  TasksCompleted(Date, DisasterName, DisasterRating, HQPhone,

T (Problem#, Supervisor, (SubProblem, Team#, CompletionStatus,


Comments, Expenses))

A
B
A
S
E
76
D DBMS Table Definition
A  Enter Tables
 Columns
 Column Properties
 Format

T  Keys
 Data Types


Input Mask
Caption

A
 Text  Default
 Memo  Validation Rule
 Number  Validation Text
B  Byte
 Integer, Long


Required & Zero Length
Indexed

A
 Single, Double
 Date/Time  Relationships
 Currency  One-to-One

S  AutoNumber (Long)
 Yes/No


One-to-Many
Referential Integrity

E
 OLE Object  Cascade Update/Delete
 Descriptions  Define before entering data

77
D Key
Table Definition in Access
A
T
A
B
A
S
E
Numeric Subtypes or text length 78
D Key
Graphical Table Definition in Oracle
A
T
A
B
A
S
E
Minimal ability to change the table definition later. 79
D Keys
Graphical Table Definition in SQL Server

A
T
A
B
A
S
E
80
D CREATE TABLE Animal
( SQL Table Definition
A
AnimalID INTEGER,
Name VARCHAR2(50),
Category VARCHAR2(50),

T
Breed VARCHAR2(50),
DateBorn DATE,
Gender VARCHAR2(50) CHECK (Gender='Male' Or

A
Gender='Female' Or Gender='Unknown' Or Gender Is Null),
Registered VARCHAR2(50),
Color VARCHAR2(50),

B
ListPrice NUMBER(38,4) DEFAULT 0,
Photo LONG RAW,
ImageFile VARCHAR2(250),

A
ImageHeight INTEGER,
ImageWidth INTEGER,
CONSTRAINT pk_Animal PRIMARY KEY (AnimalID),

S
CONSTRAINT fk_BreedAnimal FOREIGN KEY (Category,Breed)
REFERENCES Breed(Category,Breed)
ON DELETE CASCADE,

E
CONSTRAINT fk_CategoryAnimal FOREIGN KEY (Category)
REFERENCES Category(Category)
ON DELETE CASCADE
);
81
D Oracle Databases
A  For Oracle and SQL Server, it is best to create a text file that
contains all of the SQL statements to create the table.

T  It is usually easier to modify the text table definition.


 The text file can be used to recreate the tables for backup or
transfer to another system.
A  To make major modifications to the tables, you usually create a
new table, then copy the data from the old table, then delete the

B old table and rename the new one. It is much easier to create
the new table using the text file definition.
 Be sure to specify Primary Key and Foreign Key constraints.
A  Be sure to create tables in the correct order—any table that
appears in a Foreign Key constraint must first be created. For

S example, create Customer before creating Order.


 In Oracle, to substantially improve performance, issue the
following command once all tables have been created:
E  Analyze table Animal compute statistics;

82
D Data Volume
A  Estimate the total size of the
database.
 For concatenated keys (and
similar tables).

T  Current.
 Future growth.
 OrderItems(O#, Item#, Qty)
 Hard to “know” the total
 Guide for hardware and number of items ordered.
A software purchases.
 For each table.
 Start with the total number
of orders.

B
 Multiply by the average
 Use data types to estimate number of items on a typical
the number of bytes used order.

A for each row.


 Multiply by the estimated
 Need to know time frame or
how long to keep data.
number of rows.  Do we store all customer
S  Add the value for each table
to get the total size.
data forever?
 Do we keep all orders in the

E active database, or do we
migrate older ones?

83
D Data Volume Example
A Customer(C#, Name, Address, City, State, Zip)
Row: 4 + 15 + 25 + 20 + 2 + 10 = 76
Order(O#, C#, Odate)
T Row: 4 + 4 + 8 = 16
OrderItem(O#, P#, Quantity, SalePrice)

A Row: 4+4 + 4 + 8 = 20
10 Orders
Ordersin 3 yrs  1000 Customers * * 3 yrs  30,000
B OrderLines  30,000 Orders *
Customer
5 Lines
 150,000
A  Business rules
Order

 Customer 76 * 1000 76,000

S  Three year retention.


 1000 customers.


Order 16 * 30,000
OrderItem 20 * 150,000
480,000
3,000,000
 Average 10 orders per customer per

E
 Total 3,556,000
year.
 Average 5 items per order.

84
D Appendix: Formal Definitions: Terms
A Formal Definition Informal
Relation A set of attributes with data Table
T that changes over time. Often
denoted R.

A Attribute Characteristic with a real-world Column


domain. Subsets of attributes
are multiple columns, often
B Tuple
denoted X or Y.
The data values returned for Row of data

A specific attribute sets are often


denoted as t[X]

S
Schema Collection of tables and
constraints/relationships
Functional XY Business rule
E dependency dependency

85
D Appendix: Functional Dependency
A Derives from a real-world relationship/constraint.

T Denoted X  Y for sets of attributes X and Y

A Holds when any rows of data that have identical values for
X attributes also have identical values for their Y attributes:

B If t1[X] = t2[X], then t1[Y] = t2[Y]

A X is also known as a determinant if X is non-trivial (not a


subset of Y).

S
E
86
D Appendix: Keys
A Keys are attributes that are ultimately used to identify rows of data.

T A key K (sometimes called candidate key) is a set of attributes

A (1) With FD K  U where U is all other attributes in the relation


(2) If K’ is a subset of K, then there is no FD K’  U
B
A set of key attributes functionally determines all other attributes in the
A relation, and it is the smallest set of attributes that will do so (there is
no smaller subset of K that determines the columns.)

S
E
87
D Appendix: First Normal Form
A A relation is in first normal form (1NF) if and only if all attributes are
atomic.
T
A Atomic attributes are single valued, and cannot be composite, multi-
valued or nested relations.

B Example:
Customer(CID, Name: First + Last, Phones, Address)
A CID Name: First + Last Phones Address

S
111 Joe Jones 111-2223 123 Main
111-3393
112-4582

E
88
D Appendix: Second Normal Form
A A relation is in second normal form (2NF) if it is in 1NF and each
non-key attribute is fully functionally dependent on the primary key.

T K  Ai for each non-key attribute Ai

A That is, there is no subset K’ such that K’  Ai

B Example:
OrderProduct(OrderID, ProductID, Quantity, Description)
A OrderID ProductID Quantity Description

S 32
32
15
16
1
2
Blue Hose
Pliers

E 33 15 1 Blue Hose

89
D Appendix: Transitive Dependency
A Given functional dependencies: X  Y and Y  Z, the transitive

T dependency X  Z must also hold.

A Example:
There is an FD between OrderID and CustomerID. Given the OrderID
B key attribute, you always know the CustomerID.
There is an FD between CustomerID and the other customer data,
A because CustomerID is the primary key. Given the CustomerID, you
always know the corresponding attributes for Name, Phone, and so

S
on.
Consequently, given the OrderID (X), you always know the
corresponding customer data by transitivity.
E
90
D Appendix: Third Normal Form
A A relation is in third normal form if and only if it is in 2NF and no non-
key attributes are transitively dependent on the primary key.

T
That is, K  Ai for each attribute, (2NF) and
A There is no subset of attributes X such that K  X  Ai

B Example:
Order(OrderID, OrderDate, CustomerID, Name, Phone)
A OrderID OrderDate CustomerID Name Phone

S 32
33
5/5/2004
5/5/2004
1
2
Jones
Hong
222-3333
444-8888

E 34 5/6/2004 1 Jones 222-3333

91
D Appendix: Boyce-Codd Normal Form
A A relation is in Boyce-Codd Normal Form (BCNF) if and only if it is in
3NF and every determinant is a candidate key (or K is a superkey).

T That is, K  Ai for every attribute, and there is no subset X (key or


nonkey) such that X  Ai where X is different from K.

A Example: Employees can have many specialties, and many employees


can be within a specialty. Employees can have many managers, but a
B manager can have only one specialty: Mgr  Specialty
EmpSpecMgr(EID, Specialty, ManagerID)
A EID Speciality ManagerID
FD ManagerID  Specialty is
S 32
33
Drill
Weld
1
2
not currently a key.

E 34 Drill 1

92
D Appendix: Multi-Valued Dependency
A A multi-valued dependency (MVD) exists when there are at least
three attributes in a relation (A, B, and C; and they could be sets),
T and one attribute (A) determines the other two (B and C) but the other
two are independent of each other.

A That is, A B and A  C but B and C have no FDs

B Example:
Employees have many specialties and many tools, but
A tools and specialties are not directly related.

S
E
93
D Appendix: Fourth Normal Form
A A relation is in fourth normal form 4NF if and only if it is in BCNF and
there are no multi-valued dependencies.
T That is, all attributes of R are also functionally dependent on A.

A If A   B, then all attributes of R are also functionally dependent on A:


A  Ai for each attribute.

B Example:

A EmpSpecTools(EID, Specialty, ToolID)

S EmpSpec(EID, Specialty)
EmpTools(EID, ToolID)
E
94

You might also like