You are on page 1of 62

Database Design: Logical Models: Normalization and The Relational Model

University of California, Berkeley School of Information IS 257: Database Management

IS 257 Fall 2006

2006.09.14 - SLIDE 1

Announcements
I will be away next week Instead we will have an informal workshop to work on issues of choosing and designing your personal Databases

IS 257 Fall 2006

2006.09.14 - SLIDE 2

Lecture Outline
Review
Conceptual Model and UML

Logical Model for the Diveshop database Normalization Relational Advantages and Disadvantages
IS 257 Fall 2006 2006.09.14 - SLIDE 3

Lecture Outline
Review Logical Design for the Diveshop database Normalization Relational Advantages and Disadvantages

IS 257 Fall 2006

2006.09.14 - SLIDE 4

DiveShop ER Diagram
Customer No Destination Name Destination no

DiveCust
1

Customer No
n n 1

ShipVia

Dest
Destination no Site No
1 1

DiveOrds
Order No

ShipVia

ShipVia

Destination
n

Sites

n 1 1/n

Order No Item No

Site No Species No

DiveItem
n

BioSite
n

ShipWrck

Site No
1 1

Species No

BioLife

DiveStok

Item No
2006.09.14 - SLIDE 5

IS 257 Fall 2006

Lecture Outline
Review
Conceptual Model and UML

Logical Model for the Diveshop database Normalization Relational Advantages and Disadvantages
IS 257 Fall 2006 2006.09.14 - SLIDE 6

Database Design Process


Application 1 Application 2 Application 3 Application 4

External Model
Application 1

External Model

External Model

External Model

Conceptual requirements
Application 2

Conceptual requirements
Application 3

Conceptual requirements
Application 4

Conceptual Model

Logical Model

Internal Model

Conceptual requirements

IS 257 Fall 2006

2006.09.14 - SLIDE 7

Logical Model: Mapping to a Relational Model


Each entity in the ER Diagram becomes a relation. A properly normalized (next time) ER diagram will indicate where intersection relations for many-to-many mappings are needed. Relationships are indicated by common columns (or domains) in tables that are related. We will examine the tables for the Diveshop derived from the ER diagram

IS 257 Fall 2006

2006.09.14 - SLIDE 8

DiveShop ER Diagram
Customer No Destination Name Destination no

DiveCust
1

Customer No
n n 1

ShipVia

Dest
Destination no Site No
1 1

DiveOrds
Order No

ShipVia

ShipVia

Destination
n

Sites

n 1 1/n

Order No Item No

Site No Species No

DiveItem
n

BioSite
n

ShipWrck

Site No
1 1

Species No

BioLife

DiveStok

Item No
2006.09.14 - SLIDE 9

IS 257 Fall 2006

Customer = DIVECUST
Customer No Name Street City State/Prov Zip/Postal 11 1 1Louis Jazdzewski 1 1 O'Connor Orleans 11 New LA 111 11 11 1 1Barbara Wright1W. Freeway 11 1 San Francisco CA 111 11 11 1 1Stephen Bredenburg 5 5Place IN 5 5N.E. 5 5 Indianapolis 111 11 11 1 1Phillip Davoust First Street 11 1 Berkeley CA 111 11 11 1 1David Burgett 1Montgomery Street 1 1 Seattle WA 111 11 11 1 1Mary Rioux1 1 Gateway Blvd. #1 1 11 Pueblo CO 1 111 11 11 1 1Kim Lopez 1 1 1Nottingham Lane 11 Honolulu HI 111 11 11 1 1Hiram Marley 1 Mill Run Drive CA 1 1 1 San Francisco 111 11 11 1 1Tanya Kulesa1S. Flower, Mail Stop 1 1 1 1 1 1 1 1 New York NY 1 1 1 1 11 1 1Charles Sekaron 1 1East Park Avenue, Box 1 1 1 1 1 Miller SD 11 11 1 1Lowell Lutz1 1E. Fesler 1 Dallas TX 111 11 11 1 1Keith Lucas 1South Euclid 1 Chicago IL 111 11 11 1 1Karen Ng 1 1 Elmhill Pike Falls 11 Klamath OR 111 11 11 1 1Ken Soule 1 Sansome Street CO 1 Aurora 111 11 Code Country U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. U.S.A. Phone First Contact (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 111 / / (111 -1111 1 ) 111 111 / / (111 -1111 1 ) 111 111 / / (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 11 / / 11 (111 -1111 1 ) 111 111 / /

IS 257 Fall 2006

2006.09.14 - SLIDE 10

Dive Order = DIVEORDS


Order No Customer No Date Ship Via Sale 11 1 11 11 111 / / 1UPS 11 1 11 11 111 / / 1FedEx 11 1 11 11 111 / / 1Walk In 11 1 11 11 111 / / 1FedEx 11 1 11 11 111 / / 1FedEx 11 1 11 11 111 / / 1Walk In 11 1 11 11 111 / / 1Emery 11 1 11 11 111 / / 1Emery 11 1 11 11 111 / / 1FedEx 11 1 11 11 111 / / 1DHL 11 1 11 11 111 / / 1Walk In 11 1 11 11 111 / / 1FedEx 11 1 11 11 111 / / 1FedEx 11 1 11 11 111 / / 1DHL 11 1 11 11 111 / / 1FedEx PaymentMethod CcNumber CcExpDateNo Of People Depart Date Return Date DestinationVacationCost Visa 5 5 55 55 111 55 5 5 / / 1 1 1 / / 1 1 / 11 111 11 / 1Fiji 111 11 Check 1 111 / / 1 11 / 1Santa Barbara 1 1 / 11 11 Visa 1 1 1 1 1 11 / 1 1111 / 11 1 11 / 1 11 / 1Cozumel / 11 / 11 11 11 Check 1 111 / / 1 11 / 1Monterey / 11 11 11 AmEx 1 1 1 1 1 1 / 11 1 1 1 1 11 / 1 1 111 / / 1 11 / 1Fiji / 11 111 11 Cash 1 1 / 11 1 / 11 11 / 1 11 / 1Santa Barbara 1 1 11 Master Card1 1 1 1 1 11 / 1 1 1 1 1 1 / 11 1 11 / 1 11 / 1New Jersey / 11 / 11 11 11 AmEx 1 1 1 1 1 1 / 11 1 1 1 1 11 / 1 1 11 / 1 11 / 1New Jersey / 11 / 11 11 11 Money Order 1 11 / 1 11 / 1Monterey / 11 / 11 11 11 Master Card1 1 1 1 1 / / 1 1 1 1 1 1 111 1 11 / 1 11 / 1Florida / 11 / 11 111 11 Cash 1 111 / / 1 11 / 1Cozumel / 11 11 11 Check 1 111 / / 1 11 / 1Florida / 11 11 11 Money Order 1 11 / 1 11 / 1Santa Barbara 1 1 1 / 11 / 11 11 Discover 1 1 1 1 1 1 / 11 1 1 1 1 11 / 1 1 11 / 1 11 / 1Fiji / 11 / 11 111 11 Cash 1 1 111 / / 1 11 / 1Great Barrier Reef 1 / 11 11 1 11

IS 257 Fall 2006

2006.09.14 - SLIDE 11

Line item = DIVEITEM


Order No Item No Rental/Sale Qty 11 1 1 1 1Rental 11 11 1 5 5 5Rental 55 11 1 1 1 1Rental 11 11 1 1 1 1Rental 11 11 1 5 5 5Rental 55 11 1 1 1 1Rental 11 11 1 1 1 1Rental 11 11 1 1 1 1Rental 11 11 1 1 1 1Rental 11 11 1 5 5 5Rental 55 11 1 1 1 1Sale 11 11 1 1 1 1Rental 11 11 1 1 1 1Rental 11 11 1 1 1 1Rental 11 11 1 1 1 1Sale 11
IS 257 Fall 2006

Line Note 1 1This is our most popular mask. 1 1These are our best selling fins. 1 1 1 1A good weight belt for beginners 1 1 1Holds 11 cubic feet of cargo. 1 1 1 1
2006.09.14 - SLIDE 12

Shipping information = SHIPVIA

Ship Via DHL Emery FedEx UPS US Mail

Ship Cost 1 1 1 5 5 5 5 1

IS 257 Fall 2006

2006.09.14 - SLIDE 13

Dive Equipment Stock= DIVESTOK


Item No DescriptionEquipment On Hand Reorder Point Class Cost Sale Price Rental Price 1 1 1Shotgun 1Snorkel - Clear 11 Snorkel 1 1 1 $55 .55 $11 .11 $111 . 1 1 1Shotgun 1Snorkel - Red 11 Snorkel 1 1 1 $11 .11 $11 .11 $111 . 1 1 1Shotgun 1Snorkel - Teal 11 Snorkel 1 1 1 $11 .11 $11 .11 $111 . 1 1 1Tri-Vent Mask - Clear 11 Mask 5 5 1 $11 .11 $1111 .1 $111 . 1 1 1Tri-Vent Mask - Red 11 Mask 1 1 1 $11 .11 $1111 .1 $111 . 1 1 1Tri-Vent Mask - Teal 11 Mask 1 1 1 $11 .11 $1111 .1 $111 . 1 1 1Quad Vision Mask - Clear 11 Mask 1 1 1 $11 .11 $11 .11 $111 . 1 1 1Quad Vision Mask - Red 11 Mask 1 1 1 $11 .11 $11 .11 $111 . 1 1 1Quad Vision Mask - Teal 11 Mask 1 1 1 $11 .11 $11 .11 $11 .11 1 1 1Sea Wing Fins - Clear 11 Fins 1 1 1 $11 .11 $1111 $11 .1 .11 1 1 1Sea Wing Fins - Red 11 Fins 1 1 1 $11 .11 $1111 $11 .1 .11 1 1 1Sea Wing Fins - Teal 11 Fins 1 1 1 $11 .11 $1111 $11 .1 .11 1 1 1Jet Fin - Black 11 Fins 1 1 1 $11 .11 $11 .11 $11 .11 1 1 1D111 Regulator 11 Second Stage 1 1 1 $1111 $1111 $11 .1 .1 .11 1 1 1G111 Regulator 11 Second Stage 1 1 1 $1111 $1111 $11 .1 .1 .11 1 1 1G111 Regulator 11 Second Stage 1 1 1 $1111 $1111 $11 .1 .1 .11

IS 257 Fall 2006

2006.09.14 - SLIDE 14

Dive Locations = DEST

Destination No Destination Name Avg Temp Avg Temp Spring Temp (F) Temp (C) Temp (F) TempTemp (F) Temp Winter Temp (F) Temp (C) (F) (C) Spring Summer Summer Fall (C) Fall (C) Winter Accomodations Life Night 1Cozumel 1 1 1 .1 1 11 1 1 5 .5 5 55 1 1 5 .5 5 55 1 1 1 .1 1 11 1 1 5 .5 5Cheap 55 Sleepy 1Great Barrier Reef1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1Moderate Pleasant 11 1Monterey 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1ExpensiveWild 11 1Santa Barbara 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1ExpensiveWild 11 1Florida 1 1 1 1 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1Moderate Pleasant 11 1Fiji 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1ExpensiveSleepy 11 1New J ersey 1 1 1 .1 1 11 1 1 1 .1 11 1 1 1 .1 1 11 1 1 1 .1 1 11 1 1 1 .1 1ExpensivePleasant 11

Body of Water Cost Travel Caribbean 55 55 Coral Sea 11 11 Pacific 11 11 Pacific 11 11 Caribbean 11 11 South Pacific 1 1 11 Atlantic 11 11

IS 257 Fall 2006

2006.09.14 - SLIDE 15

Dive Sites = SITE

Site No Destination Site Nam No e Site Highlight Site NotesDistance fromTown (km(m Visibility (ft Visibility (mCurrent Distance from Town h (ft)Depth ) ) Dept(m ) ) ) 11 11 1Palancar Reef Reef 1 5. 5 1 5 5 11 1 1. 1 1 1 55 5 5 . 5Strong 5 5 11 11 1Santa Rosa Reef Reef 1 1. 1 1 1 1 1 1 .11 11 11 1 1 . 1Strong 1 1 11 11 1Chancanab Reef Reef 1 11 .11 1 1 1 .11 11 11 1 1 . 1Mild 1 1 11 11 1Punta Sur Reef 1 1. 1 1 1 1 11 1 1 .11 11 11 1 1 . 1Strong 1 1 11 11 1Yocab Reef Reef 1 11 .11 1 1 1. 1 1 1 11 1 1 . 1Mild 1 1 11 11 1Heron Island Reef 1 1. 1 1 1 1 1 1 1 .11 11 11 1 1 . 1Mild 1 1 11 11 1Cod Hole Fish 1 1. 1 1 1 1 1 1 1. 1 1 1 11 1 1 . 1Mild 1 1 11 11 1Butterfly Bay Caves 1 1. 1 1 1 1 1 1 1 .11 11 1 1 1 . 1 None 1 1 1 11 11 1Wheeler Reef Marine Life 1 1. 1 1 1 1 1 1 1. 1 1 1 11 1 1. 1 1Mild 11 11 1Watanabe Marine Life 11 11 1 1. 1 11 1 1. 1 1 1 11 1 1 . 1None 1 1 11 11 1Point Lobos Marine Life 1 11 .11 1 1 1 .11 11 1 1 1 . 1None 1 1 11 11 1Macabee Beach Marine Life 1 .1 1 1 .11 1 1 1 .11 11 1 1 1 . 1 None 1 1 1 11 11 1Pinnacles Pinnacle 1 11 .11 1 1 1 .11 11 1 1 1 . 1Mild 1 1 11 11 1Monastery BeachMarine Life 1 11 .11 1 1 1. 1 1 1 1 1 1 . 1 Surge 1 1 1

Skill Level Interm ediate Interm ediate Beginning Advanced Beginning Interm ediate Beginning Advanced Beginning Interm ediate Beginning Beginning Beginning Beginning

IS 257 Fall 2006

2006.09.14 - SLIDE 16

Sea Life = BIOLIFE


Species No ategory Common Name Species Name Length (cm) C Length (in) Notes Graphic 1 1 1TriggerfishClown T riggerfish 11 Ballistoides conspicillum 1 .1 1 1 11 1 5 5 5Snapper Red Emperor Lutjanus sebae 55 1 1 .1 1 1 11 1 1 1Wrasse Giant Maori Wrasse 11 Cheilinus undulatus 1 1 5 .5 5 1 55 1 1 1Angelfish Blue Angelfish P omacanthus nauarchus 1 .1 1 11 1 11 1 1 1 1Cod 11 Lunartail Rockcod ariola louti V 1 5 .5 5 1 55 1 1 1Scorpionfish 11 Firefish P terois volitans 1 1 .1 1 1 11 1 1 1Butterflyfish 11 Ornate Butterflyfish Chaetodon Ornatissimus 1 1 1 1 .1 1 1 1 1 1Shark 11 Swell Shark Cephaloscyllium ventriosum .1 1 11 1 1 1 1 1 1 1Ray 11 Bat Ray Myliobatis californica 1 1 .1 1 1 11 1 1 1Eel 11 California Moray Gymnothorax mordax1 1 1 .1 1 1 11 1 1 1Cod 11 Lingcod Ophiodon elongatus 1 1 1 .1 1 1 11

IS 257 Fall 2006

2006.09.14 - SLIDE 17

BIOSITE -- linking relation


S pe cie s N o S ite 111 11 111 11 111 11 111 11 111 11 111 11 111 11 111 11 111 11 111 11 111 11
IS 257 Fall 2006

No 11 11 11 11 55 55 55 55 55 55 11 11 11 11 11 11 11 11 11 11 11 11
2006.09.14 - SLIDE 18

Shipwrecks = SHIPWRK

Ship Nam e Sit No Cat e egory Type Int erest Tonnage Lengt (ft h Lengt (m Beam(ft ) h ) Beam(m Cause ) ) Dat Sunk Com ent Survivors ondition e m s Passengers/Crew C Delaw are 11 Com ercialSt 11 m eamFreight Treasure er 11 11 11 5 . 55 1 5 5 5 1 1 1 . 11 Fire 1 1 1 1 1 1 1Broken F.S.Loop 11 Com ercialSt 11 m eamSchooner Machinery 11 1 11 1 . 11 1 1 1 1 1 1 1 . 11 Deliberat 1 1 1 e 1 /1 /1 1 1 Scatered t Gosford 11 Com ercialBarque Riggedure 11 m Fix Sail t 11 11 11 1 1. 1 1 1 1 1 1 1 . 11 Fire 1 1 1 Int act GreatIsaac 11 Com ercialSeagoing Tug t 11 m Fixure 11 11 11 1 1. 1 1 1 1 1 1 1 . 11 Collision 1 1 1 111 /1/1 1 1 1 1Int act Lizzie D 11 Com ercialTug/Rum Treasure 11 m runner 11 1 1 1 1 . 11 1 1 1 1 1 1 11 Unknow .11 n 1/ 1 1 1 / 11 1 1Int act Mohaw k 11 Passenger Ocean LinerTreasure 11 11 11 11 11 11 1 1. 1 1 1 1 1 . 11 Collision 1 1 1 111 /1/1 11 1 11Scatered 1 t R.P. Resor 11 Com ercialOil Tanker Treasure 11 m 11 11 11 11 1 1 1 . 1 1. 1 1 1 . 11 Milit 1 1 11 1 ary 111 /1/1 1 1 1Broken St of Scot ar land 11 Passenger Brit Q-Boat 11 ish Treasure 11 11 11 1 . 11 1 1 1 1 1 1 1 . 1 Weat 1 1 1 her 111 /1/1 1 1Broken Tolt en 11 Com ercialFreight 11 m er Fixure t 11 11 11 1 1. 1 1 1 1 1 1 1 . 11 Milit 1 1 1 ary 111 /1/1 1 1 1Int act USS Moody 11 Milit 11 ary WWI Dest Treasure royer 11 11 11 1 . 11 1 1 1 1 1 1 1 11 Deliberat .11 e 1 /1 /1 1 1 Int act Valiant 11 Passenger Luxury Mot Treasure 11 or Yacht 11 11 1 1. 1 1 . 11 1 11 1 1 1 1 11 Fire .11 1/ 1 1 1 / 11 1 1 1 1Int act

Graphi

IS 257 Fall 2006

2006.09.14 - SLIDE 19

Mapping to Other Models


Hierarchical
Need to make decisions about access paths

Network
Need to pre-specify all of the links and sets

Object-Oriented
What are the objects, datatypes, their methods and the access points for them

Object-Relational
Same as relational, but what new datatypes might be needed or useful (more on OR later)
IS 257 Fall 2006 2006.09.14 - SLIDE 20

Lecture Outline
Review Logical Model for the Diveshop database Normalization Relational Advantages and Disadvantages

IS 257 Fall 2006

2006.09.14 - SLIDE 21

Normalization
Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data Normalization is a multi-step process beginning with an unnormalized relation
Hospital example from Atre, S. Data Base: Structured Techniques for Design, Performance, and Management.
IS 257 Fall 2006 2006.09.14 - SLIDE 22

Normal Forms
First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)

IS 257 Fall 2006

2006.09.14 - SLIDE 23

Normalization

No transitive dependency between nonkey attributes All determinants are candidate keys - Single multivalued dependency

BoyceCodd and Higher

Functional dependency of nonkey attributes on the primary key - Atomic values only Full Functional dependency of nonkey attributes on the primary key

IS 257 Fall 2006

2006.09.14 - SLIDE 24

Unnormalized Relations
First step in normalization is to convert the data into a two-dimensional table In unnormalized relations data can repeat within a column

IS 257 Fall 2006

2006.09.14 - SLIDE 25

Unnormalized Relation
Patient # Surgeon # Surg. date Patient Name Patient Addr Surgeon Surgery Gallstone s remov al; Kidney stones remov al Eye C ataract remov al T hrombos is remov al Open H eart Surgery Postop drug rug side effects D Jan 1 , 55 ; June 55 5 , 55 555 1 N St. 1 ew N York, ew N Y Beth Little Michael Diamond

55 5 11 11 11 1

John White

Penicillin, none-

rash none

11 1 11 55 11 5

Apr 1 , 11 May 11 5 , 55 555

Mary Jones

11 55 11 5

Jan 1 , 11 11

Charles Brow n

11 11 11 1

Nov 1 , 11 11 May 1 ,1 11 11

Hal Kane

11 11 11 1

Paul Kosher

1 Main St. 1 R ye, N Y D ogwood Lane H arrison, N Y 1 Boston 1 Post R oad, C hester, C N Blind Brook Mamaronec k, N Y

Charles Field Patricia Gold

T etracyclin e none

Fev er none

Dav id Rosen

C ephalosp orin

none

Beth Little

Beth Little

11 11 11 1
IS 257 Fall 2006

Apr 1 , 11 D 11 ec 1 , 11 111

Ann H ood

H ilton R oad Larchmont, N Y

Charles Field

C holecyst ectomy Gallstone s R emov al Eye C ornea R eplacem ent Eye cataract remov al

D emicillin

none

none

none

T etracyclin e

Fev er

2006.09.14 - SLIDE 26

First Normal Form


To move to First Normal Form a relation must contain only atomic values at each row and column.
No repeating groups A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation.

IS 257 Fall 2006

2006.09.14 - SLIDE 27

First Normal Form


P tie t # a n S r e n# ug o S r eyDte ug r a P tie t N m a n a e P tie t A d a n dr S r e nNm ug o a e S r ey ug r Du a m r g d in S eEf c id fe ts 11 1 1 5 5 5 1- a1J n 1 1 1 N wS 1 e t. N w ok e Yr , N Y 1 N wS 1 e t. N w ok e Yr , N Y 1 MinS 1 a t. Re N y, Y 1 MinS 1 a t. Re N y, Y Dg o d o wo Ln ae H r is n ar o , N Y Glls n a to e s r mv l e oa K ny id e s ns to e r mv l e oa Ee y C taa t a rc r mv l e oa T r mo ho b s isr mv l e oa Oe pn H at er S r ey ug r

J h Wite on h

B thL e ittle M he ic a l D mn ia o d

P n illin e ic

rs ah

11 1 1

1 1 1

5- u5J n 5 5

J h Wite on h

nn oe T tr c c e a y lin e

nn oe

11 1 1

1 1 1

1 - p1A 1 r1

Mr J n s ay o e

C ale F ld h r s ie

F vr ee

11 1 1

5 5 5

1- a1My1 1

Mr J n s ay o e

P tr iaGld a ic o

nn oe

nn oe

11 1 1

5 5 5

1- a1J n 1 1

C ale hr s Bo n rw

D v Rs n aid o e

Cp a s e h lo p oin r

nn oe

11 1 1

1 1 1

1 - o1N1 v 1

Hl K n a ae

1 Bs n 1 o to P s Ra , ot od C e te, hs r C N B dBo k lin r o Mmr n c a ao e kN , Y H nR a ilto o d L r h o t, ac mn N Y H nR a ilto o d L r h o t, ac mn N Y

B thL e ittle

Co c s h le y t e to y c m Glls n a to e s R mv l e oa Ee y Cr e on a Rp c m e la e et n Ee y c taa t a rc r mv l e oa

Dm illin e ic

nn oe

11 1 1

1 1 1

1- a1My1 1

PuKs e al ohr

B thL e ittle

nn oe

nn oe

11 1 1

1 1 1

1 - p1A 1 r1

A nH o n od

C ale F ld h r s ie

T tr c c e a y lin e

F vr ee

11 1 1

1 1 1

1- e1D c1 1

A nH o n od

C ale F ld h r s ie

nn oe

nn oe

IS 257 Fall 2006

2006.09.14 - SLIDE 28

1NF Storage Anomalies


Insertion: A new patient has not yet undergone surgery -- hence no surgeon # -- Since surgeon # is part of the key we cant insert. Insertion: If a surgeon is newly hired and hasnt operated yet -- there will be no way to include that person in the database. Update: If a patient comes in for a new procedure, and has moved, we need to change multiple address entries. Deletion (type 1): Deleting a patient record may also delete all info about a surgeon. Deletion (type 2): When there are functional dependencies (like side effects and drug) changing one item eliminates other information.
IS 257 Fall 2006 2006.09.14 - SLIDE 29

Second Normal Form


A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key.
That is, every nonkey attribute needs the full primary key for unique identification

IS 257 Fall 2006

2006.09.14 - SLIDE 30

Second Normal Form


Patient # Patient Name Patient Address 5 New St. New 5 11 1 1John White York, NY 5 Main St. Rye, 5 55 5 5Mary Jones NY Charles Dogwood Lane 1 1 Brown 11 Harrison, NY 1 Boston Post 1 55 5 5Hal Kane Road, Chester, Blind Brook 11 1 1Paul Kosher Mamaroneck, NY Hilton Road 11 1 1Ann Hood Larchmont, NY

IS 257 Fall 2006

2006.09.14 - SLIDE 31

Second Normal Form


Surgeon # Surgeon Name

1 1Beth Little 1 5 5David Rosen 5 1 1Charles Field 1 1 1Michael Diamond 1 5 5Patricia Gold 5

IS 257 Fall 2006

2006.09.14 - SLIDE 32

Second Normal Form


Patient # 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 Surgeon # 55 5 11 1 11 1 55 5 55 5 11 1 11 1 11 1 11 1 Surgery Date Surgery Drug Admin Penicillin none Tetracycline none Cephalospori n Demicillin none none Tetracycline Side Effects rash none Fev er none none none none none Fev er Gallstones 1 -Jan- 1 remov 1 1 Kidney al stones 5 -Jun- 5 remov 5 5 al Eye Cataract 1 -Apr- 1 remov 1 1 al T hrombosis 1 -May- 1 remov 1 1 al Open Heart 1 -Jan- 1 Surgery 1 1 Cholecystect 1 -Nov1 omy 1 -1 Gallstones 1 -May- 1 Remov 1 1 al Eye cataract 1 -Dec- 1 remov 1 1 al Eye Cornea 1 -Apr- 1 Replacement 1 1

IS 257 Fall 2006

2006.09.14 - SLIDE 33

1NF Storage Anomalies Removed


Insertion: Can now enter new patients without surgery. Insertion: Can now enter Surgeons who havent operated. Deletion (type 1): If Charles Brown dies the corresponding tuples from Patient and Surgery tables can be deleted without losing information on David Rosen. Update: If John White comes in for third time, and has moved, we only need to change the Patient table
IS 257 Fall 2006 2006.09.14 - SLIDE 34

2NF Storage Anomalies


Insertion: Cannot enter the fact that a particular drug has a particular side effect unless it is given to a patient. Deletion: If John White receives some other drug because of the penicillin rash, and a new drug and side effect are entered, we lose the information that penicillin can cause a rash Update: If drug side effects change (a new formula) we have to update multiple occurrences of side effects.
IS 257 Fall 2006 2006.09.14 - SLIDE 35

Third Normal Form


A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes
When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency.

The side effect column in the Surgery table is determined by the drug administered
Side effect is transitively functionally dependent on drug so Surgery is not 3NF

IS 257 Fall 2006

2006.09.14 - SLIDE 36

Third Normal Form


Patient # Surgeon # Surgery Date 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
IS 257 Fall 2006

Surgery

Drug Admin Penicillin none Tetracycline none Cephalosporin Demicillin none none Tetracycline
2006.09.14 - SLIDE 37

55 5 11 1 11 1 55 5 55 5 11 1 11 1 11 1 11 1

1 -Jan-1 Gallstones removal 1 1 Kidney stones 5 -Jun-5 removal 5 5 1 -Apr-1 Eye Cataract removal 1 1 1 -May-1 Thrombosis removal 1 1 1 -Jan-1 Open Heart Surgery 1 1 1 -Nov-1 Cholecystectomy 1 1 1 -May-1 Gallstones Removal 1 1 1 -Dec-1 Eye cataract removal 1 1 Eye Cornea 1 -Apr-1 Replacement 1 1

Third Normal Form

Drug Admin Cephalosporin Demicillin none Penicillin Tetracycline


IS 257 Fall 2006

Side Effects none none none rash Fever


2006.09.14 - SLIDE 38

2NF Storage Anomalies Removed Insertion: We can now enter the fact that a particular drug has a particular side effect in the Drug relation. Deletion: If John White recieves some other drug as a result of the rash from penicillin, but the information on penicillin and rash is maintained. Update: The side effects for each drug appear only once.
IS 257 Fall 2006 2006.09.14 - SLIDE 39

Boyce-Codd Normal Form


Most 3NF relations are also BCNF relations. A 3NF relation is NOT in BCNF if:
Candidate keys in the relation are composite keys (they are not single attributes) There is more than one candidate key in the relation, and The keys are not disjoint, that is, some attributes in the keys are common

IS 257 Fall 2006

2006.09.14 - SLIDE 40

Most 3NF Relations are also BCNF Is this one?


P atient # P atient Nam eP atient A ddress 5 New S t. New 5 1 1 John W hite Y ork, NY 11 5 M ain S t. Rye, 5 5 5 M ary Jones NY 55 Charles Dogwood Lane 1 1 B rown 11 Harrison, NY 1 B os ton P ost 1 5 5 Hal K ane 55 Road, Chester, B lind B rook 1 1 P aul K osher M am aroneck, NY 11 Hilton Road 1 1 A nn Hood Larc hm ont, NY 11
IS 257 Fall 2006 2006.09.14 - SLIDE 41

BCNF Relations
Patient # Patient Name P atient # P atient A ddres s 5 New S t. New 5 1 1 Y ork, NY 11 5 M ain S t. Ry e, 5 5 5 NY 55 Dogwood Lane 1 1 Harris on, NY 11 1 B os ton Post 1 5 5 Road, Ches ter, 55 B lind B rook 1 1 M am aronec k, NY 11 Hilton Road 1 1 Larc hm ont, NY 11
2006.09.14 - SLIDE 42

11 1 1John White 55 5 5Mary Jones Charles 5 5 Brown 55 55 5 5Hal Kane 11 1 1Paul Kosher 11 1 1Ann Hood
IS 257 Fall 2006

Fourth Normal Form


Any relation is in Fourth Normal Form if it is BCNF and any multivalued dependencies are trivial Eliminate non-trivial multivalued dependencies by projecting into simpler tables

IS 257 Fall 2006

2006.09.14 - SLIDE 43

Fifth Normal Form


A relation is in 5NF if every join dependency in the relation is implied by the keys of the relation Implies that relations that have been decomposed in previous NF can be recombined via natural joins to recreate the original relation.

IS 257 Fall 2006

2006.09.14 - SLIDE 44

Effectiveness and Efficiency Issues for DBMS Focus on the relational model Any column in a relational database can be searched for values. To improve efficiency indexes using storage structures such as BTrees and Hashing are used But many useful functions are not indexable and require complete scans of the the database
IS 257 Fall 2006 2006.09.14 - SLIDE 45

Example: Text Fields


In conventional RDBMS, when a text field is indexed, only exact matching of the text field contents (or Greater-than and Lessthan).
Can search for individual words using pattern matching, but a full scan is required.

Text searching is still done best (and fastest) by specialized text search programs (Search Engines) that we will look at more later.
IS 257 Fall 2006 2006.09.14 - SLIDE 46

Normalization
Normalization is performed to reduce or eliminate Insertion, Deletion or Update anomalies. However, a completely normalized database may not be the most efficient or effective implementation. Denormalization is sometimes used to improve efficiency.

IS 257 Fall 2006

2006.09.14 - SLIDE 47

Normalizing to death
Normalization splits database information across multiple tables. To retrieve complete information from a normalized database, the JOIN operation must be used. JOIN tends to be expensive in terms of processing time, and very large joins are very expensive.

IS 257 Fall 2006

2006.09.14 - SLIDE 48

Downward Denormalization
Before:
Customer ID Address Name Telephone

After:

Customer ID Address Name Telephone

Order Order No Date Taken Date Dispatched Date Invoiced Cust ID

Order Order No Date Taken Date Dispatched Date Invoiced Cust ID Cust Name
2006.09.14 - SLIDE 49

IS 257 Fall 2006

Upward Denormalization
Order Order No Date Taken Date Dispatched Date Invoiced Cust ID Cust Name Order Item Order No Item No Item Price Num Ordered Order Order No Date Taken Date Dispatched Date Invoiced Cust ID Cust Name Order Price Order Item Order No Item No Item Price Num Ordered
2006.09.14 - SLIDE 50

IS 257 Fall 2006

Denormalization
Usually driven by the need to improve query speed Query speed is improved at the expense of more complex or problematic DML (Data manipulation language) for updates, deletions and insertions.

IS 257 Fall 2006

2006.09.14 - SLIDE 51

Using RDBMS to help normalize


Example database: Cookie Database of books, libraries, publisher and holding information for a shared (union) catalog

IS 257 Fall 2006

2006.09.14 - SLIDE 52

Cookie relationships

IS 257 Fall 2006

2006.09.14 - SLIDE 53

Cookie BIBFILE relation


ACCNO A111 T555 C111 B111 B111 B111 B111 B111 B111 B111 B111 B111 B111 B111 B111 F111 B111 S111 B111 B111 C111 C111 C111 AUTHOR TITLE LOC PUBID DATE PRICE PAGINATION ILL HEIGHT AMERICAN LIBRARY ASSOCIATION ALA BULLETIN CHICAGO 5 5 $111 1V. . 1 ILL. 1 1 ANDERSON, THEODOREHE TEACHING OF MODERN LANGUAGES T PARIS 1 1 11 1 1 $11 1 1P. .11 1 1 1 AXT, RICHARD G. COLLEGE SELF STUDY : LECTURES ON INSTITU 1 BOULDER, CO. 1 1 11 1 $111 111 GRAPHS . X, P. 1 1 BALDERSTON, FREDERICK E. MANAGING TODAYS SAN FRANCISCO 1 UNIVERSITY 1 11 11 $111 . XVI, 111 P. 1 1 BARZUN, JACQUES TEACHER IN AMERICA GARDEN CITY 1 1 11 11 $111 1 P. . 11 1 1 BARZUN, JACQUES THE AMERICAN UNIVERSITY : HOW IT1 NEW YORK RUNS, 1 1 1 W 1 1 $111 111 . XII, P. 1 1 BARZUN, JACQUES THE HOUSE OF INTELLECT NEW YORK 1 1 11 11 $111 . VIII, 111 P. 1 1 BELL, DANIEL THE COMING OF POST-INDUSTRIAL SOCIETY 1 1 NEW YORK 1 1 : 1 1 $11 XXVII, 111 .11 P. 1 1 BENSON, CHARLES S. IMPLEMENTING THE SAN FRANCISCO 1 LEARNING SOCIETY 1 11 11 $111 . XVII, 111 P. 1 1 BERG, IVAR EDUCATION AND JOBS : THE GREAT TRAINING1 1 $11 XX, 111 BOSTON 1 1 11 .11 P. 1 1 BERSI, ROBERT M. RESTRUCTURING THE BACCALAUREATE WASHINGTON, D.C. 1 1 11 1 1 $11 IV, 111 .11 P. 1 1 BEVERIDGE, WILLIAM I. HE ART OF SCIENTIFIC INVESTIGATION T NEW YORK 1 1 11 1 1 $11 XIV, 111 .11 P. 1 1 BIRD, CAROLINE THE CASE AGAINST NEW YORK COLLEGE 1 1 11 1 1 $11 XII, 111 .11 P. 1 1 BISSELL, CLAUDE T. THE STRENGTH OF THE UNIVERSITY 1 TORONTO 1 11 1 1 $11 VII, 111 .11 P. 1 1 BLAIR, GLENN MYERS EDUCATIONAL PSYCHOLOGY NEW YORK 1 1 11 1 1 $11 1 1P. .11 1 1 1 BLAKE, ELIAS, JR. THE FUTURE OF THE BLACK COLLEGES CAMBRIDGE, MA.1 1 11 1 1 $11 VIII, PP. 111 .11 1 1 BOLAND, R.J. CRITICAL ISSUES IN CHICHESTER, ENG. INFORMATION SYSTEMS1 1 1 1 R 1 $11 XV, 111 ILL. 1 .11 P. 1 1 BROWN, SANBORN C., SCIENTIFIC MANPOWER ED. CAMBRIDGE, MASS. 1 1 11 11 $111 111 . X, P. 1 1 BUCKLAND, MICHAEL K. LIBRARY SERVICES ELMSFORD,AND CONTEXT 1 IN THEORY NY 1 1 1 1 $11 XII, 111 ILL. 1 .11 P. 1 1 BUDIG, GENE A. ACADEMIC QUICKSAND : SOME TRENDS AND 1 1 $11 1 P. LINCOLN, NEBRASKA 1 1 11 ISS .11 1 1 1 CALIFORNIA. DEPT. OF JUSTICE SCHOOL LAW IN THE MONTCLAIR, N.J. 1 1 11 11 $111 11 . IV, P. 1 1 CAMPBELL, MARGARET A. WOULD A GIRLOLD INTO MEDICINE? WHY GO WESTBURY, 1 N.Y. 1 1 1 11 $111 111 . V, P. 1 1 CARNEGIE COMMISSION DIGEST OF REPORTS OF THE CARNEGIE COMM A ON HIGHER NEW YORK 1 1 11 11 $111 1 P. . 11 1 1

IS 257 Fall 2006

2006.09.14 - SLIDE 54

How to Normalize?
Currently no way to have multiple authors for a given book, and there is duplicate data spread over the BIBFILE table Can we use the DBMS to help us normalize? Access example

IS 257 Fall 2006

2006.09.14 - SLIDE 55

Database Creation in Access


Simplest to use a design view
wizards are available, but less flexible

Need to watch the default values Helps to know what the primary key is, or if one is to be created automatically
Automatic creation is more complex in other RDBMS and ORDBMS

Need to make decision about the physical storage of the data


IS 257 Fall 2006 2006.09.14 - SLIDE 56

Database Creation in Access


Some Simple Examples

IS 257 Fall 2006

2006.09.14 - SLIDE 57

Lecture Outline
Review Logical Model for the Diveshop database Normalization Relational Advantages and Disadvantages

IS 257 Fall 2006

2006.09.14 - SLIDE 58

Advantages of RDBMS
Relational Database Management Systems (RDBMS) Possible to design complex data storage and retrieval systems with ease (and without conventional programming). Support for ACID transactions
Atomic Consistent Independent Durable
IS 257 Fall 2006 2006.09.14 - SLIDE 59

Advantages of RDBMS
Support for very large databases Automatic optimization of searching (when possible) RDBMS have a simple view of the database that conforms to much of the data used in business Standard query language (SQL)

IS 257 Fall 2006

2006.09.14 - SLIDE 60

Disadvantages of RDBMS
Until recently, no real support for complex objects such as documents, video, images, spatial or time-series data. (ORDBMS add -- or make available support for these) Often poor support for storage of complex objects from OOP languages (Disassembling the car to park it in the garage) Usually no efficient and effective integrated support for things like text searching within fields (MySQL does have simple keyword searching now with index support)
IS 257 Fall 2006 2006.09.14 - SLIDE 61

Next Week
Database Design Workshop

IS 257 Fall 2006

2006.09.14 - SLIDE 62

You might also like