DATABASE DESIGN

12 3 •Data are the most stable part of an

Observations about DATA

ab c

organization’s information system •Permanent data are stored in tables within a database •Permanent storage of data is also referred to as persistent data
xy 78 9

Why do we need database 12 design? 3

ab c •A quality I.S. demands a quality db design

•Avoid redundancy (duplication) of data •Insures simple db structures which allow for maximum effective utilization of the data

xy

78 9

Analysis to Design (Logical model to Physical model)
Student iD name Major code name Analysis (Logical )

Student iD name majorCo de

Major code name

Design (Physical) note: majorCode is a synonym for code

Example of Duplicate Data
First Name Last Name
Adam s Adam s Adam s Bake r Bake r Bake r Bake r L e L e

(notice the redundancy in the data values)

J ohn J ohn J ohn Susa n Susa n Susa n Susa n Ki m Ki m

123-456789 123-456789 123-456789 987-654321 987-654321 987-654321 987-654321 789-123456 789-123456

Student ID

Course Taken
IDS306 IDS406 IDS315 IDS250 IDS315 IDS306 IDS480 IDS180 IDS250

Grad e
B A B + A A -B B A A

Distribute the data into 2 tables
(notice the reduction in redundancy)

First N ame J
ohn Susa n Ki m

Last N Student ame ID Adam 123-45-6789
s Bake r L e 987-65-4321 789-12-3456

Student 123-45-6789 ID
123-45-6789 123-45-6789 987-65-4321 987-65-4321 987-65-4321 987-65-4321 789-12-3456 789-12-3456

Course Taken
IDS306 IDS406 IDS315 IDS250 IDS315 IDS306 IDS480 IDS180 IDS250

Grad e B
A B + A A -B B A A

Foreign Key

Hierarchical Components of Persistent Data
B its 011100 01 B ytes Attribut es
Last Name Norman Social

A, B, ... Z, 0,1...9, #, &, $, etc...

Templa te First Name

Middle Initial Security Number State Ronald J 65-8213 CA

559-

Values, states, or instances
First Name Middle Initial Security Number State Last Name Social

Records

(each row is a record)

Ronal d Rash mi Jame

J

Norm an Kuma r Logan

559-658213 371-484562 559-63-

C A MI O R

B R L

TABLES (Individual Files or all part of a database)
Table #1 Student Informat ion
First Name Number Ronal Middle Initial Last Name Social Security State C Norm 559-65J

d Rash mi Jame

B R L

an Kuma r Logan

8213 371-484562 559-63-

A MI O R

Table #2 Course Informat ion

Course Number Course Name Department Act10 Accounting
2 Bio10 1 Chm1 09 Eco10 4 Department AccountiMajors No. of Eng10 0 ng Biology MIS11 Chemist ry Economi cs English M.I.S. Marketi ng

Units

Table #3 Departm ent Informat

3 Accounti Principles 3 ng Intro to Biology 3 Biology Organic 3 Chemist Chemistry 3 ry Macro Economics 3 Economi Beginning 3 cs English English Department Head 1 Telephone 3 M.I.S. 59427 J. Intro. to Morgan 3 Marketi 2348 5 S.Computers Principles of 594ng 11 Tishman 4459 0 P. Dayson 59412 R. Kumar 7728 0 J. Amar 594K. 0923 75 Kettlema 594n 8276 60 A. 59417 Winters

Seven Table (file) Types
• Master • Transaction • “Table” • Temporary • Log • Mirror • Archive

Master Table reference (foundational) data for the information system

Student Master Table
Social Security First Number Name etc....... Jim
Mary Mind er

Middle Initial
R J

Last Name
Thom as Wilso n

Zipcode Telephone
9 194 2 9 4643782 5712190 e tc.. . e

123-456789 321-546638

Transaction Table holds the business activity for the information system
Course Registration Transaction Table
Section # 559680 843 525987 391 371234 959 559680 843

Course Course Serial # Number Course Semester Date/Time Eng10 5 1 0 2 029 MIS11 2 4 1 1 2 Act10 5 983 2 3 2 Soc11 2 4 8 219

Transact ion Student # S pr9 5 S pr9 5 S pr9 941115/1 202 941115/1 202 941115/1 202 941115/1 203

“Table” Table Static (relatively) table of values
State Code State Name

State Code Table
Alabam a Arizona Califor nia Colorad o

Sales Tax Code TableRange Sale
Sales Tax

AL

.00 .09 .10 .24 .25 .39 .40 .54 .55 .69

.0 0 .0 1 .0 2 .0 3 .0 4

AZ CA CO W Y

Temporary Table - created and used briefly OR over an extended period of time to help the information system accomplish its intended purpose Log Table - contains copies of Master and Transaction table records for audit, statistical, and recovery purposes Mirror Table - an exact copy of one of the other types of tables used to minimize or eliminate information Archive Table - a historical copy of a master, transaction, “table”, or log table

DATABASE DESIGN
•Database = one or more related tables (files) •Folder = Metaphor for holding a database •Data Structures - another name for records •Simplicity •Non-redundancy •Data Structure Modeling: •Entity-Relationship Diagrams •Object Models: •Generalization-Specialization Structure •Whole-Part Object Connection w/constraints •Object Connection w/constraints

Attribute (field) Types
•Key - used to identify & find one or more records in a table (file) •Primary - unique; identifies one specific record; table may need to combine two or more attributes to accomplish this (Examples: customer #, student #, VIN #, UPC #) •Secondary - non-unique - may identify multiple records; another way to identify one or more records in a file (Examples: customer name, zip code, city, last name) •Foreign - attributes added to a table to associate a record in the table with one or more records in one or more OTHER tables (Example: “Courses Taken” table has a student # in it) •Descriptor - characteristics that describe the data; some of these attributes are used for Audit & Control purposes, Security purposes, or programmer consistency & control purposes

Key Examples •Student Account Number Primary (unique)

•Bank Account Number •Vehicle ID Number •Credit Card Number •University Course Schedule Number •University Course Number + Section Number •Student Last Name •Vehicle Type •State •Zipcode •Student Account Number -----> Courses Taken •Vehicle Type -----> Description of this Type •State -----> Table of State Codes & Descriptions •City ---> Table of valid zip codes for each city

Secondary (non-unique) Foreign (association)

Key Attribute Examples
Key Attribute Name State) Example
Student ID Number Social Security Number Vehicle ID Number Course Number VISA Card Number Checking Account Number

Instance (Value or
68372 559-68-0923 JA3XC52BONY002 400 MIS-111 4128 0022 2048 2552 128-0049

Foreign Key Example
Student Name Student ID Number Student ID Number Course Number 371-48557-33Bio10 Adams 4326 5849 1 Jones 559-62243-98Bio10 Kumar 0987 7615 1 Lopez 243-98558-97Bio10 Norma 7615 8221 1 n 337-89371-48Eng1 Smith 6212 4326 03 Zumw 298-88Eng1 7643 03 557-33MIS1 5849 11 558-97MIS1

Student Information Table* Information Table*

Course

Foreign Key

* Note: Both of these tables would have additional attributes (columns)

Seven Table (file) Types
• • • • • • • Master Transaction “Table” Temporary Log Mirror Archive

These different types of tables have access and organization needs/requirements…next page

Table Access & Organization
Table Access: Method of reading or writing records •Sequential - first to last, vice versa •Direct - any record Table Organization: Method of storing records •Serial - based on arrival time of data •Sequential - based on sorted attribute(s) •Relative or Direct - based on an algorithm •Indexed - based on maintaining a sorted index of attribute values separate from the data

Serial File Organization
E-Mail InBox File
Date 11/28/ 97 11/28/ 97 12/01/ 97 12/01/ 97 0 9:1 2 1 1:5 5 1 0:1 6

1 2 3 4 5 6

From Subject Dean Presid ent JSmith MChen Dean KHadd

Time New Enroll Discrim. Policy Grade in Class Research Paper Faculty

Based on arrival date & time attributes

Sequential File Organization
Table ordered by Student ID Number Student ID Number Student Name Smith, Fred 102-589762 Baker, Jane Haddad, Kamal Chang, Minder Rice, Jerry 204-787652 371-484133 450-229611

Table ordered by Student (Last) Name Student ID Number Student Name 204-78Baker, Jane
7652 450-229611 371-484133 558-566749 Chang, Minder Haddad, Kamal Favre, Brett Rice, Jerry

Student Master Table ordered by Student ID Number

Student ID Number Student Name 102-58Smith, Fred
9762 204-787652 371-484133 450-229611 Baker, Jane Haddad, Kamal Chang, Minder Rice, Jerry

Insertion of new records in a Sequential
NEW Student Master Table ordered by Student ID

Student ID Number Student Name 102-58Smith, Fred
9762 204-787652 298-730912 371-484133 450-229611 557-38Baker, Jane Jackson, Janet Haddad, Kamal Chang, Minder Rice, Jerry Carey,

Insert new students: 298-73-0912
Jackson, Janet 557-93-8247

A discussion of the Direct (Relative) Table Organization Method is in the text

Conceptual Model of an Index Table Organization Student ID #
102-58-9762 4 204-78-7652 6 298-73-0912 3 371-48-4133 1 450-22-9611 8
Index
Student ID # Name Etc...

Student Master Table
Student

1 2 3 4 5 6 7 8

371-48-4133 Haddad, Kamal 557-93-8247 Carey, Mariah 298-73-0912 Jackson, Janet 102-58-9762 Smith, Fred 558-56-6749 Favre, Brett

Note: This Table will normally have dozens of attributes. 1. Search Student Index Table to find Student ID Number. 2. Get Pointer Value and access that record in Student Master Table to

Relational Database Normalization

Relational Database Normalization
“The process of simplifying complex data structures so that the resulting data structures will be more easily maintained and more flexible to meet present and future needs of the user.” (Norman, 1996)

Relational Database Normalization
“… data analysis uses a procedure called normalizationto simplify entities, eliminate redundancy, and build flexibility into the data model.” (Whitten, 1989)

Why Normalization?
• Find entities (tables) • Avoid anomalies

Sample Data

Deletion Anomalies
• Deletion anomalies: When a value for one attribute is unexpectedly removed when a value for another attribute is deleted. • E.g. deleting row 3 results in the ‘loss’ of the CS major

Update Anomalies
• Update anomalies: In order to effect a change to a single attribute, changes to multiple rows of a table must be made. • E.g. Rows 4-6 must be changed to accommodate a name change for ‘Mary’.

Insert Anomalies
• Insert anomalies: Need to store a value for an attribute but cannot because the value for another attribute is unknown. • E.g. cannot add a complete record for ‘Ron’, until he completes a class and receives a grade!

E. F. Codd
• Each attribute is dependent on the key, the whole key, and nothing but the key, … so help me Codd

Order Number Customer Number Customer Name Street Address City Product Number 1 2 3 4 5 6 7 Product Name

ABC Incorporate

Order Date

Stat e Color Unit Price

Zip Code Quantity Total Price

Come to ABC Incorporated for all your technology needs. Thank you for your patronage. You are a valued customer.

ORDER TOTAL SALES TAX SHIPPING GRAND TOTAL

Relational Database Normaliza tion
2. Remove non-key attributes that are not fully, functionally dependent on all attributes in

Unnormaliz ed Data

Data Structure in First Normal Form

1. Remove Attributes that can have multiple

Data Structure in Second Normal Form

4th Normal Form Boyce-Codd NF 5th Normal

Data Structure in Third Normal

3. Remove attributes that are uniquely identified by another non-key

Sales Order Class with

SalesOr orderNumber der (primary
key) orderDate

customerNumber customerName customerAddress customerCity customerState customerZipcode For each product ordered (up to 7) productNumber productName productColor productUnitPrice productQuantity productTotalPrice (derived) orderTotal (derived) orderTax (derived)

servi ces

SalesOrder and ProductsOrdered Classes with Objects in First N.F.
SalesOr der orderNumber
(primary key) orderDate customerNumber customerName customerAddress customerCity customerState customerZipcode orderTotal (derived) orderTax (derived) orderDelivery (derived)

1,7

1. Remove Attributes that can have multiple

1 ProductsOrd ered orderNumber (primary
key) productNumber (primary key) productName productColor productUnitPrice productQuantity

servi ces

servi ces

Order Number Customer Number Customer Name Street Address City Product Number 1 IC2 PENT 3 PS-220 4 KB-102 5 MO6 7
675 San Diego 3 482

ABC Incorporate
53 4 Norman Business Systems, Inc. 7150 University Blvd., Suite 218

Order Date
12/02/ 97

Stat e

C A

Zip Code Unit Price
$6 75 $1 50 $ 75

Product Name
Intel Pentium CPU 220 V. Power Supply 102-key Keyboard

9 210

Color
B n Sl T n T

Quantity
1 1 1 2 1

Total Price
$6 75 $1 50 $ 75

Come to ABC Incorporated for all your technology needs. Thank you for your patronage. You are a valued customer.

ORDER TOTAL SALES TAX SHIPPING GRAND TOTAL

$1,3 55 $ 95 $ 25 $1,4 75

SalesOr der orderNumber
(primary key) orderDate customerNumber customerName customerAddress customerCity customerState customerZipcode orderTotal (derived) orderTax (derived) orderDelivery (derived) 5 1

34820 12/02/97 534 Norman Business Systems 7150 University Ave., Suite 218 San Diego CA 92108 1355 95 25

ProductsOrd ered orderNumber (primary key)
productNumber (primary key) productName productColor productUnitPrice productQuantity

34820 34820 PS-220 IC-PENT etc... Intel Pentium CPU Sl Bn 150 675 1 1

34820 KB-102 etc... Tn 75 1

34820 MO-675 etc... Tn 65 2

34820 HD-550 etc... Sl 325 1

Sample Objects for SalesOrder and ProductsOrdered

Sample ProductsOrdered Objects for Several SalesOrders
ProductsOrd ered
34820 KB-102 etc... Tn 75 1 34820 MO-675 etc... Tn 65 2 34820 HD-550 etc... Sl 325 1

orderNumber (primary key) productNumber (primary key) productName productColor productUnitPrice productQuantity

34820 IC-PENT Intel Pentium CPU Bn 675 1

34820 PS-220 etc... Sl 150 1

servi ces

(continued)

34821 IC-80486 Intel 80486 CPU Bn 325 10

34821 PS-220 220 V. Power Supply Sl 150 3

34822 KB-102 102-key Keyboard Tn 75 4

34823 IC-80486 Intel 80486 CPU Bn 325 2

34823 HD-550 etc... Sl 325 3

SalesOr orderNumber der
(primary key) orderDate

Sales Order Data Structure in Second Normal
1, 7

customerNumber customerName customerAddress customerCity customerState customerZipcode orderTotal (derived) orderTax (derived) orderDelivery (derived)

servi ces

1
0, m

2. Remove non-key attributes that are not fully, functionally dependent on all attributes in

Produ productNumber ct
(primary key) productName productColor productUnitPrice

ProductsOrd ered orderNumber (primary
1
key) productNumber (primary key) productUnitPrice productQuantity

servi ces

servi ces

SalesOr der orderNumber
(primary key) orderDate customerNumber customerName customerAddress customerCity customerState customerZipcode orderTotal (derived) orderTax (derived) orderDelivery (derived)

Sample Objects For Second Normal Form Sales
1,m

1

servi ces

ProductsOrd ered orderNumber (primary key)

etc. .... 34820 IC-PENT 675 1

productNumber (primary key) productUnitPrice productQuantity

Produ productNumber ct

(primary key) productName productColor productUnitPrice servi

IC-80486 Intel Pentium CPU Bn 675

PS-220 220 V. Power Supply Sl 150

KB-102 102-key Keyboard Tn 75

MO-675 Mouse Serial Tn 65

HD-550 550 MB HD Sl 325

ces

SalesOr der orderNumber
(primary key) orderDate customerNumber orderTotal (derived) orderTax (derived) orderDelivery (derived)

1 0,m 1,m

Custo customerNumber (primary key) mer
customerName customerAddress customerCity customerState

servi ces

servi ces

Produ productNumber ct
(primary key) productName productColor productUnitPrice

3. Remove attributes that are uniquely identified by another non-key 0,m 1

1

ProductsOrd ered orderNumber (primary key)
productNumber (primary key) productUnitPrice productQuantity

servi ces

servi ces

Sales Order Data Structure in Third

SalesOr der

Order Order Customer OrderDelivery OrderGrand Number Date Number 34820 12/02/95 534 25 1475 34821 15 12/02/95 871 7719

OrderTotal (derived) 1355 7200

OrderTax (derived) 95 504

ProductsOrd ered

Produ ct

Custo mer

OrderNumber ProductNumber ProductUnitPrice ProductQuantity (deriv ProductTotalPrice ed) 1 34820 IC-PENT 1 67 675 1 5 34820 PS-220 2 150 1 15 34820 KB-102 1 0 75 0 34820 MO-675 3 75 65 4 34820 HD-550 ProductNumber ProductName ProductColor ProductUnitPrice IC-PENT Intel Pentium CPU Bn 675 IC-80486 Intel 80486/DX4 CPU Sl 325 HD-550 550 MB Hard Disk Sl 325 HD-1GBCustomer Hard Disk 1-GB Sl Customer Customer 550 Customer Cust Customer KB-102 Name102-key Keyboard Tn Number Address 107 Chips ‘N Bits 824 E. Main Street Pasadena CA 92875 290 Computers 4 U 925 W. Broadway Avenue Tucson AZ 85721 534 Norman Business Systems 7150 University Ave., Suite 218

(remove multi-valued ABE primary F prima C keys ry CD key AC CD D AB D DE C AC F D AC Conversion to Second Normal Form (Remove non-key attributes not fully,
functionally dependent on all attributes in the key

Normalization Conversion toSummary First Normal Form
Conversion to Third Normal Form (Remove attributes uniquely
identified by another non-key attribute (transitive dependencies)

A C

B

primary key

AB C

ABC D
primary keys = dependen

primary keys

A D

A B

primary key = dependen

B C

Normalization Example
Course Registration Record Id _________ Name __________ Address ___________________ _____________________ Course Request List Course Title Units Grade ____________________________ ____________________________ ____________________________ Year ________ Term ______ Class Level ___ Fees _______

Why Object-Oriented Database Management Systems?
•OODB supports new types of applications that no relational, network, or hierarchical database system is well suited. •Object-oriented languages are rapidly gaining acceptance, and OODB has proven to be able to support the persistent data needs better than the conventional record-based database models (relational, network, and hierarchical). •The majority of conceptual language-design work from objectoriented programming languages carries over easily to OODB. •Information systems are becoming more and more rigorous and sophisticated.

Object-Oriented Data Model
Traditional Database Systems •Persistence •Sharing •Query Language •Transaction Processing Semantic Data Model •Aggregation •Generalization ObjectOriented Programming •Complex objects •Object identity •Classes & Methods •Encapsulation •Inheritance •Extensibility

Object-Oriented Data Model

Common Characteristics of an Object Data Model •Supports the representation of complex objects •Extensibility; allows the definition of new data types as well as operations that act on them •Encapsulation of data and methods •Inheritance of data and methods from other objects •Object identity

The Object-Oriented Database Management System Manifesto Rules
The system must: 1. Support complex objects 2. Support object identity 3. Allow objects to be encapsulated 4. Support types or classes 5. Support inheritance 6. Avoid premature binding 7. Be computationally complete 8. Be extensible 9. Be able to remember data locations 10. Be able to manage very large databases 11. Accept concurrent users 12. Be able to recover from hardware/software failures 13. Support data query in a simple way

Strengths and Weaknesses of an OODB
1. Data Modeling 2. Non-homogenous data 3. Variable length and long strings 4. Complex objects 5. Version control 6. Schema evolution 7. Equivalent objects 8. Long transactions 9. User Benefits 1. New problem solving approach 2. Lack of a common data model with a strong theoretical foundation 3. Limited success stories

Strengths Weaknesses