You are on page 1of 26

Principles of Database Design

NLM/MBL Medical Informatics

Session Outline
 Why

learn this?  Database Principles and Paradigms  Principles of Relational Database Design  System design and building methods  Exercise: Transforming flat files to tables
NLM/MBL Medical Informatics

Why Learn about Database Design?


 Vendors

will sell you on user interfaces, but the power and flexibility is in the data model  Evaluating and comparing products  Communicating with vendors and IT support staff  Building your own databases

NLM/MBL Medical Informatics

What is a Database?
 An

organized collection of information

Computer-based representation Systematic, automated retrieval Systematic, automated symbol manipulation

NLM/MBL Medical Informatics

Historical Evolution of Databases


 Dedicated

files created & maintained by application software (sequential, random access)  Database Management Systems (DBMSs)

NLM/MBL Medical Informatics

Hierarchical Databases
Lab Results Serum Na+ 5/30/96 Pt=Smith
Advantages: efficient storage and I/O, rapid access via predetermined data hierarchies Disadvantages: difficult to view/retrieve data from other perspectives, hard to modify underlying structure NLM/MBL Medical Informatics

Information Network Databases


Database as Hypertext

Advantages: Can model complex many-to-many relationships as well as hierarchies and simple lists Disadvantages: difficult to predict & control effects of transitive relationships; recursion; I/O intensive, potential to become incomprehensible NLM/MBL Medical Informatics

Relational Databases
Rows & Columns with inter-table references Patient Pt-UI 12345 12346 12347 12348 12349 Lname Fname Smith Elmer Jones Barbara Clark Arthur Jones Casey Sample Steve Lab_test Pt-UI 12345 42353 47756 12348 34523 Testname Serum_Na CBC ESR HBsAg Amylase Date 5/30/96 5/30/96 5/30/96 5/30/96 5/30/96

Advantages: Understandable, permits variety of logical aggregation or views of data elements, structure easily modifiable, new elements generally do not break existing programs Disadvantages: I/O intensive, 1 logical record may = many physical records, relational integrity is a constant concern & must be under software control NLM/MBL Medical Informatics

Object-Oriented Databases
 Multiple

data types including text, graphics, sound, signals, etc.  Encapsulation of data & programs  Interprocess messaging: e.g., Print Yourself
Advantages: applications programs consist of high level commands & functions which do not need to know the underlying data organization; modularity, reusability and portability between systems Disadvantages: early in commercialization; CPU intensive; few standards for query & object sharing NLM/MBL Medical Informatics

Fundamental Assertions about Systems Design


 The

Data Model is the most critical aspect of system design and function  Data Models should reflect real world objects and their relationships to ensure durability  A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up

NLM/MBL Medical Informatics

Object-oriented Systems design: Basic Concepts




The World contains Things e.g., Collies, Terriers, Bloodhounds We develop abstractions of things called objects e.g., dog We group objects by criteria which represent the abstract object as an empty table

Dog Name

Breed

Favorite Food

Birthdate

NLM/MBL Medical Informatics

Basic Concepts, contd


 Empty

tables can be filled in to represent the real world things from which the object was abstracted
Breed St. Bernard Poodle Pomeranian Favorite Food Canned Dry Canned Birthdate Jan 81 May 92 Apr 87

Dog Name Boris Fifi Fido

NLM/MBL Medical Informatics

Basic Concepts, contd


 There

are Relationships between objects which are attributes of those objects


License Owner Name Lic. Date

Dog Name

Relationship: OWNS Dog Owner OWNS Dogs

Owner Name Address

Phone

NLM/MBL Medical Informatics

Objects
 All

of the real-world things in the set (the instances) have the same characteristics  All instances conform to the same rules
So that... License 123 ABC 691XKY 12-A-962 LICENSE Exp. Date Jan. 97 Mar.98 Apr.98 Manufacturer Ford Honda ? Model Taurus Prelude Poodle

...you dont get holes in the table ...you dont get strange values
NLM/MBL Medical Informatics

Types of Objects (ie., types of tables)


 Tangible

Things e.g., book  Roles e.g., doctor, patient, supervisor  Incidents (=events, occurences) e.g., ordering of a lab test  Interactions (bind two or more other objects via a transaction) e.g., Purchase relates Buyer to Seller  Specifications (definition tables of tangible things)
NLM/MBL Medical Informatics

Table Notation
Empty Table form: Patient_Admissions Pt_ID Date_Adm Graphical Form: Patient_Admissions * Pt_ID -Date_Adm -Time_Adm -Unit -Room Time_Adm Unit Textual Form: Patient_Admissions (Pt_ID, Date_Adm, Time_Adm, Unit, Room) Room

NLM/MBL Medical Informatics

Formalisms for Tables


 Rule

1: One instance of an object has exactly one value for each attribute (i.e, only one data element at each row-column intersection; no repeating groups, no true holes in table)  Rule 2: Attributes must contain no internal structure
Not OK: Name Smith Jones Clark Age-Sex 38-F 22-M 18-M

If Rules 1 and 2 are obeyed, the data model is in First Normal Form
NLM/MBL Medical Informatics

Formalisms for Tables, contd


 Rule

3: Every attribute should represent a characteristic of the entire object, not a characteristic of a limited part of the object

Not OK: Hospital Committee Membership * Person Name * Committee Name -Date committee term expires -Date first joined hospital staff OK: Hospital Committee Membership * Person Name * Committee Name -Date committee term expires

Attribute of hospital staff appointment, not committee

NLM/MBL Medical Informatics

Relationships
A

relationship is the abstraction of a set of associations that hold systematically between different kinds of real world things Patient OCCUPIES bed Library CONTAINS books Specimen IS ASSAYED by Lab Method  Most relationships may be stated in the inverse also: Library LENDS book Book IS LENT BY Library
NLM/MBL Medical Informatics

Relationship Types
has One-to-One: State governs owns is owned by writes is written by Book Dog Governor

One-to-Many

Dog Owner

Many-to-Many

Author

NLM/MBL Medical Informatics

Modeling Many-to-Many Relationships


DRUG *generic name - other attributes DRUG MANUFACTURER * manufacturer name - other attributes

LICENSE * manufacturer name * generic name - date licensed

NLM/MBL Medical Informatics

Overall System Design Process


 Build

the Entity-Relationship diagram for all defined objects (tables), [including an Object Specification Document]  [Create a State Transition Model which describes changes to objects based on events or transactions]  [Create a Data Flow diagram which models the information elements which cause State Transitions]
[Recommended for multi-programmer projects]
NLM/MBL Medical Informatics

Exercise: Devise a Relational Model for MEDLINE citations

Sample MEDLINE citation

UI - 90134185 AU - Greenes RA ; Shortliffe EH TI - Medical Informatics. An Emerging academic discipline and institutional priority MH - Hospital Information Systems; Career Choice; Medical Informatics/EDUCATION/*TRENDS PT - JOURNAL ARTICLE; REVIEW; TUTORIAL EM - 9005 AB - Information management constitutes a major activity of the health care profession. Currently a number of forces are focusing attention on this function... AD - Department of Radiology, Brigham and Womens Hosp., Boston, MA 02115 SO - JAMA 1990 Feb 23; 263(8):1114-20

The Bottom Line in Database Design


 The

Data Model is the most critical aspect of system design and function  Data Models should reflect real world objects and their relationships to ensure durability  A correct Data Model subserves and outlasts applications, including many not anticipated at system start-up

NLM/MBL Medical Informatics

Questions?