You are on page 1of 63

Database Management

System
Walkthrough . . . .
Hierarchy of Data
Traditional Approach & Drawbacks
Database Approach
Data Models
Emerging Database Trends
Entity Relationship Modelling
Normalization
Overview of SQL

2
Hierarchy of Data

Bit -A bit represents a circuit that is either on/off (1/ 0)


Byte - A byte is usually 8 bits, representing a character
Record - A collection of related fields is a record
(Ex 05BS-0045 Amit Kumar Singh)
File - A collection of related records is a file
(05BS- 3567 Avinash Singh
05BS- 4456 Rajan Khetarpal… )
Databse - A collection of integrated and related files is a
database
3
Traditional Approach
One or more data files created and used for every
application.
Ex : Invoicing program would have files on
customers and inventory items being shipped
Possible to have the same data in several files used
by different applications.

4
Drawbacks of Traditional Approach
Increased Data Redundancy resulting from the
duplication of data
Lack of Data Integrity resulting from the duplication
of data
Program-Data-Dependence
Data developed and organized for one application are
incompatible with data from another application
Difficult to modify and update application programs and data

5
Database Approach to Data Management
Separate data files are reorganized into a pool of
related data so that each data element is stored and
maintained only in one location.
Related data is shared by multiple application
programs.
Database Management System (DBMS) is required.

6
Database Management System
A DBMS is a data storage and retrieval system
which permits data to be stored non-redundantly
while making it appear to the user as if the data is
well-integrated
Collection of interrelated data
Set of programs to access the data
DBMS contains information about a particular
enterprise
DBMS provides an environment that is both
convenient and efficient to use

7
Database Management
System
Application
#1

Application

DBMS
#2
Database
containing
centralized
shared data
Application
#3 DBMS manages data
resources like an operating
system manages hardware
resources

8
Basic Concepts
File
Name: Barry Harris
A set of related College: Medicine
records
♦Record
Tel: 392-5555
­ A collection of Data
about an individual Name: Barry Harris
item College: Medicine
Tel: 392-5555
♦Field
­ A single item of Name: Barry Harris
data common to
all records

9
An Example of a Database

Records NAME GL PHONE COLLEGE


Graff rgraff 392-3900 Pharmacy
Harris bharris 392-5555 Medicine
Ipswich zipswich 846-5656 Health Professions

Fields

10
Tables and Relationships

Person takes many Surveys


• Surveys consists of many Questions

Demographics Surveys
Name Questions
Survey
GL Leadership
Supervisor
Phone Likeable
Peers
College Fair
GL Survey

11
Basic Design Rules

Unique records
ES
U L
Unique fields R

Independent fields
No calculated or derived fields
Data is broken down into smallest
logical parts

12
Benefits of Database Approach
Data can be shared
Reduced redundancy of data (Storage)
Inconsistency can be avoided (Updates)
Improved data integrity (Correctness/Constraints)
Increased security
Conflicting requirements can be balanced
Standards can be enforced
Increased user productivity

13
Drawbacks of Database Approach
Relatively high cost of purchasing
Requires specialized staff to implement and
coordinate the use of the database
Increased vulnerability due to failure of the DBMS
or access to the entire database by an intruder

14
Data Modeling
Content - What data should be collected?
Access - What data should be given to what users?
Logical structure - How will the data be organized to
make sense to a particular user?
Physical organization - Where will the data actually
be located?

15
The 3 Schema architecture for a database system

Users
Extern
al
Level

Logic
Database Administrators al
=> What Data and Relationships
Level

Physic
=> How Data are Stored al Level
16
Levels of Abstraction
Physical/ Internal level describes how a record (e.g.,
customer) is stored.
Logical/Conceptual level: describes data stored in
database, and the relationships among the data.
type customer = record
name : string;
street : string;
city : integer;
end;
External/View level: application programs hide details
of data types. Views can also hide information (e.g.,
salary) for security purposes.
17
Data independence
A user of a relational database system should be able to use SQL to
query the database without knowing about how precisely data is stored,
e.g.

SELECT name, address


FROM Student
WHERE name = ‘Rajan’;

After all, you don't worry much how numbers are stored
when you program some arithmetic or use a computer-based
calculator.

18
More on data independence

Logical data independence protects the user from


changes in the logical structure of the data -- could
completely reorganize the calendar “schema” without
changing how I query it.
Physical data independence protects the user from changes
in the physical structure of data: could add an index on
Who without changing how the user would write the query,
but the query would execute faster (query optimization).

19
Managing Data Files
Character
A basic unit of data (e.g., number)
Field
A collection of related characters (e.g., customer name)
Record
A collection of related data fields
File
A collection of related records

20
Data Models
A collection of conceptual tool for describing data, data
relationships etc
3 categories defined
Object Based Logical Model
Entity Relationship Model
Record Based Logical Model
Relational Model
Network Model
Hierarchial Model
Physical Models
21
Entity-Relationship Modeling
Entity – object, event or agent about which data are
collected.
Attribute – item of data that characterizes an entity
Composite attributes – consist of several sub-attributes
Attributes must be sufficient to uniquely identify every
entity in a database
Key attribute – has unique value for every entity

22
Developing Model Representations for
Entities and Attributes

FIGURE 3.9b
23
Relationships
Relationship – association between entities
Strategy for identifying entity relationships affecting
the logical design of a database
Consider existing and desired information
requirements of users
Evaluate entity pairs to improve attribute
descriptions of entities
Evaluate each entity to identify recursive
relationships among entities

24
Modeling Relationship Types

FIGURE 3.11
25
Entity-Relationship Model

An entity is an object that exists and is distinguishable from other


objects.
Example: specific person, place, company, object,
event, concept ( often corresponding to a row in a table )

Entities have attributes…..property or characteristic of an


entity type ( often corresponding to a field in a table )
Example: people have names and addresses

26
Entity Sets
An entity set is a set of entities of the same type that share the same properties.
( Entity Type …. Collection of entities ( often corresponding to a table )
Example: set of all persons, companies, trees, holidays
Entity Instance – A single occurrence of an entity type.

Relationship instance – link between entities


Relationship type – category of relationship…link between entity types

A database can be modeled as:


a collection of entities,
relationship among entities

27
Attribute - property or characteristic of an entity type
An entity is represented by a set of attributes, that is descriptive properties
possessed by all members of an entity set.

Example: customer = (customer-id, customer-name,


Domain – the set of permitted valuescustomer-city)
customer-street, for each attribute
Attribute types:
loan = (loan-number, amount)
Simple and composite attributes.
Single-valued and multi-valued attributes
● E.g. multivalued attribute: phone-numbers
Derived attributes
● Can be computed from other attributes
● E.g. age, given date of birth
Identifier Attributes ( Keys )

28
Relationship symbols

Entity
symbols Attribute
symbols

A special entity
that is also a
relationship
Relationshi
p degrees
specify
number of
Relationshi entity types
p involved
cardinalitie
s specify
how many
of each 29
E-R Diagrams

n Rectangles represent entity sets.


n Diamonds represent relationship sets.
n Lines link attributes to entity sets and entity sets to relationship sets.
n Ellipses represent attributes
n Double ellipses represent multivalued attributes.
n Dashed ellipses denote derived attributes.
n Underline indicates primary key attributes
30
Figure 3-4 Inappropriate entities

System user System output

Appropriate entities

31
A composite attribute

An attribute
broken into
component parts

32
Simple key attribute

The key is underlined

33
Composite key attribute

The key is composed


of two subparts

34
An attribute that is both multivalued and composite

This is an
example of
time-stamping

35
Entity with a multivalued attribute (Skill) and derived
attribute (Years_Employed)

What’s wrong with this?

Multivalued:
Derived an employee can have
rom date employed and current date more than one skill

36
What Should an Entity Be?
SHOULD BE:
An object that will have many instances in the
database
An object that will be composed of multiple
attributes
An object that we are trying to model
SHOULD NOT BE:
A user of the database system
An output of the database system (e.g. a report)
37
Relational database
Most common
Data is organized in a collection of tables (relations)
Rows – tuples (records)
Columns – attributes (fields)

38
Example of a Relation (EMPLOYEE) and Its Parts

FIGURE 3.10
39
Relational Model
Attributes
Example of tabular data in the relational model
Customer customer- customer- customer- account-
-id name street city number
192-83-7465 Johnson Alma Palo Alto A-101

019-28-3746 Smith North Rye A-215

192-83-7465 Johnson Alma Palo Alto A-201

321-12-3123 Jones Main Harrison A-217

019-28-3746 Smith North Rye A-201

40
A Sample Relational Database

41
The Relational Model
Views entities as two-dimensional tables
Records are rows
Attributes are columns
Tables can be linked
Supports one-to-many, many-to-many, and one-to-
one relationships

42
Associations
Relationships among the entities in the data structures
Three types
One-to-one
One-to-many
Many-to-many
Relationships set by placing primary key from one
table as foreign key in another
Creates “acceptable” redundancy

43
Cardinality of Relationships
One-to-One
Each entity in the relationship will have exactly one
related entity
One-to-Many
An entity on one side of the relationship can have
many related entities, but an entity on the other side
will have a maximum of one related entity
Many-to-Many
Entities on both sides of the relationship can have
many related entities on the other side

44
45
46
Sample E-R Diagram

47
Identifiers (Keys)
Identifier (Key) - An attribute (or combination of
attributes) that uniquely identifies individual
instances of an entity type
Simple Key versus Composite Key
nA super/ composite key of an entity set is a set of one or
more attributes whose values uniquely determine each
entity.
nA candidate key of an entity set is a minimal super key
H Customer-id is candidate key of customer
H account-number is candidate key of account
nAlthough several candidate keys may exist, one of the
candidate keys is selected
48 to be the primary key.
Normalization

49
Normalization
Process of efficiently organizing data in a
database
Goals of normalization:
Eliminate redundancy
Ensure data dependency makes sense

v Advantage :
Complex databases are made simpler
Reduces the amount of space utilized
Ensures data is logically stored.
50
Normalization
Defined as :- Process of restructuring a relation
(table) for reducing it to a form where each domain
will consist of single, non-composite values.

It’s a step by step procedure & steps are called


‘FORMS’.
Normal forms are numbered from lowest 1st
through 5th.
NF’s are table structures with min. redundancy.

51
Contd…
The process of decomposing relations with
anomalies to produce smaller, well-structured
relations

General rule of thumb: a table should not


pertain to more than one entity type

52
Well Structured Relation
A relation that contains minimal data redundancy
and allows users to insert, delete, and update rows
without causing data inconsistencies
Goal is to avoid anomalies
Insertion Anomaly – adding new rows forces user to
create duplicate data
Deletion Anomaly – deleting rows may cause a loss of
data that would be needed for other future rows
Modification Anomaly – changing data in a row
forces changes to other rows because of duplication

53
Redundant Data

54
Anomalies in this Table
♦ Insertion – can’t enter a new instructor’s record,
without entering student information.
♦ Deletion – if we remove Hann’s record, we lose
information about the existence of a Database
class
♦ Modification – changing a course instructor
forces us to update multiple records

Why do these anomalies exist?


Because we’ve combined multiple themes (entity
types) into one relation. This results in duplication,
and an unnecessary dependency between the entities
55
Some Terms to Know :
Functional dependency – Attribute Y is
functionally dependent on attribute X, if the
value of X determines the value of Y.
Ex : Students phone no can be
obtained looking up the name of the student.
Candidate Key

56
First Normal Form
q Defn: A table is said to be in 1st NF if and only if all the underlying domains
contain values that are not decomposable any further.
Remove all repeating groups.
Eliminate duplicate columns
Identify each row with a unique column/set of columns (Primary Key)

Click to edit Master text styles


Second level
● Third level

● Fourth level

● Fifth level

57
Second Normal Form
q Defn: A table is said to be in 2nd NF when it already is
in the 1st NF and every non-key field which is not a
key is fully dependent on the primary key.

Relation should be in 1st NF.


Every non-key field is fully dependant on PK
Remove duplicate data i.e data that apply to multiple
rows of a table.
Place these multiple rows in separate table.
Create relationship between new table & predecessor
table using FK.
58
Third Normal Form
q Defn: A table is said to be in 3rd NF, when it is in
its 2nd NF and every field which is not a key is
functionally dependent on just the primary key.

Relation should be in 2NF


Remove columns not dependent on PK.

Fourth Normal Form


No multivalued dependency.

59
Steps in
Normalization

60
Normalized Data

61
Popular Database Management Systems

62on percentage of worldwide new licence revenue from DBMSs


World wide database Market Share 2001 – Based
Summary Slide
The Hierarchy of Data
The Traditional Approach to Data Management
Problems with the Traditional Approach to Data Management
The Database Approach to Data Management
Disadvantages of the Database Approach
How to Perform Data Analysis
Step 1: Define the needed fields
Step 2: Select the required entities
Step 3: Create a Relational Data model
Step 4: Normalize the Data Model
Parameters to Consider when Selecting a DBMS
Emerging Database Trends
Hands-On Microsoft Access 2000
63