You are on page 1of 110

Unit 1: Understand the concept

of DBMS & RDBMS


by,
Asmatullah Khan
CL/CP, GIOE, Secunderabad.
Outline

• Database system • Entity and entity sets


• Advantages of database system • Relationship and relationship sets
• Data base abstraction • Super key , candidate key and
• Data models primary key
• Instances and schemes • Mapping constraints
• Data independence. • Reducing ER diagrams to tables
• Data definition language • Generalization, specialization and
• Data manipulation languages aggregation
• Data base manager • Functional dependencies
• Data base administrator and users • Normalization– 1st NF, 2nd NF, 3rd
• NF
Overall system structure
• E.F.Codd’s rules for RDBMS
Database Management System (DBMS)
• A database can be summarily described as a repository for data.
• A database management system (DBMS) is an aggregate of data,
hardware, software, and users that helps an enterprise manage its
operational data.
• DBMS contains information about a particular enterprise
▫ Collection of interrelated data
▫ Set of programs to access the data
▫ An environment that is both convenient and efficient to use.
• Database Applications:
▫ Banking: transactions
▫ Airlines: reservations, schedules
▫ Universities: registration, grades
▫ Sales: customers, products, purchases
▫ Online retailers: order tracking, customized recommendations
▫ Manufacturing: production, inventory, orders, supply chain
▫ Human resources: employee records, salaries, tax deductions
• Databases can be very large.
• Databases touch all aspects of our lives
Drawbacks of using file systems to store data

• Data redundancy and inconsistency


▫ Multiple file formats, duplication of information in
different files
 Since different programmers create the files and application
programs over a long period, the various files are likely to
have different formats and the programs may be written in
several programming languages.
 Moreover, the same information may be duplicated in several
places (files). For example, the address and telephone number
of a particular customer may appear in a file that consists of
savings-account records and in a file that consists of checking-
account records.
 This redundancy leads to higher storage and access cost. In
addition, it may lead to data inconsistency;
 that is, the various copies of the same data may no longer agree.
• Difficulty in accessing data
▫ Need to write a new program to carry out each new task
Drawbacks of using file systems to store data (Cont.)

• Data isolation
▫ Multiple files and formats
 Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve
the appropriate data is difficult.
• Integrity problems
▫ Integrity constraints (e.g., account balance > 0) become
“buried” in program code rather than being stated explicitly
▫ Hard to add new constraints or change existing ones
 The data values stored in the database must satisfy certain
types of consistency constraints.
 For example, the balance of a bank account may never fall below a
prescribed amount (say, $25).
 Developers enforce these constraints in the system by adding
appropriate code in the various application programs.
 However, when new constraints are added, it is difficult to
change the programs to enforce them.
Drawbacks of using file systems to store data (Cont.)
• Atomicity of updates
▫ In many applications, it is crucial that, if a failure occurs,
the data be restored to the consistent state that existed
prior to the failure.
 Consider a program to transfer $50 from account A to account
B.
 If a system failure occurs during the execution of the program, it is
possible that the $50 was removed from account A but was not
credited to account B, resulting in an inconsistent database state.
 Concurrent access by multiple users
• Security problems
▫ Hard to provide user access to some, but not all, data.

Database systems offer solutions to all the above


problems and thus are some of its major
advantages.
View of Data and its Abstraction

• The need for efficiency has led


designers to use complex data
structures to represent data in the
database.
• Since many database-systems
users are not computer trained,
developers hide the complexity
from users through several levels
of abstraction
Levels of Abstraction
• Physical level: How
▫ describes how a record (e.g., instructor) is stored.
 The lowest level of abstraction describes how the data are actually
stored. The physical level describes complex low-level data structures
in detail.
• Logical level: What
▫ describes data stored in database, and the relationships among the
data.
▫ describes the entire database in terms of a small number of relatively
simple structures
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
• View level: Who
▫ application programs hide details of data types. Views can also hide
information (such as an employee’s salary) for security purposes.
Instances and Schemas
• Similar to variables and types in programming languages
• Logical Schema – the overall logical structure of the
database
▫ Example: The database consists of information about a set of
customers and accounts in a bank and the relationship
between them
Analogous to type information of a variable in a program
• Physical schema–
schema the overall physical structure of the
database
• Instance – the actual content of the database at a
particular point in time
▫ Analogous to the value of a variable
• Physical Data Independence – the ability to modify the
physical schema without changing the logical schema
▫ Applications depend on the logical schema
▫ In general, the interfaces between the various levels and
components should be well defined so that changes in some
parts do not seriously influence others.
Data Models
• A collection of tools for describing
▫ Data
▫ Data relationships
▫ Data semantics
▫ Data constraints
1. Entity-Relationship data model (mainly for database
design)
2.Relational model (mainly for database implementation)
3.Object-based data models (Object-oriented and Object-
relational database implementation)
4.Semistructured data model (XML – for file format
transformations)
5.Other older models:

1. Network model
2.Hierarchical model
Entity - Relational Model
• The entity-relationship (E-R) data model is based on a
perception of a real world that consists of a collection of basic
objects, called entities, and of relationships among these
objects.
• An entity is a “thing” or “object” in the real world that is
distinguishable from other objects.
▫ For example, each person is an entity, and bank accounts can be
considered as entities.
• Entities are described in a database by a set of attributes.
▫ For example, the attributes account-number and balance may
describe one particular account in a bank, and they form
attributes of the account entity set.
• A relationship is an association among several entities.
▫ For example, a depositor relationship associates a customer with
each account that she has.
The set of all entities of the same type and the set of all
relationships of the same type are termed an entity set and
relationship set, respectively.
E-R Diagrams
• The overall logical structure (schema) of a database can
be expressed graphically by an E-R diagram, which is
built up from the following components:
▫ Rectangles, which represent entity sets
▫ Ellipses, which represent attributes
▫ Diamonds, which represent relationships among entity
sets
▫ Lines, which link attributes to entity sets and entity sets to
relationships
• In addition to entities and relationships, the E-R model
represents certain constraints to which the contents of a
database must confirm.
• One important constraint is mapping cardinalities,
which express the number of entities to which another
entity can be associated via a relationship set.
▫ For example, if each account must belong to only one
customer, the E-R model can express that constraint.
Sample E-R Diagram
• The E-R diagram indicates that there are two
entity sets, customer and account, with
attributes as outlined earlier.
• The diagram also shows a relationship depositor
between customer and account.
Relational Model
• The relational model revolves around a
fundamental data structure called a table,
which is a formalization of the intuitive Columns

notion of a table.
• Informally, the relational model consists of:
▫ A class of data structures referred to as
tables. Rows
▫ A collection of methods for building new
tables starting from an initial collection of
tables;
 we refer to these methods as relational
algebra operations.
▫ A collection of constraints imposed on the
data contained in tables.
• The relational model uses a collection of
tables to represent both data and the
relationships among those data.
• Each table has multiple columns, and each
column has a unique name.
A Sample Relational Database

The relational model is at a lower level of abstraction than the E-R model. Database
designs are often carried out in the E-R model, and then translated to the relational
model;
Database Languages
• A database system provides a data definition language to
specify the database schema and a data manipulation
language to express database queries and updates.
• Two classes of languages
▫ Pure – used for proving properties about computational power and for optimization
 Relational Algebra
 Tuple relational calculus
 Domain relational calculus
▫ Commercial – used in commercial systems
 SQL is the most widely used commercial language
• In practice, the data definition and data manipulation
languages are not two separate languages; instead they simply
form parts of a single database language, such as the widely
used SQL language.
• The commands in the language are classified into different
categories based on their functional implementation
▫ DDL – Data Definition Language
▫ DML – Data Manipulation Language
▫ DCL – Data Control Language
▫ TCL – Transaction Control Language
Data Definition Language (DDL)
• data storage and definition language.
• These statements define the implementation details of the database
schemas, which are usually hidden from the users.
• The data values stored in the database must satisfy certain consistency
constraints.
• Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
• Execution of the above DDL statement creates the account table.
▫ In addition, it updates a special set of tables called the data dictionary or data
directory.
▫ Data dictionary contains metadata (i.e., data about data)
 Database schema
 Integrity constraints
 Primary key (ID uniquely identifies instructors)
 Authorization
 Who can access what
Data Manipulation Language (DML)
• Data manipulation is
▫ The retrieval of information stored in the database
▫ The insertion of new information into the database
▫ The deletion of information from the database
▫ The modification of information stored in the database
• A data-manipulation language (DML) is a language that enables users to
access or manipulate data as organized by the appropriate data model.
• There are basically two types:
▫ Procedural DMLs require a user to specify what data are needed and how to get those
data.
▫ Declarative DMLs (also referred to as nonprocedural DMLs) require a user to
specify what data are needed without specifying how to get those data.
• The DML component of the SQL language is nonprocedural.

• A query is a statement requesting the retrieval of information. The portion of a


DML that involves information retrieval is called a query language.
SQL Query
• The most widely used commercial Query
language
• SQL is NOT a Turing machine equivalent
language.
• To be able to compute complex functions SQL
is usually embedded in some higher-level
language
• Application programs generally access
databases through one of
▫ Language extensions to allow embedded SQL
▫ Application program interface (e.g.,
ODBC/JDBC) which allow SQL queries to be
sent to a database
Sample SQL Query
• This query in the SQL language
finds the name of the customer
whose customer-id is 192-83-7465:

select customer.customer-name from


customer
where customer.customer-id = 192-83-7465

• The query specifies that those rows


from the table customer where the
customer-id is 192-83-7465 must be
retrieved, and the customer-name
attribute of these rows must be
displayed.
What is result of following Query?
• select account.balance from depositor, account where
depositor.customer-id = 192-83-7465 and depositor.account-
number = account.account-number
Database Design
The process of designing the general structure of the database:

• Logical Design – Deciding on the database schema.


Database design requires that we find a “good”
collection of relation schemas.
▫ Business decision – What attributes should we record in
the database?
▫ Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
• Physical Design – Deciding on the physical layout of
the database
Database Design (Cont.)
• Is there any problem with this relation?
Design Approaches
• Need to come up with a methodology to ensure
that each of the relations in the database is
“good”
• Two ways of doing so:
▫ Entity Relationship Model
 Models an enterprise as a collection of entities and
relationships
 Represented diagrammatically by an entity-
relationship diagram.
▫ Normalization Theory
 Formalize what designs are bad, and test for them
Object-Relational Data Models
• Relational model: flat, “atomic” values
• Object Relational Data Models
▫ Extend the relational data model by including object
orientation and constructs to deal with added data
types.
▫ Allow attributes of tuples to have complex types,
including non-atomic values such as nested relations.
▫ Preserve relational foundations, in particular the
declarative access to data, while extending modeling
power.
▫ Provide upward compatibility with existing relational
languages.
XML: Extensible Markup Language
• Defined by the WWW Consortium (W3C)
• Originally intended as a document markup
language not a database language
• The ability to specify new tags, and to create
nested tag structures made XML a great way to
exchange data, not just documents
• XML has become the basis for all new generation
data interchange formats.
• A wide variety of tools is available for parsing,
browsing and querying XML documents/data
Database Manager or Engine

• Storage or Memory
manager
• Query processing
• Transaction
manager
Storage Management
• Storage manager is a program module that
provides the interface between the low-level data
stored in the database and the application programs
and queries submitted to the system.
• The storage manager is responsible for the
interaction with the file manager.
▫ The storage manager translates the various DML
statements into low-level file-system commands.

• Thus, the storage manager is responsible for storing,


retrieving, and updating data in the database.
Storage manage components
• The storage manager components include:
▫ Authorization and integrity manager, which tests for the satisfaction of
integrity constraints and checks the authority of users to access data.
▫ Transaction manager, which ensures that the database remains in a
consistent (correct) state despite system failures, and that concurrent transaction
executions proceed without conflicting.
▫ File manager, which manages the allocation of space on disk storage and the
data structures used to represent information stored on disk.
▫ Buffer manager, which is responsible for fetching data from disk storage into
main memory, and deciding what data to cache in main memory. The buffer
manager is a critical part of the database system, since it enables the database to
handle data sizes that are much larger than the size of main memory.
• The storage manager implements several data structures as part of the
physical system implementation:
▫ Data files, which store the database itself.
▫ Data dictionary, which stores metadata about the structure of the database, in
particular the schema of the database.
▫ Indices, which provide fast access to data items that hold particular values.
Query Processing
• The query processor components
include
▫ DDL interpreter, which
interprets DDL statements and
records the definitions in the data
dictionary.
▫ DML compiler, which translates
DML statements in a query
language into an evaluation plan
consisting of low-level instructions
that the query evaluation engine
understands.
 The DML compiler also performs
query optimization, that is, it picks
the lowest cost evaluation plan
from among the alternatives.
▫ Query evaluation engine,
which executes low-level
instructions generated by the DML
compiler.
Transaction Management
• Consider following questions pertaining to state of database
• What if the system fails?
• What if more than one user is concurrently updating the same data?
• A transaction is a collection of operations that performs a single
logical function in a database application.
• Each transaction is a unit of both atomicity and consistency. Thus, we
require that transactions do not violate any database-consistency
constraints such as:
▫ Atomicity
▫ Consistency
▫ Integrity
▫ Durability
• Ensuring the atomicity and durability properties is the responsibility
of the database system itself—specifically, of the transaction-
management component which has following components.
▫ failure recovery that detect system failures and restore database to
state that existed prior to the occurrence of failure.
▫ concurrency-control manager to control interaction among
concurrent transactions, to ensure consistency of the database.
Database Users and Administrators
• The community of users of a DBMS are
classified based on their roles and
interests in accessing and managing the
databases.
▫ Once a database is created, it is the job
of the Database Administrator to make
decisions about nature of data to be
stored in the database, the access
policies to be enforced monitoring and
tuning the performance of the
database, etc.
▫ End Users have limited access rights,
and they need to have only minimal
technical knowledge of the database.
▫ Application Programmers their role is
to work within existing DBMS systems
and, using a combination of the query
languages and higher-level languages,
to create various reports based on the
data contained in the database.
Database
Database Architecture
The architecture of a
database systems is
greatly influenced by the
underlying computer
system on which the
database is running:
• Centralized
• Client-server
• Parallel (multi-processor)
• Distributed
History of Database Systems
• 1950s and early 1960s:
▫ Data processing using magnetic tapes for storage
 Tapes provided only sequential access
▫ Punched cards for input
• Late 1960s and 1970s:
▫ Hard disks allowed direct access to data
▫ Network and hierarchical data models in
widespread use
▫ Ted Codd defines the relational data model
 Would win the ACM Turing Award for this work
 IBM Research begins System R prototype
 UC Berkeley begins Ingres prototype
▫ High-performance (for the era) transaction
processing
History (cont.)
• 1980s:
▫ Research relational prototypes evolve into
commercial systems
 SQL becomes industrial standard
▫ Parallel and distributed database systems
▫ Object-oriented database systems
• 1990s:
▫ Large decision support and data-mining
applications
▫ Large multi-terabyte data warehouses
▫ Emergence of Web commerce
• Early 2000s:
▫ XML and XQuery standards
▫ Automated database administration
• Later 2000s:
▫ Giant data storage systems
 Google BigTable, Yahoo PNuts, Amazon, ..
Database design through E-R Model based diagrams
The E/R model uses the notions of entity, relationship,
and attribute.
• Database of the college used for our running example
reflects the following information:
▫ Students: any student who has ever registered at the
college;
▫ Instructors: anyone who has ever taught at the college;
▫ Courses: any course ever taught at the college;
▫ Advising: which instructor currently advises which
student, and
▫ Grades: the grade received by each student in each
course, including the semester and the instructor.
• Individual entities and individual relationships are
grouped into homogeneous sets of entities
(STUDENTS, COURSES, and INSTRUCTORS) and
homogeneous sets of relationships (ADVISING, The notion of role helps explain
GRADES). the significance of entities in
▫ STUDENTS represent all the student entities, and relationships.
ADVISING, all the individual advising relationships. Roles appear as labels of the
edges of the E/R diagram.
• We refer to such sets as entity sets and relationship
sets, respectively.
• Attributes - properties of entities and relationships are
described by attributes.
• Each attribute A has an associated set of values, which we
refer to as the domain of A and denote by Dom(A).
• The set of attributes of a set of entities E is denoted by
Attr(E); similarly, the set of attributes of a set of
relationships R is denoted by Attr(R).
• Domains of attributes consist of atomic values.
▫ Means that the elements of such domains must be “simple”
values such as integers, dates, or strings of characters.
• If s is a student entity, then the values associated to s are
denoted by
stno(s), name(s), addr(s), city(s), state(s), zip(s).

DBMS must support attribute domains.


Such support includes validity checks
and implementation of operations
specific to the domains such as:
string concatenation for strings of
characters,
various computations involving
dates, and
arithmetic operations on numeric
domains.
E/R Diagram for College Database
Keys
• In order to talk about a specific student, you have to be
able to identify him.
• As long as no two students have the same name, one
can use the name attribute as a key.
• Key an attribute, or a set of attributes, that uniquely
identifies each entity in a collection is generally a
necessity for electronic databases.
▫ In the college database, the value of the attribute stno is
sufficient to identify a student entity. Since the set stno
has no proper, nonempty subsets, it clearly satisfies the
minimality condition and, therefore, it is a key for the
STUDENTS entity set.
• For the entity set COURSES both cno and cname are
keys?
Types of Keys
What can be keys for entity set Patrons and Books and relationship
set Loans?
• We can consider all the set of attributes as one single key – Super
Key - to uniquely identify each entity in a entity set.
• Also it is possible to have several different set of attribute
combinations as keys for a set of entities each uniquely identifying
each and every entity; all these keys are called as Candidate Keys.
• One of these keys is chosen as the primary key; the remaining keys Some Notational Types of
are alternate keys. Keys
• The primary key of a set of entities E is used by other constituents Composite Primary Key – a
of the E/R model to refer to the entities of E, and this primary key primary key that is made up of
is included in the other constituents as a Foreign Key. more than one attribute.
Surrogate Primary Key – a system
• The identification of the primary key and of the alternate keys is a
assigned primary key generally
semantic statement: numeric and auto-incremented
▫ It reflects our understanding of the role played by various Natural Key – a real world,
attributes in the real world. generally accepted identifier used
▫ In other words, choosing the primary key from among the to distinguish real world objects
available keys is a choice of the designer. Candidate Key- a minimal
superkey that does not contain a
The definition of keys for sets of relationships is completely parallel to
subset of attributes that it itself a
the definition of keys for sets of entities. superkey
Foreign Key – an attribute in one
table whose values must match
the primary key in another table.
Characteristics Primary Key
A primary key is an attribute that uniquely identifies the entity that it resides in.
• For a primary key to be useful and functional there are several characteristics that should
be followed:
▫ Non-Null values – Primary key attributes cannot empty i.e., Primary key should contain some
value and cannot be null.
▫ Unique values – The primary key values must be unique, as it identifies each entity of the table.
▫ Nonintelligent -The primary key should be “fact less” i.e., it cannot be composed of semantic data.
 For example in an entity called STUDENT_INFO, school_ID composed of numbers would be a better choice for a
primary key than first_name or last_name.
▫ No Change Over Time – For a primary key avoid semantic data because it can change overtime.
 If primary keys are changed then the foreign keys must be updated as well.
 Since primary keys are the identity of the table or entity, it should be permanent and unchangeable.
▫ Single-Attribute – The primary key should be composed of only one attribute, however this is not
required.
 If the primary key is a composite primary key ( one made up of multiple attribute), it will cause the primary keys of
other entities to have multiple attributes as well.
▫ Preferable Numeric – Primary keys are easier and better managed when they are composed of
mostly numeric data.
 This is useful because when new data is being entered, the database MS can employ a counter style attribute where with
each new entry, the database program generates a number then increments the number by one automatically for the
next entry.
▫ Security Complaint – The selected primary must not be an attribute that is considered sensitive
information
 Example it would not be a good idea to set a social security number of a person as a primary key .
Participation Constraints
• The E/R model allows us to impose constraints on the
number of relationships in which an entity is allowed to
participate.
• If (E, u, v,R) is a participation constraint we may add u : v
to whatever other labels may be on the edge joining E to R.
• When there is no upper limit to the number of
relationships in which an entity may participate, we write
u : +.
• If every student must choose an advisor, and an instructor
may not advise more than 7 students, we have the
participation constraints
(STUDENTS, 1, 1, ADVISING)
and
(INSTRUCTORS, 0, 7, ADVISING)
Types of Participatory
Constraints

The set of relationships R from U to V is:


1. one-to-one if p = 0, q = 1 and m = 0, n = 1;
2. one-to-many if p = 0, q > 1 and m = 0, n = 1;
3. many-to-one if p = 0, q = 1 and m = 0, n > 1;
4. many-to-many if p = 0, q > 1 and m = 0, n > 1.

A recursive relationship is a binary relationship connecting a set of entities to itself.


Weak and Strong Entities and Identity Relationship
• Suppose that we need to expand our database by
adding information about student loans by adding a
set of entities called LOANS.
• The existence of a loan entity in the E/R model of
the college database is conditioned upon the
existence of a student entity corresponding to the
student to whom that loan was awarded, this type of
dependency is called an existence dependency.
▫ An entity is said to be existence dependent on other
entity when the entity’s (Weak Entity) existence solely
depends on the existence of the other entity (Strong
Entity).
• E is a set of weak entities if the following conditions
are satisfied:
1. The set of entities E does not have a key, and •If a student entity is deleted,
2. the participation constraint (E, 1, k, R) is satisfied for the LOANS entities that depend
some k ≥ 1.
on the student entity should also
• Weak entity sets are represented in E/R diagrams by
be removed.
dashed boxes.
• No Weak entity can exist in E unless it is involved in
a relationship of R with a Strong entity of E′ and Note that the attributes of the
such a relationship is called as Identity relationship, LOANS entity set (source,
where the primary key of strong entity set is added amount, year) are not sufficient
as a foreign key to the existing set of attributes of to identify an entity in this set.
Weak entity.
Enhanced E/R features – Specialization
• Refinement from an initial entity set into successive levels of entity subgroupings
represents a top-down design process in which distinctions are made explicit.
• The process of designating subgroupings within an entity set is called
specialization.
• The specialization of an entity set person, with attributes name, street, and city;
allows us to distinguish among persons according to whether they are employees
with employee-id and salary; or customers with customer-id.
• An entity set may be specialized by more than one distinguishing feature.
▫ Example, the distinguishing feature among employee entities can be the job the employee
performs. Another, coexistent, specialization could be based on whether the person is a
temporary (limited-term) employee or a permanent employee, resulting in the entity sets
temporary-employee and permanent-employee.
•In terms of an E-R diagram, specialization is
depicted by a triangle component labeled ISA.

•The label ISA stands for “is a” and represents,


for example, that a customer “is a” person.

•The ISA relationship may also be referred to as


a superclass-subclass relationship.

•Higher- and lower-level entity sets are depicted


as regular entity sets
Enhanced E/R features – Aggregation
One limitation of the E-R model is that it cannot express
relationships among relationships.
• Consider a E/R diagram representation of Banking
database system, where consider the ternary relationship
works-on, between a employee, branch, and job.
• Now, suppose we want to record managers for tasks
performed by an employee at a branch; that is, we want to
record managers for (employee, branch, job) Before Aggregation
combinations. Let us assume that there is an entity set Quadratic Relationship
manager.
• One alternative for representing this relationship is to
create a quaternary relationship manages between
employee, branch, job, and manager.
▫ (A quaternary relationship is required—a binary
relationship between manager and employee would
not permit us to represent which (branch, job)
combinations of an employee are managed by which
manager.)
• The best way to model a situation such as the one
just described is to use aggregation.
• Aggregation is an abstraction through which Aggregated Works-on set
relationships are treated as higherlevel entities.
• Thus, for our example, we regard the relationship
set works-on as a higher-level entity set called
works-on enitity.
E/R Diagram Symbols

Alternate E/R Diagram


Symbols
Banking System Database E/R Diagram
Transforming E-R model to Relational Model
Design to Implementation
Review - Concepts

Relational Model is made up of tables

A row of table a relational instance/tuple


A column of table an attribute
A table a schema/relation
Cardinality number of rows
Degree number of columns
Review - Example

Attribute

Cardinality = 2
tuple/relational
instance SID Name Major GPA
1234 John CS 2.8
5678 Mary EE 3.6

4 Degree

A Schema / Relation
From ER Model to Relational Model

So… how do we convert an ER diagram into a


table?? Simple!!
Basic Ideas:
 Build a table for each entity set
 Build a table for each relationship set if necessary (more
on this later)
 Make a column in the table for each attribute in the
entity set
 Indivisibility Rule and Ordering Rule
 Primary Key
Example – Strong Entity Set
SID Name SSN Name

Student Advisor Professor

Major Dept
GPA

SID Name Major GPA SSN Name Dept


1234 John CS 2.8 9999 Smith Math
5678 Mary EE 3.6 8888 Lee CS
Representation of Weak Entity Set
• Weak Entity Set Cannot exists alone
• To build a table/schema for weak entity set
1. Construct a table with one column for each attribute
in the weak entity set, remember to include
discriminator
2. Augment one extra column on the right side of the
table, put in there the primary key of the Strong
Entity Set (the entity set that the weak entity set is depending
on)
3. Primary Key of weak entity set = Discriminator + foreign key
Example – Weak Entity Set
Age
SID Name Name

Student owns Children

Major GPA

Age Name Parent_SID


10 Bart 1234
8 Lisa 5678

* Primary key of Children is Parent_SID + Name


Representation of Relationship Set

--This is a little more complicated--


 Unary/Binary Relationship set
 Depends on the cardinality and participation of the relationship
 Two possible approaches
 N-ary (multiple) Relationship set
 Primary Key Issue
 Identifying Relationship
 No relational model representation necessary
Representing Relationship Set
Unary/Binary Relationship

• For one-to-one relationship w/out total participation


▫ Build a table with two columns, one column for each
participating entity set’s primary key.
▫ Add successive columns, one for each descriptive
attributes of the relationship set (if any).
• For one-to-one relationship with one entity set having
total participation
▫ Augment one extra column on the right side of the
table of the entity set with total participation, put in
there the primary key of the entity set without
complete participation as per to the relationship.
Example – One-to-One Relationship Set
Degree
SID Name ID Code

Student study Major

Major GPA

SID Maj_ID Co S_Degree


9999 07 1234
8888 05 5678

* Primary key can be either SID or Maj_ID_Co


Example – One-to-One Relationship Set
Condition
SID Name 1:1 Relationship S/N #

Student Have Laptop

Major GPA Brand

SID Name Major GPA LP_S/N Hav_Cond


9999 Bart Economy -4.0 123-456 Own
8888 Lisa Physics 4.0 567-890 Loan

* Primary key can be either SID or LP_S/N


Representing Relationship Set
Unary/Binary Relationship

• For one-to-many relationship w/out total


participation
▫ Same thing as one-to-one
• For one-to-many/many-to-one relationship with
one entity set having total participation on
“many” side
▫ Augment one extra column on the right side of the
table of the entity set on the “many” side, put in
there the primary key of the entity set on the “one”
side as per to the relationship.
Example – Many-to-One Relationship Set
Semester
SID Name N:1 Relationship SSN

Student Advisor Professor

Major GPA Dept Name

SID Name Major GPA Pro_SSN Ad_Sem


9999 Bart Economy -4.0 123-456 Fall 2006
8888 Lisa Physics 4.0 567-890 Fall 2005

* Primary key of this table is SID


Representing Relationship Set
Unary/Binary Relationship

• For many-to-many relationship


▫ Same thing as one-to-one relationship without
total participation.
▫ Primary key of this new schema is the union of the
foreign keys of both entity sets.
▫ No augmentation approach possible…
Representing Relationship Set
N-ary Relationship

• Intuitively Simple
▫ Build a new table with as many columns as there are
attributes for the union of the primary keys of all
participating entity sets.
▫ Augment additional columns for descriptive attributes
of the relationship set (if necessary)
▫ The primary key of this table is the union of all
primary keys of entity sets that are on “many” side
Example – N-ary Relationship Set
P-Key1
D-Attribute A-Key
E-Set 1

P-Key2 A relationship Another Set


E-Set 2

P-Key3

E-Set 3

P-Key1 P-Key2 P-Key3 A-Key D-Attribute


9999 8888 7777 6666 Yes
1234 5678 9012 3456 No
* Primary key of this table is P-Key1 + P-Key2 + P-Key3
Representing Relationship Set
Identifying Relationship

• This is what you have to know


▫ You DON’T have to build a table/schema for the
identifying relationship set once you have built a
table/schema for the corresponding weak entity set
▫ Reason:
 A special case of one-to-many with total participation
 Reduce Redundancy
Representing Composite Attribute
• Relational Model Indivisibility Rule Applies
• One column for each component attribute
• NO column for the composite attribute itself

SSN Name

SSN Name Street City


Professor 9999 Dr. Smith 50 1st St. Fake City
8888 Dr. Lee 1 B St. San Jose
Address

Street City
Representing Multivalue Attribute

• For each multivalue attribute in an entity


set/relationship set
▫ Build a new relation schema with two columns
▫ One column for the primary keys of the entity
set/relationship set that has the multivalue attribute
▫ Another column for the multivalue attributes. Each
cell of this column holds only one value. So each value
is represented as an unique tuple
▫ Primary key for this schema is the union of all
attributes
Example – Multivalue attribute
SID Name The primary key for this
table is Student_SID + Children,
the union of all attributes
Children
Student

Major GPA

Stud_SID Children
1234 Johnson
1234 Mary
SID Name Major GPA
5678 Bart
1234 John CS 2.8
5678 Lisa
5678 Homer EE 3.6
5678 Maggie
Representing Class Hierarchy
• Two general approaches depending on
disjointness and completeness
▫ For non-disjoint and/or non-complete class hierarchy:
 create a table for each super class entity set
according to normal entity set translation method.
 Create a table for each subclass entity set with a
column for each of the attributes of that entity set
plus one for each attributes of the primary key of the
super class entity set
 This primary key from super class entity set is also
used as the primary key for this new table
Class Hierarchy SSN Name

Example 1
Person

SID Status
Gender
ISA
Student

Major GPA

SSN Name Gender


1234 Homer Male
5678 Marge Female

SSN SID Status Major GPA


1234 9999 Full CS 2.8
5678 8888 Part EE 3.6
Representing Class Hierarchy
• Two general approaches depending on
disjointness and completeness
▫ For disjoint AND complete mapping class hierarchy:
 DO NOT create a table for the super class entity set
 Create a table for each subclass entity set include all
attributes of that subclass entity set and attributes of the
superclass entity set.
SSN Name
Class Hierarchy No table created for superclass
entity set
Example 2
GIOE people

ISA
SID
Student Faculty
Disjoint and Complete
mapping
Major GPA Dept

SSN Name SID Major GPA SSN Name Dept


1234 John 9999 CS 2.8 1234 Homer C.S.
5678 Mary 8888 EE 3.6 5678 Marge Math
Representing Aggregation
Name SSN Name

Student Advisor Professor

Dept
SID

Name
member

Primary Key of Advisor


Dept

SID Code Code


1234 04 Primary key of Dept
5678 08
Eliminating redundant information through restructuring of
database tables.
Normalization Definition
• This is the process which allows you to winnow out
(sort through) redundant data within your database.
• This involves restructuring the tables to
successively meeting higher forms of
Normalization.
• A properly normalized database should have the
following characteristics
▫ Scalar values in each fields
▫ Absence of redundancy.
▫ Minimal use of null values.
▫ Minimal loss of information.
Levels of Normalization
• Levels of normalization based on the amount of redundancy in the
database.
• Various levels of normalization are:
▫ First Normal Form (1NF)
▫ Second Normal Form (2NF)

Number of Tables
Redundancy

Complexity
▫ Third Normal Form (3NF)
▫ Boyce-Codd Normal Form (BCNF)
▫ Fourth Normal Form (4NF)
▫ Fifth Normal Form (5NF)
▫ Domain Key Normal Form (DKNF)
Most
Mostdatabases
databasesshould
shouldbe
be3NF
3NFororBCNF
BCNFininorder
ordertotoavoid
avoidthe
thedatabase
databaseanomalies.
anomalies.
Levels of Normalization
1NF

2NF
3NF

4NF

5NF

DKNF

Each
Eachhigher
higherlevel
levelisisaasubset
subsetofofthe
thelower
lowerlevel
level
First Normal Form (1NF)
A table is considered to be in 1NF if all the fields contain
only scalar values (as opposed to list of values).
Example (Not 1NF)

ISBN Title AuName AuPhone PubName PubPhone Price

0-321-32132-1 Balloon Sleepy, 321-321-1111, Small House 714-000-0000 $34.00


Snoopy, 232-234-1234,
Grumpy 665-235-6532

0-55-123456-9 Main Street Jones, Smith 123-333-3333, Small House 714-000-0000 $22.95
654-223-3455

0-123-45678-0 Ulysses Joyce 666-666-6666 Alpha Press 999-999-9999 $34.00

1-22-233700-0 Visual Basic Roman 444-444-4444 Big House 123-456-7890 $25.00

Author
Authorand
andAuPhone
AuPhonecolumns
columnsare
arenot
notscalar
scalar
1NF - Decomposition
1. Place all items that appear in the repeating group in a new table
2. Designate a primary key for each new table produced.
3. Duplicate in the new table the primary key of the table from
which the repeating group was extracted or vice versa.
Example (1NF)

ISBN AuName AuPhone

0-321-32132-1 Sleepy 321-321-1111

ISBN Title PubName PubPhone Price 0-321-32132-1 Snoopy 232-234-1234

0-321-32132-1 Balloon Small House 714-000-0000 $34.00 0-321-32132-1 Grumpy 665-235-6532

0-55-123456-9 Main Street Small House 714-000-0000 $22.95 0-55-123456-9 Jones 123-333-3333

0-123-45678-0 Ulysses Alpha Press 999-999-9999 $34.00 0-55-123456-9 Smith 654-223-3455

1-22-233700-0 Visual Basic Big House 123-456-7890 $25.00 0-123-45678-0 Joyce 666-666-6666

1-22-233700-0 Roman 444-444-4444


Functional Dependencies
If one set of attributes in a table determines another set of
attributes in the table, then the second set of attributes is
said to be functionally dependent on the first set of
attributes.

Example 1

ISBN Title Price Table Scheme: {ISBN, Title, Price}


0-321-32132-1 Balloon $34.00 Functional Dependencies: {ISBN}  {Title}
0-55-123456-9 Main Street $22.95
{ISBN}  {Price}
0-123-45678-0 Ulysses $34.00

1-22-233700-0 Visual Basic $25.00


Functional Dependencies
Example 2
PubID PubName PubPhone Table Scheme: {PubID, PubName, PubPhone}
1 Big House 999-999-9999 Functional Dependencies: {PubId}  {PubPhone}
2 Small House 123-456-7890 {PubId}  {PubName}
3 Alpha Press 111-111-1111 {PubName, PubPhone}  {PubID}

Example 3

AuID AuName AuPhone


1 Sleepy 321-321-1111
Table Scheme: {AuID, AuName, AuPhone}
2 Snoopy 232-234-1234
Functional Dependencies: {AuId}  {AuPhone}
3 Grumpy 665-235-6532
{AuId}  {AuName}
{AuName, AuPhone}  {AuID}
4 Jones 123-333-3333

5 Smith 654-223-3455

6 Joyce 666-666-6666

7 Roman 444-444-4444
FD – Example
Database to track reviews of papers submitted to an academic
conference. Prospective authors submit papers for review and possible
acceptance in the published conference proceedings. Details of the
entities
▫ Author information includes a unique author number, a name, a mailing
address, and a unique (optional) email address.
▫ Paper information includes the primary author, the paper number, the
title, the abstract, and review status (pending, accepted, rejected)
▫ Reviewer information includes the reviewer number, the name, the
mailing address, and a unique (optional) email address
▫ A completed review includes the reviewer number, the date, the paper
number, comments to the authors, comments to the program chairperson,
and ratings (overall, originality, correctness, style, clarity)
FD – Example
Functional Dependencies
▫ AuthNo  AuthName, AuthEmail, AuthAddress
▫ AuthEmail  AuthNo
▫ PaperNo  Primary-AuthNo, Title, Abstract, Status
▫ RevNo  RevName, RevEmail, RevAddress
▫ RevEmail  RevNo
▫ RevNo, PaperNo  AuthComm, Prog-Comm, Date,
Rating1, Rating2, Rating3, Rating4, Rating5
Second Normal Form (2NF)
For a table to be in 2NF, there are two requirements
▫ The database is in first normal form
▫ All nonkey attributes in the table must be functionally dependent on the entire
primary key
Note: Remember that we are dealing with non-key attributes

Example 1 (Not 2NF)


Scheme  {Title, PubId, AuId, Price, AuAddress}
1. Key  {Title, PubId, AuId}
2. {Title, PubId, AuID}  {Price}
3. {AuID}  {AuAddress}
4. AuAddress does not belong to a key
5. AuAddress functionally depends on AuId which is a subset of a key
Second Normal Form (2NF)
Example 2 (Not 2NF)
Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
1. key  {City, Street, HouseNumber}
2. {City, Street, HouseNumber}  {HouseColor}
3. {City}  {CityPopulation}
4. CityPopulation does not belong to any key.
5. CityPopulation is functionally dependent on the City which is a proper subset of the
key

Example 3 (Not 2NF)


Scheme  {studio, movie, budget, studio_city}
1. Key  {studio, movie}
2. {studio, movie}  {budget}
3. {studio}  {studio_city}
4. studio_city is not a part of a key
5. studio_city functionally depends on studio which is a proper subset of the key
2NF - Decomposition
1. If a data item is fully functionally dependent on only a part of the
primary key, move that data item and that part of the primary key to a
new table.
2. If other data items are functionally dependent on the same part of the
key, place them in the new table also
3. Make the partial primary key copied from the original table the
primary key for the new table. Place all items that appear in the
repeating group in a new table
Example 1 (Convert to 2NF)
Old Scheme  {Title, PubId, AuId, Price, AuAddress}
New Scheme  {Title, PubId, AuId, Price}
New Scheme  {AuId, AuAddress}
2NF - Decomposition
Example 2 (Convert to 2NF)
Old Scheme  {Studio, Movie, Budget, StudioCity}
New Scheme  {Movie, Studio, Budget}
New Scheme  {Studio, City}

Example 3 (Convert to 2NF)


Old Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
New Scheme  {City, Street, HouseNumber, HouseColor}
New Scheme  {City, CityPopulation}
Third Normal Form (3NF)
This form dictates that all non-key attributes of a table must be functionally
dependent on a candidate key i.e. there can be no interdependencies among
non-key attributes.

For a table to be in 3NF, there are two requirements


▫ The table should be second normal form
▫ No attribute is transitively dependent on the primary key

Example (Not in 3NF)


Scheme  {Title, PubID, PageCount, Price }
1. Key  {Title, PubId}
2. {Title, PubId}  {PageCount}
3. {PageCount}  {Price}
4. Both Price and PageCount depend on a key hence 2NF
5. Transitively {Title, PubID}  {Price} hence not in 3NF
Third Normal Form (3NF)
Example 2 (Not in 3NF)
Scheme  {Studio, StudioCity, CityTemp}
1. Primary Key  {Studio}
2. {Studio}  {StudioCity}
3. {StudioCity}  {CityTemp}
4. {Studio}  {CityTemp}
5. Both StudioCity and CityTemp depend on the entire key hence 2NF
6. CityTemp transitively depends on Studio hence violates 3NF
BuildingID Contractor Fee
Example 3 (Not in 3NF)
100 Randolph 1200
Scheme  {BuildingID, Contractor, Fee}
150 Ingersoll 1100
1. Primary Key  {BuildingID}
200 Randolph 1200
2. {BuildingID}  {Contractor}
3. {Contractor}  {Fee} 250 Pitkin 1100

4. {BuildingID}  {Fee} 300 Randolph 1200

5. Fee transitively depends on the BuildingID


6. Both Contractor and Fee depend on the entire key hence 2NF
3NF - Decomposition
1. Move all items involved in transitive dependencies to a new entity.
2. Identify a primary key for the new entity.
3. Place the primary key for the new entity as a foreign key on the
original entity.

Example 1 (Convert to 3NF)


Old Scheme  {Title, PubID, PageCount, Price }
New Scheme  {PubID, PageCount, Price}
New Scheme  {Title, PubID, PageCount}
3NF - Decomposition
Example 2 (Convert to 3NF)
Old Scheme  {Studio, StudioCity, CityTemp}
New Scheme  {Studio, StudioCity}
New Scheme  {StudioCity, CityTemp}

Contractor Contractor
Example 3 (Convert to 3NF) BuildingID Fee

100 Randolph Randolph 1200


Old Scheme  {BuildingID, Contractor, Fee}
150 Ingersoll Ingersoll 1100
New Scheme  {BuildingID, Contractor} 200 Randolph Pitkin 1100
New Scheme  {Contractor, Fee} 250 Pitkin
300 Randolph
Boyce-Codd Normal Form (BCNF)
• BCNF does not allow dependencies between attributes that belong to candidate keys.
• BCNF is a refinement of the third normal form in which it drops the restriction of a non-key
attribute from the 3rd normal form.
• Third normal form and BCNF are not same if the following conditions are true:
▫ The table has two or more candidate keys
▫ At least two of the candidate keys are composed of more than one attribute
▫ The keys are not disjoint i.e. The composite candidate keys share some attributes

Example 1 - Address (Not in BCNF)


Scheme  {City, Street, ZipCode }
1. Key1  {City, Street }
2. Key2  {ZipCode, Street}
3. No non-key attribute hence 3NF
4. {City, Street}  {ZipCode}
5. {ZipCode}  {City}
6. Dependency between attributes belonging to a key
Boyce Codd Normal Form (BCNF)
Example 2 - Movie (Not in BCNF)
Scheme  {MovieTitle, MovieID, PersonName, Role, Payment }
1. Key1  {MovieTitle, PersonName}
2. Key2  {MovieID, PersonName}
3. Both role and payment functionally depend on both candidate keys thus 3NF
4. {MovieID}  {MovieTitle}
5. Dependency between MovieID & MovieTitle Violates BCNF

Example 3 - Consulting (Not in BCNF)


Scheme  {Client, Problem, Consultant}
1. Key1  {Client, Problem}
2. Key2  {Client, Consultant}
3. No non-key attribute hence 3NF
4. {Client, Problem}  {Consultant}
5. {Client, Consultant}  {Problem}
6. Dependency between attributess belonging to keys violates BCNF
BCNF - Decomposition
1. Place the two candidate primary keys in separate entities
2. Place each of the remaining data items in one of the
resulting entities according to its dependency on the
primary key.
Example 1 (Convert to BCNF)
Old Scheme  {City, Street, ZipCode }
New Scheme1  {ZipCode, Street}
New Scheme2  {City, Street}
• Loss of relation {ZipCode}  {City}
Alternate New Scheme1  {ZipCode, Street }
Alternate New Scheme2  {ZipCode, City}
Decomposition – Loss of Information
1. If decomposition does not cause any loss of information it is called a
lossless decomposition.
2. If a decomposition does not cause any dependencies to be lost it is
called a dependency-preserving decomposition.
3. Any table scheme can be decomposed in a lossless way into a
collection of smaller schemas that are in BCNF form. However the
dependency preservation is not guaranteed.
4. Any table can be decomposed in a lossless way into 3 rd normal form
that also preserves the dependencies.
• 3NF may be better than BCNF in some cases

Use
Useyour
yourown
ownjudgment
judgmentwhen
whendecomposing
decomposingschemas
schemas
BCNF - Decomposition
Example 2 (Convert to BCNF)
Old Scheme  {MovieTitle, MovieID, PersonName, Role, Payment }
New Scheme  {MovieID, PersonName, Role, Payment}
New Scheme  {MovieTitle, PersonName}
• Loss of relation {MovieID}  {MovieTitle}
New Scheme  {MovieID, PersonName, Role, Payment}
New Scheme  {MovieID, MovieTitle}
• We got the {MovieID}  {MovieTitle} relationship back

Example 3 (Convert to BCNF)


Old Scheme  {Client, Problem, Consultant}
New Scheme  {Client, Consultant}
New Scheme  {Client, Problem}
Fourth Normal Form (4NF)
• Fourth normal form eliminates independent many-to-one relationships
between columns.
• To be in Fourth Normal Form,
▫ a relation must first be in Boyce-Codd Normal Form. 
▫ a given relation may not contain more than one multi-valued attribute.

Example (Not in 4NF)


Scheme  {MovieName, ScreeningCity, Genre)
Primary Key: {MovieName, ScreeningCity, Genre)
1. All columns are a part of the only candidate key, hence BCNF
2. Many Movies can have the same Genre
3. Many Cities can have the same movie Movie ScreeningCity Genre

4. Violates 4NF Hard Code Los Angles Comedy

Hard Code New York Comedy

Bill Durham Santa Cruz Drama

  Bill Durham Durham Drama


The Code Warrier New York Horror
Fourth Normal Form (4NF)
Example 2 (Not in 4NF) Manager Child     Employee

Scheme  {Manager, Child, Employee} Jim Beth Alice

1. Primary Key  {Manager, Child, Employee} Mary Bob Jane


2. Each manager can have more than one child Mary NULL Adam
3. Each manager can supervise more than one employee
4. 4NF Violated

Example 3 (Not in 4NF)


Scheme  {Employee, Skill, ForeignLanguage}
1. Primary Key  {Employee, Skill, Language }
2. Each employee can speak multiple languages
3. Each employee can have multiple skills
Employee Skill Language
4. Thus violates 4NF
1234 Cooking French

1234 Cooking German


1453 Carpentry Spanish

1453 Cooking Spanish


2345 Cooking Spanish
4NF - Decomposition
1. Move the two multi-valued relations to separate tables
2. Identify a primary key for each of the new entity.

Example 1 (Convert to 3NF)


Old Scheme  {MovieName, ScreeningCity, Genre}
New Scheme  {MovieName, ScreeningCity}
New Scheme  {MovieName, Genre}

Movie Genre Movie ScreeningCity


Hard Code Comedy Hard Code Los Angles

Bill Durham Drama Hard Code New York

The Code Warrier Horror Bill Durham Santa Cruz

Bill Durham Durham

The Code Warrier New York


4NF - Decomposition
Example 2 (Convert to 4NF) Manager Child     Manager Employee
Old Scheme  {Manager, Child, Employee} Jim Beth Jim Alice

New Scheme  {Manager, Child} Mary Bob Mary Jane


Mary Adam
New Scheme  {Manager, Employee}

Example 3 (Convert to 4NF)


Old Scheme  {Employee, Skill, ForeignLanguage}
New Scheme  {Employee, Skill}
New Scheme  {Employee, ForeignLanguage}
Employee Skill Employee Language
1234 Cooking 1234 French

1453 Carpentry 1234 German

1453 Cooking 1453 Spanish


2345 Cooking 2345 Spanish
Fifth Normal Form (5NF)
• Fifth normal form is satisfied when all tables are broken
into as many tables as possible in order to avoid
redundancy. Once it is in fifth normal form it cannot be
broken into smaller relations without changing the facts or
the meaning. 

Domain Key Normal Form (DKNF)


• The relation is in DKNF when there can be no insertion or
deletion anomalies in the database.
Codd proposed thirteen rules (numbered zero to twelve) and
said that if a Database Management System meets these rules, it
can be called as a Relational Database Management System.
These rules are called as Codd's12 rules. Hardly any commercial
product follows all.

Refer notes section for short explanation.


Rule Zero
• The system must qualify as relational, as a
database, and as a management system.
• For a system to qualify as a relational database
management system (RDBMS), that system
must use its relational facilities (exclusively) to
manage the database.
• Rule 1 : The information rule:
▫ All information in the database is to be represented in
one and only one way, namely by values in column
positions within rows of tables.
• Rule 2 : The guaranteed access rule:
▫ All data must be accessible.
▫ This rule is essentially a restatement of the
fundamental requirement for primary keys.
▫ It says that every individual scalar value in the
database must be logically addressable by specifying
the name of the containing table, the name of the
containing column and the primary key value of the
containing row.
• Rule 5 : The comprehensive data sub language rule:
▫ The system must support at least one relational language that
1. Has a linear syntax
2. Can be used both interactively and within application programs,
3. Supports data definition operations (including view definitions),
data manipulation operations (update as well as retrieval), security
and integrity constraints, and transaction management operations
(begin, commit, and rollback).
• Rule 6 : The view updating rule:
▫ All views those can be updated theoretically, must be updated by the
system.
• Rule 7 : High-level insert, update, and delete:
▫ The system must support set-at-a-time insert, update, and delete operators.
▫ This means that data can be retrieved from a relational database in sets
constructed of data from multiple rows and/or multiple tables.
▫ This rule states that insert, update, and delete operations should be supported
for any retrievable set rather than just for a single row in a single table.
• Rule 8 : Physical data independence:
▫ Changes to the physical level (how the data is stored, whether in arrays
or linked lists etc.) must not require a change to an application based
on the structure.

• Rule 9 : Logical data independence:


▫ Changes to the logical level (tables, columns, rows, and so on) must not
require a change to an application based on the structure.
▫ Logical data independence is more difficult to achieve than physical
data independence.

• Rule 10 : Integrity independence:


▫ Integrity constraints must be specified separately from application
programs and stored in the catalog.
▫ It must be possible to change such constraints as and when appropriate
without unnecessarily affecting existing applications.
• Rule 11 : Distribution independence:
▫ The distribution of portions of the database to various locations should
be invisible to users of the database.
▫ Existing applications should continue to operate successfully :
1. when a distributed version of the DBMS is first introduced; and
2. when existing distributed data are redistributed around the
system.

• Rule 12: The non subversion rule:


▫ If the system provides a low-level (record-at-a-time) interface, then
that interface cannot be used to subvert the system,
 for example, bypassing a relational security or integrity constraint.

You might also like