Professional Documents
Culture Documents
AGENDA
Data an Introduction
• Data : Raw Facts.
Data is raw, unorganized facts that need to be processed. Data can be
something simple and seemingly random and useless until it is organized.
Do you find yourself entering the same values of information into multiple
spreadsheets/reports/documents?
When you make the changes in your spreadsheet/reports/documents, are you forced to
make the same changes in others?
Do you have a large amount of data that is becoming larger and unmanageable?
Do several people in your organization have the need to view your data at the same time?
Are you tracking related information in several spreadsheets – such as separate sheets for
sales for different departments or different geographical locations?
When viewing your information, are you constantly scrolling on your screen to view it
all? Or do you have a difficult time viewing the specific sets of data that you want?
CHALLENGES WITHOUT AN DBMS...
5
DATABASE MANAGEMENT SYSTEM
• DBMS
• is a collection of programs that enables you to store, modify and extract information
from a database
• Is a piece of software that provides services for accessing a database
• Why DBMS
• Secure and Survivable medium, for the storage and retrieval of data.
BENEFITS OF A DATABASE
BC
/ OD ity User Application
BC tiv
JD nec
con
JDBC/ODBC
connectivity
User Application
Data File JD
con BC/O
ne
cti DBC
vi t
y
User Application
Hierarchical model represents the data in a tree-like structure. A child record is associated to
a single parent. To maintain order there is a sort field which keeps sibling nodes into a recorded
manner. This was formed for the earlier database management systems based out in mainframe
IMS databases.
DATA MODELS
• Network database model, a child can be linked to multiple parents, a feature that was not supported by
the hierarchical data model. The parent nodes are known as owners and the child nodes are called
members.
DATABASE MODELS
• Data models define how the logical structure of a database is modeled. Data
Models are fundamental entities to introduce abstraction in a DBMS. Data
models define how data is connected to each other and how they are processed
and stored inside the system.
• Entity-Relational (ER) Model is based on the notion of real-world entities
and relationships among them. ER model is based on Entities and
their attributes and Relationships among entities.
• Entity is a real-world entity having properties called attributes.
Every attribute is defined by its set of values
called domain.
• Relationship − The logical association among entities is called relationship.
Relationships are mapped with
entities in various ways. Mapping cardinalities define the number of association
between two entities.
• Entity Cardinality
• 1:1 : Customer -> Adhaar Card.
• 1:m : Customer -> Sales
• M:m : Student <-> Subject
ER DIAGRAM
DATABASE MODELS
• Relational Model is the most popular data model in DBMS. The data is stored in a tabular format and is defined
as an n-ary relation.
Attribute/Column
• Advent Of Databases
File Based DBMS
• Data was collated and stored in the form of Access is only Physical Physical as well as Logical
ledgers using in the manual mode.
Predetermined access to data Flexible access to data (SQL)
• Data was collated and stored in electronic excel
sheets, files in the computer mode. At any point in time only one given Concurrent users
user
• Database: organized collection of data that Redundancy permitted Redundancy controlled.
is stored and accessed electronically.
Restricted unauthorized access.
• Database Management System: A software
Back up and recovery process
for creating and managing databases.
Data is Isolated
ACID PROPERTIES
Example:
• If you are storing bank accounts that relate to bank
customers, it should not be possible to create an account for
a customer who does not exist, and it should not be possible
to delete a customer from the customers table if there are
still accounts referring to them in the accounts table.
ISOLATION
• Isolation means that transactions do not affect each other while they are
running. Each transaction should be able to view the world as though it is the
only one reading and altering things. In practice this is not usually the case, but
locks are used to achieve the illusion
DURABILITY
• Centralized:
• Client Server
N-Tier :
• Distributed
DATABASE ARCHITECTURE
NORMALIZATION
Functional
dependency
No transitive
of nonkey
dependency
attributes on
between
the primary
nonkey
attributes
Boyce- key - Atomic
Codd and values only
Higher
All Full
determinants Functional
are candidate dependency
keys - Single of nonkey
multivalued attributes on
dependency the primary
key
FUNCTIONAL DEPENDENCIES
• Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs
• FDs and keys are used to define normal forms for relations
• FDs are constraints that are derived from the meaning and interrelationships of the data attributes
• A set of attributes X functionally determines a set of attributes Y if the value of X determines a
unique value for Y
• X Y holds if whenever two tuples have the same value for X, they must have the same value
for Y
If t1[X]=t2[X], then t1[Y]=t2[Y] in any relation instance r(R)
• X Y in R specifies a constraint on all relation instances r(R)
• FDs are derived from the real-world constraints on the attributes
EXAMPLES OF FUNCTIONAL
DEPENDENCY
• To move to First Normal Form a relation must contain only atomic values at
each row and column.
• No repeating groups
• A column or set of columns is called a Candidate Key when its values can uniquely
identify the row in the relation
FIRST NORMAL FORM
• Insertion: A new patient has not yet undergone surgery -- hence no surgeon # --
Since surgeon # is part of the key, we cannot insert.
• Insertion: If a surgeon is newly hired and has not operated yet -- there will be no
way to include that person in the database.
• Update: If a patient comes in for a new procedure, and has moved, we need to
change multiple address entries.
• Deletion (type 1): Deleting a patient record may also delete all info about a
surgeon.
• Deletion (type 2): When there are functional dependencies (like side effects and
drug) changing one item eliminates other information.
SECOND NORMAL FORM
Deletion (type 1): If Charles Brown dies, the corresponding tuples from Patient and
Surgery tables can be deleted without losing information on David Rosen.
Update: If John White comes in for third time, and has moved, we only need to change
the Patient table
2NF ANOMALIES
• Insertion: Cannot enter the fact that a particular drug has a particular side effect
unless it is given to a patient.
• Deletion: If John White receives some other drug because of the penicillin rash,
and a new drug and side effect are entered, we lose the information that
penicillin can cause a rash
• Update: If drug side effects change (a new formula) we have to update multiple
occurrences of side effects.
THIRD NORMAL FORM
• Insertion: We can now enter the fact that a particular drug has a particular side
effect in the Drug relation.
• Deletion: If John White receives some other drug as a result of the rash from
penicillin, the information on penicillin and rash is maintained.
• Update: The side effects for each drug appear only once.
BOYCE-CODD NORMAL FORM
Student Teacher Subject The table is not in BCNF, because in the FD (teacher->subject), teacher is
Jhansi P.Naresh Database not a key. This relation suffers with anomalies
Ram K.Das C
If we delete the student Jhansi, the teacher R Prasad who teaches C will
Lakshman P.Naresh Database also be lost.
Jhansi R.Prasad C
Mathew Azhar Networks
Teacher-> subject violates BCNF [since teacher is not a candidate key].
If X->Y violates BCNF then divide R into R1(X, Y) and R2(R-Y).
Ram Azhar Networks
Student Teacher
Teacher Subject
Jhansi P.Naresh
P.Naresh Database
Ram K.Das
K.Das C
Lakshman P.Naresh
P.Naresh Database
Jhansi R.Prasad
R.Prasad C
Mathew Azhar
Azhar Networks
Ram Azhar
CHALLENGES ARISING OUT OF
NORMALIZING
Order
Order Order No
Order No Date Taken
Date Taken Date Dispatched
Date Dispatched Date Invoiced
Date Invoiced Cust ID
Cust ID Cust Name
UPWARD DENORMALIZATION
Order Order
Before: After:
Order No Order No
Date Taken Date Taken
Date Dispatched Date Dispatched
Date Invoiced Date Invoiced
Cust ID Cust ID
Cust Name Cust Name
Order Price
Order Item
Order No Order Item
Item No Order No
Item Price Item No
Num Ordered Item Price
Num Ordered
PROS & CONS
Normalized De-Normalized
Smaller Tables Large Table
Current Transactional Historical Data
Quick insert update Slow insert update
Reports needs multiple joins will take Quick report with less joins.
time Very large Databases.
DB size less. OLAP
OLTP
Oracle 19C Architecture