0% found this document useful (0 votes)
50 views102 pages

Understanding Database Concepts and Applications

Uploaded by

kdont1472
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views102 pages

Understanding Database Concepts and Applications

Uploaded by

kdont1472
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Definitions of Database

q Def 1: Database is an organized collection of


logically related data
q Def 2: A database is a shared collection of logically
related data that is stored to meet the requirements of
different users of an organization
q Def 3: A database is a self-describing collection of
integrated records
q Def 4: A database models a particular real world
system in the computer in the form of data

Chapter 1
Examples of Database Applications
q Purchases from the supermarket
q Purchases using your credit card
q Booking a holiday at the travel agents
q Using the local library
q Taking out insurance
q Renting a video
q Using the Internet
q Studying at university

Chapter 1
Definitions
q Data: stored representations of meaningful objects and
events or
q Referred to facts concerning objects and events that
could be recorded and stored on computer media
q Structured: numbers, text, dates

q Unstructured: images, video, documents

q Information: data processed to increase knowledge in


the person using the data
q Metadata: data that describes the properties and
context of user data
Chapter 1 3
What is a Database
q Shared collection of logically related data (and a
description of this data), designed to meet the
information needs of an organization.
q System catalog (metadata) provides description of
data to enable program–data independence.
q Logically related data comprises entities,
attributes, and relationships of an organization’s
information.

Chapter 1 4
Figure 1-1a Data in Context

Context helps users understand data


Chapter 1
Descriptions of the properties or characteristics of the data,
including data types, field sizes, allowable values, and data
context

Chapter 1
Components of Database

Chapter 1
A bit of History
q Computer initially used for computational/ engineering
purposes
q Commercial applications introduced File Processing
System

Chapter 1 8
File Processing System
q File system is a method of organising the files with
a hard disk or other medium of storage.
q It is compatible with different file types, such as
mp3, doc, txt, mp4,etc and these are also grouped
into directories.
q Examples are NTFS or the New Technology File
System and EXT, the Extended File System.
q A file system is a more unstructured data store for
storing arbitrary, probably unrelated data.

Chapter 1 9
File Processing Systems

Library Examination Registration


Reg_Number Reg_Number Reg_Number
Name Name Name
Father Name Address Father Name
Books Issued Class Phone
Fine Semester Address
Grade Class

Chapter 1 10
Disadvantages of File Processing
q Program-Data Dependence
q File structure is defined in the program code where change in format
requires change in all the program associated with those files.

q Duplication of Data (Data Redundancy)


q Different systems/programs have separate copies of the same data
¨ Same data is held by different programs.
¨ Wasted space and potentially different values and/or different formats for
the same item.
q Limited Data Sharing
q No centralized control of data
q Programs are written in different languages, and so cannot easily access
each other’s files.

Chapter 1
Disadvantages of File Processing
q Lengthy Development Times
q Programmers must design their own file formats
q Excessive Program Maintenance
q Vulnerable to Inconsistency
q Change in one table need changes in corresponding tables as well
otherwise data will be inconsistent

Chapter 1 12
SOLUTION:
The DATABASE Approach
q Central repository of shared data
q Data is managed by a controlling agent
q Stored in a standardized, convenient form

This requires a
Database and Database Management System (DBMS)

Chapter 1
Library Examination Registration

Library Examination Registration


Applications Applications Applications

Database
Management
System

- Data Sharing - Data Independence


- Controlled Redundancy University - Better Data Integrity
Students
Database
Chapter 1 14
Database Management System
q A software system that is used to create, maintain,
and provide controlled access to users of a database
q (Database) application program: A computer program
that interacts with database by issuing an appropriate
request (SQL statement) to the DBMS

Chapter 1
Database Management System

DBMS manages data resources like an operating system manages hardware resources

Chapter 1
Advantage of DBMS
n Redundancy problem can be solved
n Has a very high security level
n Presence of Data integrity
n Support multiple users
n Avoidance of inconsistency
n Shared data
n Provide backup of data

Chapter 1
Disadvantages of DBMS
n Complexity
n Size
n Performance
n Higher impact of a failure
n Cost of DBMS

Chapter 1
Objective of DBMS
Data Availability:
DBMS ensures that the format of data should be meaningful and at a
reasonable cost so that a wide range of users can access it.

Data Organization and Structuring:


DBMS structures and efficiently organize the data to eliminate data
complexity and provide a framework for relationships between different
data entities.

Data Security and Authorization:


To maintain the authority of data and protect it from unauthorized
access, DBMS validates the user and gives permissions to only
authorized users to view, modify, and delete

Chapter 1
Types of DBMS
Hierarchical database
Relationships between records form a hierarchy or tree like structure.

Network Database
Allows many-to-many relationship among records. That is, the network
model can access a data element by following one of several paths,
because any data element or record can be related to any number of
other data elements.

Relational Database
Data elements within the database are stored in the form of simple
tables. Tables are related if they contain common fields.

Chapter 1
Examples of Database
Management System

n MySQL
n Oracle
n MS-Access
n SQLite

Chapter 1
Hierarchy of Data
Data are the principal resources of an organization. Data stored in computer
systems form a hierarchy extending from a single bit to a database, the major
record-keeping entity of a firm. Each higher rung of this hierarchy is organized
from the components below it.

n Data are logically organized into:

n 1. Bits (characters)

n 2. Fields

n 3. Records

n 4. Files

n 5. Databases
Chapter 1
Chapter 1
n Bit (Character) - a bit is the smallest unit of data representation (value of a
bit may be a 0 or 1). Eight bits make a byte which can represent a character
or a special symbol in a character code.

n Field - a field consists of a grouping of characters. A data field represents


an attribute (a characteristic or quality) of some entity (object, person, place,
or event).

n Record - a record represents a collection of attributes that describe a real-


world entity. A record consists of fields, with each field describing an
attribute of the entity.

n File - a group of related records. Files are frequently classified by the


application for which they are primarily used (employee file). A primary key
in a file is the field (or fields) whose value identifies a record among others
in a data file.

n Database -maintained in a table

Chapter 1
Relational Database (RDBMS)
A relational database is a type of database. It uses a structure that
allows us to identify and access data in relation to another piece of data
in the database. Often, data in a relational database is organized in

Table
Tables are essential objects in a database because they hold all the
information or data to tables.

Chapter 1
Chapter 1
Fields
n Besides the rows, you can see the data is separated into four
columns, or fields. A field is a column in our example of a data table
containing specific information about every record in the table.

Chapter 1
Records
n All four data values make up one record. It corresponds to a row of
the table.

Chapter 1
Data Values
Each of the four elements has a specific meaning. We call each one a data
value.

Chapter 1
Keys in DBMS
n A key in DBMS is an attribute or a set of
attributes that help to uniquely identify a
tuple (or row) in a relation (or table).
n Keys are also used to establish
relationships between the different tables
and columns of a relational database.
n Individual values in a key are called key
values.
Chapter 1
Why are the Keys Required?
n A key is used in the definitions of various kinds of
integrity constraints. A table in a database represents a
collection of records or events for a particular relation.
Now there can be thousands and thousands of such
records, some of which may be duplicated.
n There should be a way to identify each record separately
and uniquely, i.e. no duplicates. Keys allow us to be free
from this hassle.

Chapter 1
Types of Keys in DBMS
n Primary Key
n Candidate Key
n Super Key
n Foreign Key
n Composite Key

Chapter 1
Primary Key
n Primary key is a column of a table or a set
of columns that helps to identify every
record present in that table uniquely.
n There can be only one primary Key in a
table.
n Also, the primary Key cannot have the
null and same values repeating for any
row.
Chapter 1
Chapter 1
Candidate Key
n Candidate keys are those attributes that
uniquely identify rows of a table.
n The Primary Key of a table is selected from one
of the candidate keys.
n So, candidate keys have the same properties as
the primary keys explained above.
n There can be more than one candidate keys in a
table.

Chapter 1
Chapter 1
Super Key
n Super Key is the set of all the keys which help to
identify rows in a table uniquely.
n Various super keys together makes the criteria
to select the candidate keys.
n In a relation, number of super keys are more
than number of primary keys.
n Super key’s attributes can contain NULL values.

Chapter 1
Chapter 1
Foreign Key
n Foreign Key is used to establish
relationships between two tables.
n A foreign key will require each value in a
column or set of columns to match the
Primary Key of the referential table.
n Foreign keys help to maintain data and
referential integrity.

Chapter 1
In this Key in DBMS example, we have
two tables: instructor and department at
a school. However, it is hard to
distinguish which instructor is assigned
to which division.

We can link the two tables in this table


by adding the Foreign Key in Deptcode
Chapter 1 to the Teacher name.
Chapter 1
Chapter 1
Composite Key
n A composite Key is a set of two or more
attributes that help identify each tuple in a
table uniquely.
n The attributes in the set may not be
unique when considered separately.
However, when taken all together, they will
ensure uniqueness.
n The ‘concatenated key’ is another name
for 1a composite key.
Chapter
• StudentID alone does not uniquely identify a record because a student
can register for multiple courses.
• CourseID alone does not uniquely identify a record because a course
can have multiple students registered.
• The combination of StudentID and CourseID forms a composite key that
uniquely identifies each record in the CourseRegistrations table.

Chapter 1
Chapter 1
Entity
n An entity is an object or component of
data. An entity is represented as rectangle
in an ER diagram

For example: In the following ER diagram we have two entities


Student and College and these two entities have many to one
relationship as many students study in a single college

Chapter 1
Attributes
n An attribute describes the property of an
entity. An attribute is represented as Oval
in an ER diagram. There are four types of
attributes

Chapter 1
n There are four types of attributes:
n 1. Key attribute
n 2. Composite attribute
n 3. Multivalued attribute
n 4. Derived attribute

Chapter 1
Rollno is key attribute

Stu_phone is multivalued attribute and age is derived attribute

Chapter 1
Relationship
n A relationship is represented by diamond
shape in ER diagram, it shows the
relationship among entities. There are four
types of cardinal relationships:
n 1. One to One
n 2. One to Many
n 3. Many to One
n 4. Many to Many
Chapter 1
Chapter 1
ER Model
n An Entity–relationship model (ER model)
describes the structure of a database with the
help of a diagram, which is known as Entity
Relationship Diagram (ER Diagram).
n An ER model is a design or blueprint of a
database that can later be implemented as a
database.
n The main components of E-R model are: entity
set and relationship set.

Chapter 1
Facts about ER Diagram Model:
n ER model allows you to draw Database Design
n It is an easy to use graphical tool for modeling data
n Widely used in Database Design
n It is a GUI representation of the logical structure of a Database
n It helps you to identifies the entities which exist in a system and the
relationships between those entities

Why use ER Diagrams?


n Provide a preview of how all your tables should connect, what fields
are going to be on each table
n Helps to describe entities, attributes, relationships
n ER diagrams are translatable into relational tables which allows you
to build databases quickly

Chapter 1
Chapter 1
How to Draw an ER Diagram?
n First, identify all the Entities. Embed all the entities in a
rectangle and label them properly.
n Identify relationships between entities and connect them
using a diamond in the middle, illustrating the
relationship. Do not connect relationships with each
other.
n Connect attributes for entities and label them properly.
n Eradicate any redundant entities or relationships.
n Make sure your ER Diagram supports all the data
provided to design the database.
n Effectively use colors to highlight key areas in your
diagrams.
Chapter 1
Chapter 1
Chapter 1
Chapter 1
nClass WORK:
DRAW ER Diagram of ATM system

Chapter 1
Chapter 1
n A transaction is a unit of program execution
that accesses and possibly updates various
data items.
n E.g., transaction to transfer $50 from account
A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

n Two main issues to deal with:


¨ Failures of various kinds, such as hardware
failures and system crashes
Chapter¨1Concurrent execution of multiple transactions
COMMIT
n As app program is executing, it is “in a
transaction”
n Program can execute COMMIT
¨ SQL command to finish the transaction
successfully
¨ The next SQL statement will automatically
start a new transaction

Chapter
April 2004
1
ROLLBACK
n If the app gets to a place where it can’t
complete the transaction successfully, it
can execute ROLLBACK
n This causes the system to “abort” the
transaction
¨ The database returns to the state without any
of the previous changes made by activity of
the transaction
Chapter
April 2004
63
1
the ACID properties of transactions provide a mechanism in DBMS to ensure the
consistency and correctness of any database.

n Atomicity. Either all operations of the transaction are


properly reflected in the database or none are.
n Consistency. Execution of a transaction in isolation
preserves the consistency of the database.
n Isolation. Although multiple transactions may
execute concurrently, each transaction must be
unaware of other concurrently executing transactions.
Intermediate transaction results must be hidden from
other concurrently executed transactions.
n Durability. After a transaction completes
successfully, the changes it has made to the database
persist, even if there are system failures.
Chapter 1
Chapter 1
n Active – the initial state; the transaction stays in this
state while it is executing
n Partially committed – after the final statement has
been executed.
n Failed -- after the discovery that normal execution can
no longer proceed.
n Aborted – after the transaction has been rolled back
and the database restored to its state prior to the
start of the transaction. Two options after it has been
aborted:
¨ Restart the transaction
n Can be done only if no internal logical error
¨ Kill the transaction
n Committed
Chapter 1 – after successful completion.
Atomicity
n Two possible outcomes for a transaction
¨ It commits: all the changes are made
¨ It aborts: no changes are made

n That is, transaction’s activities are all or


nothing

Chapter
April 2004
67
1 Transactions by Alan Fekete
Chapter 1
Consistency
n Each transaction can be written on the assumption that
all integrity constraints hold in the data, before the
transaction runs
n It must make sure that its changes leave the integrity
constraints still holding
¨ However, there are allowed to be intermediate states where the
constraints do not hold
n A transaction that does this, is called consistent
n This is an obligation on the programmer
¨ Usually the organization has a testing/checking and sign-off
mechanism before an application program is allowed to get
installed in the production system

Chapter
April 2004
69
1 Transactions by Alan Fekete
n Total after T occurs = 400 + 300 = 700.

n Total before T occurs = 500 + 200 = 700.

n Thus, the given database is consistent.


Here, an inconsistency would occur when
T1 completes, but then the T2 fails. As a
result, the T would remain incomplete.

Chapter 1
Chapter 1
D – Durability
n The durability property states that once the
execution of a transaction is completed, the
modifications and updates on the database gets
written on and stored in the disk.
n These persist even after the occurrence of a
system failure. Such updates become
permanent and get stored in non-volat ile
memory.
n Thus, the effects of this transaction are never
lost.
Chapter 1
Normalization
n This is the process of organizing data in a
database so that it is consistent, efficient, and
easy to manage.
n Data normalization is the process of reorganizing
data within a database so that users can utilize it
for further queries and analysis.
n it is the process of developing clean data. This
includes eliminating redundant and unstructured

Chapter 1
Normalization is important
because it helps to:

n Reduce data redundancy


n Improve data consistency
n Simplify database design
n Improve query performance

Chapter 1
How Does Data Normalization Work
n Step 1: Identify the Problem Identify the problems with the current
database design, such as data redundancy, data inconsistency, and
data anomalies.
n Step 2: Define the Functional Dependencies Identify the functional
dependencies between attributes in a table. A functional dependency
exists between two attributes, X and Y, if the value of Y is
determined by the value of X.
n Step 3: Apply the First Normal Form (1NF) Apply the rules of 1NF
to ensure that each attribute in a table is a single-valued attribute.
This involves eliminating any composite or multi-valued attributes.
n Step 4: Apply the Second Normal Form (2NF) Apply the rules of
2NF to ensure that each non-prime attribute is fully dependent on
each candidate key. This involves eliminating any partial
dependencies.

Chapter 1
n Step 5: Apply the Third Normal Form (3NF) Apply the rules of 3NF
to ensure that there are no transitive dependencies. This involves
eliminating any dependencies between non-prime attributes.

n Step 7: Decompose Tables Decompose tables into smaller tables to


eliminate data redundancy and dependency.

n Step 8: Create Relationships Create relationships between tables


using keys to ensure data consistency and integrity.

n Step 9: Verify and Refine Verify that the normalized database design
meets the requirements and refine it as necessary.

Chapter 1
Normal Forms
n Normals forms are defined structures for relations with
set of constraints that relations must satisfy in order to
detect data redundancy and correct anomalies. There
can be following anomalies while performing a database
operation:

n insert: data is known but can not be inserted


n update: updating data requires modifications in multiple
tuples (rows)
n delete: deleting some data causes some other data to be
lost

Chapter 1
Problems without Normalization in DBMS

Insertion Anomaly in DBMS

•Suppose for a new admission, until and unless a student opts for a branch,
data of the student cannot be inserted, or else we will have to set the branch
information as NULL.
•Also, if we have to insert data for 100 students of the same branch, then the
branch information will be repeated for all those 100 students.
Chapter 1
Deletion Anomaly in DBMS Updation Anomaly in DBMS
•In our Student table, two different n What if Mr. X leaves the college? or

pieces of information are kept Mr. X is no longer the HOD of the


together, the Student computer science department? In
information and the Branch that case, all the student records will
information. have to be updated, and if by
•So if only a single student is mistake we miss any record, it will
enrolled in a branch, and that lead to data inconsistency.
student leaves the college, or for n This is an Updation anomaly
some reason, the entry for the because you need to update all the
student is deleted, we will lose the records in your table just because
branch information too. one piece of information got
•So never in DBMS, we should changed.
keep two different entities together,
which in the above example is
Student and branch,
The solution for all the three anomalies described above is
to keep the student information and the branch
Chapter 1 information in two different tables. And use the
branch_id in the student table to reference the branch.
n 1NF: Removes duplicates and creates separate tables
for groups of related data.
n 2NF: Removes subgroups of data present in multiple
rows of a table and creates new tables, with connections
between them.
n 3NF: Deletes columns that do not depend on the main
key value.

Chapter 1
First Normal Form (1NF)
The first normal form (1NF) is the initial stage of data
normalization. A database is in 1NF if it contains atomic
values. This means that each cell in the database holds a
single value and each record is unique. This stage
eliminates duplicate data and ensures that each entry in
the database has a unique identifier, enhancing data
consistency. OR
n There are only Single Valued Attributes.
n Attribute Domain does not change.
n There is a Unique name for every Attribute/Column.
n The order in which data is stored, does not matter.
n First Normal Form (1NF) does not eliminate redundancy,
but rather, it’s that it eliminates repeating groups.
Chapter 1
Chapter 1
Chapter 1
Second Normal Form (2NF)
n A database reaches the second normal form (2NF) if it is
already in 1NF and all non-key attributes are fully
functionally dependent on the primary key. In other
words, there should be no partial dependencies in the
database. This stage further reduces redundancy and
ensures that each piece of data in the database is
associated with the primary key which uniquely identifies
each record.

Chapter 1
Chapter 1
Chapter 1
Does each column in the product table depend on the product
ID?
n product name: Yes, this is directly related to the product ID.
n usage: This contains values such as Casual or Running, and
it's not dependent on the product ID. It can exist separately to
the product.
n manufacturer: The manufacturer does not depend on the
product ID. A manufacturer, such as Nike, can exist
independently of this Nike Air Max product.
n price: Yes, this is specific to this product.
n size: No, the size can exist separately from the product.
n description: Yes, this is specific to this product.
n number in stock: Yes, this is specific to this product.

Chapter 1
We have several attributes that are not dependent on the
product ID. They should be created as separate tables:
n usage, manufacturer,size
We can create the tables in a similar way to the color and
category tables.

Chapter 1
Another step we need to take as part of second normal form is relating
our tables together. This is done by adding foreign keys to tables, so
that they can relate to the correct record.

Chapter 1
Third Normal Form (3NF)
n The next step is called third normal form. This is a common
place to end the SQL normalization process, because the
design allows for efficient SQL queries and addresses the
issues of databases that aren't normalised.

n A design will meet third normal form if it has no "transitive


functional dependency".
A "transitive functional dependency"
means that every attribute that is not the primary key is
dependent on only the primary key.
Note:To achieve 3NF we have to overcome some more issues
associated with 2NF
Chapter 1
For example, in one table:
n Column A determines Column B
n Column B determines Column C
n This is a "transitive functional
dependency" because Column C depends
on Column B instead of Column A.

Chapter 1
In the relation CANDIDATE given above:
Functional dependency Set:
{CAND_NO -> CAND_NAME, CAND_NO ->CAND_STATE,
CAND_STATE -> CAND_CUNTRY, CAND_NO ->
CAND_AGE}
So, Candidate key here would be:
{CAND_NO}

Chapter 1
n For the relation given here in the table, CAND_NO ->
CAND_STATE and CAND_STATE -> CAND_COUNTRY are
actually true. Thus, CAND_COUNTRY depends transitively on
CAND_NO. This transitive relation violates the rules of being in the
3NF. So, if we want to convert it into the third normal form, then we
have to decompose the relation CANDIDATE (CAND_NO,
CAND_NAME, CAND_STATE, CAND_COUNTRY, CAND_AGE)
as:

n CANDIDATE (CAND_NO, CAND_NAME, CAND_STATE,


CAND_STATE, CAND_AGE)

n STATE_COUNTRY (STATE, COUNTRY)

Chapter 1
Chapter 1
Chapter 1
Centralized Database and Distributed Database
Centralized Database Distributed Database

A type of database in which all data stored on the A type of database that consists of two or more
central device. Central device may be a mobile or a database files located at different places over the
computer etc. network.

Managing, modifying and backup of data is easy As there are multiple database files, in a distributed
because all data stored in one place. database, it requires time to manage data.

Requires time to access data because multiple users It has good speed in accessing data because data files
access the database files. are retrieved from the nearest database.

If the central server fails users are not able to access


If one database fails user still access another database
database

Has more data consistency and it provides the complete May have data replication and there can be some data
user view inconsistency.

Chapter 1
Chapter 1
Chapter 1
What are the benefits of a centralised database?

n Data Integrity: This means that when data is added, updated or


deleted in one place, it will be added, updated or deleted in all
places.
n Simplified Reporting
n Single Point of Failure: A single source of failure also means a
single source of support. If anything goes wrong with your database,
your support team will be able to hone in on the problem much
quicker than in the case of a distributed system.
n Reduced Learning Curve: A reduced learning curve means less
training costs and that employees can be up and running in less
time.
n Improved Security: A centralised database has a single point of
access through a network, meaning there is a single point of access
that needs to be protected.

Chapter 1
database administrator
n A database administrator, or DBA, is responsible for maintaining,
securing, and operating databases and also ensures that data is
correctly stored and retrieved.
n DBAs often work with developers to design and implement new
features and troubleshoot any issues. A DBA must have a strong
understanding of both technical and business needs.

DBA Activity
n Creating and maintaining database standards and policies.
n Supporting database design, creation, and testing activities.
n Managing the database availability and performance, including
incident and problem management.
n Administering database objects to achieve optimum utilization.
Chapter 1
Database Security
n Database security includes a variety of measures used
to secure database management systems from
malicious cyber-attacks
n Data security best practices include data protection
techniques such as data encryption, key management,
data redaction, data subsetting, and data masking, as
well as privileged user access controls and auditing and
monitoring.

Chapter 1
Security Best Practice

Chapter 1

You might also like