You are on page 1of 55

Notes

B.Sc. (Computer Science), Mca

Database Management System


DBMS stands for Database Management System. We can break it like this DBMS = Database
+ Management System. Database is a collection of data and Management System is a set of
programs to store and retrieve those data. Based on this we can define DBMS like this:
DBMS is a collection of inter-related data and set of programs to store & access those data
in an easy and effective manner.

Introduction to DBMS

DBMS Applications

Advantages of DBMS over file processing system

View of Data

Data Abstraction

Instances and Schemas

Data Models in DBMS

E-R Model in DBMS

Relational Model in DBMS

RDBMS concepts

Hierarchical data Model in DBMS

Network Model in DBMS

Database languages

Keys in DBMS

Primary key

Super key

Candidate key

Alternate key

Composite key

Foreign key

Constraints in DBMS

Domain constraints

Mapping constraints

Cardinality in DBMS

Functional dependencies in DBMS

Trivial functional dependency

non-trivial functional dependency

Multivalued dependency

Transitive dependency

Normalization in dbms

First Normal Form(1NF)

Second Normal Form(2NF)

Third Normal Form(3NF)

BoyceCodd Normal Form(BCNF)

Transaction Management in DBMS

ACID Properties

Deadlock
Concurrency Control

Database Applications DBMS


Applications where we use Database Management Systems are:

Telecom: There is a database to keeps track of the information regarding calls made,
network usage, customer details etc. Without the database systems it is hard to
maintain that huge amount of data that keeps updating every millisecond.

Industry: Where it is a manufacturing unit, warehouse or distribution centre, each


one needs a database to keep the records of ins and outs. For example distribution
centre should keep a track of the product units that supplied into the centre as well as
the products that got delivered out from the distribution centre on each day; this is
where DBMS comes into picture.

Banking System: For storing customer info, tracking day to day credit and debit
transactions, generating bank statements etc. All this work has been done with the help
of Database management systems.

Education sector: Database systems are frequently used in schools and colleges to
store and retrieve the data regarding student details, staff details, course details, exam
details, payroll data, attendance details, fees details etc. There is a hell lot amount of
inter-related data that needs to be stored and retrieved in an efficient manner.

Online shopping: You must be aware of the online shopping websites such as Amazon,
Flipkart etc. These sites store the product information, your addresses and preferences,
credit details and provide you the relevant list of products based on your query. All this
involves a Database management system.

I have mentioned very few applications; this list is never going to end if we start
mentioning all the DBMS applications.

Advantages of DBMS
Drawbacks of File system:
Data Isolation: Because data are scattered in various files, and files may be in
different formats, writing new application programs to retrieve the appropriate data is
difficult.

Duplication of data Redundant data

Dependency on application programs Changing files would lead to change in


application programs.

Advantage of DBMS over file system:


There are several advantages of Database management system over file system. Few of
them are as follows:
No redundant data Redundancy removed by data normalization
Data Consistency and Integrity data normalization takes care of it too
Secure Each user has a different set of access
Privacy Limited access
Easy access to data
Easy recovery
Flexible

Disadvantages of DBMS:

DBMS implementation cost is high compared to the file system


Complexity: Database systems are complex to understand
Performance: Database systems are generic, making them suitable for various
applications. However, this feature affect their performance for some applications

View of Data in DBMS


Abstraction is one of the main features of database systems. Hiding irrelevant details from
user and providing abstract view of data to users, helps in easy and efficient userdatabase interaction.
To understand the view of data, you must have a basic knowledge of data abstraction and
instance & schema.
1.
2.

Data abstraction
Instance and schema

1. Data Abstraction in DBMS


Database systems are made-up of complex data structures. To ease the user interaction
with database, the developers hide internal irrelevant details from users. This process of
hiding irrelevant details from user is called data abstraction.
We have three levels of abstraction:

Physical level: This is the lowest level of data abstraction. It describes how data is actually
stored in database. You can get the complex data structure details at this level.

Logical level: This is the middle level of 3-level data abstraction architecture. It describes
what data is stored in database.

View level: Highest level of data abstraction. This level describes the user interaction with
database system.

Example: Lets say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.)
in memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such
details are hidden from them.

2. Instance and schema in DBMS


Definition of schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema.
The design of a database at physical level is called physical schema, how the data stored in
blocks of storage is described at this level.
Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of
data records gets stored in data structures, however the internal details such as
implementation of data structure is hidden at this level (available at physical level).

Design of database at view level is called view schema. This generally describes end user
interaction with database systems.

Definition of instance: The data stored in database at a particular moment of time is


called instance of database. Database schema defines the variable declarations in tables
that belong to a particular database; the value of these variables at a moment of time is
called the instance of that database.

Data models in DBMS


A Data Model is a logical structure of Database. It describes the design of database to
reflect entities, attributes, relationship among data, constrains etc.
Types of Data Models:
Object based logical Models Describe data at the conceptual and view levels.
1.
E-R Model
2.
Object oriented Model
Record based logical Models Like Object based model, they also describe data at the
conceptual and view levels. These models specify logical structure of database with
records, fields and attributes.
1.
2.
3.

Relational Model
Hierarchical Model
Network Model Network Model is same as hierarchical model except that it has
graph-like structure rather than a tree-based structure. Unlike hierarchical model, this
model allows each record to have more than one parent record.
Physical Data Models These models describe data at the lowest level of abstraction.

1. E-R model in DBMS


An entityrelationship model (ER model) is a systematic way of describing and defining a
business process. An ER model is typically implemented as a database. The main
components of E-R model are: entity set and relationship set.
Here are the geometric shapes and their meaning in an E-R Diagram
Rectangle: Represents Entity sets.
Ellipses: Attributes
Diamonds: Relationship Set
Lines: They link attributes to Entity Sets and Entity sets to Relationship Set
Double Ellipses: Multivalued Attributes
Dashed Ellipses: Derived Attributes

Double Rectangles: Weak Entity Sets


Double Lines: Total participation of an entity in a relationship set

A sample E-R Diagram:

Multivalued Attributes: An attribute that can hold multiple values is known as multivalued
attribute. We represent it with double ellipses in an E-R Diagram. E.g. A person can have
more than one phone numbers so the phone number attribute is multivalued.
Derived Attribute: A derived attribute is one whose value is dynamic and derived from
another attribute. It is represented by dashed ellipses in an E-R Diagram. E.g. Person age is
a derived attribute as it changes over time and can be derived from another attribute (Date
of birth).

E-R diagram with multivalued and derived attributes:

Total Participation of an Entity set:


A Total participation of an entity set represents that each entity in entity set must have at
least one relationship in a relationship set. For example: In the below diagram each college
must have at-least one associated Student.

2. Object oriented Model


a) Relational model in DBMS
In relational model, the data and relationships are represented by collection of interrelated tables. Each table is a group of column and rows, where column represents
attribute of an entity and rows represents records.
Sample relationship Model: Student table with 3 columns and four records.
Stu_Id
111
123
169
234

Stu_Name
Ashish
Saurav
Lester
Lou

Stu_Age
23
22
24
26

Course table: Course table


Stu_Id
111
111
169
169

Course_Id
C01
C02
C22
C39

Course_Name
Science
DBMS
Java
Computer Networks

Here Stu_Id, Stu_Name & Stu_Age are attributes of table Student and Stu_Id, Course_Id &
Course_Name are attributes of table Course. The rows with values are the records
(commonly known as tuples).

RDBMS Concepts
RDBMS stands for relational database management system. A relational database has following major
components: Table, Record / Tuple, Field & Column /Attribute.

Table:
A table is a collection of data represented in rows and columns. For e.g.
following table stores the information of students.
Student_Id
101
102
103
104

Student_Name
Chaitanya
Ajeet
Rahul
Shubham

Student_Addr
Dayal Bagh, Agra
Delhi
Gurgaon
Chennai

Student_Age
27
26
24
25

Records / Tuple:
Each row of a table is known as record or it is also known as tuple. For e.g. The below row is a record.
102

Ajeet

Delhi

26

Field:
The above table has four fields: Student_Id, Student_Name, Student_Addr & Student_Age.
Column / Attribute:
Each attribute and its values are known as attributes in a database. For e.g. Set of values of Student_Id field
is one of the four columns of the Student table.
Student_Id
101
102
103
104

What is an attribute in DBMS? Definition and explanation


You may hear this term often when dealing with Relational Database Management Systems (RDBMS). In
RDBMS, a table organizes data in rows and columns. The columns are known as attributes whereas the rows
are known as records.
Example: A school maintains the data of students in a table named student. Suppose the data they store
in table is student id, student name & student age. To do this they have had three columns in the
table: student_id, student_age, student_name. The table looks like this:
student_id
101
102
103

student_age
12
13
12

Here student_id, student_age and student_name are the attributes.

student_name
Jon
Arya
Sansa

b) Hierarchical model in DBMS

In hierarchical model, data is organized into a tree like


structure with each record is having one parent record and
many children. The main drawback of this model is that, it
can have only one to many relationships between nodes.
Sample Hierarchical Model Diagram:
Example of hierarchical data represented as relational tables: The above hierarchical
model can be represented as relational tables like this:
Stu_Id
123
367
234

Stu_Name
Steve
Chaitanya
Ajeet

Stu_Age
29
27
28

Course Table:
Course_Id
C01
C21
C22
C33

Course_Name
Cobol
Java
Perl
JQuery

Stu_Id
123
367
367
234

DBMS languages
Database languages are used for read, update and store data in a database. There are
several such languages that can be used for this purpose; one of them is SQL (Structured
Query Language).
Types of DBMS languages:

Data Definition Language (DDL):


DDL is used for specifying the database schema. Lets take SQL for instance to categorize
the statements that comes under DDL.

To create the database instance CREATE

To alter the structure of database ALTER

To drop database instances DROP

To delete tables in a database instance TRUNCATE

To rename database instances RENAME


All these commands specify or update the database schema thats why they come under
Data Definition language.

SQL CREATE DATABASE Statement


The ideal way to store the data is to first create a database and then create tables into it.
Syntax:
We use CREATE DATABASE statement in order to create a database. This is how it is used:
CREATE DATABASE DBName;

Here DBName can be any string that would represent the database name.
Example The below statement would create a database named employee
SQL> CREATE DATABASE Employee;

In order to get the list of all the databases, you can use
Example
SQL> SHOW DATABASES;
+--------------------+
| Database
|
+--------------------+
| BeginnersBook
|
| AbcTemp
|
| Employee
|
| Customers
|

SHOW

DATABASES

statement.

| Student
|
| Faculty
|
| MyTest
|
| Demo
|
+--------------------+
8 rows in set (0.00 sec)

As you can see this statement listed all the databases. You can also find the Employee database
in the above list that we have created above using the CREATE DATABASE statement.

SQL DROP DATABASE statement


The

DROP DATABASE

statement is used for deleting a database and all of its tables completely.

Syntax:
DROP DATABASE DBName;

Here DBName is the name of the database which you want to delete.
Example The below statement would delete the database named Student.
SQL> DROP DATABASE Student;

Note: By deleting a database you delete all of its tables implicitly. For e.g. the above statement
would delete all the tables that are stored inside Student database, along with the database.
After dropping a database you can check the database list to cross verify that the database has
been successfully dropped or not. This is how you can do it.
Before deleting Student Database:
SQL> SHOW DATABASES;

5 rows in set (0.00 sec)

+--------------------+
| Database
|
+--------------------+
| Abc
|
| Xyz
|
| Student
|
| Demo
|
| Test
|
+--------------------+

After deleting Student Database:


SQL> DROP DATABASE Student;

List down the databases:


SQL> SHOW DATABASES;

4 rows in set (0.00 sec)

+--------------------+
| Database
|
+--------------------+
| Abc
|
| Xyz
|
| Demo
|
| Test
|
+--------------------+

Data Manipulation Language (DML):


DML is used for accessing and manipulating data in a database.

To read records from table(s) SELECT

To insert record(s) into the table(s) INSERT

Update the data in table(s) UPDATE

Delete all the records from the table DELETE

SQL SELECT Query


Select query is used for fetching data from table(s). We have flexibility to fetch few columns, few rows or
entire table using SELECT Query.
Syntax:
SELECT column_name_1, column_name_2, ... FROM table_name;

For fetching entire table:


SELECT * FROM table_name;
For fetching certain columns of table:

Lets say we want to fetch column_a and column_x of table named ABC. The query for this should be:
SELECT column_a, column_x from ABC;

Example:
Lets say we have an EMPLOYEES table having below data.
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+

|SSN

In order to fetch the SSN and EMP_NAME, we can write the SELECT Query like this:
SELECT SSN, EMP_NAME FROM EMPLOYEES;

The Query would produce the below result.


+------+----------+
|SSN | EMP_NAME |

+------+----------+
| 101 | Steve |
| 223 | Peter |
| 388 | Shubham |
| 499 | Chaitanya|
| 589 | Apoorv |
| 689 | Rajat |
| 700 | Ajeet |
+------+----------+

Similarly, you can fetch any particular column or group of columns using the SELECT Query in SQL.
To fetch the entire EMPLOYEES table:
SELECT * FROM EMPLOYEES;

Result:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+

|SSN

UPDATE Query in SQL


Update Query is used for updating existing rows(records) in a table. Using this we can update any number of
rows in a table.

Syntax
UPDATE TableName
SET column_name1 = value, column_name2 = value....
WHERE condition;

Query would update only those rows that satisfy the condition defined in where clause.
Example

EMPLOYEES table:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |

|SSN

+------+----------+---------+----------+

Update the salary of employees to 10000 if they are having age greater than 25.
SQL> UPDATE EMPLOYEES
SET EMP_SALARY = 10000
WHERE EMP_AGE > 25;

Updated EMPLOYEES table would look like this:


+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 10000.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+

|SSN

As you can see that only one employee is there in table above the age of 25. The salary for the employee
got updated to 10000.

Update the salary of employee Apoorv to 120000.


SQL> UPDATE EMPLOYEES
SET EMP_SALARY = 120000
WHERE EMP_NAME = 'Apoorv';

Output:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 10000.00 |
| 589 | Apoorv | 21
| 12000.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+

|SSN

DELETE Query in SQL


Delete Query is used for deleting the existing rows(records) from table. Generally, DELETE query is used
along with WHERE clause to delete the certain number of rows that fulfills the specified condition. However
DELETE query can be used without WHERE clause too, in that case the query would delete all the rows of
specified table.

Syntax
1) To Delete a particular set of rows:
DELETE FROM TableName
WHERE condition;
2) To Delete all the rows of a table:
DELETE FROM TableName;

Example:

Lets take a table EMPLOYEES as follows,


+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
| 499 | Chaitanya| 29
| 6588.00 |
| 589 | Apoorv | 21
| 1400.00 |
| 689 | Rajat | 24
| 8900.00 |
| 700 | Ajeet | 20
| 18300.00 |
+------+----------+---------+----------+

|SSN

Delete all the records that have SSN greater than 400:
SQL> DELETE FROM EMPLOYEES
WHERE SSN > 400;

After the successful execution of above query the table would be having below mentioned records:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 223 | Peter | 24
| 2550.00 |
| 388 | Shubham | 19
| 2444.00 |
+------+----------+---------+----------+

|SSN

Delete the data of employees having age greater than or equal to 24:
SQL> DELETE FROM EMPLOYEES
WHERE EMP_AGE >=24;

Result:
+------+----------+---------+----------+
| EMP_NAME | EMP_AGE |EMP_SALARY|
+------+----------+---------+----------+
| 101 | Steve | 23
| 9000.00 |
| 388 | Shubham | 19
| 2444.00 |
+------+----------+---------+----------+

|SSN

DELETE all the records of table EMPLOYEES


SQL> DELETE FROM EMPLOYEES;

Data Control language (DCL):


DCL is used for granting and revoking user access on a database

To grant access to user GRANT

To revoke access from user REVOKE


In practical data definition language, data manipulation language and data control
languages are not separate language; rather they are the parts of a single database
language such as SQL.
keys in DBMS

Key plays an important role in relational database; it is used for identifying unique rows
from table. It also establishes relationship among tables.

Types of keys in DBMS

Primary Key A primary is a column or set of columns in a table that uniquely identifies tuples
(rows) in that table.

Primary key in DBMS


Example:
Student Table
Stu_Id Stu_Name Stu_Age
101
Steve
23
102
John
24
103
Robert
28
104
Carl
22

In the above Student table, the Stu_Id column uniquely identifies each row of the table.
Note:

We denote the primary key by underlining the column name.

The value of primary key should be unique for each row of the table. Primary key column cannot
contain duplicate values.

Primary key column should not contain nulls.

Primary keys are not necessarily to be a single column; more than one column can also be a primary
key for a table. For e.g. {Stu_Id, Stu_Name} collectively can play a role of primary key in the above table,
but that does not make sense because Stu_Id alone is enough to uniquely identifies rows in a table then
why to make things complex. Having that said, we should choose more than one columns as primary key
only when there is no single column that can play the role of primary key.

How to choose a primary key?


There are two ways: Either to create a column and let database automatically have numbers in increasing
order for each row or choose a column yourself making sure that it does not contain duplicates and nulls.
For e.g. in the above Student table, The Stu_Name column cannot be a primary key as more than one people
can have same name, similarly the Stu_Age column cannot play a primary key role as more than one persons
can have same age.

Super Key A super key is a set of one of more columns (attributes) to uniquely identify
rows in a table.

Super key in DBMS


Often people get confused between super key and candidate key, so we will also discuss a little about
candidate key here.
How candidate key is different from super key?
Answer is simple Candidate keys are selected from the set of super keys, the only thing we take care while
selecting candidate key is: It should not have any redundant attribute. Thats the reason they are also
termed as minimal super key.
Lets take an example to understand this: Employee table
Emp_SSN
12345678
9
99999932
1
88899721
2
77777888
8

Emp_Numbe Emp_Nam
r
e
226

Steve

227

Ajeet

228

Chaitanya

229

Robert

Super keys:
{Emp_SSN}
{Emp_Number}
{Emp_SSN, Emp_Number}
{Emp_SSN, Emp_Name}
{Emp_SSN, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}
All of the above sets are able to uniquely identify rows of the employee table.

Candidate Keys:
As I stated above, they are the minimal super keys with no redundant attributes.

{Emp_SSN}

{Emp_Number}
Only these two sets are candidate keys as all other sets are having redundant attributes that are not
necessary for unique identification.

Primary key:
Primary key is being selected from the sets of candidate keys by database designer. So
Either {Emp_SSN} or {Emp_Number} can be the primary key.

Candidate Key A super key with no redundant attribute is known as candidate key.

Candidate Key in DBMS


Candidate keys are selected from the set of super keys, the only thing we take care while selecting
candidate key is: It should not have any redundant attributes. Thats the reason they are also termed as
minimal super key.
For example:
Emp_Id
E01
E22
E23
E45

Emp_Number
2264
2278
2288
2290

Emp_Name
Steve
Ajeet
Chaitanya
Robert

There are two candidate keys in above table:


{Emp_Id}
{Emp_Number}
Note: A primary key is being selected from the group of candidate keys. That means we can either have
Emp_Id or Emp_Number as primary key.

Alternate Key Out of all candidate keys, only one gets selected as primary key,
remaining keys are known as alternate or secondary keys.

Alternate key in DBMS


For example: Consider the below table
Emp_I Emp_Numb Emp_Na
d
er
me
E01
2264
Steve
E22
2278
Ajeet
Chaitany
E23
2288
a
E45
2290
Robert

There are two candidate keys in above table:


{Emp_Id}
{Emp_Number}

Since we have selected


secondary key.

Emp_Id

as primary key, the remaining key Emp_Number would be called alternative or

Composite Key A key that consists of more than one attribute to uniquely identify rows
(also known as records & tuples) in a table is called composite key.

Composite key in DBMS


It is also known as compound key.
Example: Table Sales
cust_Id
C01
C02
C02
C01

order_Id
O001
O123
O123
O001

product_code
P007
P007
P230
P890

product_count
23
19
82
42

Key in above table: {cust_id, order_id}


This is a composite key as it consists of more than one attribute.

Foreign Key Foreign keys are the columns of a table that points to the primary key of
another table. They act as a cross-reference between tables.

Foreign key in DBMS


For example:
In the below example the
key of the Student table.

Stu_Id

column in

Course_enrollment

table is a foreign key as it points to the primary

Course_enrollment table:
Course_Id
C01
C02
C03
C05
C06
C07

Stu_Id
101
102
101
102
103
102

Student table:
Stu_Id Stu_Name Stu_Age

101
102
103
104

Chaitanya
Arya
Bran
Jon

22
26
25
21

Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if it
points to a unique column (not necessarily a primary key) of another table then too, it would be a
foreign key. So, a correct definition of foreign key would be: Foreign keys are the columns of a
table that points to the candidate key of another table.

Constraints in DBMS
Constraints enforce limits to the data or type of data that can be inserted/updated/deleted
from a table. The whole purpose of constraints is to maintain the data integrity during an
update/delete/insert into a table.

Types of constraints

NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints PRIMARY KEY, FOREIGN KEY
Domain constraints
Mapping constraints

NOT NULL:

NOT NULL constraint makes sure that a column does not hold NULL value. When we dont
provide value for a particular column while inserting a record into a table, it takes NULL
value by default. By specifying NULL constraint, we can be sure that a particular column(s)
cannot have NULL values.
Example:
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
);

NOT NULL Constraint in SQL


How to specify the NULL constraint while creating table
Here I am creating a table STUDENTS. I have specified NOT NULL constraint for columns ROLL_NO,
STU_NAME and STU_AGE which means you must provide the value for these three fields while
inserting/updating records in this table. It enforces these column(s) not to accept null values.

CREATE TABLE STUDENTS(


ROLL_NO INT
NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT
NOT NULL,
STU_ADDRESS VARCHAR (235) ,
PRIMARY KEY (ROLL_NO)
);

Specify the NULL constraint for already existing table


In the above section we learnt how to specify the NULL constraint while creating a table. However we can
specify this constraint on a already present table also. For this we need to use ALTER TABLE statement.

ALTER TABLE STUDENTS


MODIFY STU_ADDRESS VARCHAR (235) NOT NULL;

After this STU_ADDRESS column will not accept any null values

UNIQUE:

UNIQUE Constraint enforces a column or set of columns to have unique values. If a column
has a unique constraint, it means that particular column cannot have duplicate values in a
table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);

UNIQUE Constraint in SQL


Set UNIQUE Constraint while creating a table
For SQL Server / MS Access / Oracle:
Syntax:
CREATE TABLE <table_name>
(
<column_name> <data_type> UNIQUE,
<column_name2> <data_type>,
....
....
);

Example:
Here we are setting up the UNIQUE Constraint for two columns: STU_NAME & STU_ADDRESS. which means
these two columns cannot have duplicate values.
Note: STU_NAME column has two constraints (NOT NULL and UNIQUE both) setup.

CREATE TABLE STUDENTS(


ROLL_NO INT
NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT
NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);

MySQL:
Syntax:
CREATE TABLE <table_name>
(

<column_name> <data_type>,
<column_name2> <data_type>,
....
....
UNIQUE(column_name)
);

Example:
Setting up constraint on STU_NAME column.
CREATE TABLE STUDENTS(
ROLL_NO INT
NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT
NOT NULL,
STU_ADDRESS VARCHAR (35),
UNIQUE(STU_NAME),
PRIMARY KEY (ROLL_NO)
);
Naming of UNIQUE Constraint:

MySQL / SQL Server / MS Access / Oracle:

CREATE TABLE STUDENTS(


ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35),
CONSTRAINT stu_Info UNIQUE(STU_NAME, STU_ADDRESS),
PRIMARY KEY (ROLL_NO)
);

Set UNIQUE Constraint on already created table


For MySQL / Oracle / SQL Server / MS Access:
For single column and without constraint naming:
Syntax:
ALTER TABLE <table_name>
ADD UNIQUE (<column_name>);

Example:
ALTER TABLE STUDENTS
ADD UNIQUE (STU_NAME);

For multiple columns and with constraint naming:

Syntax:
ALTER TABLE <table_name>
ADD CONSTRAINT <constraint_name> UNIQUE (<column_name1>, <column_name2>,...);

Example:
ALTER TABLE STUDENTS
ADD CONSTRAINT stu_Info UNIQUE (STU_NAME,STU_ADDRESS);

How to drop a UNIQUE Constraint


IN MySQL:
syntax:
ALTER TABLE <table_name>
DROP INDEX <constraint_name>;

Example:
ALTER TABLE STUDENTS
DROP INDEX stu_Info

IN ORACLE / SQL Server / MS Access:


Syntax:
ALTER TABLE <table_name>
DROP CONSTRAINT <constraint_name>;

Example:
ALTER TABLE STUDENTS
DROP CONSTRAINT stu_Info;

DEFAULT:

The DEFAULT constraint provides a default value to a column when there is no value
provided while inserting a record into a table.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);

DEFAULT Constraint in SQL


Lets see how to specify this constraint and how it works.

Specify DEFAULT constraint while creating a table


Here we are creating a table STUDENTS, we have a requirement to set the exam fees to 10000 if fee is not
specified while inserting a record (row) into the STUDENTS table. We can do so by using DEFAULT constraint.
As you can see we have set the default value of EXAM_FEE column to 10000 using DEFAULT constraint.

CREATE TABLE STUDENTS(


ROLL_NO INT
NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT
NOT NULL,
EXAM_FEE INT
DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);

Specify DEFAULT constraint while creating a table


What if we want to set this constraint on already existing table. For this we can ALTER Table statement like
this:
Syntax:
ALTER TABLE <table_name>
MODIFY <column_name> <column_data_type> DEFAULT <default_value>;

Example:
ALTER TABLE STUDENTS
MODIFY EXAM_FEE INT DEFAULT 10000;

This way we can set constraint on already created table.

How to drop DEFAULT Constraint


In the above sections, we have learnt the ways to set Constraint. Here we will see how to drop (delete) a
Constraint:

Syntax:
ALTER TABLE <table_name>
ALTER COLUMN <column_name> DROP DEFAULT;

Example:
Lets say we want to drop the constraint from STUDENTS table, which we have created in the above
sections. We can do it like this.

ALTER TABLE CUSTOMERS


ALTER COLUMN EXAM_FEE DROP DEFAULT;

CHECK:

This constraint is used for specifying range of values for a particular column of a table.
When this constraint is being set on a column, it ensures that the specified column must
have the value falling in the specified range.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);

In the above example we have set the check constraint on ROLL_NO column of STUDENT
table. Now, the ROLL_NO field must have the value greater than 1000.

Key constraints:
PRIMARY KEY:

Primary key uniquely identifies each record in a table. It must have unique values and
cannot contain nulls. In the below example the ROLL_NO field is marked as primary key,
that means the ROLL_NO field cannot have duplicate and null values.
CREATE TABLE STUDENT(
ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,

STU_AGE INT NOT NULL,


STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);

Primary key in DBMS

Definition: A primary is a column or set of columns in a table that uniquely identifies tuples (rows) in that
table.
Example:
Student Table
Stu_Id Stu_Name Stu_Age
101
Steve
23
102
John
24
103
Robert
28
104
Carl
22

In the above Student table, the Stu_Id column uniquely identifies each row of the table.
Note:

We denote the primary key by underlining the column name.

The value of primary key should be unique for each row of the table. Primary key column cannot
contain duplicate values.

Primary key column should not contain nulls.

Primary keys are not necessarily to be a single column; more than one column can also be a primary
key for a table. For e.g. {Stu_Id, Stu_Name} collectively can play a role of primary key in the above table,
but that does not make sense because Stu_Id alone is enough to uniquely identifies rows in a table then
why to make things complex. Having that said, we should choose more than one columns as primary key
only when there is no single column that can play the role of primary key.
How to choose a primary key?
There are two ways: Either to create a column and let database automatically have numbers in increasing
order for each row or choose a column yourself making sure that it does not contain duplicates and nulls.
For e.g. in the above Student table, The Stu_Name column cannot be a primary key as more than one people
can have same name, similarly the Stu_Age column cannot play a primary key role as more than one persons
can have same age.

FOREIGN KEY:

Foreign keys are the columns of a table that points to the primary key of another table.
They act as a cross-reference between tables.

Foreign key in DBMS


Example:
In the below example the Stu_Id column in Course_enrollment table is a foreign key as it points to the
primary key of the Student table.
Course_enrollment table:
Course_I
d
C01
C02
C03
C05

Stu_I
d
101
102
101
102

C06
C07

103
102

Student table:
Stu_I Stu_Nam Stu_Ag
d
e
e
Chaitany
101
22
a
102
Arya
26
103
Bran
25
104
Jon
21
Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if it points to
a unique column (not necessarily a primary key) of another table then too, it would be a foreign key. So, a
correct definition of foreign key would be: Foreign keys are the columns of a table that points to the
candidate key of another table.

Domain constraints:

Each table has certain set of columns and each column allows a same type of data, based
on its data type. The column does not accept values of any other data type.
Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY / FOREIGN
KEY / CHECK / DEFAULT)

Domain constraints in DBMS


Example:
For example I want to create a table student_info with stu_id field having value greater than 100, I can
create a domain and table like this:
create domain id_value int
constraint id_test
check(value > 100);
create table student_info (
stu_id id_value PRIMARY KEY,
stu_name varchar(30),
stu_age int
);

Another example:
I want to create a table bank_account with account_type field having value either checking or
saving:
create domain account_type char(12)
constraint acc_type_test
check(value in ("Checking", "Saving"));

create table bank_account (


account_nbr int PRIMARY KEY,
account_holder_name varchar(30),
account_type account_type
);

Mapping constraints in DBMS


Mapping constraints can be explained in terms of mapping cardinality:
Mapping Cardinality:
One to One: An entity of entity-set A can be associated with at most one entity of entityset B and an entity in entity-set B can be associated with at most one entity of entity-set A.
One to Many: An entity of entity-set A can be associated with any number of entities of
entity-set B and an entity in entity-set B can be associated with at most one entity of
entity-set A.
Many to One: An entity of entity-set A can be associated with at most one entity of entityset B and an entity in entity-set B can be associated with any number of entities of entityset A.
Many to Many: An entity of entity-set A can be associated with any number of entities of
entity-set B and an entity in entity-set B can be associated with any number of entities of
entity-set A.
We can have these constraints in place while creating tables in database.
Example:
CREATE TABLE Customer (
customer_id int PRIMARY KEY NOT NULL,
first_name varchar(20),
last_name varchar(20)
);
CREATE TABLE Order (
order_id int PRIMARY KEY NOT NULL,
customer_id int,
order_details varchar(50),
constraint fk_Customers foreign key (customer_id)
references dbo.Customer
);

Assuming, that a customer orders more than once, the above relation represents one to
many relation. Similarly, we can achieve other mapping constraints based on the requirements.

Cardinality in DBMS

In DBMS you may hear cardinality term at two different places and it has two different
meanings as well.
In Context of Data Models:
In terms of data modeling, cardinality refers to the relationship between two tables. They
can be of four types:
One to One A single row of table 1 associates with single row of table 2
One to Many A single row of table 1 associates with more than one rows of table 2
Many to One Many rows of table 1 associate with a single row of table 2
Many to Many Many rows of table 1 associate with many rows of table 2
In Context of Query Optimization:
In terms of query, the cardinality refers to the uniqueness of a column in a table. The
column with all unique values would be having the high cardinality and the column with all
duplicate values would be having the low cardinality. These cardinality scores help in query
optimization.

Functional dependency in DBMS

The attributes of a table is said to be dependent on each other when an attribute of a table
uniquely identifies another attribute of the same table.
For example: Suppose we have a student table with attributes: Stu_Id, Stu_Name, Stu_Age.
Here Stu_Id attribute uniquely identifies the Stu_Name attribute of student table because if
we know the student id we can tell the student name associated with it. This is known as
functional dependency and can be written as Stu_Id->Stu_Name or in words we can say
Stu_Name is functionally dependent on Stu_Id.
Formally:
If column A of a table uniquely identifies the column B of same table then it can
represented as A->B (Attribute B is functionally dependent on attribute A)

Types of Functional Dependencies

Trivial functional dependency


non-trivial functional dependency
Multivalued dependency
Transitive dependency

1. Trivial functional dependency in DBMS

The dependency of an attribute on a set of attributes is known as trivial functional


dependency if the set of attributes includes that attribute.
Symbolically: A ->B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A->A & B->B
For example: Consider a table with two columns Student_id and Student_Name.

{Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as Student_Id


is a subset of {Student_Id, Student_Name}. That makes sense because if we know the
values of Student_Id and Student_Name then the value of Student_Id can be uniquely
determined.

Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial dependencies
too.

2. Non trivial functional dependency in DBMS

If a functional dependency X->Y holds true where Y is not a subset of X then this
dependency is called non trivial Functional dependency.
For example:
An employee table with three attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)
emp_id -> emp_address (emp_address is not a subset of emp_id)

On the other hand, the following dependencies are trivial:


{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}]
Completely non trivial FD:
If a FD X->Y holds true where X intersection Y is null then this dependency is said to be
completely non trivial function dependency.

3. Multivalued dependency in DBMS

Multivalued dependency occurs when there are more than one independent multivalued
attributes in a table.
For example: Consider a bike manufacture company, which produces two colors (Black and
white) in each model every year.

bike_model
M1001
M1001
M2012
M2012
M2222
M2222

manuf_year
2007
2007
2008
2008
2009
2009

color
Black
Red
Black
Red
Black
Red

Here columns manuf_year and color are independent of each other and dependent on
bike_model. In this case these two columns are said to be multivalued dependent on
bike_model. These dependencies can be represented like this:
bike_model ->> manuf_year
bike_model ->> color

4. Transitive dependency in DBMS


A functional dependency is said to be transitive if it is indirectly formed by two functional
dependencies. For e.g.
X -> Z is a transitive dependency if the following three functional dependencies hold true:

X->Y

Y does not ->X

Y->Z
Note: A transitive dependency can only occur in a relation of three of more attributes. This
dependency helps us normalizing the database in 3NF (3rd Normal Form).
Example: Lets take an example to understand it better:
Book
Game of Thrones
Harry Potter
Dying of the Light

Author
George R. R. Martin
J. K. Rowling
George R. R. Martin

Author_age
66
49
66

{Book} ->{Author} (if we know the book, we knows the author name)
{Author} does not ->{Book}
{Author} -> {Author_age}
Therefore, as per the rule of transitive dependency: {Book} -> {Author_age} should hold,
that makes sense because if we know the book name we can know the authors age.

Normalization in DBMS: 1NF, 2NF, 3NF and


BCNF in Database
Normalization is a process of organizing the data in database to avoid data redundancy,
insertion anomaly, update anomaly & deletion anomaly. Lets discuss about anomalies first
then we will discuss normal forms with examples.

Anomalies in DBMS
There are three types of anomalies that occur when the database is not normalized. These
are Insertion, update and deletion anomaly. Lets take an example to understand this.
Example: Suppose a manufacturing company stores the employee details in a table named
employee that has four attributes: emp_id for storing employees id, emp_name for storing
employees name, emp_address for storing employees address and emp_dept for storing
the department details in which the employee works. At some point of time the table looks
like this:
emp_id
101
101
123
166
166

emp_name
Rick
Rick
Maggie
Glenn
Glenn

emp_address
Delhi
Delhi
Agra
Chennai
Chennai

emp_dept
D001
D002
D890
D900
D004

The above table is not normalized. We will see the problems that we face when a table is
not normalized.
Update anomaly: In the above table we have two rows for employee Rick as he belongs to
two departments of the company. If we want to update the address of Rick then we have to
update the same in two rows or the data will become inconsistent. If somehow, the correct
address gets updated in one department but not in other then as per the database, Rick
would be having two different addresses, which is not correct and would lead to
inconsistent data.
Insert anomaly: Suppose a new employee joins the company, who is under training and
currently not assigned to any department then we would not be able to insert the data into
the table if emp_dept field doesnt allow nulls.
Delete anomaly: Suppose, if at a point of time the company closes the department D890
then deleting the rows that are having emp_dept as D890 would also delete the information
of employee Maggie since she is assigned only to this department.
To overcome these anomalies we need to normalize the data. In the next section we will
discuss about normalization.

Normalization
Here

are the most commonly used normal forms:


First normal form(1NF)
Second normal form(2NF)
Third normal form(3NF)
Boyce & Codd normal form (BCNF)

First normal form (1NF)


As per the rule of first normal form, an attribute (column) of a table cannot hold multiple
values. It should hold only atomic values.
Example: Suppose a company wants to store the names and contact details of its
employees. It creates a table that looks like this:
emp_id emp_name
101
Herschel

emp_address
New Delhi

102

Jon

Kanpur

103

Ron

Chennai

104

Lester

Bangalore

emp_mobile
8912312390
8812121212
9900012222
7778881212
9990000123
8123450987

Two employees (Jon & Lester) are having two mobile numbers so the company stored them
in the same field as you can see in the table above.
This table is not in 1NF as the rule says each attribute of a table must have atomic
(single) values, the emp_mobile values for employees Jon & Lester violates that rule.
To make the table complies with 1NF we should have the data like this:
emp_id
101
102
102
103
104
104

emp_name
Herschel
Jon
Jon
Ron
Lester
Lester

emp_address
New Delhi
Kanpur
Kanpur
Chennai
Bangalore
Bangalore

emp_mobile
8912312390
8812121212
9900012222
7778881212
9990000123
8123450987

Second normal form (2NF)


A table is said to be in 2NF if both the following conditions hold:

Table is in 1NF (First normal form)

No non-prime attribute is dependent on the proper subset of any candidate key of


table.
An attribute that is not part of any candidate key is known as non-prime attribute.
Example: Suppose a school wants to store the data of teachers and the subjects they
teach. They create a table that looks like this: Since a teacher can teach more than one
subjects, the table can have multiple rows for a same teacher.
teacher_i
d
111
111
222
333
333

subject
Maths
Physics
Biology
Physics
Chemistr
y

teacher_ag
e
38
38
38
40
40

Candidate Keys: {teacher_id, subject}


Non prime attribute: teacher_age
The table is in 1 NF because each attribute has atomic values. However, it is not in 2NF
because non-prime attribute teacher_age is dependent on teacher_id alone which is a
proper subset of candidate key.
This violates the rule for 2NF as the rule says no non-prime attribute is dependent on the
proper subset of any candidate key of the table.
To make the table complies with 2NF we can break it in two tables like this:
teacher_details table:
teacher_id
111
222
333

teacher_age
38
38
40

teacher_subject table:
teacher_id
111
111
222
333
333

Now the tables comply with Second normal form (2NF).

subject
Maths
Physics
Biology
Physics
Chemistry

Third Normal form (3NF)


A table design is said to be in 3NF if both the following conditions hold:

Table must be in 2NF

Transitive functional dependency of non-prime attribute on any super key should be


removed.
An attribute that is not part of any candidate key is known as non-prime attribute.
In other words, 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:

X is a super key of table

Y is a prime attribute of table


An attribute that is a part of one of the candidate keys is known as prime attribute.
Example: Suppose a company wants to store the complete address of each employee, they
create a table named employee_details that looks like this:
emp_i emp_na emp_zi emp_sta emp_cit emp_distri
d
me
p
te
y
ct
Dayal
1001 John 282005 UP
Agra
Bagh
Chenna
1002 Ajeet 222008 TN
M-City
i
Chenna Urrapakka
1006 Lora 282007 TN
i
m
1101 Lilly 292008 UK
Pauri Bhagwan
Gwalio
1201 Steve 222999 MP
Ratan
r

Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}so on


Candidate Keys: {emp_id}
Non-prime attributes: all attributes except emp_id are non-prime as they are not part of
any candidate keys.
Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is
dependent on emp_id that makes non-prime attributes (emp_state, emp_city &
emp_district) transitively dependent on super key (emp_id). This violates the rule of 3NF.
To make this table complies with 3NF we have to break the table into two tables to remove
the transitive dependency:
employee table:
emp_id
1001
1002
1006
1101
1201

emp_name
John
Ajeet
Lora
Lilly
Steve

emp_zip
282005
222008
282007
292008
222999

employee_zip table:
emp_zip

emp_state

emp_city

emp_district

282005
222008
282007
292008
222999

UP
TN
TN
UK
MP

Agra
Chennai
Chennai
Pauri
Gwalior

Dayal Bagh
M-City
Urrapakkam
Bhagwan
Ratan

Boyce Codd normal form (BCNF)


It is an advance version of 3NF thats why it is also referred as 3.5NF. BCNF is stricter than
3NF. A table complies with BCNF if it is in 3NF and for every functional dependency X->Y, X
should be the super key of the table.
Example: Suppose there is a company wherein employees work in more than one
department. They store the data like this:
emp_id
1001
1001
1002
1002

emp_nationality
Austrian
Austrian
American
American

emp_dept
Production and planning
stores
design and technical support
Purchasing department

dept_type
D001
D001
D134
D134

dept_no_of_emp
200
250
100
600

Functional dependencies in the table above:


emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}
Candidate key: {emp_id, emp_dept}
The table is not in BCNF as neither emp_id nor emp_dept alone are keys.
To make the table comply with BCNF we can break the table in three tables like this:
emp_nationality table:
emp_id
1001
1002

emp_nationality
Austrian
American

emp_dept table:
emp_dept
Production and planning
stores
design and technical support
Purchasing department

dept_type
D001
D001
D134
D134

emp_dept_mapping table:
emp_id
1001
1001
1002
1002

emp_dept
Production and planning
stores
design and technical support
Purchasing department

dept_no_of_emp
200
250
100
600

Functional dependencies:
emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}
Candidate keys:
For first table: emp_id
For second table: emp_dept
For third table: {emp_id, emp_dept}
This is now in BCNF as in both the functional dependencies left side part is a key.

Transaction Management in DBMS


ACID properties in DBMS
To ensure the integrity of data during a transaction (A transaction is a unit of program that
updates various data items, read more about it here), the database system maintains the
following properties. These properties are widely known as ACID properties:

Atomicity: This property ensures that either all the operations of a transaction
reflect in database or none. Lets take an example of banking system to understand this:
Suppose Account A has a balance of 400$ & B has 700$. Account A is transferring 100$ to
Account B. This is a transaction that has two operations a) Debiting 100$ from As
balance b) Creating 100$ to Bs balance. Lets say first operation passed successfully
while second failed, in this case As balance would be 300$ while B would be having 700$
instead of 800$. This is unacceptable in a banking system. Either the transaction should
fail without executing any of the operation or it should process both the operations. The
Atomicity property ensures that.

Consistency: To preserve the consistency of database, the execution of transaction


should take place in isolation (that means no other transaction should run concurrently
when there is a transaction already running). For example account A is having a balance
of 400$ and it is transferring 100$ to account B & C both. So we have two transactions
here. Lets say these transactions run concurrently and both the transactions read 400$
balance, in that case the final balance of A would be 300$ instead of 200$. This is
wrong. If the transaction were to run in isolation then the second transaction would
have read the correct balance 300$ (before debiting 100$) once the first transaction
went successful.

Isolation: For every pair of transactions, one transaction should start execution only
when the other finished execution. I have already discussed the example of Isolation in
the Consistency property above.

Durability: Once a transaction completes successfully, the changes it has made into
the database should be permanent even if there is a system failure. The recoverymanagement component of database systems ensures the durability of transaction.

Deadlock in DBMS

A deadlock is a condition wherein two or more tasks are waiting for each other in order to be
finished but none of the task is willing to give up the resources that other task needs. In this
situation no task ever gets finished and is in waiting state forever.

Coffman conditions
Coffman stated four conditions for a deadlock occurrence. A deadlock may occur if all the following
conditions holds true.

Mutual exclusion condition: There must be at least one resource that cannot be used by
more than one process at a time.
Hold and wait condition: A process that is holding a resource can request for additional
resources that are being held by other processes in the system.
No preemption condition: A resource cannot be forcibly taken from a process. Only the
process can release a resource that is being held by it.
Circular wait condition: A condition where one process is waiting for a resource that is
being held by second process and second process is waiting for third process .so on and the
last process is waiting for the first process. Thus making a circular chain of waiting.

Deadlock Handling
Ignore the deadlock (Ostrich algorithm)
Did that made you laugh? You may be wondering how ignoring a deadlock can come under deadlock
handling. But to let you know that the windows you are using on your PC, uses this approach of
deadlock handling and that is reason sometimes it hangs up and you have to reboot it to get it
working. Not only Windows but UNIX also uses this approach.

The question is why? Why instead of dealing with a deadlock they ignore it and why this is
being called as Ostrich algorithm?
Well! Let me answer the second question first, This is known as Ostrich algorithm because in this
approach we ignore the deadlock and pretends that it would never occur, just like Ostrich behavior
to stick ones head in the sand and pretend there is no problem.

Lets discuss why we ignore it : When it is believed that deadlocks are very rare and cost of
deadlock handling is higher, in that case ignoring is better solution than handling it. For example:
Lets take the operating system example If the time requires handling the deadlock is higher than
the time requires rebooting the windows then rebooting would be a preferred choice considering
that deadlocks are very rare in windows.

Deadlock detection
Resource scheduler is one that keeps the track of resources allocated to and requested by
processes. Thus, if there is a deadlock it is known to the resource scheduler. This is how a deadlock
is detected.

Once a deadlock is detected it is being corrected by following methods:

Terminating processes involved in deadlock : Terminating all the processes involved in


deadlock or terminating process one by one until deadlock is resolved can be the solutions but
both of these approaches are not good. Terminating all processes cost high and partial work
done by processes gets lost. Terminating one by one takes lot of time because each time a
process is terminated, it needs to check whether the deadlock is resolved or not. Thus, the best
approach is considering process age and priority while terminating them during a deadlock
condition.
Resource Preemption: Another approach can be the preemption of resources and allocation
of them to the other processes until the deadlock is resolved.

Deadlock prevention
We have learnt that if all the four Coffman conditions hold true then a deadlock occurs so
preventing one or more of them could prevent the deadlock.

Removing mutual exclusion: All resources must be sharable that means at a time more than
one processes can get a hold of the resources. That approach is practically impossible.
Removing hold and wait condition: This can be removed if the process acquires all the
resources that are needed before starting out. Another way to remove this to enforce a rule of
requesting resource when there are none in held by the process.
Preemption of resources: Preemption of resources from a process can result in rollback and
thus this needs to be avoided in order to maintain the consistency and stability of the system.
Avoid circular wait condition: This can be avoided if the resources are maintained in a
hierarchy and process can hold the resources in increasing order of precedence. This avoid
circular wait. Another way of doing this to force one resource per process rule A process can
request for a resource once it releases the resource currently being held by it. This avoids the
circular wait.

Deadlock Avoidance
Deadlock can be avoided if resources are allocated in such a way that it avoids the
deadlock occurrence. There are two algorithms for deadlock avoidance.

Wait/Die
Wound/Wait

Here is the table representation of resource allocation for each algorithm. Both of these
algorithms take process age into consideration while determining the best possible way of
resource allocation for deadlock avoidance.
Wait/Die

Wound/Wait

Older process needs a resource held by younger


process

Older process
waits

Younger
process dies

Younger process needs a resource held by older


process

Younger
process dies

Younger
process waits

Once of the famous deadlock avoidance algorithm is Bankers algorithm.

Exact Numeric Data Types:

Approximate Numeric Data Types:

bigint

float

int

real

smallint
tinyint
bit
decimal
numeric
money
smallmoney

Date and Time Data Types:


DATA TYPE

FROM

TO

datetime

Jan 1, 1753

Dec 31, 9999

smalldatetime

Jan 1, 1900

Jun 6, 2079

date

Stores a date like June 30, 1991

time

Stores a time of day like 12:30 P.M.

Character Strings Data Types:


DATA TYPE

FROM

char

char

varchar

varchar

varchar(max)

varchar(max)

text

text

TO
Maximum length of 8,000 characters.( Fixed length nonUnicode characters)
Maximum of 8,000 characters.(Variable-length non-Unicode
data).
Maximum length of 231characters, Variable-length nonUnicode data (SQL Server 2005 only).
Variable-length non-Unicode data with a maximum length of
2,147,483,647 characters.

Unicode Character Strings Data Types:


DATA TYPE

Description

nchar

Maximum length of 4,000 characters.( Fixed length Unicode)

nvarchar

Maximum length of 4,000 characters.(Variable length Unicode)

nvarchar(max)

Maximum length of 231characters (SQL Server 2005 only).( Variable length Unicode)

ntext

Maximum length of 1,073,741,823 characters. ( Variable length Unicode )

Binary Data Types:


DATA TYPE

Description

binary

Maximum length of 8,000 bytes(Fixed-length binary data )

varbinary

Maximum length of 8,000 bytes.(Variable length binary data)

varbinary(max)

Maximum length of 231 bytes (SQL Server 2005 only). ( Variable length Binary data)

image

Maximum length of 2,147,483,647 bytes. ( Variable length Binary Data)

SQL Comparison Operators:


Operator

Description

Checks if the values of two operands are equal or not, if yes then condition becomes true.

!=

Checks if the values of two operands are equal or not, if values are not equal then condition becomes true.

<>

Checks if the values of two operands are equal or not, if values are not equal then condition becomes true.

>

Checks if the value of left operand is greater than the value of right operand, if yes then condition becomes true.

<

Checks if the value of left operand is less than the value of right operand, if yes then condition becomes true.

>=

Checks if the value of left operand is greater than or equal to the value of right operand, if yes then condition becomes true.

<=

Checks if the value of left operand is less than or equal to the value of right operand, if yes then condition becomes true.

!<

Checks if the value of left operand is not less than the value of right operand, if yes then condition becomes true.

!>

Checks if the value of left operand is not greater than the value of right operand, if yes then condition becomes true.

SQL Logical Operators:


Operator

Description

ALL

The ALL operator is used to compare a value to all values in another value set.

AND

The AND operator allows the existence of multiple conditions in an SQL statement's WHERE clause.

ANY

The ANY operator is used to compare a value to any applicable value in the list according to the condition.

BETWEEN

The BETWEEN operator is used to search for values that are within a set of values, given the minimum value and the
maximum value.

EXISTS

The EXISTS operator is used to search for the presence of a row in a specified table that meets certain criteria.

IN

The IN operator is used to compare a value to a list of literal values that have been specified.

LIKE

The LIKE operator is used to compare a value to similar values using wildcard operators.

NOT

The NOT operator reverses the meaning of the logical operator with which it is used. Eg: NOT EXISTS, NOT BETWEEN,
NOT IN, etc. This is a negate operator.

OR

The OR operator is used to combine multiple conditions in an SQL statement's WHERE clause.

IS NULL

The NULL operator is used to compare a value with a NULL value.

UNIQUE

The UNIQUE operator searches every row of a specified table for uniqueness (no duplicates).