You are on page 1of 28

S.

No. Category Selection Projection

The Project
The selection operation operation is also
Other is also known as known as vertical
1. Names horizontal partitioning. partitioning.

It is used to choose the


subset of tuples from It is used to select
the relation that certain required
satisfies the given attributes, while
condition mentioned in discarding other
2. Use the syntax of selection. attributes.

It partitions the table It partitions the


3. Partitioning horizontally. table vertically.

The projection
operation is
The selection operation performed after
is performed before selection (if they
Which used projection (if they are to are to be used
4. first be used together). together).

Project operator is
Operator Select operator is used used in Projection
5. Used in Selection Operation. Operation.

Select operator is Project operator is


Operator denoted by Sigma denoted by Pi
6. Symbol symbol. symbol.

A----Relational algebra is a query language that processes one or more relations to define
another relation.
The basic operation of relational algebra are as follows;
1.Unary operations
Selection, Projection
Operations which involve only one relation are called unary operations.
2.Binary operations:
Operations which involve pairs of relations are called binary operations.
Examples:
Union, Difference, Cartesian product
A-----On modeling the design of the relational database we can put some
restrictions like what values are allowed to be inserted in the relation, what kind
of modifications and deletions are allowed in the relation. These are the
restrictions we impose on the relational database.
In models like ER models, we did not have such features.
Constraints in the databases can be categorized into 3 main categories:
1. Constraints that are applied in the data model is called Implicit
constraints.
2. Constraints that are directly applied in the schemas of the data model,
by specifying them in the DDL(Data Definition Language). These are
called as schema-based constraints or Explicit constraints.
3. Constraints that cannot be directly applied in the schemas of the data
model. We call these Application based or semantic constraints.
So here we will deal with Implicit constraints.
Mainly Constraints on the relational database are of 4 types:
1. Domain constraints
2. Key constraints
3. Entity Integrity constraints
4. Referential integrity constraints
Let discuss each of the above constraints in detail.
1. Domain constraints :
1. Every domain must contain atomic values(smallest indivisible units) it
means composite and multi-valued attributes are not allowed.
2. We perform datatype check here, which means when we assign a data
type to a column we limit the values that it can contain. Eg. If we assign
the datatype of attribute age as int, we cant give it values other then int
datatype.
Example:

Explanation:
In the above relation, Name is a composite attribute and Phone is a multi-values
attribute, so it is violating domain constraint.
2. Key Constraints or Uniqueness Constraints :
1. These are called uniqueness constraints since it ensures that every
tuple in the relation should be unique.
2. A relation can have multiple keys or candidate keys(minimal
superkey), out of which we choose one of the keys as primary key, we
don’t have any restriction on choosing the primary key out of candidate
keys, but it is suggested to go with the candidate key with less number
of attributes.
3. Null values are not allowed in the primary key, hence Not Null
constraint is also a part of key constraint.
Example:

Explanation:
In the above table, EID is the primary key, and first and the last tuple has the
same value in EID ie 01, so it is violating the key constraint.
3. Entity Integrity Constraints :
1. Entity Integrity constraints says that no primary key can take NULL
value, since using primary key we identify each tuple uniquely in a
relation.
Example:

Explanation:
In the above relation, EID is made primary key, and the primary key cant take
NULL values but in the third tuple, the primary key is null, so it is a violating
Entity Integrity constraints.
4. Referential Integrity Constraints :
1. The Referential integrity constraints is specified between two relations
or tables and used to maintain the consistency among the tuples in two
relations.
2. This constraint is enforced through foreign key, when an attribute in
the foreign key of relation R1 have the same domain(s) as the primary
key of relation R2, then the foreign key of R1 is said to reference or
refer to the primary key of relation R2.
3. The values of the foreign key in a tuple of relation R1 can either take
the values of the primary key for some tuple in relation R2, or can take
NULL values, but can’t be empty.
Example:

Explanation:
In the above, DNO of the first relation is the foreign key, and DNO in the second
relation is the primary key. DNO = 22 in the foreign key of the first table is not
allowed since DNO = 22
is not defined in the primary key of the second relation. Therefore, Referential
integrity constraints is violated here

Like20

A=====Data Model gives us an idea that how the final system will
look like after its complete implementation. It defines the data
elements and the relationships between the data elements. Data
Models are used to show how data is stored, connected, accessed
and updated in the database management system. Here, we use a
set of symbols and text to represent the information so that
members of the organisation can communicate and understand it

Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:

1) Relational Data Model: This type of model designs the data in the form of rows
and columns within a table. Thus, a relational model uses tables for representing data
and in-between relationships. Tables are also called relations. This model was initially
described by Edgar F. Codd, in 1969. The relational data model is the widely used
model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data


as objects and relationships among them. These objects are known as entities, and
relationship is an association among these entities. This model was designed by Peter
Chen and published in 1976 papers. It was widely used in database designing. A set of
attributes describe the entities. For example, student_name, student_id describes the
'student' entity. A set of the same type of entities is known as an 'Entity set', and the
set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions,


encapsulation, and object identity, as well. This model supports a rich type system that
includes structured and collection types. Thus, in 1980s, various database systems
following the object-oriented approach were developed. Here, the objects are nothing
but the data carrying its properties.

4) Semistructured Data Model: This type of data model is different from the other
three data models (explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same type may have
different attributes sets. The Extensible Markup Language, also known as XML, is
widely used for representing the semistructured data. Although XML was initially
designed for including the markup information to the text document, it gains
importance because of its application in the exchange of data.

A======elational Algebra
Relational algebra is a procedural query language. It gives a step by step process to
obtain the result of the query. It uses operators to perform queries.

Types of Relational operation

1. Select Operation:
o The select operation selects tuples that satisfy a given predicate.
o It is denoted by sigma (σ).

1. Notation: σ p(r)
Where:

σ is used for selection prediction


r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR and
NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤.

For example: LOAN Relation

BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000

Redwood L-23 2000

Perryride L-15 1500

Downtown L-14 1500

Mianus L-13 500

Roundhill L-11 900

Perryride L-16 1300

Input:

1. σ BRANCH_NAME="perryride" (LOAN)

Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500

Perryride L-16 1300


2. Project Operation:
o This operation shows the list of those attributes that we wish to appear in the result.
Rest of the attributes are eliminated from the table.
o It is denoted by ∏.

ADVERTISEMENT BY ADRECOVER

1. Notation: ∏ A1, A2, An (r)

Where

A1, A2, A3 is used as an attribute name of relation r.

Example: CUSTOMER RELATION

NAME STREET CITY

Jones Main Harrison

Smith North Rye

Hays Main Harrison

Curry North Rye

Johnson Alma Brooklyn

Brooks Senator Brooklyn

Input:

1. ∏ NAME, CITY (CUSTOMER)

Output:

NAME CITY

Jones Harrison
Smith Rye

Hays Harrison

Curry Rye

Johnson Brooklyn

Brooks Brooklyn

3. Union Operation:
o Suppose there are two tuples R and S. The union operation contains all the tuples that
are either in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.

1. Notation: R ∪ S

A union operation must hold the following condition:

o R and S must have the attribute of the same number.


o Duplicate tuples are eliminated automatically.

Example:
DEPOSITOR RELATION

CUSTOMER_NAME ACCOUNT_NO

Johnson A-101

Smith A-121

Mayes A-321

Turner A-176

Johnson A-273
Jones A-472

Lindsay A-284

BORROW RELATION

CUSTOMER_NAME LOAN_NO

Jones L-17

Smith L-23

Hayes L-15

Jackson L-14

Curry L-93

Smith L-11

Williams L-17

Input:

1. ∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Johnson

Smith

Hayes
Turner

Jones

Lindsay

Jackson

Curry

Williams

Mayes

4. Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in both R & S.
o It is denoted by intersection ∩.

1. Notation: R ∩ S

Example: Using the above DEPOSITOR table and BORROW table

Input:

1. ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Smith

Jones

5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in R but not in S.
o It is denoted by intersection minus (-).

1. Notation: R - S

Example: Using the above DEPOSITOR table and BORROW table

Input:

1. ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Jackson

Hayes

Willians

Curry

6. Cartesian product
o The Cartesian product is used to combine each row in one table with each row in the
other table. It is also known as a cross product.
o It is denoted by X.

1. Notation: E X D
Example:
EMPLOYEE

EMP_ID EMP_NAME EMP_DEPT

1 Smith A

2 Harry C
3 John B

DEPARTMENT

DEPT_NO DEPT_NAME

A Marketing

B Sales

C Legal

Input:

1. EMPLOYEE X DEPARTMENT

Output:

EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME

1 Smith A A Marketing

1 Smith A B Sales

1 Smith A C Legal

2 Harry C A Marketing

2 Harry C B Sales

2 Harry C C Legal

3 John B A Marketing

3 John B B Sales
3 John B C Legal

7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted by rho (ρ).

Example: We can use the rename operator to rename STUDENT relation to


STUDENT1.

1. ρ(STUDENT1, STUDENT)

A=====Difference Between Generalization and


Specialization in DBMS
DBMSData StructureData Storage

In this post, we will understand the difference between generalization and


specialization in DBMS.

Generalization
• It works using bottom-up approach.

• The size of schema is reduced.

• It is generally applied to a group of entities.

• Inheritance is not used in generalization.

• It can be defined as a process where grouping are created from multiple entity
sets.

• It takes the union of two or more lower-level entity sets, and produces a higher-
level entity set.

• Some of the common features are obtained in the resultant higher-level entity
set.

• The differences and similarities between the entities that need to be in union
operation are ignored.

Example:
Pigeon, house sparrow, crow and dove can all be generalized as Birds −
Specialization
• It uses a top-down approach.

• The size of schema is increased.

• It can be applied to a single entity.

• It can be defined as process of creation of subgroups within an entity set.

• It is the reverse of generalization.

• It takes a subset of higher level entity, and forms a lower-level entity set.

• A higher entity is split to form one or more low entity.

• Inheritance can be used in this approach.

Example
A person has name, date of birth, gender, etc. These properties are common in all
persons, human beings. But in a company, persons can be identified as employee,
employer, customer, or vendor, based on what role they play in the company.

A======Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities to
which another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity
sets.
o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
0. One to one (1:1)
1. One to many (1:M)
2. Many to one (M:1)
3. Many to many (M:M)

One-to-one
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with at most one entity in E1.

One-to-many
In one-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with at most one entity in E1.
Many-to-one
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2,
and an entity in E2 is associated with any number of entities in E1.

ADVERTISEMENT BY ADRECOVER

Many-to-many
In many-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with any number of entities in E1.

A====

Dml precompiler in database management


system, Database Management System
Assignment Help:

DML Precompiler

All the DBMS have two basic sets of Languages - Data Definition Language (DDL) that
have the set of commands needed to define the format of the data that is being
stored and Data Manipulation Language (DML) which tells the set of commands that
modify, process data to make user definable output. The DML statements can as well
be written in an application program. The DML precompiler changes DML statements
(such as SELECT...FROM in Structured Query Language (SQL) covered in Block 2)
embedded in an application program to normal procedural calls in the host language.
The precompiler relate with the query processor in order to produce the appropriate
code.

A=====SQL | DDL, DQL, DML, DCL


and TCL Commands
• Difficulty Level : Easy
• Last Updated : 30 Sep, 2021
Structured Query Language(SQL) as we all know is the database language by the
use of which we can perform certain operations on the existing database and also
we can use this language to create a database. SQL uses certain commands like
Create, Drop, Insert, etc. to carry out the required tasks.
These SQL commands are mainly categorized into four categories as:
1. DDL – Data Definition Language
2. DQl – Data Query Language
3. DML – Data Manipulation Language
4. DCL – Data Control Language
Though many resources claim there to be another category of SQL clauses TCL –
Transaction Control Language. So we will see in detail about TCL as well.
DDL (Data Definition Language):
DDL or Data Definition Language actually consists of the SQL commands that can
be used to define the database schema. It simply deals with descriptions of the
database schema and is used to create and modify the structure of database objects
in the database.DDL is a set of SQL commands used to create, modify, and delete
database structures but not data. These commands are normally not used by a
general user, who should be accessing the database via an application.
List of DDL commands:
• CREATE: This command is used to create the database or its objects (like
table, index, function, views, store procedure, and triggers).
• DROP: This command is used to delete objects from the database.
• ALTER: This is used to alter the structure of the database.
• TRUNCATE: This is used to remove all records from a table, including all
spaces allocated for the records are removed.
• COMMENT: This is used to add comments to the data dictionary.
• RENAME: This is used to rename an object existing in the database.
DQL (Data Query Language):
DQL statements are used for performing queries on the data within schema
objects. The purpose of the DQL Command is to get some schema relation based
on the query passed to it. We can define DQL as follows it is a component of SQL
statement that allows getting data from the database and imposing order upon it.
It includes the SELECT statement. This command allows getting the data out of the
database to perform operations with it. When a SELECT is fired against a table or
tables the result is compiled into a further temporary table, which is displayed or
perhaps received by the program i.e. a front-end.
List of DQL:
• SELECT: It is used to retrieve data from the database.
DML(Data Manipulation Language):
The SQL commands that deals with the manipulation of data present in the
database belong to DML or Data Manipulation Language and this includes most of
the SQL statements. It is the component of the SQL statement that controls access
to data and to the database. Basically, DCL statements are grouped with DML
statements.
List of DML commands:
• INSERT : It is used to insert data into a table.
• UPDATE: It is used to update existing data within a table.
• DELETE : It is used to delete records from a database table.
• LOCK: Table control concurrency.
• CALL: Call a PL/SQL or JAVA subprogram.
• EXPLAIN PLAN: It describes the access path to data.
DCL (Data Control Language):
DCL includes commands such as GRANT and REVOKE which mainly deal with the
rights, permissions, and other controls of the database system.
List of DCL commands:
• GRANT: This command gives users access privileges to the database.
• REVOKE: This command withdraws the user’s access privileges given by
using the GRANT command.
Though many resources claim there to be another category of SQL clauses TCL –
Transaction Control Language. So we will see in detail about TCL as well. TCL
commands deal with the transaction within the database.
List of TCL commands:

• COMMIT: Commits a Transaction.


• ROLLBACK: Rollbacks a transaction in case of any error occurs.
• SAVEPOINT:Sets a savepoint within a transaction.
• SET TRANSACTION: Specify characteristics for the transaction.

A====
The Database Management System (DBMS) is defined as a software system that
allows the user to define, create and maintain the database and provide control access
to the data.
It is a collection of programs used for managing data and simultaneously it supports
different types of users to create, manage, retrieve, update and store information.
Advantages of DBMS
The advantages of the DBMS are explained below −

• Redundancy problem can be solved.


In the File System, duplicate data is created in many places because all the programs
have their own files which create data redundancy resulting in wastage of memory. In
DBMS, all the files are integrated in a single database. So there is no chance of
duplicate data.
For example: A student record in a library or examination can contain duplicate values,
but when they are converted into a single database, all the duplicate values are
removed.

• Has a very high security level.


Data security level is high by protecting your precious data from unauthorized access.
Only authorized users should have the grant to access the database with the help of
credentials.

• Presence of Data integrity.


Data integrity makes unification of so many files into a single file. DBMS allows data
integrity which makes it easy to decrease data duplicity Data integration and reduces
redundancy as well as data inconsistency.

• Support multiple users.


DBMS allows multiple users to access the same database at a time without any
conflicts.

• Avoidance of inconsistency.
DBMS controls data redundancy and also controls data consistency. Data consistency
is nothing but if you want to update data in any files then all the files should not be
updated again.
In DBMS, data is stored in a single database so data becomes more consistent in
comparison to file processing systems.

• Shared data
Data can be shared between authorized users of the database in DBMS. All the users
have their own right to access the database. Admin has complete access to the
database. He has a right to assign users to access the database.

• Enforcement of standards
As DBMS have central control of the database. So, a DBA can ensure that all the
applications follow some standards such as format of data, document standards etc.
These standards help in data migrations or in interchanging the data.
• Any unauthorized access is restricted
Unauthorized persons are not allowed to access the database because of security
credentials.

• Provide backup of data


Data loss is a big problem for all the organizations. In the file system users have to
back up the files in regular intervals which lead to waste of time and resources.
DBMS solves this problem of taking backup automatically and recovery of the
database.
Tunability
Tuning means adjusting something to get a better performance. Same in the case of
DBMS, as it provides tunability to improve performance. DBA adjusts databases to get
effective results.
Disadvantages of DBMS
The disadvantages of DBMS are as follows:

• Complexity
The provision of the functionality that is expected of a good DBMS makes the DBMS
an extremely complex piece of software. Database designers, developers, database
administrators and end-users must understand this functionality to take full advantage
of it.
Failure to understand the system can lead to bad design decisions, which leads to a
serious consequence for an organization.

• Size
The functionality of DBMS makes use of a large piece of software which occupies
megabytes of disk space.

• Performance
Performance may not run as fast as desired.

• Higher impact of a failure


The centralization of resources increases the vulnerability of the system because all
users and applications rely on the availability of DBMS, the failure of any component
can bring operation to halt.
A==========

Three schema Architecture


o The three schema architecture is also called ANSI/SPARC architecture or three-level
architecture.
o This framework is used to describe the structure of a specific database system.
o The three schema architecture is also used to separate the user applications and
physical database.
o The three schema architecture contains three-levels. It breaks the database down into
three different categories.

The three-schema architecture is as follows:

In the above diagram:

o It shows the DBMS architecture.


o Mapping is used to transform the request and response between various database
levels of architecture.
o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the request from external
level to conceptual schema.
o In Conceptual / Internal mapping, DBMS transform the request from the conceptual to
internal level.

ADVERTISEMENT BY ADRECOVER

Objectives of Three schema Architecture


The main objective of three level architecture is to enable multiple users to access the
same data with a personalized view while storing the underlying data only once. Thus
it separates the user's view from the physical structure of the database. This separation
is desirable for the following reasons:

o Different users need different views of the same data.


o The approach in which a particular user needs to see the data may change over time.
o The users of the database should not worry about the physical implementation and
internal workings of the database such as data compression and encryption
techniques, hashing, optimization of the internal structures etc.
o All users should be able to access the same data according to their requirements.
o DBA should be able to change the conceptual structure of the database without
affecting the user's
o Internal structure of the database should be unaffected by changes to physical aspects
of the storage.

1. Internal Level

o The internal level has an internal schema which describes the physical storage structure
of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in
a block.
o The physical level is used to describe complex low-level data structures in detail.
The internal level is generally is concerned with the following activities:

o Storage space allocations.


For Example: B-Trees, Hashing etc.
o Access paths.
For Example: Specification of primary and secondary keys, indexes, pointers and
sequencing.
o Data compression and encryption techniques.
o Optimization of internal structures.
o Representation of stored fields.

2. Conceptual Level

o The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data structure
are hidden.
o Programmers and database administrators work at this level.

3. External Level

o At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is interested
and hides the remaining database from that user group.
o The view schema describes the end user interaction with database systems.

Mapping between Views


The three levels of DBMS architecture don't exist independently of each other. There
must be correspondence between the three levels i.e. how they actually correspond
with each other. DBMS is responsible for correspondence between the three types of
schema. This correspondence is called Mapping.

There are basically two types of mapping in the database architecture:

o Conceptual/ Internal Mapping


o External / Conceptual Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual level and the internal
level. Its role is to define the correspondence between the records and fields of the
conceptual level and files and data structures of the internal level.

External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level and the Conceptual
level. Its role is to define the correspondence between a particular external and the
conceptual view.

A=====Recursive Relationships in ER
diagrams
• Difficulty Level : Medium
• Last Updated : 10 Jun, 2021
Prerequisite – ER Model
A relationship between two entities of a similar entity type is called
a recursive relationship. Here the same entity type participates more than once
in a relationship type with a different role for each instance. In other words, a
relationship has always been between occurrences in two different entities.
However, the same entity can participate in the relationship. This is termed
a recursive relationship.
Example –
Let us suppose that we have an employee table. A manager supervises a
subordinate. Every employee can have a supervisor except the CEO and there can
be at most one boss for each employee. One employee may be the boss of more
than one employee. Let’s suppose that REPORTS_TO is a recursive relationship
on the Employee entity type where each Employee plays two roles.
1. Supervisor
2. Subordinate

Supervisors and subordinates are called “Role Names”. Here the degree of the
REPORTS_TO relationship is 1 i.e. a unary relationship.

• The minimum cardinality of the Supervisor entity is ZERO since the


lowest level employee may not be a manager for anyone.
• The maximum cardinality of the Supervisor entity is N since an
employee can manage many employees.
• Similarly, the Subordinate entity has a minimum cardinality of ZERO to
account for the case where CEO can never be a subordinate.
• Its maximum cardinality is ONE since a subordinate employee can have
at most one supervisor.
Note – Here none of the participants have total participation since both
minimum cardinalities are Zero. Hence, the relationships are connected by a
single line instead of a double line in the ER diagram.
To implement a recursive relationship, a foreign key of the employee’s manager
number would be held in each employee record. A Sample table would look
something like this:-
Emp_entity( Emp_no,Emp_Fname, Emp_Lname, Emp_DOB, Emp_NI_Number,
Manager_no);

What are some of the advantages and disadvantages of


allowing null values in a database table?
he “world’s smallest database” I helped to code started out with NULLs, but we
abandoned NULL support, which dramatically reduced our code size and made the
engine much smaller and faster. As our engine was a small-device DB, this was hugely
important. However, it meant we had to work with customers to teach them how to work
without NULLs.

The advantages of using NULLs:

You don’t have to create an application-managed “don’t care” value.



This is a Big Deal as you often end up with a don’t-care value that
becomes a “real” value. Even though the relational guy in me doesn’t
love NULLs, I have to concede that most people are better off just using
NULL versus trying to manage don’t-care values.
• You can populate tables and otherwise use outer joins without needing
to use IFNULL() to load don’t-care values.
• Depending on your datatype, NULLs may save a bit of space (although
not as much as some may think as NULLs do require at least a bit of
storage and will require a mask in each row to store the bits, and many
db storage engines require a byte for NULLs, not a bit).
The disadvantages:

• Potential bugs in your SQL due to not being aware of NULL behavior in
operators. Most SQLs will happily allow you to write “mycol = NULL”,
but that always evaluates to NULL, which is effectively FALSE in most
implementations. (Most hand-coded SQL avoids this rookie mistake, but
I’ve seen this in program-generated SQL many times.)
• Exposure to weird, inconsistent NULL semantics in SQL, particularly
when it comes to operations that involve ordering and grouping. In
practice, most people can live with these, but the behavior can be
different across db vendors.
• Many database storage engines make storage optimizations if they
know you don’t have NULLs in a column. This *may* make access
slightly slower if you have a lot of NULLs.

You might also like