You are on page 1of 70

Module 1 – NORMALIZATION

Introduction

Database normalization or commonly called normalization is a concept in


database that should be deeply understand by every database developer and database
administrators. It is a process used for data modeling or database creation, where you
organized your data and tables so it can be added and updated efficiently. Normalization
is part of successful database design. Without normalization, database systems can be
imprecise, slow, and ineffective and they might not produce the data you expect. In this
chapter students will recognize the importance and the processes in normalizing a
database.

Learning Outcomes

After studying this lesson, students should be able to:

1. Understand what normalization is.


2. Discuss the importance of normalization in the database design.
3. Confer the problems related to data redundancy.
4. Employ normalization process up to third normal forms in the design of
database.

Lesson 1: Database Tables and Normalization

Relational Database Management System is the collection of information organized


in tables- tables are also “relations”. Tables are constructed and associated to each
other through share fields- “common” fields are also “columns” or “attributes”. Tables
are also a set of attributes comprises a record and a record are also “rows” or “tuples”.
Tables are related through common fields designated as primary and foreign keys.
When we say primary key, it must be unique and cannot contain a null value while
foreign key is fields in the table that refer to the primary key in another table.

Database normalization is the process of structuring a relational database in


accordance with a series of so-called normal forms in order to reduce data redundancy
and improve data integrity. It was first proposed by Edgar F. Codd as part of his
relational model.

Normalization entails organizing the columns (attributes) and tables (relations)


of a database to ensure that their dependencies are properly enforced by database
integrity constraints. It is accomplished by applying some formal rules either by a

1
process of synthesis (creating a new database design) or decomposition (improving an
existing database design). In general, is the process of efficiently organizing data in a
database.

Two goals of normalization process:

▪ Eliminating Redundant Data

▪ Ensuring Data Dependencies

The objective is to segregate data so that modifications of a field can be made


in just one table and then propagated through the rest of the database using the
defined relationships.

Lesson 2: The Need for Normalization

Update anomaly

The same information can be expressed on multiple rows; therefore, updates to


the relation may result in logical inconsistencies. For example, each record in an
"Employees' Skills" relation might contain an Employee ID, Employee Address, and
Skill; thus, a change of address for a particular employee may need to be applied to
multiple records (one for each skill). If the update is only partially successful – the
employee's address is updated on some records but not others – then the relation is
left in an inconsistent state. Specifically, the relation provides conflicting answers to the
question of what this particular employee's address is.

Table 1: An Update Anomaly

Employee’s Skill
Employee_ID Employee_Address Skill
110 Borongan City Programmer
110 Borongan City Programmer
111 Arteche, Eastern Samar Website Developer
111 Dolores, Eastern Samar Programmer
Employee 111 is shown as having different addresses on different records.

2
Insertion anomaly

There are situations in which certain facts cannot be recorded at all. For
example, each record in a "Faculty and Their Courses" relation might contain a Faculty
ID, Faculty Name, Faculty Hire Date, and Course Code. Therefore, we can record the
details of any faculty member who teaches at least one course, but we cannot record a
newly hired faculty member who has not yet been assigned to teach any courses,
except by setting the Course Code to null. This phenomenon is known as an insertion
anomaly.

Table 2: An Insertion Anomaly

Faculty and Their Courses


Faculty_ID Faculty_Name Faculty_Hire_Date Course_Code
110 Prof. Borata 10-Feb-2010 COMP-121
111 Prof. Casillano 11-June-2005 COMP-122
111 Prof. Casillano 11-June-2005 COMP-122

?
120 Prof. Baltazar 10-Mah-2020

Until the new faculty member, Prof. Baltazar, is assigned to teach at least
one course, his details cannot be recorded.

Deletion anomaly

Under certain circumstances, deletion of data representing certain facts


necessitates deletion of data representing completely different facts. The "Faculty and
Their Courses" relation described in the previous example suffers from this type of
anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we
must delete the last of the records on which that faculty member appears, effectively
also deleting the faculty member, unless we set the Course Code to null. This
phenomenon is known as a deletion anomaly.

3
Table 3: A Deletion Anomaly

Faculty and Their Courses


Faculty _ID Faculty_Name Faculty_Hire_Date Course_Code
110 Prof. Borata 10 -Feb -2010 COMP -121
111 Prof. Casillano 11 -June -2005 COMP -122
111 Prof. Casillano 11 -June -2005 COMP -122

DELETE

All information about Prof. Borata is lost if he temporarily ceases to be assigned


to any courses.

Thus, to overcome these anomalies, we need to normalize the data.

Lesson 3: The Normalization Process (1NF, 2NF, 3NF)

There are four most commonly used normal forms:

A. First normal form (1NF). As per the rule of first normal form, an attribute

(column) of a table cannot hold multiple values. It should hold only atomic values.
Example: Suppose a company wants to store the names and contact details of its
employees. It creates a table that looks like this:

Table 4: 1NF Example A

Employee_ID Employee_Name Employee_Address Contact_Number

101 Prof. Ramilo Borongan City +639357131689

102 Prof. Borata Can-avid, Easter Samar +639759231790

+639352989168

103 Prof. Nuguit Arteche, Eastern Samar +639307131777

104 Prof. Baje Dolores, Eastern Samar +639398888888

+639657777777

Two employees (Prof. Borata & Prof. Baje) are having two mobile numbers so
the company stored them in the same field as you can see in the table above.

4
This table is not in 1NF as the rule says “each attribute of a table must have
atomic (single) values”, the emp_mobile values for employees Prof. Borata & Prof. Baje
violates that rule.
To make the table complies with 1NF we should have the data like this:

Table 5: 1NF Example B

Employee_ID Employee_Name Employee_Address Contact_Number


101 Prof. Ramilo Borongan City +639357131689
102 Prof. Borata Can-avid, Easter Samar +639759231790
102 Prof. Borata Can-avid, Easter Samar +639352989168
103 Prof. Nuguit Arteche, Eastern Samar +639307131777
104 Prof. Baje Dolores, Eastern Samar +639398888888
104 Prof. Baje Dolores, Eastern Samar +639657777777

B. Second normal form (2NF).

A table is said to be in 2NF if both the following conditions hold:

▪ Table is in 1NF (First normal form)

▪ No non-prime attribute is dependent on the proper subset of any


candidate key of table.

An attribute that is not part of any candidate key is known as non-prime


attribute. Example: Suppose a school wants to store the data of teachers and the
subjects they teach. They create a table that looks like this: Since a teacher can teach
more than one subjects, the table can have multiple rows for a same teacher.

5
Table 6: 2NF Example A

teac her_Id Subject teacher_age

111 Math 38
111 Physics 38
222 Biology 38
333 Physics 40
333 Chemistry 40
Candidate Keys : {teacher_id, subject }
Non -prime attribute : teacher_age

The table is in 1 NF because each attribute has atomic values. However, it is


not in 2NF because non-prime attribute teacher_age is dependent on teacher_id alone
which is a proper subset of candidate key. This violates the rule for 2NF as the rule
says “no non-prime attribute is dependent on the proper subset of any candidate key of
the table”. To make the table complies with 2NF we can break it in two tables like this:

Table 7: 2NF Example B (teacher_details)

Teacher_Id Teacher_age

111 38
222 38
333 40
Table 8: 2NF Example B (teacher Subject)
Teacher_Id Subject

111 Math
111 Physics
222 Biology
333 Physics
333 Chemistry

6
C. Third Normal form (3NF). A table design is said to be in 3NF if both the following
conditions hold:

▪ Table must be in 2NF

▪ Transitive functional dependency of non-prime attribute on any super key


should be removed.
An attribute that is not part of any candidate key is known as non-prime attribute.

In other words, 3NF can be explained like this: A table is in 3NF if it is in 2NF and
for each functional dependency X-> Y at least one of the following conditions
hold:

▪ X is a super key of table

▪ Y is a prime attribute of table

An attribute that is a part of one of the candidate keys is known as prime


attribute.

Example: Suppose a company wants to store the complete address of each


employee, they create a table named Employee_Details that looks like this:

Table 9: 3NF Example A (Employee_details)

Emp_I Emp_Nam Emp_Zon Emp_Brg Emp_City Emp_Municipalit


D e e y y
101 Prof. Ramilo 1 Maypangd Borongan Eastern Samar
a
102 Prof. Borata 2 Mercedes Catbaloga Western Samar
n
103 Prof. Nuguit 3 Balud Calbayog Northern Samar
104 Prof. Baje 4 Tamoso Borongan Eastern Samar

Super keys: {Emp_ID}, {Emp_ID, Emp_Name}, {Emp_ID, Emp_Name,


Emp_Zone}…etc.

Candidate Keys: {Emp_ID}

Non-prime attributes: all attributes except Emp_ID are non-prime as they are not
part of any candidate keys.

Here, Emp_Brgy, Emp_City & Emp_Municipality dependent on Emp_Zone.


And, Emp_Zone is dependent on Emp_ID that makes non-prime attributes (Emp_Brgy,

7
Emp_City & Emp_Municipality) transitively dependent on super key (Emp_Id). This
violates the rule of 3NF.

We will use the 3NF to break the table into two tables to eliminate the transitive
dependency:

Table 10: 3NF Example B (Employee_Table)

Emp_ID Emp_Name Emp_Zone


101 Prof. Ramilo 1
102 Prof. Borata 2
103 Prof. Nuguit 3
104 Prof. Baje 4

Table 11: 3NF Example B (Employee_Zip)

Emp_Zone Emp_Brgy Emp_City Emp_Municipality


1 Maypangda Borongan Eastern Samar
2 Mercedes Catbalogan Western Samar
3 Balud Calbayog Northern Samar
4 Tamoso Borongan Eastern Samar

Lesson 4: Improving the Design

▪ Table structures cleaned up to eliminate initial partial and transitive dependencies

▪ Normalization cannot, by itself, be relied on to make good designs

▪ It is valuable because its use helps eliminate data redundancies

▪ Issues to address in order to produce a good normalized set of tables:

- Evaluate PK Assignments

- Evaluate Naming Conventions

- Refine Attribute Atomicity

- Identify New Attributes

- Identify New Relationships

8
- Refine Primary Keys as Required for Data Granularity

- Maintain Historical Accuracy

- Evaluate Using Derived Attributes

Lesson 5: Surrogate Key Considerations

When primary key is considered to be unsuitable, designers use surrogate keys.


Data entries in table 12 are inappropriate because they duplicate existing records.

Table 12: Duplicate Entries in the Job Table

JOB_CODE JOB_DESCRIPTION JOB_RATE


112 Database Designer PHP. 80,000
113 Database Designer PHP. 80,000

Lesson 6: Higher-Level Normal Forms

There are two (2) cases of 3NF:

A. Boyce-Codd normal form (BCNF)

It is an advance version of 3NF that’s why it is also referred as 3.5NF. BCNF is


stricter than 3NF. A table complies with BCNF if it is in 3NF and for every functional
dependency X->Y, X should be the super key of the table.

Example: Suppose there is a company wherein employees work in more than one
department. They store the data like this:

Table 13: BCNF Example A

Emp_Id Emp_Nationality Emp_Dept Dept_Type Dept_No_Of_Emp


Production
1001 Austrian D001 200
and planning
1001 Austrian Stores D001 250
design and
1002 American technical D134 100
support

9
Purchasing
1002 American D134 600
department

Functional dependencies in the table above:

Emp_Id -> Emp_Nationality

Emp_Dept -> {Dept_Type, Dept_No_Of_Emp}

Candidate key: {Emp_Id, Emp_Dept}

NOTE: Table is not in BCNF as neither Emp_Id nor Emp_Dept alone


are keys.

To make the table comply with BCNF we can break the table in three tables
like this:

Table 14: BCNF Example B (Emp_Nationality Table)

Emp_Id Emp_Nationality

1001 Austrian

1002 American

Table 15: BCNF Example B (Emp_Dept Table)

Emp_Dept Dept_Type Dept_No_Of_Emp

Production and planning D001 200

Stores D001 250


design and technical
D134 100
support
Purchasing department D134 600

10
Table 16: BCNF Example B(Emp_Dept_Mapping Table)

Emp_Id Emp_Dept
1001 Production and planning
1001 stores
1002 design and technical support
1002 Purchasing department

Functional dependencies:

Emp_Id -> Emp_Nationality

Emp_Dept -> {Dept_Type, Dept_No_Of_Emp}

Candidate keys:

For first table: Emp_Id

For second table: Emp_Dept

For third table: {Emp_Id, Emp_Dept}

This is now in BCNF as in both the functional dependencies left side part
is a key.

B. Fourth normal form (4NF)

The 4NF was introduced by Ronald Fagin in 1977, it comes after 1NF,
2NF, 3NF, and BCNF. To be in 4NF, a relation should be in Boyce-Codd Normal
Form and may not contain more than one multi-valued attribute.

Table 17: 4NF Example A (Movie Table)

Movie_Name Shooting_Location Listing

I love you Goodbye UK Comedy

I love you Goodbye UK Thriller

Can’t You See Australia Action

Can’t You See Australia Crime

11
Forever Philippine Drama

NOTE: Table above is not in 4NF, since one movie can have the same
listing and many shooting locations can have the same movie.

Let us convert the above table in 4NF:

Table 18: 4NF Example B (Movie_Shooting)

Movie_Name Shooting_Location

I love you Goodbye UK

I love you Goodbye UK

Can’t You See Australia

Can’t You See Australia

Forever Philippine

Table 19: 4NF Example B (Movie_Listing)

Movie_Name Listing

I love you Goodbye Comedy

I love you Goodbye Thriller

Can’t You See Action

Can’t You See Crime

Forever Drama
Now, the violation is removed and the tables are in 4NF.

Lesson 7: Normalization and Database Design

One of the most important steps in designing a database is ensuring that the
data is properly distributed among in the tables. With proper data structures, illogically
or inconsistently stored data can cause a number of problems.
The objective of normalization process is to ensure that each table conforms to
the concept of well-informed relations, that is, tables that have the following
characteristics:

▪ Each table represents a single subject.

12
▪ No data item will be unnecessarily stored in more than one table.
▪ All nonprime attributes in a table are dependent on the primary key.

▪ Each table is void of insertion, update, or deletion anomalies.

To accomplish the objectives, the normalization process takes you through the
steps that lead to successively higher normal forms.

Table 20: Normal Forms

Normal Forms Characteristics


First Normal Form (1NF) Table format, no repeating groups, and PK identified
Second Normal Form 1NF and no partial dependencies
(2NF)
Third Normal Form (3NF) 2NF and no transitive dependencies
Boyce-Codd normal form
Every determinant is a candidate key (special case of 3NF)
(BCNF)
Fourth normal form (4NF) 3NF and no independent multivalued dependencies
Table 21: Unnormalized Table

Student Advisor Adv_room Class 1 Class 2 Class 3


1022 Jones 412 101-07 143-01 159-02
4123 Smith 216 201-01 211-02 214-01
Table 22: First Normal Form: No Repeating Grapes

Student Advisor Adv_room Class

1022 Jones 412 101-07

1022 Jones 412 143-01

1022 Jones 412 159-02

4123 Smith 216 201-01

4123 Smith 216 211-02

4123 Smith 216 214-01


Table 23: Second Normal Form: Eliminate Redundant Data

Student Class
1022 101-07
1022 143-01
1022 159-02

13
4123 201-01
4123 211-02
4123 214-01
Student Advisor Adv_room
1022 Jones 412
4123 Smith 216
Student: Registration:

Table 24: Third Normal Form: Eliminate Data Not Dependent on Key

Student: Faculty:

Name Room Depart


Jones 412 42
Smith 216 42
Student Advisor
1022 Jones
4123 Smith

Module 2: INTRODUCTION TO SQL

Introduction

SQL stands for Structured Query Language. SQL is a database computer


language designed for the retrieval and management of data in relational database. In
this chapter students will learn the basic concepts related to Structured Query Language
(SQL).

Learning Outcomes

After studying this lesson, students should be able to:

1. Define what Structured Query Language is.


2. Identify the Structured Query Language process.

14
3. Understand the SQL RDBMS Concepts.
4. Understand the broad categories of SQL functions and the Basic SQL select
statement
Lesson 1: SQL Overview

SQL is Structured Query Language, which is a computer language for storing,


manipulating and retrieving data stored in a relational database.
SQL is the standard language for Relational Database System. All the
Relational Database Management Systems (RDMS) like MySQL, MS Access, Oracle,
Sybase, Informix, Postgres and SQL Server use SQL as their standard database
language.
Also, they are using different dialects, such as:
▪ MS SQL Server using T-SQL,
▪ Oracle using PL/SQL,
▪ MS Access version of SQL is called JET SQL (native format) etc.

Why SQL?

SQL is widely popular because it offers the following advantages:

▪ Allows users to access data in the relational database management systems.


▪ Allows users to describe the data.
▪ Allows users to define the data in a database and manipulate that data.
▪ Allows to embed within other languages using SQL modules, libraries &
precompilers.
▪ Allows users to create and drop databases and tables.
▪ Allows users to create view, stored procedure, functions in a database. • Allows
users to set permissions on tables, procedures and views.
SQL Process

When you are executing an SQL command for any RDBMS, the system
determines the best way to carry out your request and SQL engine figures out how to
interpret the task. There are various components included in this process.

These components are –


▪ Query Dispatcher
▪ Optimization Engines
▪ Classic Query Engine

15
▪ SQL Query Engine, etc.

A classic query engine handles all the non-SQL queries, but a SQL query
engine won't handle logical files.

Following is a simple diagram showing the SQL Architecture:

www.tutorialspoint.com/sql/sql_tutorial.pdf

Lesson 2: SQL ─ RDBMS Concepts

What is RDBMS?

RDBMS stands for Relational Database Management System.


RDBMS is the basis for SQL, and for all modern database systems like MS
SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.
A Relational database management system (RDBMS) is a database
management system (DBMS) that is based on the relational model as
introduced by E. F. Codd.

What is a table?

The data in an RDBMS is stored in database objects which are called as


tables. This table is basically a collection of related data entries and it consists
of numerous columns and rows. Remember, a table is the most common and
simplest form of data storage in a relational database. The following program
is an example of a CUSTOMERS table:

16
www.tutorialspoint.com/sql/sql_tutorial.pdf

What is a field?

Every table is broken up into smaller entities called fields. The fields in the
CUSTOMERS table consist of ID, NAME, AGE, ADDRESS and SALARY.

A field is a column in a table that is designed to maintain specific


information about every record in the table.

What is a Record or a Row?

A record is also called as a row of data is each individual entry that exists in
a table. For example, there are 7 records in the above CUSTOMERS table.

Following is a single row of data or record in the CUSTOMERS table:

www.tutorialspoint.com/sql/sql_tutorial.pdf
A record is a horizontal entity in a table.

What is a column?

A column is a vertical entity in a table that contains all


information associated with a specific field in a table.

For example, a column in the CUSTOMERS table is


ADDRESS, which represents location description and would be as
shown in this figure.

www.tutorialspoint.com/sql/sql_tutorial.pdf

17
What is a NULL value?

A NULL value in a table is a value in a field that appears to be blank, which


means a field with a NULL value is a field with no value.

It is very important to understand that a NULL value is different than a


zero value or a field that contains spaces. A field with a NULL value is the
one that has been left blank during a record creation.

SQL Constraints

Constraints are the rules enforced on data columns on a table. These


are used to limit the type of data that can go into a table. This ensures the
accuracy and reliability of the data in the database.

Constraints can either be column level or table level. Column level


constraints are applied only to one column whereas, table level constraints
are applied to the entire table.

Following are some of the most commonly used constraints available in SQL:

▪ NOT NULL Constraint: Ensures that a column cannot have a NULL value.

▪ DEFAULT Constraint: Provides a default value for a column when none is


specified.

▪ UNIQUE Constraint: Ensures that all the values in a column are different.

▪ PRIMARY Key: Uniquely identifies each row/record in a database table.

▪ FOREIGN Key: Uniquely identifies a row/record in any another database table.

▪ CHECK Constraint: The CHECK constraint ensures that all values in a column
satisfy certain conditions.

▪ INDEX: Used to create and retrieve data from the database very quickly.

Data Integrity

The following categories of data integrity exist with each RDBMS:

▪ Entity Integrity: There are no duplicate rows in a table.


▪ Domain Integrity: Enforces valid entries for a given column by restricting the
type, the format, or the range of values.

18
▪ Referential integrity: Rows cannot be deleted, which are used by other records.
▪ User-Defined Integrity: Enforces some specific business rules that do not fall
into entity, domain or referential integrity.

SQL RDBMS Databases

There are many popular RDBMS available to work with. The following are a
brief overview of few most popular RDBMS. This would help users to compare basic
features of RDBMS.

1.MySQL is an open source SQL database, which is developed by Swedish


company MySQL AB. MySQL is pronounced "my ess-que-ell," in contrast with SQL,
pronounced "sequel."
MySQL is supporting many different platforms including Microsoft Windows, the
major Linux distributions, UNIX, and Mac OS X. MySQL has free and paid versions,
depending on its usage (non-commercial/commercial) and features. MySQL comes
with a very fast, multi-threaded, multi-user, and robust SQL database server.
2.MS SQL Server is a Relational Database Management System developed by
Microsoft Inc. Its primary query languages are T-SQL and ANSI SQL.
3.ORACLE is a very large and multi-user database management system. Oracle is
a relational database management system developed by 'Oracle Corporation'.
Oracle works to efficiently manage its resource, a database of information, among
the multiple clients requesting and sending data in the network.
It is an excellent database server choice for client/server computing. Oracle
supports all major operating systems for both clients and servers, including MSDOS,
NetWare, UnixWare, OS/2 and most UNIX flavors.
4.MS ACCESS is one of the most popular Microsoft products. Microsoft Access is
an entry-level database management software. MS Access database is not only an
inexpensive but also powerful database for small-scale projects.
MS Access uses the Jet database engine, which utilizes a specific SQL language
dialect (sometimes referred to as Jet SQL). MS Access comes with the professional
edition of MS Office package. MS Access has easy-to-use intuitive graphical interface.

Lesson 3: Categories of SQL Functions

SQL functions fit into two broad categories:

1. Data Definition Language (DDL)

19
SQL includes commands to create database objects such as tables, indexes
and views, as well as commands to define access rights to those database
objects.

Command Description
CREATE Creates a new table, a view of a table, or other object in
database
ALTER Modifies an existing database object, such as a table.
Deletes an entire table, a view of a table or other object in
DROP
the database.

Creating a Database To initialize a new database:

Syntax:

CREATE DATABASE database_name


▪ There are numerous arguments that go along with this command but are
database specific
▪ Only some databases require database to be created and space to be allocated
prior to creation of tables.
▪ Some databases provide graphical user interfaces to create databases and
allocate space.
– Access only allows database to be created using User Interface

Creating a Table

Syntax Example

CREATE TABLE table_name CREATE TABLE

(Column_name datatype[(size)], books

Column_name datatype[(size)], (ISBN char(20),

) Title char(50),

AuthorID Integer,

Price
float)

20
Creates a table with four columns

Data Types

Following broad categories of data types exist in most databases:

1. String Data Fixed Length:

▪ Occupies the same length of space in memory no matter how much data is
stored in them. ▪ Syntax:

o char(n) where n is the length of the


String o e.g. name char(50)

▪ If the variable stored for name is ‘Sanjay’ the extra 43 fields are padded with
blanks
Variable Length string is specified with maximum length of characters possible
in the string, however, the allocation is sized to the size of the data stored in
memory. ▪ Syntax:

o Varchar(n) – n is the maximum length


of data possible for the type

▪ There may be a restriction in the maximum length of the data that you can
specify in the declaration which will vary according to the database.
▪ All character data has to be enclosed in single quotes during specification.

2. Numeric Data
▪ Store all the data related to purely numeric data.
▪ Some numeric data may also be stored as a character field e.g. zip codes ▪
Common Numeric Types:
1. Decimal Floating point number
2. Float Floating point number
3. Integer(size) Integer of specified length
4. Money A number which contains exactly two digits after the
decimal point
5. Number A standard number field that can hold a floating point
data
Note: Different databases name their numeric fields differently and may not
support all numeric types. They may also support additional numeric types.

21
3. Temporal Data- These represent the dates and time
Three basic types are supported:
a. Dates
b. Times
c. Date-Time Combinations
4. Large Objects- These are used for storing data objects
like files and images There are two types:
a. Character Large Objects (clobs)
b. Binary Large Objects (blobs)

Specifying Keys- Introduction


▪ Unique keyword is used to specify keys.
o This ensures that duplicate rows are not created in the database.
▪ Both Primary keys and Candidate Keys can be specified in the database.
▪ Once a set of columns has been declared unique any data entered that
duplicates the data in these columns is rejected. ▪ Specifying a single column
as unique:

Example: CREATE TABLE Studios


(studio_id Number, name char(20),
city varchar(50),
state char(2),
UNIQUE (name))

▪ Here the name column has been declared as a candidate key

Specifying Keys- Multiple Columns ▪


Specifying multiple columns as
unique:
Example:
CREATE TABLE Studios
(studio_id Number,
name char(20),
city varchar(50), state char(2),
UNIQUE (name),
UNIQUE(city, state))

22
▪ Here both name & city/state combination are declared as candidate keys

Specifying Keys- Primary Key

▪ Specifying multiple columns as unique:

▪ To specify the Primary Key the Primary Key clause is used

Example: CREATE TABLE Studios

(namestudio_id Number, char(20),

city varchar(50),
state char(2),
PRIMARY KEY (studio_id),
UNIQUE (name),
UNIQUE(city, state)
)

Specifying Keys- Foreign Keys

References clause is used to create a relationship between a set of columns in one


table and a candidate key in the table that is being referenced.

Example:
CREATE TABLE Movies
(movie_title varchar(40),

studio_id Number REFERENCES Studios(studio_id))


Creates a relationship from the Movies table to the Studios table

Modifying Records

Insert Statement

Insert Allows you to add new records to the Table

Syntax: Insert into table_name[(column_list)] values (value_list)

Example: INSERT INTO studios


VALUES (1, ‘Giant’, ‘Los Angeles’, ‘CA’)
INSERT INTO studios
(studio_city, studio_state, studio_name, studio_id)
VALUES (‘Burbank’, ‘CA’, ‘MPM’, 2)

23
Notes 1: If the columns are not specified as in the first example the data
goes in the order specified in the table

Notes 2: There are two ways of inserting Null values

▪ If the field has a default value of Null, you can use an Insert
statement that ignores the column where the value is to be Null.
▪ You can specify the column in the column list specification and
assign a value of Null to the corresponding value field.

Select & Insert

A select query can be used in the insert statement to get the values for the
insert statement

Example: INSERT INTO city_state


SELECT studio_city, studio_state FROM studios

This selects the corresponding fields from the studios table and inserts them
into the city_state table.

Example:
INSERT INTO city_state
SELECT Distinct studio_city, studio_state FROM studios

This selects the corresponding fields from the studios table, deletes the
duplicate fields and inserts them into the city_state table. Thus, the final
table has distinct rows.

Delete Statement:

Used to remove records from a table of the database. The where clause in
the syntax is used to restrict the rows deleted from the table otherwise all the
rows from the table are deleted.

Syntax: DELETE FROM table_name [WHERE Condition]

Example:
DELETE FROM City_State WHERE
state = ‘TX’

Deletes all the rows where the state is Texas keeps all the other rows.

24
Update Statement:

Used to make changes to existing rows of the table. It has three parts.
First, you, must specify which table is going to be updated. The second part of
the statement is the set clause, in which you should specify the columns that
will be updated as well as the values that will be inserted. Finally, the where
clause is used to specify which rows will be updated.

Syntax: UPDATE table_name


SET column_name1 = value1,
column_name2 = value2, …..
[WHERE Condition]

Example:
UPDATE studios

SET studio_city = ‘New York’,

studio_state = ‘NY’
WHERE studio_id = 1

Notes1: If the condition is dropped then all the rows are updated.

Truncate Statement:

Used to delete all the rows of a table. Delete can also be used to delete all
the rows from the table. The difference is that delete performs a delete
operation on each row in the table and the database performs all attendant
tasks on the way. On the other had the Truncate statement simply throws
away all the rows at once and is much quicker. The note of caution is that
truncate does not do integrity checks on the way which can lead to
inconsistencies on the way. If there are dependencies requiring integrity
checks we should use delete.

Syntax: TRUNCATE TABLE table_name

Example: TRUNCATE TABLE studios

This deletes all the rows of the table studios

25
Drop Statement:

Used to remove elements from a database, such as tables, indexes or


even users and databases. Drop command is used with a variety of keywords
based on the need.

Drop Table Syntax: DROP TABLE table_name

Drop Table Example: DROP TABLE studios

Drop Index Syntax: DROP INDEX table_name

Drop Index Example: DROP INDEX movie_index

2. Data Manipulation Language (DML)

▪ It is a language used for selecting, inserting, deleting and updating data in a


database.
▪ It is used to retrieve and manipulate data in a relational database.
▪ DML performs read-only queries of data.

Command Description
SELECT Retrieves certain records from one or more tables.
INSERT Creates a record
UPDATE Modifies records.
DELETE Deletes records.
Select Command

▪ SELECT command is used to retrieve data from the database.


▪ This command allows database users to retrieve the specific information they
desire from an operational database.
▪ It returns a result set of records from one or more tables.

SELECT Command has many optional clauses are as stated below:

Clause Description
WHERE It specifies which rows to retrieve.
GROUP BY It is used to arrange the data into groups.
HAVING It selects among the groups defined by the GROUP BY
clause.

26
ORDER BY It specifies an order in which to return the rows.

Syntax: Example

SELECT*FROM <table_name>; SELECT * FROM employee;

OR

SELECT * FROM employee where salary >=10,000;


Insert Command
▪ INSERT command is used for inserting a data into a table.

▪ Using this command, you can add one or more records to any single table in a
database.

▪ It is also used to add records to an existing code.

Syntax
INSERT INTO <table_name> (`column_name1` <datatype>,
`column_name2` <datatype>, . . . , `column_name_n` <database>) VALUES
(`value1`, `value2`, . . . , `value n`);

Example:
INSERT INTO employee (`eid` int, `ename` varchar(20), `city` varchar(20))
VALUES (`1`, `ABC`, `PUNE`);

Update Command

▪ UPDATE command is used to modify the records present in existing table.


▪ This command updates existing data within a table.
▪ It changes the data of one or more records in a table.

Syntax: Example:

UPDATE <table_name> UPDATE employee


SET <column_name = value> SET salary=20000
WHERE condition; WHERE ename='ABC';

27
Delete Command

▪ DELETE command is used to delete some or all records from the existing
table.
▪ It deletes all the records from a table.

Syntax: Example
DELETE FROM <table_name> where <condition>;DELETE FROM employee

WHERE emp_id = '001';

If users do not write the WHERE condition, then all rows will get deleted.
The SQL SELECT Statement

▪ The SELECT statement is used to select data from a database.


▪ The data returned is stored in a result table, called the result-set.

SELECT Syntax

SELECT column1, column2, ...


FROM table_name;

Here, column1, column2, ... are the field names of the table you want to
select data from. If you want to select all the fields available in the table,
use the following syntax:
SELECT * FROM table_name;

The SQL SELECT DISTINCT Statement

▪ The SELECT DISTINCT statement is used to return only distinct (different)


values.
▪ Inside a table, a column often contains many duplicate values; and sometimes
you only want to list the different (distinct) values.

SELECT DISTINCT Syntax Example

SELECT DISTINCT column1, column2, ... SELECT DISTINCT Country FROM


Customers;

28
FROM table_name;

Module 3: ADVANCED DATA DEFINITION COMMANDS

Introduction

In this chapter students will learn the advanced data definition commands. Data
definition commands are used to create, modify and remove database objects
such as schemas, tables, views, indexes etc. Learn the different clauses to
change the behavior of the SELECT statement. Like, displaying the number of
rows in the table, removing duplicate rows, ordering SELECT results (ascending
and descending), and using WHERE in your queries. Creating view and joining
tables.

Learning Outcomes

After studying this lesson, students should be able to:

1. Understand the usage of ALTER command.

2. Understand the different clauses of SELECT statement.

3. Understand how to join two or more tables in the database.

4. Perform create view.

5. Apply the usage of ALTER, SELECT, and Joining tables commands by


performing tasks using an online SQL editor.
Lesson 1: Advanced Data Definition Commands

All changes are made by using ALTER command. MySQL ALTER command
is very useful when you want to change a name of your table, any table field or if you
want to add or delete an existing column in a table.

Let's begin with creation of a table called testalter_tbl.


root@host# mysql -u root -p
password; Enter password:*******
mysql> use TUTORIALS; Database
changed
mysql> create table testalter_tbl(i INT, c
CHAR(1) ); Query OK, 0 rows affected (0.05

29
sec) mysql> SHOW COLUMNS FROM
testalter_tbl;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
|i | int(11) | YES | | NULL | |
|c | char(1) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

Dropping, Adding or Repositioning a Column:

Suppose you want to drop an existing column i from above MySQL table then
you will use DROP clause along with ALTER command as follows:

mysql> ALTER TABLE testalter_tbl DROP i;

A DROP will not work if the column is the only one left in the table.

To add a column, use ADD and specify the column definition. The following statement
restores the i column to testalter_tbl:

mysql> ALTER TABLE testalter_tbl ADD i INT;

After issuing this statement, testalter will contain the same two columns that it had
when you first created the table, but will not have quite the same structure. That's
because new columns are added to the end of the table by default. So even
though i originally was the first column in mytbl, now it is the last one.

mysql> SHOW COLUMNS FROM testalter_tbl;


+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
|c | char(1) | YES | | NULL | |
|i | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

30
To indicate that you want a column at a specific position within the table, either use
FIRST to make it the first column or AFTER col_name to indicate that the new
column should be placed after col_name. Try the following ALTER TABLE
statements, using SHOW COLUMNS after each one to see what effect each one
has:

ALTER TABLE testalter_tbl DROP i;


ALTER TABLE testalter_tbl ADD i INT FIRST;
ALTER TABLE testalter_tbl DROP i;
ALTER TABLE testalter_tbl ADD i INT AFTER c;

The FIRST and AFTER specifiers work only with the ADD clause. This means that
if you want to reposition an existing column within a table, you first must DROP it
and then ADD it at the new position.

Changing a Column Definition or Name:

To change a column's definition, use MODIFY or CHANGE clause along with


ALTER command. For example, to change column c from CHAR(1) to CHAR(10),
do this: mysql> ALTER TABLE testalter_tbl MODIFY c CHAR(10);

With CHANGE, the syntax is a bit different. After the CHANGE keyword, you name
the column you want to change, then specify the new definition, which includes the
new name. Try out the following example:

mysql> ALTER TABLE testalter_tbl CHANGE i j BIGINT;

If you now use CHANGE to convert j from BIGINT back to INT without changing
the column name, the statement will be as expected: mysql> ALTER TABLE
testalter_tbl CHANGE j j INT;

The Effect of ALTER TABLE on Null and Default Value Attributes:

When you MODIFY or CHANGE a column, you can also specify whether or not
the column can contain NULL values and what its default value is. In fact, if you
don't do this, MySQL automatically assigns values for these attributes.

Here is the example, where NOT NULL column will have value 100 by default.

31
mysql> ALTER TABLE testalter_tbl
-> MODIFY j BIGINT NOT NULL DEFAULT 100;

If you don't use above command, then MySQL will fill up NULL values in all the
columns.

Changing a Column's Default Value:

You can change a default value for any column using ALTER command. Try out the
following example.

mysql> ALTER TABLE testalter_tbl ALTER i SET DEFAULT 1000;


mysql> SHOW COLUMNS FROM testalter_tbl;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
|c | char(1) | YES | | NULL | |
|i | int(11) | YES | | 1000 | |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

You can remove default constraint from any column by using DROP clause along with
ALTER command.

mysql> ALTER TABLE testalter_tbl ALTER i DROP DEFAULT;


mysql> SHOW COLUMNS FROM testalter_tbl;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
|c | char(1) | YES | | NULL | |
|i | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+ 2
rows in set (0.00 sec)

Changing a Table Type:

32
You can use a table type by using TYPE clause along with ALTER command. Try out
the following example to change testalter_tbl to MYISAM table type.

To find out the current type of a table, use the SHOW TABLE STATUS statement.

mysql> ALTER TABLE testalter_tbl TYPE = MYISAM;


mysql> SHOW TABLE STATUS LIKE 'testalter_tbl'\G
*************************** 1. row ****************
Name: testalter_tbl
Type: MyISAM
Row_format: Fixed
Rows: 0
Avg_row_length: 0
Data_length: 0
Max_data_length: 25769803775
Index_length: 1024
Data_free: 0
Auto_increment: NULL
Create_time: 2007-06-03 08:04:36
Update_time: 2007-06-03
08:04:36 Check_time: NULL
Create_options:
Comment:
1 row in set (0.00 sec)

Renaming a Table:

To rename a table, use the RENAME option of the ALTER TABLE statement.
Try out the following example to rename testalter_tbl to alter_tbl.

Mysql> ALTER TABLE testalter tbl RENAME TO alter_tbl;

Lesson 2: Advance Select Queries

SELECT command can be used to retrieve records from your tables. We


can also use many different clauses to change the behavior of the SELECT
statement. Like, displaying the number of rows in the table, removing duplicate

33
rows, ordering SELECT results (ascending and descending), limiting your results,
and using WHERE in your queries.

Display the number of rows in the table

We can use the COUNT() function to return the number of rows that
matches a specified criteria. For example, to display the number of all rows in a
table, we can use the following command:

mysql> SELECT * FROM test_table;


+------+-------------+------+
| name | surname | birth_year |
+------+-------------+------+
| Sarah | Geronimo | 1991 |
| Angel | Locsin | 1982 |
| Angel | Aquino | 1975 |
| Anne | Curtis | 1985 |
+------+-------------+------+
4 rows in set (0.00 sec)
mysql> SELECT COUNT(*) FROM test_table;
+----------+
| COUNT(*) |
+----------+
|4|
+----------+
1 row in set (0.00 sec)

Remove duplicate rows

Sometimes, when querying data from a table, you might get duplicate rows. To
remove these duplicate rows, you use the DISTINCT clause in

the SELECT statement. For example, let’s say that we want a list of all first name in
our table. If we select just the name column from a table, we will get duplicate results:

mysql> SELECT name FROM test_table;


+------+

34
| name |
+------+
| Sarah |
| Angel |
| Angel |
| Anne |
+------+
4 rows in set (0.00 sec)
Notice how the name Angel appears twice. To remove multiple entries, we can use
the DISTINCT clause:

mysql> SELECT DISTINCT name FROM test_table;


+------+
| name |
+------+
| Sarah |
| Angel |
| Anne |
+------+
3 rows in set (0.01 sec)

Filter records

You can narrow down queries by returning only those where a certain expression is
true. To do that, the WHERE clause is used.

Here is the syntax:

35
• BETWEEN operator, which is useful with integer or data
comparisons because it searches for results between a minimum
and maximum value.
SELECT column_name FROM table_name WHERE column_name operator value

Here is an example. Let’s say that we want to select all rows with the value Sarah in
the name column:

mysql> SELECT * FROM test_table;


+------+-------------+------+
| name | surname | birth_year |
+------+-------------+------+
| Sarah | Geronimo | 1991 |
| Angel | Locsin | 1982 |
| Angel | Aquino | 1975 |
| Anne | Curtis | 1985 |
+------+-------------+------+
4 rows in set (0.00 sec)

mysql> SELECT * FROM test_table WHERE name='Sarah';


+------+-------------+------+
| name | surname | birth_year |
+------+-------------+------+
| Sarah | Geronimo | 1991 |
+------+-------------+------+
1 rows in set (0.01 sec)

Note: Make sure to use the single quotes around text values when working with
strings.

Here is another example. Let’s say that we want to select all rows where the birth_year
column is between 1980 and 1995:

mysql> SELECT * FROM test_table WHERE birth_year between 1980 and 1995;
+------+-----------+------+
| name | surname | birth_year |
+------+-----------+------+

36
| Sarah | Geronimo | 1991 |
| Angel | Locsin | 1982 |
| Anne | Curtis | 1985 |
+------+-----------+------+
3 rows in set (0.00 sec)

Ordering SELECT Results

You can sort the results either in ascending or descending order.


▪ specify your requirements using the ORDER BY clause
▪ sort order ascending (ASC)
▪ sort order descending (DESC)

mysql> SELECT * FROM test_table;


+------+-------------+------+
| name | surname | birth_year |
+------+-------------+------+
| Sarah | Geronimo | 1991 |
| Angel | Locsin | 1982 |
| Angel | Aquino | 1975 |
| Anne | Curtis | 1985 |
+------+-------------+------+
4 rows in set (0.00 sec)

mysql> SELECT * FROM test_table ORDER BY surname asc;


+------+-------------+------+
| name | surname | birth_year |
+------+-------------+------+
| Angel | Aquino | 1991 |
| Anne | Curtis | 1982 |
| Sarah | Geronimo | 1975 |
| Angel | Locsin | 1985 |
+------+-------------+------+
4 rows in set (0.00 sec)
mysql> SELECT * FROM test_table ORDER BY surname desc;

37
+------+-------------+------+
| name | surname | birth_year |
+------+-------------+------+
| Angel | Aquino | 1991 |
| Anne | Curtis | 1982 |
| Sarah | Geronimo | 1975 |
| Angel | Locsin | 1985 |
+------+-------------+------+
4 rows in set (0.00 sec)

Lesson 3: Creating a View

In SQL, a view is a virtual table based on the result-set of an SQL


statement. A view contains rows and columns, just like a real table. The fields in a
view are fields from one or more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and
present the data as if the data were coming from one single table.

CREATE VIEW Syntax

CREATE VIEW view_name AS


SELECT column1, column2, ...
FROM table_name
WHERE condition;

Note: A view always shows up-to-date data! The database engine recreates the data,
using the view's SQL statement, every time a user queries a view.

Example

CREATE VIEW Artists AS


SELECT name, surname, birth_year
FROM test_table
WHERE birth_year > 1980;

We can query the view above as follows:

SELECT * FROM Artists;

38
Lesson 4: Joining Database Tables

▪ A SQL join is a Structured Query Language (SQL) instruction to combine data


from two sets of data (i.e. two tables).

▪ Joining tables is the most important distinction between relational database and
other databases

▪ Join is performed when data are retrieved from more than one table at a time o
Equally comparison between foreign key and primary key of related tables.
▪ Join tables by listing tables in FROM clause of SELECT statement.

o DBMS creates Cartesian product of every table

Model Tables
suppliers

+-------------+---------------+-------------------+------------------+
| supplier_id | supplier_name | supplier_address | supplier_contact |
+-------------+---------------+-------------------+------------------+
| 1 | Microsoft | 1 Microsoft Way | Bill Gates |
| 2 | Apple, Inc. | 1 Infinite Loop | Steve Jobs |
| 3 | EasyTech | 100 Beltway Drive | John Williams |
| 4 | WildTech | 100 Hard Drive | Alan Wilkes | +-------------
+---------------+-------------------+------------------+

product
+-----------+----------------------------+-----------------------------+-------------+
| prod_code | prod_name | prod_desc | supplier_id |
+-----------+----------------------------+-----------------------------+-------------+
| 1 | CD-RW Model 4543 | CD Writer | 3|
| 2 | EasyTech Mouse 7632 | Cordless Mouse | 3|
| 3 | WildTech 250Gb 1700 | SATA Disk Drive | 4|
| 4 | Microsoft 10-20 Keyboard | Ergonomic Keyboard | 1|
| 5 | Apple iPhone 8Gb | Smart Phone | 2 | +-----------
+----------------------------+-----------------------------+-------------+

39
Performing a Cross-Join

Joining tables involves combining rows from two tables. The most basic of
join types is the cross-join. The cross-join simply assigns a row from one table to
every row of the second table. This is of little or no use in real terms, but for the
purposes of completeness, the syntax for a cross-join is as follows:

SELECT column_names FROM table1, table2;

For example, if we were to perform the following command on our sample


table, we would get the following output:

+----------------------------+---------------+
| prod_name | supplier_name |
+----------------------------+---------------+
| CD-RW Model 4543 | Microsoft |
| CD-RW Model 4543 | Apple, Inc. |
| CD-RW Model 4543 | EasyTech |
| CD-RW Model 4543 | WildTech |
| EasyTech Mouse 7632 | Microsoft |
| EasyTech Mouse 7632 | Apple, Inc. |
| EasyTech Mouse 7632 | EasyTech |
| EasyTech Mouse 7632 | WildTech |
| WildTech 250Gb 1700 | Microsoft |
| WildTech 250Gb 1700 | Apple, Inc. |
| WildTech 250Gb 1700 | EasyTech |
| WildTech 250Gb 1700 | WildTech |
| Microsoft 10-20 Keyboard | Microsoft |
| Microsoft 10-20 Keyboard | Apple, Inc. |
| Microsoft 10-20 Keyboard | EasyTech |
| Microsoft 10-20 Keyboard | WildTech |
| Apple iPhone 8Gb | Microsoft |
| Apple iPhone 8Gb | Apple, Inc. |
| Apple iPhone 8Gb | EasyTech |
| Apple iPhone 8Gb | WildTech |
+----------------------------+---------------+

40
Equi-Join (aka the Inner Join)

The Equi-Join joins rows from two or more tables based on comparisons
between a specific column in each table. The syntax for this approach is as follows:

SELECT column_names FROM table1, table2 WHERE (table1.column =


table2.column);

For example, to extract the product name and supplier name for each row in
our product table we would use the following command:

SELECT prod_name, supplier_name, supplier_address FROM product, suppliers


WHERE (product.supplier_id = suppliers.supplier_id);

Note that we have to use what is known as the fully qualified name for the
supplier_id column in each table since both tables contain a supplier_id. A fully
qualified column name is defined by specifying the table name followed by a dot (.)
and then the column name.

The result of the above command is to produces a list of products and the name and
address of the supplier for each product:

+--------------------------+---------------+-------------------+
| prod_name | supplier_name | supplier_address |
+--------------------------+---------------+-------------------+
| Microsoft 10-20 Keyboard | Microsoft | 1 Microsoft Way |
| Apple iPhone 8Gb | Apple, Inc. | 1 Infinite Loop |
| CD-RW Model 4543 | EasyTech | 100 Beltway Drive |
| EasyTech Mouse 7632 | EasyTech | 100 Beltway Drive |
| WildTech 250Gb 1700 | WildTech | 100 Hard Drive |
+--------------------------+---------------+-------------------+

Performing a Left Join or a Right Join

Another way to join tables is use a LEFT JOIN in the select statement.The
LEFT JOIN causes the tables to be joined before any WHERE clause is used. The
syntax for this type of join is: SELECT column names FROM table1 LEFT JOIN
table2 ON (table1.column = table2.column);

41
Therefore, we can perform a LEFT JOIN that gives us the same result as our EquiJoin:

SELECT prod_name, supplier_name, supplier_address FROM product LEFT JOIN


suppliers
ON (product.supplier_id = suppliers.supplier_id);
+----------------------------+---------------+-------------------+
| prod_name | supplier_name | supplier_address |
+----------------------------+---------------+-------------------+
| CD-RW Model 4543 | EasyTech | 100 Beltway Drive |
| EasyTech Mouse 7632 | EasyTech | 100 Beltway Drive |
| WildTech 250Gb 1700 | WildTech | 100 Hard Drive |
| Microsoft 10-20 Keyboard | Microsoft | 1 Microsoft Way |
| Apple iPhone 8Gb | Apple, Inc. | 1 Infinite Loop |
+----------------------------+---------------+-------------------+

One key different with the LEFT JOIN is that it will also list rows from the
first table for which there is no match in the second table. For example, suppose
we have product in our product table for which there is no matching supplier in the
supplier table. When we run our SELECT statement, the row will still be displayed,
but with NULL values for the supplier columns since no such supplier exists:

+----------------------------+---------------+-------------------+
| prod_name | supplier_name | supplier_address |
+----------------------------+---------------+-------------------+
| CD-RW Model 4543 | EasyTech | 100 Beltway Drive |
| EasyTech Mouse 7632 | EasyTech | 100 Beltway Drive |
| WildTech 250Gb 1700 | WildTech | 100 Hard Drive |
| Microsoft 10-20 Keyboard | Microsoft | 1 Microsoft Way |
| Apple iPhone 8Gb | Apple, Inc. | 1 Infinite Loop |
| Moto Razr | NULL | NULL |
+----------------------------+---------------+-------------------+

The opposite effect can be achieved using a RIGHT JOIN, whereby all the
rows in the second table (i.e. our supplier table) will be displayed regardless of
whether that supplier has any products in our product table:

42
SELECT prod_name, supplier_name, supplier_address FROM product RIGHT JOIN
suppliers
ON (product.supplier_id = suppliers.supplier_id);
+--------------------------+-----------------+------------------------+
| prod_name | supplier_name | supplier_address |
+--------------------------+-----------------+------------------------+
| Microsoft 10-20 Keyboard | Microsoft | 1 Microsoft Way |
| Apple iPhone 8Gb | Apple, Inc. | 1 Infinite Loop |
| CD-RW Model 4543 | EasyTech | 100 Beltway Drive |
| EasyTech Mouse 7632 | EasyTech | 100 Beltway Drive |
| WildTech 250Gb 1700 | WildTech | 100 Hard Drive |
| NULL | Hewlett Packard | 100 Printer Expressway |
+--------------------------+-----------------+------------------------+
Creating Joins with WHERE and USING

The next step is to incorporate some WHERE clauses into our LEFT and

RIGHT joins. Say, for example, that we wish to list only products supplied by Microsoft:

SELECT prod_name, supplier_name, supplier_address FROM product RIGHT JOIN


suppliers
ON (product.supplier_id = suppliers.supplier_id) WHERE supplier_name='Microsoft';
+--------------------------+---------------+------------------+
| prod_name | supplier_name | supplier_address |
+--------------------------+---------------+------------------+
| Microsoft 10-20 Keyboard | Microsoft | 1 Microsoft Way |
+--------------------------+---------------+------------------+
1 row in set (0.00 sec)

The USING clause further simplifies the tasks of creating joins. The
purpose of USING is to avoid the use of fully qualified names (such as
product.supplier_id and supplier.supplier_id) when reference columns that reside
in different tables but have the names. For example, to perform the same join
above based on the values of product.supplier_id and supplier.supplier_id we can
simply use the following syntax:

43
SELECT prod_name, supplier_name, supplier_address FROM product
LEFT JOIN suppliers USING (supplier_id) WHERE
supplier_name='Microsoft';

Resulting in the following output:

+--------------------------+---------------+------------------+
| prod_name | supplier_name | supplier_address |
+--------------------------+---------------+------------------+
| Microsoft 10-20 Keyboard | Microsoft | 1 Microsoft Way |
+--------------------------+---------------+------------------+
1 row in set (0.00 sec)

44
ASSESSMENT:

1. Define what normalization is.

___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________

2. Identify the problems that you may encountered without normalization in designing
your database and discuss briefly each modification anomaly.

___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________

3. The table below shows an unnormalized data. Improve the data by normalization in
first normal form.

Table Product

Product ID Color Price


101 red, blue 25
102 blue 57
103 red, green 34
104 Yellow, blue 48
105 red 50

___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________

45
4. Explain what Structured Query Language is.

_________________________________________________________________
_________________________________________________________________
_________________________________________________________________
_____________________________________________________
______________________________________________________________ 5.
Identify and explain briefly the Structured Query Language process.
______________________________________________________________
_________________________________________________________________
_________________________________________________________________
________________________________________________________
______________________________________________________________

6. Explain briefly the categories of data integrity exists in RDBMS.


__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
___________________________________________________
7. Perform the following exercises using your laptop or personal computer. Make
sure to have screenshots for every task and send to instructor’s email account.

A. Create a Database with the following tables and field specifications.

Student Table Book Table


Field Field
StudentID BookID
Fname StudentID
Mname Book Title
Lname Publication

46
▪ Provide at least 5 sample records for each table.

▪ Student_table:

Add/insert the following fields:

o Address

o Contact No.

o Gender

▪ Book_table:

o Insert Author Name

o Select the BookID and Book Title

B. Create the following table.


Table name: my_classmates
Surname Name Age Prog_id
Dela Cruz Juan 20 1
Locsin Angel 23 2
Aquino Jake 33 3
Lname Jake 25 4

Table name: programs

Prog_id Description
1 BSIT
2 BSED
3 BEED
4 BSA

▪ Rename the table my_classmates to my_friends

▪ Create VIEW named best_friends, add the following columns:


surname, name, prog_id. Do not forget to query your view and sort in
ascending order according to their age.

▪ Join two tables above using inner join, use the prog_id to join the
tables.

47
Module 4: ADVANCED SQL FUNCTIONS

Introduction

SQL functions are simply sub-programs, which are commonly used and reused
throughout SQL database applications for processing or manipulating data. All SQL
database systems have DDL (data definition language) and DML (data manipulation
language) tools to support the creation and maintenance of databases.

Learning Outcomes

After studying this lesson, students should be able to:

1. Understand how to use the advanced SQL functions.

2. Understand how to use SQL functions to manipulate dates and time.

3. Understand how to use the syntax of all the date functions.

4. Understand SQL Concatenate character function

5. Understand how to use the SQL LENGTH function to get the number of characters
in a string.

6. Understand how to use the SQL UPPER function to convert a string into uppercase.

7. Understand how to use the LOWER function to convert a string into lowercase.

8. Understand how to use the SQL SUBSTRING function to extract a substring from a
string.

Lesson 1. SQL DATE FUNCTIONS

In SQL, dates are complicated for newbies, since while working with a database, the
format of the date in the table must be matched with the input date in order to insert. In
various scenarios instead of date, datetime (time is also involved with date) is used.

Date and Time Function formats are different various database. we are going to discuss
most common functions used in Oracle database.

The function SYSDATE returns 7 bytes of data, which includes:

48
• Century
• Year
• Month
• Day
• Hour
• Minute
• Second
Extract ():

Oracle helps you to extract Year, Month and Day from a date using Extract()
Function.

• Example-1: Extracting Year:

SELECT SYSDATE AS CURRENT_DATE_TIME, EXTRACT(Year FROM SYSDATE)

AS ONLY_CURRENT_YEAR FROM Dual

Output:

CURRENT_DATE_TIME ONLY_CURRENT_YEAR

05.Feb.2019 07:29:24 2019

Explanation:

Useful to retrieve only year from the System date/Current date or particular specified
date.

• Example-2: Extracting Month

SELECT SYSDATE AS CURRENT_DATE_TIME, EXTRACT(Month FROM


SYSDATE) AS ONLY_CURRENT_MONTH FROM Dual Output:

CURRENT_DATE_TIME ONLY_CURRENT_MONTH

Feb 05. 2019 07:29:24 Feb.

49
Explanation:

Useful to retrieve only month from the System date/Current date or particular specified
date.

Example-3: Extracting Day

SELECT SYSDATE AS CURRENT_DATE_TIME, EXTRACT(Day FROM


SYSDATE) AS ONLY_CURRENT_DAY FROM Dual Output:

CURRENT_DATE_TIME ONLY_CURRENT_DAY

05.Feb.2019 07:29:24 5

Explanation:

Useful to retrieve only day from the System date/Current date or particular specified
date.

ADD_MONTHS (date, n):

Using this method in PL/SQL you can add as well as subtract number of months(n)
to a date. Here ‘n’ can be both negative or positive.

SELECT ADD_MONTHS(SYSDATE, -1) AS PREV_MONTH, SYSDATE AS

CURRENT_DATE, ADD_MONTHS(SYSDATE, 1) as NEXT_MONTH

FROM Dual

Output:

PREV_MONTH CURRENT_DATE NEXT_MONTH

02.Jan.2019 09:15:46 02.Feb.2019 09:15:46 02.Mar.2019 09:15:46

50
Explanation:

ADD_MONTHS function has two parameters one is date, where it could be any
specified/particular date or System date as current date and second is ‘n’, it is an
integer value could be positive or negative to get upcoming date or previous date.

LAST_DAY(date):

Using this method in PL/SQL you can get the last day in the month of specified date.

• Example-5

SELECT SYSDATE AS CURRENT_DATE, LAST_DAY(SYSDATE)


AS LAST_DAY_OF_MONTH,

LAST_DAY(SYSDATE)+1 AS FIRST_DAY_OF_NEXT_MONTH

FROM Dual

Output:

CURRENT_DATE LAST_DAY_OF_MONTH FIRST_DAY_OF_NEXT_MONTH

02.Feb.2019 09:32:00 28.Feb.2019 09:32:00 01.Mar.2019 09:32:00

Explanation:

In above example, we are getting current date using SYSDATE function and last
date of the month would be retrieved using LAST_DAY function and this function
be also helpful for retrieving the first day of the next month.

Example-6: Number of Days left in the month

SELECT SYSDATE AS CURRENT_DATE, LAST_DAY(SYSDATE) - SYSDATE AS

DAYS_LEFT_IN_MONTH

FROM Dual
Output:

51
CURRENT_DATE DAYS_LEFT_IN_MONTH

02.Feb.2019 09:32:00 26

MONTHS_BETWEEN (date1, date2):

Using this method in PL/SQL you can calculate the number of months between
two entered dates date1 and date2. if date1 is later than date2 then the result
would be positive and if date1 is earlier than date2 then result is negative.
Note:

If a fractional month is calculated, the MONTHS_BETWEEN function calculates the


fraction based on a 31-day month.

SELECT MONTHS_BETWEEN (TO_DATE ('01-07-2003', 'dd-mm-yyyy'),

TO_DATE ('14-03-2003', 'dd-mm-yyyy')) AS NUMBER_OF_MONTHS

FROM Dual
Output:

NUMBER_OF_MONTHS

3.58
Explanation:

Here date1 and date2 are not on the same day of the month that’s why we are
getting the value in fractions, as well as date1 is later than date2 so the resulting
value is in integers.

Eneterd date should be in particular date format, that is the reason of using TO_DATE
function while comparison within MONTHS_BETWEEN function.

Let’s select the number of months an employee has worked for the company.

• Example-8

SELECT MONTHS_BETWEEN (SYSDATE, DATE_OF_HIRE) AS

NUMBER_OF_MONTHS

52
FROM Employees

Input:

SYSTEM_DATE DATE_OF_HIRE

02-02-2019 31-10-2017

02-02-2019 03-12-2017

02-02-2019 24-09-2018

02-02-2019 22-12-2016

02-02-2019 18-06-2018

Output:

NUMBER_OF_MONTHS

15.064

13.967

4.290

25.354

7.483

NEXT_DAY(date, day_of_week):

It will return the upcoming date of the first weekday that is later than the entered
date. It has two parameters first date where, system date or specified date can be
entered; second day of week which should be in character form.

• Example-9:

SELECT NEXT_DAY(SYSDATE, 'SUNDAY') AS NEXT_SUNDAY

FROM Dual

53
Output:

NEXT_SUNDAY

17-FEB-2019
Explanation:

It will help to provide the next upcoming date corresponding to the day, return type is
always DATE regardless of datatype date. The second parameter must be a day of the
week either full name or abbreviated.

Lesson 2. CHARACTER FUNCTIONS

A character or string function is a function which takes one or more characters or


numbers as parameters and returns a character value. Basic string functions offer a
number of capabilities and return a string value as a result set.

Lesson 2.2.1. SQL Concatenate character function

The CONCAT function concatenates by taking two VARCHAR2 strings and returning
those strings appended together in the order specified. The syntax of the function looks
like as follows:

CONCAT ( string_value1, string_value2 [, string_valueN ] )

The CONCAT function at least requires two parameters and this function can accept a
maximum of 254 parameters.

CONCAT function examples

In this example, we will join Think and green strings with the CONCAT function:
SELECT CONCAT('Think','green') AS 'FullString'

54
As we can see clearly in this first example, the CONCAT function joined these two
strings and we obtained the Thinkgreen string.
In this second example, we will join 7 strings:

SELECT CONCAT('If' , ' you' , ' save', ' a', ' tree' , ' you' , ' save' ,' a' ,' life') AS 'FullStr
ing'

In this second example, the CONCAT function concatenated more than two strings
and the result was If you save a tree you save a life.

In addition, we can concatenate the variables with this function:

DECLARE @Str1 AS VARCHAR(100)='Think'

DECLARE @Str2 AS VARCHAR(100)='-'

DECLARE @Str3 AS VARCHAR(100)='green'

SELECT CONCAT(@Str1,@Str2,@Str3) AS ResultSt

ring

Concatenating numerical expressions with CONCAT function in SQL

CONCAT function also has the capability to join the numeric values. In the following
example, we will join three different integer values:

SELECT CONCAT(11,33,99) AS Result

55
As we can see, we did not use any CAST or CONVERT function to join these
numerical expressions with the function. On the other hand, if we want to
concatenate these expressions with (+) plus sign, we need to convert them to
string data types. Otherwise, the result of the concatenation operation will be
incorrect:

SELECT CAST(11 AS VARCHAR(10)) + CAST(33 AS VARCHAR(10)) +CAST(99


A S VARCHAR(10)) AS TrueResult

The following example demonstrates the concatenating numerical expressions with (+)
plus without any data conversion so the output will be a mathematical addition:

SELECT 11+33+99 AS WrongResult

Now, let’s research and try to understand what is happening behind the scene while
the numerical expressions concatenation process with CONCAT function.

Firstly, we will create a test table in order to insert some numerical expressions. The
following script will create a Test_NumericValue table:

DROP TABLE IF EXISTS Test_NumericValue

56
CREATE TABLE Test_NumericValue (Number_1 INT , Number_2 INT , Number_3INT)

In the second step, we will insert test data to this table:

INSERT INTO Test_NumericValue VALUES (11,33,9

9)

Now, we will execute the below SELECT statement in the ApexSQL Plan:

SELECT CONCAT(Number_1,Number_2,Number_3) FROM Test_Num

ericValue

How to use line feed (\n) and carriage return (\r) with CONCAT function

CHAR function enables to convert ASCII numbers to character values. The following
ASCII codes can be used to get a new line with CHAR function in SQL:

Value Char Description

10 LF Line Feed

13 CR Carriage Return

We can get a new line when we concatenate the strings with the following CHAR
functions:

• CHAR (10): New Line / Line Feed

• CHAR (13): Carriage Return

For example, the following query results in the concatenated string line by line with
CHAR(13) function:

SELECT CONCAT('Make',CHAR(13),'every' ,CHAR(13),'drop' , CHAR(13) , 'of',CHA

R(13),'water',CHAR(13), 'count')

57
AS Result

Now, we will replace CHAR(13) expression with CHAR(10) expression and re-
execute the same query:

SELECT CONCAT('Make',CHAR(10),'every' ,CHAR(10),'drop' , CHAR(10) ,


'of',CHAR(10),'water',CHAR(10),

'count') AS Result

At the same time, we can use CHAR(13) and CHAR(10) together. This usage type
could be a good option when we want to generate a line break. Now, we will make
a demonstration of it:

58
SELECT CONCAT('Make',CHAR(10),CHAR(13),'every' ,CHAR(10),CHAR(13),'drop',
CHAR(10),CHAR(13), 'of',CHAR(10),CHAR(13),'water',CHAR(10),CHAR(13),'count')
AS Result

Conclusion

In this lesson, we have learned the CONCAT function in SQL using various examples.

CONCAT function is a very useful option to concatenate the expressions in the SQL.

Lesson 2.2.2. SQL Length Function

The SQL LENGTH function returns the number of characters in a string. The
LENGTH function is available in every relational database systems. Some
database systems use the LEN function that has the same effect as the LENGTH
function.

LENGTH(string)

If the input string is empty, the LENGTH returns 0. It returns NULL if the input string
is NULL.

The number of characters is the same as the number of bytes for the ASCII strings.
For other character sets, they may be different.

The LENGTH function returns the number of bytes in some relational database
systems such as MySQL and PostgreSQL. To get the number of characters in a
string in MySQL and PostgreSQL, you use the CHAR_LENGTH function instead.

59
SQL LENGTH examples

The following statement uses the LENGTH function to return the number of
characters the string SQL :
SELECT LENGTH ('SQL');

length
--------
3
(1 row)

See the following employees table in the sample databas e.

The following statement returns the top five employees with the longest names.

SELECT
employee_id,
CONCAT (first_name, ' ', last_name) AS full_name,
LENGTH (CONCAT (first_name, ' ', last_name)) AS len
FROM
employees
ORDER BY len DESC
LIMIT 5;

60
How the query works.

• First, use the CONCAT function to construct the full name of the employee by
concatenating the first name, space, and last name.

• Second, apply the LENGTH function to return the number of characters of the
full name of each employee.

• Third, sort the result set by the result of the LENGTH function and get five rows
from the sorted result set.

Lesson 2.2.3. SQL Upper Function

The SQL UPPER function converts all the letters in a string into uppercase. If you
want to convert a string to lowercase, you use the LOWER function instead.

The syntax of the UPPER function is as simple as below.

UPPER(string);

If the input string is NULL, the UPPER function returns NULL, otherwise, it returns a
new string with all letters converted to uppercase.

Besides the UPPER function, some database systems provide you with an
additional function named UCASE which is the same as the UPPER function. It is
“there is more than one way to do it”.

UCASE(string);

SQL UPPER function examples

61
The following statement converts the string sql upper to SQL UPPER:

SELECT UPPER('sql upper');

UPPER

----------------------

SQL UPPER

(1 row)

Let’s take a look at the employees table in the sample database.

The following query uses the UPPER function to convert last names of employees to
uppercase.
SELECT

UPPER(last_name)

FROM

employees

ORDER BY UPPER(last_name);

The query just read the data from the employees table and convert them on the fly.
The data in the table remains intact.To convert data to uppercase in the database
table, you use the UPDATE statement. For example, the following statement
updates the emails of employees to uppercase.

62
UPDATE employees
SET email =
UPPER(email);

Querying data case insensitive using the UPPER function

When you query the data using the WHERE clause, the database systems often match
data case sensitively. For example, the literal string Bruce is different from bruce.

The following query returns no result.

SELECT
employee_id,
first_name
FROM
employees
WHERE
first_name = 'BRUCE' ;
To match data case insensitively, you use the UPER function. For example, the
following query will return a row:
SELECT
employee_id,
first_name
FROM
employees

WHERE
UPPER (first_name) ='BRUCE';

Notice that the query above scans the whole table to find the matchingcase the
rơ. In

table is big, the query will be very slow.

63
To overcome this, some database systems provide the function-based index that
allows you to define an index based on a function of one or more columns that
results in a better query performance.

Lesson 2.2.4. SQL Lower Function

The SQL LOWER function converts all the characters in a string into lowercase. If
you want to convert all characters in a string into uppercase, you should use the
UPPER function.

The following illustrates the syntax of the LOWER function.

LOWER(string);

The LOWER function returns a string with all characters in the lowercase format. It
returns NULL if the input string is NULL.

Some database systems such as Oracle database and MySQL provide the LCASE
function that is equivalent to the LOWER function. LCASE(string);

SQL LOWER examples

The following statement uses theLOWER function to convert a string to lowercase:


SELECT LOWER ('SQL LOWER');
lower
----------------------
sql lower
(1 row)

64
See the following employees table in the sample database.

The following query uses the LOWER function to return the names of departments in
lowercase.

SELECT

LOWER(department_name)

FROM
departments
ORDER BY LOWER (department_name);

The following statement updates the emails of the employees to lowercase.

65
UPDATE employees
SET
email = LOWER (email);

Querying data case insensitive

Standard SQL performs matching case sensitive. It means that the literal

string Sarah is different from sarah when it comes to an input for a query.
To query data case-insensitive, you can use the LOWER function.

The following query returns an empty result set because there is no employee whose

first name is sarah .


SELECT
employee_id, first_name, last_name, email
FROM
employees
WHERE
first_name = 'sarah' ;
However, when you use the LOWER function, it returns a row.
SELECT
employee_id,
first_name,
last_name,
email
FROM
employees
WHERE
LOWER (first_name) = 'sarah' ;

Note that this query scans the whole table to get the result. For the big table, it will be
slow.

Some database systems support the function-based index e.g., Oracle database,
PostgreSQL, etc., You can create an index based on a specific function. If you
create a function-based index for the first_name column, the query will use the
index to find the row very fast.

Lesson 2.2.5. SQL Substring Function

The SUBSTRING function extracts a substring that starts at a specified position with a
given length.

66
The following illustrates the syntax of the SUBSTRING function.

SUBSTRING(source_string, position, length);

The SUBSTRING function accepts three arguments:

• The source_string is the string from which you want to extract the substring.

• The position is the starting position where the substring begins. The first position
of the string is one (1).

• The length is the length of the substring. The length argument is optional.
Most relational database systems implement the SUBSTRING function with the same
functionality.

SQL SUBSTRING function examples

The following example returns a substring starting at position 1 with length 3.

SELECT SUBSTRING('SQLTutorial.org'
,1,3);
substring
-----------
SQL
(1 row)

The following statement returns a substring starting at position 4 with length 8.

SELECT SUBSTRING ('SQLTutorial.org',4,8);


substring
-----------
Tutorial
(1 row)

The followingstatement uses thePOSITION function to return the position of the dot
character (.) in the string.

The result of the POSITION function is passed to the SUBSTRING function to find the
extension of a domain:

67
SELECT
SUBSTRING ('SQLTutorial.org' ,
POSITION ('.' IN 'SQLTutorial.org' ));
substring
-----------
.org
(1 row)

See the following employees table in the sample databas e.

The following query uses the SUBSTRING function to extract the first characters of
the employee’s first names (initials) and group employees by the initials:

68
SELECT
SUBSTRING (first_name, 1, 1) initial ,
COUNT (employee_id )
FROM
employees
GROUP BY initial ;

Assessment:

1. Write a query to display the current date. Label this column Date

2. Display the last name, hire date and day of the week on which the employee
started. Label the column Day. Order the results by the day of the week, starting
with Monday.

3. Write a query to get the job_id and related employee's id.

Sample table: employees

69
4. Write a code to return the length of the text in the “CustomerName” column, in
bytes.

5. Convert the text in “CustomerName” to upper-case.

6. Convert the text in “CustomerName” to upper-case.

7. Extract a substring from the text in a column (start at position2, extract 5


characters)

70

You might also like