CS232L-Lab Manual
CS232L-Lab Manual
Objective
The objective of this session is to get basic introduction to DBMS and installation guide to
PostgreSQL0. Also we will cover basic DDL and DML commands.
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered and the output displayed.
1
CS232-L – LAB 1
What is DBMS?
A database management system (DBMS) is system software for creating and managing
databases. The DBMS provides users and programmers with a systematic way to create,
retrieve, update and manage data. A DBMS makes it possible for end users to create, read,
update and delete data in a database. The DBMS essentially serves as an interface between the
database and end users or application programs, ensuring that data is consistently organized
and remains easily accessible.
The DBMS manages three important things: the data, the database engine that allows data to
be accessed, locked and modified -- and the database schema, which defines the database’s
logical structure. These three foundational elements help provide concurrency, security, data
integrity and uniform administration procedures. Typical database administration tasks
supported by the DBMS include change management, performance monitoring/tuning and
backup and recovery. Many database management systems are also responsible for automated
rollbacks, restarts and recovery as well as the logging and auditing of activity. The DBMS is
perhaps most useful for providing a centralized view of data that can be accessed by multiple
users, from multiple locations, in a controlled manner.
The DBMS can offer both logical and physical data independence. That means it can protect
users and applications from needing to know where data is stored or having to be concerned
about changes to the physical structure of data (storage and hardware). As long as programs
use the application programming interface (API) for the database that is provided by the
DBMS, developers won't have to modify programs just because changes have been made to
the database.
2
CS232-L – LAB 1
What is PostgreSQL
History of PostgreSQL
The project was originally named POSTGRES, in reference to the older Ingres database
which also developed at Berkeley. The goal of the POSTGRES project was to add the
minimal features needed to support multiple data types.
In 1996, the POSTGRES project was renamed to PostgreSQL to clearly illustrate its support
for SQL. Today, PostgreSQL is commonly abbreviated as Postgres.
Originally, PostgreSQL was designed to run on UNIX-like platforms. And then, PostgreSQL
was evolved run on various platforms such as Windows, macOS, and Solaris.
LAPP stands for Linux, Apache, PostgreSQL, and PHP (or Python and Perl). PostgreSQL is
primarily used as a robust back-end database that powers many dynamic websites and web
applications.
3
CS232-L – LAB 1
Large corporations and startups alike use PostgreSQL as primary databases to support their
applications and products.
3) Geospatial database
PostgreSQL with the PostGIS extension supports geospatial databases for geographic
information systems (GIS).
Language support
• Python
• Java
• C#
• C/C+
• Ruby
• JavaScript (Node.js)
• Perl
• Go
• Tcl
PostgreSQL has many advanced features that other enterprise-class database management
systems offer, such as:
• User-defined types
• Table inheritance
• Sophisticated locking mechanism
• Foreign key referential integrity
• Views, rules, subquery
• Nested transactions (savepoints)
• Multi-version concurrency control (MVCC)
• Asynchronous replication
PostgreSQL is designed to be extensible. PostgreSQL allows you to define your own data
types, index types, functional languages, etc.
If you don’t like any part of the system, you can always develop a custom plugin to enhance it
to meet your requirements e.g., adding a new optimizer.
4
CS232-L – LAB 1
Many companies have built products and solutions based on PostgreSQL. Some featured
companies are Apple, Fujitsu, Red Hat, Cisco, Juniper Network, Instagram, etc.
PostgreSQL was developed for UNIX-like platforms, however, it was designed to be portable.
It means that PostgreSQL can also run on other platforms such as macOS, Solaris, and
Windows.
First, you need to go to the download page of PostgreSQL installers on the EnterpriseDB.
Step 1. Double click on the installer file, an installation wizard will appear and guide you
through multiple steps where you can choose different options that you would like to have in
PostgreSQL.
5
CS232-L – LAB 1
Step 3. Specify installation folder, choose your own or keep the default folder suggested by
PostgreSQL installer and click the Next button
For the tutorial on this website, you don’t need to install Stack Builder so feel free to uncheck
it and click the Next button to select the data directory:
Step 5. Select the database directory to store the data or accept the default folder. And click
the Next button to go to the next step:
7
CS232-L – LAB 1
PostgreSQL runs as a service in the background under a service account named postgres. If
you already created a service account with the name postgres, you need to provide the
password of that account in the following window.
After entering the password, you need to retype it to confirm and click the Next button:
8
CS232-L – LAB 1
Step 7. Enter a port number on which the PostgreSQL database server will listen. The default
port of PostgreSQL is 5432. You need to make sure that no other applications are using this
port.
Step 8. Choose the default locale used by the PostgreSQL database. If you leave it as default
locale, PostgreSQL will use the operating system locale. After that click the Next button.
Step 9. The setup wizard will show the summary information of PostgreSQL. You need to
review it and click the Next button if everything is correct. Otherwise, you need to click the
Back button to change the configuration accordingly.
9
CS232-L – LAB 1
Now, you’re ready to install PostgreSQL on your computer. Click the Next button to begin
installing PostgreSQL.
10
CS232-L – LAB 1
Step 10. Click the Finish button to complete the PostgreSQL installation.
The quick way to verify the installation is through the psql program.
First, click the psql application to launch it. The psql command-line program will display.
11
CS232-L – LAB 1
Second, enter all the necessary information such as the server, database, port, username, and
password. To accept the default, you can press Enter. Note that you should provide the
password that you entered during installing the PostgreSQL.
Server [localhost]:
Database [postgres]:
Port [5432]:
Username [postgres]:
Password for user postgres:
psql (12.3)
WARNING: Console code page (437) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
Type "help" for help.
postgres=#
Code language: Shell Session (shell)
Third, issue the command SELECT version(); you will see the following output:
12
CS232-L – LAB 1
psql is an interactive terminal program provided by PostgreSQL. It allows you to interact with
the PostgreSQL database server such as executing SQL statements and managing database
objects.
The following steps show you how to connect to the PostgreSQL database server via
the psql program:
First, launch the psql program and connect to the PostgreSQL Database Server using
the postgres user:
Second, enter all the information such as Server, Database, Port, Username, and Password. If
you press Enter, the program will use the default value specified in the square bracket [] and
move the cursor to the new line. For example, localhost is the default database server. In the
step for entering the password for user postgres, you need to enter the password the user
postgres that you chose during the PostgreSQL installation.
13
CS232-L – LAB 1
Third, interact with the PostgreSQL Database Server by issuing an SQL statement. The
following statement returns the current version of PostgreSQL:
SELECT version();
Code language: SQL (Structured Query Language) (sql)
Please do not forget to end the statement with a semicolon (;). After pressing Enter, psql will
return the current PostgreSQL version on your system.
The second way to connect to a database is by using a pgAdmin application. The pgAdmin
application allows you to interact with the PostgreSQL database server via an intuitive user
interface.
14
CS232-L – LAB 1
The following illustrates how to connect to a database using pgAdmin GUI application:
The pgAdmin application will launch on the web browser as shown in the following picture:
Second, right-click the Servers node and select Create > Server… menu to create a server
15
CS232-L – LAB 1
Third, enter the server name e.g., PostgreSQL and click the Connection tab:
Fourth, enter the host and password for the postgres user and click the Save button:
16
CS232-L – LAB 1
Fifth, click on the Servers node to expand the server. By default, PostgreSQL has a database
named postgres as shown below:
17
CS232-L – LAB 1
Sixth, open the query tool by choosing the menu item Tool > Query Tool or click the
lightning icon.
18
CS232-L – LAB 1
Seventh, enter the query in the Query Editor, click the Execute button, you will see the
result of the query displaying in the Data Output tab:
19
CS232-L – LAB 1
The DVD rental database represents the business processes of a DVD rental store. The DVD
rental database has many objects, including:
• 15 tables
• 1 trigger
• 7 views
• 8 functions
• 1 domain
• 13 sequences
20
CS232-L – LAB 1
>psql
21
CS232-L – LAB 1
Second, enter the account’s information to log in to the PostgreSQL database server. You can
use the default value provided by psql by pressing the Enter keyboard. However, for the
password, you need to enter the one that you provided during PostgreSQL installation.
Server [localhost]:
Database [postgres]:
Port [5432]:
Username [postgres]:
Password for user postgres:
Code language: SQL (Structured Query Language) (sql)
postgres=# exit
Code language: PHP (php)
After that, use the pg_restore tool to load data into the dvdrental database:
In this command:
• The -U postgres specifies the postgresuser to login to the PostgreSQL database server.
• The -d dvdrental specifies the target database to load.
Finally, enter the password for the postgres user and press enter
Password:
Code language: SQL (Structured Query Language) (sql)
It takes about seconds to load data stored in the dvdrental.tar file into the dvdrentaldatabase.
The following shows you step by step on how to use the pgAdmin tool to restore the sample
database from the database file:
22
CS232-L – LAB 1
First, launch the pgAdmin tool and connect to the PostgreSQL server.
Second, right click the Databases and select the Create > Database… menu option:
Third, enter the database name dvdrental and click the Save button:
23
CS232-L – LAB 1
You’ll see the new empty database created under the Databases node:
24
CS232-L – LAB 1
Fourth, right-click on the dvdrental database and choose Restore… menu item to restore the
database from the downloaded database file:
25
CS232-L – LAB 1
Fifth, enter the path to the sample database file e.g., c:\sampledb\dvdrental.tar and click
the Restore button:
Sixth, the restoration process will complete in few seconds and shows the following dialog
once it completes:
26
CS232-L – LAB 1
Finally, open the dvdrental database from object browser panel, you will find tables in
the public schema and other database objects as shown in the following picture:
27
CS232-L – LAB 1
• Boolean
• Character types such as char, varchar, and text.
• Numeric types such as integer and floating-point number.
• Temporal types such as date, time, timestamp, and interval
• UUID for storing Universally Unique Identifiers
• Array for storing array strings, numbers, etc.
• JSON stores JSON data
• hstore stores key-value pair
• Special types such as network address and geometric data.
Boolean
A Boolean data type can hold one of three possible values: true, false or null. You
use boolean or bool keyword to declare a column with the Boolean data type.
When you insert data into a Boolean column, PostgreSQL converts it to a Boolean value
When you select data from a Boolean column, PostgreSQL converts the values back e.g., t to
true, f to false and space to null.
Character
PostgreSQL provides three character data types: CHAR(n), VARCHAR(n), and TEXT
• CHAR(n) is the fixed-length character with space padded. If you insert a string that is
shorter than the length of the column, PostgreSQL pads spaces. If you insert a string
that is longer than the length of the column, PostgreSQL will issue an error.
• VARCHAR(n) is the variable-length character string. With VARCHAR(n), you can store
up to n characters. PostgreSQL does not pad spaces when the stored string is shorter
than the length of the column.
• TEXT is the variable-length character string. Theoretically, text data is a character
string with unlimited length.
Numeric
• integers
• floating-point numbers
28
CS232-L – LAB 1
Integer
• Small integer ( SMALLINT) is 2-byte signed integer that has a range from -32,768 to
32,767.
• Integer ( INT) is a 4-byte integer that has a range from -2,147,483,648 to
2,147,483,647.
• Serial is the same as integer except that PostgreSQL will automatically generate and
populate values into the SERIAL column. This is similar
to AUTO_INCREMENT column in MySQL or AUTOINCREMENT column in
SQLite.
Floating-point number
The temporal data types allow you to store date and /or time data. PostgreSQL has five main
temporal data types:
The TIMESTAMPTZ is the PostgreSQL’s extension to the SQL standard’s temporal data types.
Arrays
In PostgreSQL, you can store an array of strings, an array of integers, etc., in array columns.
The array comes in handy in some situations e.g., storing days of the week, months of the
year.
JSON
PostgreSQL provides two JSON data types: JSON and JSONB for storing JSON data.
The JSON data type stores plain JSON data that requires reparsing for each processing,
while JSONB data type stores JSON data in a binary format which is faster to process but
slower to insert. In addition, JSONB supports indexing, which can be an advantage.
29
CS232-L – LAB 1
UUID
The UUID data type allows you to store Universal Unique Identifiers defined by RFC 4122 .
The UUID values guarantee a better uniqueness than SERIAL and can be used to hide sensitive
data exposed to the public such as values of id in URL.
• Besides the primitive data types, PostgreSQL also provides several special data types
related to geometric and network.
• box– a rectangular box.
• line – a set of points.
• point– a geometric pair of numbers.
• lseg– a line segment.
• polygon– a closed geometric.
• inet– an IP4 address.
• macaddr– a MAC address.
30
CS232-L – LAB 1
• VIEW
• DROP VIEW
31
CS232-L – LAB 1
A relational database consists of multiple related tables. A table consists of rows and columns.
Tables allow you to store structured data like customers, products, employees, etc.
To create a new table, you use the CREATE TABLE statement. The following illustrates the
basic syntax of the CREATE TABLE statement:
In this syntax:
• First, specify the name of the table after the CREATE TABLE keywords.
• Second, creating a table that already exists will result in a error. The IF NOT
EXISTS option allows you to create the new table only if it does not exist. When you
use the IF NOT EXISTS option and the table already exists, PostgreSQL issues a
notice instead of the error and skips creating the new table.
• Third, specify a comma-separated list of table columns. Each column consists of the
column name, the kind of data that column stores, the length of data, and the column
constraint. The column constraints specify rules that data stored in the column must
follow. For example, the not-null constraint enforces the values in the column cannot
be NULL. The column constraints include not null, unique, primary key, check,
foreign key constraints.
• Finally, specify the table constraints including primary key, foreign key, and check
constraints.
Note that some table constraints can be defined as column constraints like primary key,
foreign key, check, unique constraints.
Constraints
32
CS232-L – LAB 1
• CHECK – a CHECK constraint ensures the data must satisfy a boolean expression.
• FOREIGN KEY – ensures values in a column or a group of columns from a table
exists in a column or group of columns in another table. Unlike the primary key, a
table can have many foreign keys.
Table constraints are similar to column constraints except that they are applied to more than
one column.
We will create a new table called accounts that has the following columns:
The following statement creates the roles table that consists of two
columns: role_id and role_name:
33
CS232-L – LAB 1
The following statement creates the account_roles table that has three
columns: user_id, role_id and grant_date.
The primary key of the account_roles table consists of two columns: user_id and role_id,
therefore, we have to define the primary key constraint as a table constraint.
Because the user_idcolumn references to the user_idcolumn in the accounts table, we need to
define a foreign key constraint for the user_idcolumn:
The role_idcolumn references the role_idcolumn in the roles table, we also need to define a
foreign key constraint for the role_idcolumn.
The following shows the relationship between the accounts, roles, and account_roles tables:
34
CS232-L – LAB 1
The following illustrates the basic syntax of the ALTER TABLE statement:
• Add a column
• Drop a column
• Change the data type of a column
• Rename a column
• Set a default value for the column.
• Add a constraint to a column.
• Rename a table
To add a new column to a table, you use ALTER TABLE ADD COLUMN statement:
To drop a column from a table, you use ALTER TABLE DROP COLUMN statement:
To rename a column, you use the ALTER TABLE RENAME COLUMN TO statement:
To change a default value of the column, you use ALTER TABLE ALTER COLUMN SET
DEFAULT or DROP DEFAULT:
35
CS232-L – LAB 1
To change the NOT NULL constraint, you use ALTER TABLE ALTER COLUMN statement:
To add a CHECK constraint, you use ALTER TABLE ADD CHECK statement:
Generailly, to add a constraint to a table, you use ALTER TABLE ADD CONSTRAINT statement:
The following illustrates the most basic syntax of the INSERT statement:
In this syntax:
• First, specify the name of the table (table_name) that you want to insert data after
the INSERT INTO keywords and a list of comma-separated columns (colum1,
column2, ....).
The INSERT statement returns a command tag with the following form:
36
CS232-L – LAB 1
OID is an object identifier. PostgreSQL used the OID internally as a primary key for its
system tables. Typically, the INSERT statement returns OID with value 0. The count is the
number of rows that the INSERT statement inserted successfully.
RETURNING clause
The INSERT statement also has an optional RETURNING clause that returns the information
of the inserted row.
If you want to return the entire inserted row, you use an asterisk (*) after
the RETURNING keyword:
If you want to return just some information of the inserted row, you can specify one or more
columns after the RETURNING clause.
For example, the following statement returns the id of the inserted row:
To rename the returned value, you use the AS keyword followed by the name of the output.
For example:
37
CS232-L – LAB 1
Note that you will learn how to create a new table in the subsequent tutorial. In this tutorial,
you just need to execute it to create a new table.
The following statement inserts a new row into the links table:
INSERT 0 1
Code language: Shell Session (shell)
To insert character data, you enclose it in single quotes (‘) for example 'PostgreSQL Tutorial'.
If you omit required columns in the INSERT statement, PostgreSQL will issue an error. In
case you omit an optional column, PostgreSQL will use the column default value for insert.
In this example, the description is an optional column because it doesn’t have a NOT
NULL constraint. Therefore, PostgreSQL uses NULL to insert into the description column.
PostgreSQL automatically generates a sequential number for the serial column so you do not
have to supply a value for the serial column in the INSERT statement.
The following SELECT statement shows the contents of the links table:
If you want to insert a string that contains a single quote (') such as O'Reilly Media, you have
to use an additional single quote (') to escape it. For example:
Output:
INSERT 0 1
38
CS232-L – LAB 1
To insert a date value into a column with the DATE type, you use the date in the
format 'YYYY-MM-DD'.
The following statement inserts a new row with a specified date into the links table:
Output:
INSERT 0 1
4) PostgreSQL INSERT- Getting the last insert id
To get the last insert id from inserted row, you use the RETURNING clause of
the INSERTstatement.
For example, the following statement inserts a new row into the links table and returns the last
insert id:
Output:
Let’s create a new table calledlinks for practicing with the ALTER TABLE statement.
39
CS232-L – LAB 1
To add a new column named active, you use the following statement:
To change the name of the title column to link_title, you use the following statement:
The following statement adds a new column named target to the links table:
To set _blank as the default value for the target column in the links table, you use the following
statement:
If you insert the new row into the links table without specifying a value for the target column,
the target column will take the _blank as the default value. For example:
The following statement adds a CHECKcondition to the targetcolumn so that the targetcolumn
only accepts the following values: _self, _blank, _parent, and _top:
40
CS232-L – LAB 1
If you attempt to insert a new row that violates the CHECK constraint set for the targetcolumn,
PostgreSQL will issue an error as shown in the following example:
The following statement adds a UNIQUE constraint to the url column of the links table:
The following statement attempts to insert the url that already exists:
The following statement changes the name of the links table to urls:
In this syntax:
• First, specify the name of the table that you want to drop after the DROP
TABLE keywords.
• Second, use the IF EXISTS option to remove the table only if it exists.
41
CS232-L – LAB 1
If you remove a table that does not exist, PostgreSQL issues an error. To avoid this situation,
you can use the IF EXISTS option.
In case the table that you want to remove is used in other objects such as views, triggers,
functions, and stored procedures, the DROP TABLE cannot remove the table. In this case,
you have two options:
• The CASCADE option allows you to remove the table and its dependent objects.
• The RESTRICT option rejects the removal if there is any object depends on the table.
The RESTRICT option is the default if you don’t explicitly specify it in the DROP
TABLE statement.
To remove multiple tables at once, you can place a comma-separated list of tables after
the DROP TABLE keywords:
Note that you need to have the roles of the superuser, schema owner, or table owner in order
to drop tables.
Let’s take some examples of using the PostgreSQL DROP TABLE statement
PostgreSQL issues an error because the author table does not exist.
To avoid the error, you can use the IF EXISTS option like this.
As can be seen clearly from the output, PostgreSQL issued a notice instead of an error.
42
CS232-L – LAB 1
The following statement uses the DROP TABLE to drop the authortable:
Because the constraint on the page table depends on the authortable, PostgreSQL issues an
error message:
In this case, you need to remove all dependent objects first before dropping the author table or
use CASCADE option as follows:
PostgreSQL removes the authortable as well as the constraint in the page table.
If the DROP TABLE statement removes the dependent objects of the table that is being
dropped, it will issue a notice like this:
The following statements create two tables for the demo purposes:
43
CS232-L – LAB 1
PostgreSQL SELECT
One of the most common tasks, when you work with the database, is to query data from tables
by using the SELECT statement.
The SELECT statement is one of the most complex statements in PostgreSQL. It has many
clauses that you can use to form a flexible query.
Because of its complexity, we will break it down into many shorter and easy-to-understand
tutorials so that you can learn about each clause faster.
Let’s start with the basic form of the SELECT statement that retrieves data from a single table.
44
CS232-L – LAB 1
SELECT
select_list
FROM
table_name;
Code language: SQL (Structured Query Language) (sql)
• First, specify a select list that can be a column or a list of columns in a table from
which you want to retrieve data. If you specify a list of columns, you need to place a
comma (,) between two columns to separate them. If you want to select data from all
the columns of the table, you can use an asterisk (*) shorthand instead of specifying all
the column names. The select list may also contain expressions or literal values.
• Second, specify the name of the table from which you want to query data after
the FROM keyword.
The FROM clause is optional. If you do not query data from any table, you can omit
the FROM clause in the SELECT statement.
PostgreSQL evaluates the FROM clause before the SELECT clause in the SELECT statement:
Note that the SQL keywords are case-insensitive. It means that SELECT is equivalent
to select or Select. By convention, we will use all the SQL keywords in uppercase to make the
queries easier to read.
We will use the following customer table in the sample database for the demonstration.
45
CS232-L – LAB 1
1) Using PostgreSQL SELECT statement to query data from one column example
This example uses the SELECT statement to find the first names of all customers from
the customer table:
Notice that we added a semicolon (;) at the end of the SELECT statement. The semicolon is not
a part of the SQL statement. It is used to signal PostgreSQL the end of an SQL statement. The
semicolon is also used to separate two SQL statements.
2) Using PostgreSQL SELECT statement to query data from multiple columns example
Suppose you just want to know the first name, last name and email of customers, you can
specify these column names in the SELECT clause as shown in the following query:
SELECT
first_name,
last_name,
email
FROM
customer;
Code language: SQL (Structured Query Language) (sql)
46
CS232-L – LAB 1
3) Using PostgreSQL SELECT statement to query data from all columns of a table
example
The following query uses the SELECT statement to select data from all columns of
the customer table:
In this example, we used an asterisk (*) in the SELECT clause, which is a shorthand for all
columns. Instead of listing all columns in the SELECT clause, we just used the asterisk (*) to
save some typing.
However, it is not a good practice to use the asterisk (*) in the SELECT statement when you
embed SQL statements in the application code like Python, Java, Node.js, or PHP due to the
following reasons:
1. Database performance. Suppose you have a table with many columns and a lot of data,
the SELECT statement with the asterisk (*) shorthand will select data from all the
columns of the table, which may not be necessary to the application.
2. Application performance. Retrieving unnecessary data from the database increases the
traffic between the database server and application server. In consequence, your
applications may be slower to respond and less scalable.
47
CS232-L – LAB 1
Because of these reasons, it is a good practice to explicitly specify the column names in
the SELECT clause whenever possible to get only necessary data from the database.
And you should only use the asterisk (*) shorthand for the ad-hoc queries that examine data
from the database.
The following example uses the SELECT statement to return full names and emails of all
customers:
SELECT
first_name || ' ' || last_name,
email
FROM
customer;
Output:
The following example uses the SELECT statement with an expression. It omits
the FROM clause:
SELECT 5 * 3;
48
CS232L – Database Management System
Objective
The objective of this session is to
Instructions
Open the handout/lab manual in front of a computer in the lab during the session.
Practice each new command by completing the examples and exercise.
Turn-in the answers for all the exercise problems as your lab report.
When answering problems, indicate the commands you entered and the output displayed.
1
CS232-L – LAB 2
The WHERE clause appears right after the FROM clause of the SELECT statement.
The WHERE clause uses the condition to filter the rows returned from the SELECT clause.
The condition must evaluate to true, false, or unknown. It can be a boolean expression or a
combination of boolean expressions using the AND and OR operators.
The query returns only rows that satisfy the condition in the WHERE clause. In other words,
only rows that cause the condition evaluates to true will be included in the result set.
PostgreSQL evaluates the WHERE clause after the FROM clause and before the SELECT:
If you use column aliases in the SELECT clause, you cannot use them in the WHERE clause.
Besides the SELECT statement, you can use the WHERE clause in
the UPDATE and DELETE statement to specify rows to be updated or deleted.
2
CS232-L – LAB 2
3
CS232-L – LAB 2
The following example finds customers whose first name and last name are Jamie and rice by
using the AND logical operator to combine two Boolean expressions:
Output:
Output:
You can use several conditions in one WHERE clause using the AND and OR operators.
4
CS232-L – LAB 2
Let’s suppose we have an Employee Table then the query given above will return the
employee number , employee name , job and salalry form employee table where the salary of
the employee is greater than and equal to 1100 and job title is ’CLERK’.
Output:
5
CS232-L – LAB 2
Using the WHERE clause with the not equal operator (<>) example
This example finds customers whose first names are Brand or Brandy and last names are
not Motley:
Output:
If you want to match a string with any string in a list, you can use the IN operator.
For example, the following statement returns customers whose first name is Ann, or Anne,
or Annie:
Output:
6
CS232-L – LAB 2
Output:
The % is called a wildcard that matches any string. The 'Ann%' pattern matches any string
that starts with 'Ann'.
Using the WHERE clause with the BETWEEN operator example
The following example finds customers whose first names start with the letter A and contains
3 to 5 characters by using the BETWEEN operator.
7
CS232-L – LAB 2
In this example, we used the LENGTH() function gets the number of characters of an input
string.
Null value
If a row lacks the data value for a particular column, that value is said to be null, or to contain
null. A null value is a value that is unavailable, unassigned, unknown, or inapplicable. A null
value is not the same as zero or a space. Zero is a number and space is a character. If any
column value in an arithmetic expression is null, the result is null. For example if you attempt
to perform division with zero, you get an error. However if you divide a number by null, the
result is a null or unknown.
A column alias allows you to assign a column or an expression in the select list of
a SELECT statement a temporary name. The column alias exists temporarily during the
execution of the query.
8
CS232-L – LAB 2
In this syntax, the column_name is assigned an alias alias_name. The AS keyword is optional
so you can omit it like this:
The following syntax illustrates how to set an alias for an expression in the SELECT clause:
The main purpose of column aliases is to make the headings of the output of a query more
meaningful.
This query assigned the surname as the alias of the last_name column:
9
CS232-L – LAB 2
The DISTINCT clause is used in the SELECT statement to remove duplicate rows from a
result set. The DISTINCT clause keeps one row for each group of duplicates.
The DISTINCTclause can be applied to one or more columns in the select list of
the SELECT statement.
In this statement, the values in the column1 column are used to evaluate the duplicate.
If you specify multiple columns, the DISTINCT clause will evaluate the duplicate based on
the combination of values of these columns.
10
CS232-L – LAB 2
Let’s create a new table called distinct_demo and insert data into it for practicing
the DISTINCT clause.
First, use the following CREATE TABLE statement to create the distinct_demo table that
consists of three columns: id, bcolor and fcolor.
Second, insert some rows into the distinct_demo table using the following INSERT statement:
Third, query the data from the distinct_demo table using the SELECT statement:
Output:
11
CS232-L – LAB 2
In the database world, NULL means missing information or not applicable. NULL is not a
value, therefore, you cannot compare it with any other values like numbers or strings. The
comparison of NULL with a value will always result in NULL, which means an unknown
result.
In addition, NULL is not equal to NULL so the following expression returns NULL:
NULL = NULL
Assuming that you have a contacts table that stores the first name, last name, email, and
phone number of contacts. At the time of recording the contact, you may not know the
contact’s phone number.
To deal with this, you define the phone column as a nullable column and insert NULL into
the phone column when you save the contact information.
The following statement inserts two contacts, one has a phone number and the other does not:
12
CS232-L – LAB 2
To find the contact who does not have a phone number you may come up with the following
statement:
The statement returns no row. This is because the expression phone = NULL in
the WHERE clause always returns false.
Even though there is a NULL in the phone column, the expression NULL = NULL returns
false. This is because NULL is not equal to any value even itself.
To check whether a value is NULL or not, you use the IS NULL operator instead:
value IS NULL
So to get the contact who does not have any phone number stored in the phone column, you
use the following statement instead:
13
CS232-L – LAB 2
PostgreSQL UPDATE
The PostgreSQL UPDATE statement allows you to modify data in a table. The following
illustrates the syntax of the UPDATE statement:
UPDATE table_name
SET column1 = value1,
column2 = value2,
...
WHERE condition;
In this syntax:
First, specify the name of the table that you want to update data after
the UPDATE keyword.
Second, specify columns and their new values after SET keyword. The columns that
do not appear in the SET clause retain their original values.
Third, determine which rows to update in the condition of the WHERE clause.
The WHERE clause is optional. If you omit the WHERE clause, the UPDATE statement will
update all rows in the table.
When the UPDATE statement is executed successfully, it returns the following command tag:
UPDATE count
The count is the number of rows updated including rows whose values did not change.
PostgreSQL UPDATE examples
Let’s take some examples of using the PostgreSQL UPDATE statement.
The following statements create a table called courses and insert some data into it:
The following statement returns the data from the courses table:
SELECT * FROM courses;
14
CS232-L – LAB 2
The following statement uses the UPDATE statement to update the course with id 3. It
changes the published_date from NULL to '2020-08-01'.
To drop a table from the database, you use the DROP TABLE statement as follows:
In this syntax:
First, specify the name of the table that you want to drop after the DROP
TABLE keywords.
Second, use the IF EXISTS option to remove the table only if it exists.
If you remove a table that does not exist, PostgreSQL issues an error. To avoid this situation,
you can use the IF EXISTS option.
In case the table that you want to remove is used in other objects such as views, triggers,
functions, and stored procedures, the DROP TABLE cannot remove the table. In this case,
you have two options:
The CASCADE option allows you to remove the table and its dependent objects.
The RESTRICT option rejects the removal if there is any object depends on the table.
The RESTRICT option is the default if you don’t explicitly specify it in the DROP
TABLE statement.
To remove multiple tables at once, you can place a comma-separated list of tables after
the DROP TABLE keywords:
Let’s take some examples of using the PostgreSQL DROP TABLE statement
16
CS232-L – LAB 2
17
CS232-L – LAB 2
To remove all data from a table, you use the DELETE statement. However, when you use
the DELETE statement to delete all data from a table that has a lot of data, it is not efficient.
In this case, you need to use the TRUNCATE TABLE statement:
The TRUNCATE TABLE statement deletes all data from a table without scanning it. This is
the reason why it is faster than the DELETE statement.
18
CS232-L – LAB 2
The following example uses the TRUNCATE TABLE statement to delete all data from
the invoices table:
To remove all data from multiple tables at once, you separate each table by a comma (,) as
follows:
TRUNCATE TABLE
table_name1,
table_name2,
...;
Code language: SQL (Structured Query Language) (sql)
For example, the following statement removes all data from invoices and customers tables:
In practice, the table you want to truncate often has the foreign key references from other
tables that are not listed in the TRUNCATE TABLE statement.
By default, the TRUNCATE TABLE statement does not remove any data from the table that
has foreign key references.
To remove data from a table and other tables that have foreign key reference the table, you
use CASCADE option in the TRUNCATE TABLE statement as follows :
The following example deletes data from the invoices table and other tables that reference
the invoices table via foreign key constraints:
The CASCADE option should be used with further consideration or you may potentially
delete data from tables that you did not want.
19
CS232-L – LAB 2
By default, the TRUNCATE TABLE statement uses the RESTRICT option which prevents
you from truncating the table that has foreign key constraint references.
20
CS232L – Database Management System
LAB 03
SQL CLAUSES-
ORDER BY, GROUP BY, HAVING
Objective
The objective of this session is to get deeper insight of different clauses used along with going
over examples on how to put them to use with select DQL command.
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered, and the output displayed.
• Try to practice and revise all the concepts covered in all previous session before coming
to the lab to avoid un-necessary ambiguities.
1
CS232-L – LAB 3
After Grouping the data, you can filter the grouped record using HAVING Clause. HAVING
Clause returns the grouped records which match the given condition. You can also sort the
grouped records using ORDER BY. ORDER BY used after GROUP BY on aggregated
column. Clauses help us filter and analyze data quickly. When we have large amounts of data
stored in the database, we use Clauses to query and get data required by the user.
The GROUP BY clause divides the rows returned from the SELECT statement into groups.
For each group, you can apply an aggregate function e.g., SUM() to calculate the sum of
items or COUNT() to get the number of items in the groups. The following statement
illustrates the basic syntax of the GROUP BY clause:
SELECT
column_1,
column_2,
...,
aggregate_function(column_3)
FROM
table_name
GROUP BY
column_1,
column_2,
...;
In this syntax:
• First, select the columns that you want to group e.g., column1 and column2, and
column that you want to apply an aggregate function (column3).
• Second, list the columns that you want to group in the GROUP BY clause.
3.2.1 GROUP BY clause examples
Let’s take a look at the payment table in the sample database of PostgreSQL i.e. dvdrental
2
CS232-L – LAB 3
You can use the GROUP BY clause without applying an aggregate function. The following
query gets data from the payment table and groups the result by customer id.
SELECT
customer_id
FROM
payment
GROUP BY
customer_id;
In this case, the GROUP BY works like the DISTINCT clause that removes duplicate rows
from the result set.
The GROUP BY clause is useful when it is used in conjunction with an aggregate function.
For example, to select the total amount that each customer has been paid, you use the GROUP
BY clause to divide the rows in the payment table into groups grouped by customer id. For
each group, you calculate the total amounts using the SUM() function.
The following query uses the GROUP BY clause to get the total amount that each customer
has been paid:
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id;
3
CS232-L – LAB 3
The GROUP BY clause sorts the result set by customer id and adds up the amount that
belongs to the same customer. Whenever the customer_id changes, it adds the row to the
returned result set.
To find the number of payment transactions that each staff has processed, you group the rows
in the payment table by the values in the staff_id column and use the COUNT() function to
get the number of transactions:
SELECT
staff_id,
COUNT (payment_id)
FROM
payment
GROUP BY
staff_id;
The GROUP BY clause divides the rows in the payment into groups and groups them by
value in the staff_id column. For each group, it returns the number of rows by using
the COUNT() function.
4
CS232-L – LAB 3
SELECT
customer_id,
staff_id,
SUM(amount)
FROM
payment
GROUP BY
staff_id,
customer_id
ORDER BY
customer_id;
In this example, the GROUP BY clause divides the rows in the payment table by the values in
the customer_id and staff_id columns. For each group of (customer_id, staff_id),
the SUM() calculates the total amount.
SELECT
DATE(payment_date) paid_date,
SUM(amount) sum
FROM
payment
GROUP BY
DATE(payment_date);
Code language: SQL (Structured Query Language) (sql)
5
CS232-L – LAB 3
The HAVING clause is often used with the GROUP BY clause to filter groups or aggregates
based on a specified condition. The following statement illustrates the basic syntax of
the HAVING clause:
SELECT
column1,
aggregate_function (column2)
FROM
table_name
GROUP BY
column1
HAVING
condition;
In this syntax, the group by clause returns rows grouped by the column1.The HAVING clause
specifies a condition to filter the groups. PostgreSQL evaluates the HAVING clause after
the FROM, WHERE, GROUP BY, and before the SELECT, DISTINCT, ORDER
BY and LIMIT clauses. Since the HAVING clause is evaluated before the SELECT clause,
you cannot use column aliases in the HAVING clause. Because at the time of evaluating
the HAVING clause, the column aliases specified in the SELECT clause are not available.
6
CS232-L – LAB 3
The following query uses the GROUP BY clause with the SUM() function to find the total
amount of each customer:
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id;
The following statement adds the HAVING clause to select the only customers who have
been spending more than 200:
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id
HAVING
7
CS232-L – LAB 3
The following table shows the comparisons between these two clauses, but the main
difference is that the WHERE clause uses condition for filtering records before any groupings
are made, while HAVING clause uses condition for filtering values from a group.
HAVING WHERE
1. The HAVING clause is used in database 1. The WHERE clause is used in database
systems to fetch the data/values from the groups systems to fetch the data/values from the tables
according to the given condition. according to the given condition.
2. The HAVING clause is always executed with 2. The WHERE clause can be executed without
the GROUP BY clause. the GROUP BY clause.
3. The HAVING clause can include SQL 3. We cannot use the SQL aggregate function
aggregate functions in a query or statement. with WHERE clause in statements.
4. We can only use SELECT statement with 4. Whereas, we can easily use WHERE clause
HAVING clause for filtering the records. with UPDATE, DELETE, and SELECT
statements.
5. The HAVING clause is used in SQL queries 5. The WHERE clause is always used before the
after the GROUP BY clause. GROUP BY clause in SQL queries.
6. It is a post-filter. 6. It is a pre-filter.
When you query data from a table, the SELECT statement returns rows in an unspecified
order. To sort the rows of the result set, you use the ORDER BY clause in
the SELECT statement.
The ORDER BY clause allows you to sort rows returned by a SELECT clause in ascending or
descending order based on a sort expression.The following illustrates the syntax of
the ORDER BY clause:
8
CS232-L – LAB 3
SELECT
select_list
FROM
table_name
ORDER BY
sort_expression1 [ASC | DESC],
...
sort_expressionN [ASC | DESC];
In this syntax:
• First, specify a sort expression, which can be a column or an expression, that you want
to sort after the ORDER BY keywords. If you want to sort the result set based on
multiple columns or expressions, you need to place a comma (,) between two columns
or expressions to separate them.
• Second, you use the ASC option to sort rows in ascending order and the DESC option
to sort rows in descending order. If you omit the ASC or DESC option, the ORDER
BY uses ASC by default.
Due to the order of evaluation, if you have a column alias in the SELECT clause, you can use
it in the ORDER BY clause.
We will use the customer table in the sample database for the demonstration.
The following query uses the ORDER BY clause to sort customers by their first names in
ascending order:
9
CS232-L – LAB 3
SELECT
first_name,
last_name
FROM
customer
ORDER BY
first_name ASC;
Code language: SQL (Structured Query Language) (sql)
Since the ASC option is the default, you can omit it in the ORDER BY.
SELECT
first_name,
last_name
FROM
customer
ORDER BY
last_name DESC;
Code language: SQL (Structured Query Language) (sql)
10
CS232-L – LAB 3
The following statement selects the first name and last name from the customer table and
sorts the rows by the first name in ascending order and last name in descending order:
SELECT
first_name,
last_name
FROM
customer
ORDER BY
first_name ASC,
last_name DESC;
Code language: SQL (Structured Query Language) (sql)
11
CS232-L – LAB 3
In this example, the ORDER BY clause sorts rows by values in the first name column first.
And then it sorts the sorted rows by values in the last name column. As you can see clearly
from the output, two customers with the same first name Kelly have the last name sorted in
descending order.
Practice Problem:
Dane country Airport officials decided that all the information related to the airline
flight should be organized using a DBMS, and you are hired to design the database.
Your first task is to organize the information about all the airplanes stationed and
maintained at the airport. The following relations keep track of airline flight
information:
Flights (flno, from_loc, to_loc, distance, price) //details about all the flights
Aircraft (aid, aname, cruisingrange) //details of the aircraft including the registration
number assigned, model of the craft and the maximum capacity of the craft in terms of
distance it can travel.
Certified (eid, aid) //Certification details of the pilots for specific crafts.
Employees (eid, ename, salary) //details of employees
12
CS232-L – LAB 3
Part A:
1. Create the above-mentioned tables and insert data in tables as given below.
create table flight , (444,'Dallas','Sydeny',10000,52000)
( , (555, 'LA','Singapore',11000,55000)
flight_no number, , (666, 'UK','Atlanta',15000,60000);
from_loc varchar(20), insert all
to_loc varchar(20), into aircraft values (111, 'AD Scout',1000)
distance number, into aircraft values (112, 'Airco', 15000)
price number into aircraft values (113, 'Avis', 9000)
); into aircraft values (114, 'Bernard', 8000)
into aircraft values (115, 'Comte', 20000)
create table aircraft select * from dual;
( insert all
aid number, into employee values (100, 'Oliver', 85000)
aname varchar(20), into employee values (101, 'Jack', 50000)
cruisingrange number into employee values (102, 'Thomas',
); 89000)
into employee values (103, 'George', 10000)
create table certified into employee values (105, 'James', 90000)
( into employee values (106, 'Daneil', 100000)
eid number, into employee values (107, 'Noah', 50000)
aid number into employee values (108, 'Joe', 25000)
); into employee values (109, 'Pheebs', 90000)
into employee values (110, 'Ross', 5000)
create table employee select * from dual;
(
eid number, insert all
ename varchar(20), into certified values (100,114)
salary number into certified values (102,113)
); into certified values (105,112)
insert all into certified values (106,115)
into flight values into certified values (107,111)
(222,'Perth','London',9000,50000) into certified values (108,112)
, (333,'Auckland','Dubai',7000,40000) select * from dual;
Part B:
1. Calculate total tickets price sold for all flights.
4. For each bonus value, check number of employees given same bonus.
SELECT COUNT(EID) AS NO_OF_EMPLOYEES,BONUS FROM
EMPLOYEE GROUP BY BONUS;
13
CS232-L – LAB 3
7. Display sum of salaries of all employees having name starting with ‘J’ and sum of
salaries is above 20000.
SELECT SUM(SALARY) FROM EMPLOYEE WHERE ENAME LIKE 'J%' AND
SALARY>20000;
8. Display aircraft data in descending order based upon their cruising range.
SELECT * FROM AIRCRAFT ORDER BY CRUISINGRANGE DESC;
1
CS232L – Database Management System
LAB#04
POSTGRESQL CONSTRAINTS
Objective
The objective of this session is to
7. Adding a Constraint
8. Dropping a Constraint
9. Disabling a Constraint
Instructions
Open the handout/lab manual in front of a computer in the lab during the session.
Practice each new command by completing the examples and exercise.
Turn-in the answers for all the exercise problems as your lab report.
When answering problems, indicate the commands you entered and the output displayed.
1
CS232-L – LAB 4
Constraint Description
NOT NULL Specifies that this column my not contain a null
value
UNIQUE Specifies a column or combination of columns
whose values must be unique for all rows in the
table
Constraints Guidelines
All constraints are stored in the data dictionary. Constraints are easy to reference if you give them
a meaningful name. Constraint names must follow the standard object naming rules. Constraints
can be defined at the time of table creation or after the table has been created. You can view the
constraints defined for a specific table by looking at the USER_CONSTRAINTS data dictionary
table.
Syntax:
CREATE TABLES table
(column datatype
[column_constraint],
column datatype [column_constraint],
…..
[table_constraint] [……]);
Constraints are usually created at the same time as the table. Constraints can be added to a table
after its creation and also temporarily disabled. Constraints can be defined at one of two levels.
2
CS232-L – LAB 4
Constraint Description
Level
Column References a single column and is defined
within a specification for the owning column:
can define any type of integrity constraint
Table References one or more columns and is defined
separately from the definitions of the columns
in the table: can define any constraints except
NOT NULL
You define primary keys through primary key constraints. Technically, a primary key
constraint is the combination of a not-null constraint and a UNIQUE constraint.
A table can have one and only one primary key. It is a good practice to add a primary
key to every table.
Normally, we add the primary key to a table when we define the table’s structure
using CREATE TABLE statement.
The following statement creates a purchase order (PO) header table with the
name po_headers.
The po_no is the primary key of the po_headers table, which uniquely identifies purchase
order in the po_headers table.
In case the primary key consists of two or more columns, you define the primary key
constraint as follows:
3
CS232-L – LAB 4
For example, the following statement creates the purchase order line items table
whose primary key is a combination of purchase order number ( po_no) and line item
number ( item_no).
If you don’t specify explicitly the name for primary key constraint, PostgreSQL will
assign a default name to the primary key constraint. By default, PostgreSQL uses table-
name_pkey as the default name for the primary key constraint. In this example,
PostgreSQL creates the primary key constraint with the name po_items_pkey for
the po_items table.
In case you want to specify the name of the primary key constraint, you
use CONSTRAINT clause as follows:
It is rare to define a primary key for existing table. In case you have to do it, you can
use the ALTER TABLE statement to add a primary key constraint.
The following statement creates a table named products without defining any primary
key.
4
CS232-L – LAB 4
Suppose you want to add a primary key constraint to the products table, you can
execute the following statement:
Suppose, we have a vendors table that does not have any primary key.
And we add few rows to the vendors table using INSERT statement:
To verify the insert operation, we query data from the vendors table using the
following SELECT statement:
SELECT
*
FROM
vendors;
Now, if we want to add a primary key named id into the vendors table and the id
field is auto-incremented by one, we use the following statement:
SELECT
id,name
FROM
vendors;
5
CS232-L – LAB 4
To remove an existing primary key constraint, you also use the ALTER TABLE statement
with the following syntax:
For example, to remove the primary key constraint of the products table, you use the
following statement:
In this tutorial, you have learned how to add and remove primary key constraints
using CREATE TABLE and ALTER TABLE statements.
A foreign key is a column or a group of columns in a table that reference the primary
key of another table.
The table that contains the foreign key is called the referencing table or child table.
And the table referenced by the foreign key is called the referenced table or parent
table.
A table can have multiple foreign keys depending on its relationships with other
tables.
In PostgreSQL, you define a foreign key using the foreign key constraint. The foreign
key constraint helps maintain the referential integrity of data between the child and
parent tables.
6
CS232-L – LAB 4
[CONSTRAINT fk_name]
FOREIGN KEY(fk_columns)
REFERENCES parent_table(parent_key_columns)
[ON DELETE delete_action]
[ON UPDATE update_action]
In this syntax:
First, specify the name for the foreign key constraint after
the CONSTRAINT keyword. The CONSTRAINT clause is optional. If you omit it,
PostgreSQL will assign an auto-generated name.
Second, specify one or more foreign key columns in parentheses after
the FOREIGN KEY keywords.
Third, specify the parent table and parent key columns referenced by the
foreign key columns in the REFERENCES clause.
Finally, specify the delete and update actions in the ON DELETE and ON
UPDATE clauses.
The delete and update actions determine the behaviors when the primary key in the
parent table is deleted and updated. Since the primary key is rarely updated, the ON
UPDATE action is not often used in practice. We’ll focus on the ON DELETE action.
SET NULL
SET DEFAULT
RESTRICT
NO ACTION
CASCADE
7
CS232-L – LAB 4
phone VARCHAR(15),
email VARCHAR(100),
PRIMARY KEY(contact_id),
CONSTRAINT fk_customer
FOREIGN KEY(customer_id)
REFERENCES customers(customer_id)
);
In this example, the customers table is the parent table and the contacts table is the child
table.
Each customer has zero or many contacts and each contact belongs to zero or one
customer.
The customer_id column in the contacts table is the foreign key column that references
the primary key column with the same name in the customers table.
The following foreign key constraint fk_customer in the contacts table defines
the customer_id as the foreign key:
CONSTRAINT fk_customer
FOREIGN KEY(customer_id)
REFERENCES customers(customer_id)
Because the foreign key constraint does not have the ON DELETE and ON UPDATE action,
they default to NO ACTION.
NO ACTION
The following inserts data into the customers and contacts tables:
The following statement deletes the customer id 1 from the customers table:
8
CS232-L – LAB 4
CASCADE
The ON DELETE CASCADE automatically deletes all the referencing rows in the child table
when the referenced rows in the parent table are deleted. In practice, the ON DELETE
CASCADE is the most commonly used option.
The following statements recreate the sample tables. However, the delete action of
the fk_customer changes to CASCADE:
WHERE customer_id = 1;
Because of the ON DELETE CASCADE action, all the referencing rows in the contacts table
are automatically deleted:
SET DEFAULT
The ON DELETE SET DEFAULT sets the default value to the foreign key column of the
referencing rows in the child table when the referenced rows from the parent table
are deleted.
To add a foreign key constraint to the existing table, you use the following form of
the ALTER TABLE statement:
When you add a foreign key constraint with ON DELETE CASCADE option to an existing
table, you need to follow these steps:
First, add a new foreign key constraint with ON DELETE CASCADE action:
In this tutorial, you have learned about PostgreSQL foreign keys and how to use the
foreign key constraint to create foreign keys for a table.
10
CS232-L – LAB 4
The CHECK constraint uses a Boolean expression to evaluate the values before they are
inserted or updated to the column.
If the values pass the check, PostgreSQL will insert or update these values to the
column. Otherwise, PostgreSQL will reject the changes and issue a constraint violation
error.
Typically, you use the CHECK constraint at the time of creating the table using
the CREATE TABLE statement.
11
CS232-L – LAB 4
The statement attempted to insert a negative salary into the salary column. However,
PostgreSQL returned the following error message:
The insert failed because of the CHECK constraint on the salary column that accepts
only positive values.
By default, PostgreSQL gives the CHECK constraint a name using the following pattern:
{table}_{column}_check
For example, the constraint on the salary column has the following constraint name:
employees_salary_check
However, if you want to assign aCHECK constraint a specific name, you can specify it
after the CONSTRAINT expression as follows:
...
salary numeric CONSTRAINT positive_salary CHECK(salary > 0)
...
Define PostgreSQL CHECK constraints for existing tables
To add CHECK constraints to existing tables, you use the ALTER TABLE statement.
Suppose, you have an existing table in the database named prices_list
Now, you can use ALTER TABLE statement to add the CHECK constraints to
the prices_list table. The price and discount must be greater than zero and the discount
is less than the price. Notice that we use a Boolean expression that contains
the AND operators.
12
CS232-L – LAB 4
The valid to date ( valid_to) must be greater than or equal to valid from date
( valid_from).
The CHECK constraints are very useful to place additional logic to restrict values that
the columns can accept at the database layer. By using the CHECK constraint, you can
make sure that data is updated to the database correctly.
In this tutorial, you have learned how to use PostgreSQL CHECK constraint to check
the values of columns based on a Boolean expression.
PostgreSQL provides you with the UNIQUE constraint that maintains the uniqueness of
the data correctly.
When a UNIQUE constraint is in place, every time you insert a new row, it checks if the
value is already in the table. It rejects the change and issues an error if the value
already exists. The same process is carried out for updating existing data.
The following statement creates a new table named person with a UNIQUE constraint
for the email column.
);
Note that the UNIQUE constraint above can be rewritten as a table constraint as shown
in the following query:
First, insert a new row into the person table using INSERT statement:
PostgreSQL allows you to create a UNIQUE constraint to a group of columns using the
following syntax:
The combination of values in column c2 and c3 will be unique across the whole table.
The value of the column c2 or c3 needs not to be unique.
14
CS232-L – LAB 4
Suppose that you need to insert an email address of a contact into a table. You can
request his or her email address. However, if you don’t know whether the contact has
an email address or not, you can insert NULL into the email address column. In this
case, NULL indicates that the email address is not known at the time of recording.
NULL is very special. It does not equal anything, even itself. The expression NULL =
NULL returns NULL because it makes sense that two unknown values should not be
equal.
To check if a value is NULL or not, you use the IS NULL boolean operator. For example,
the following expression returns true if the value in the email address is NULL.
email_address IS NULL
The IS NOT NULL operator negates the result of the IS NULL operator.
To control whether a column can accept NULL, you use the NOT NULL constraint:
If a column has a NOT NULL constraint, any attempt to insert or update NULL in the
column will result in an error.
The following CREATE TABLE statement creates a new table name invoices with the not-
null constraints.
15
CS232-L – LAB 4
This example uses the NOT NULL keywords that follow the data type of the product_id
and qty columns to declare NOT NULL constraints.
If you use NULL instead of NOT NULL, the column will accept both NULL and non-NULL
values. If you don’t explicitly specify NULL or NOT NULL, it will accept NULL by default.
To add the NOT NULL constraint to a column of an existing table, you use the following
form of the ALTER TABLE statement:
To add set multiple NOT NULL constraint to multiple columns, you use the following
syntax:
16
CS232-L – LAB 4
Then, to make sure that the qty field is not null, you can add the not-null constraint to
the qty column. However, the column already contains data. If you try to add the not-
null constraint, PostgreSQL will issue an error.
To add the NOT NULL constraint to a column that already contains NULL, you need to
update NULL to non-NULL first, like this:
UPDATE production_orders
SET qty = 1;
The values in the qty column are updated to one. Now, you can add the NOT
NULL constraint to the qty column:
After that, you can update the not-null constraints for material_id, start_date,
and finish_date columns:
UPDATE production_orders
SET material_id = 'ABC',
start_date = '2015-09-01',
finish_date = '2015-09-01';
UPDATE production_orders
SET qty = NULL;
17
CS232-L – LAB 4
Besides the NOT NULL constraint, you can use a CHECK constraint to force a column to
accept not NULL values. The NOT NULL constraint is equivalent to the
following CHECK constraint:
This is useful because sometimes you may want either column a or b is not null, but
not both.
For example, you may want either username or email column of the user tables is not
null or empty. In this case, you can use the CHECK constraint as follows:
To remove the NOT NULL constraint, use the DROP NOT NULL clause along with ALTER TABLE
ALTER COLUMN statement.
19
CS232L – Database Management System
LAB 05
SQL AGGREGATE FUNCTIONS,
SUB-QUERIES
Objective
The objective of this session is to learn how to use the PostgreSQL aggregate functions such
as AVG(), COUNT(), MIN(), MAX(), and SUM().deeper insight of different clauses used
along with going over examples on how to put them to use with select DQL command.
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered, and the output displayed.
• Try to practice and revise all the concepts covered in all previous session before coming to
the lab to avoid un-necessary ambiguities.
5.1 PostgreSQL Aggregate Functions
Aggregate functions perform a calculation on a set of rows and return a single row.
PostgreSQL provides all standard SQL’s aggregate functions as follows:
We often use the aggregate functions with the GROUP BY clause in the SELECT statement.
In these cases, the GROUP BY clause divides the result set into groups of rows and the
aggregate functions perform a calculation on each group e.g., maximum, minimum, average,
etc.
You can use aggregate functions as expressions only in the following clauses:
• SELECT clause.
• HAVING clause.
The following statement uses the AVG() function to calculate the average replacement cost of
all films:
SELECT
ROUND( AVG( replacement_cost ), 2 ) avg_replacement_cost
FROM
film;
To get the number of films, you use the COUNT(*) function as follows:
SELECT
COUNT(*)
FROM
film;
SELECT
MAX(replacement_cost)
FROM
film;
To get the films that have the maximum replacement cost, you use the following query:
SELECT
film_id,
title
FROM
film
WHERE
replacement_cost =(
SELECT
MAX( replacement_cost )
FROM
film
)
ORDER BY
title;
The subquery returned the maximum replacement cost which then was used by the outer
query for retrieving the film’s information.
The following example uses the MIN() function to return the minimum replacement cost of
films:
SELECT
MIN(replacement_cost)
FROM
film;
To get the films which have the minimum replacement cost, you use the following query:
SELECT
film_id,
title
FROM
film
WHERE
replacement_cost =(
SELECT
MIN( replacement_cost )
FROM
film
)
ORDER BY
title;
Code language: SQL (Structured Query Language) (sql)
5.1.5 SUM() function examples
The following statement uses the SUM() function to calculate the total length of films
grouped by film’s rating:
SELECT
rating,
SUM( rental_duration )
FROM
film
GROUP BY
rating
ORDER BY
rating;
Code language: SQL (Structured Query Language) (sql)
This is used to concatenate a list of strings and adds a place for a delimiter symbol or a
separator between all of the strings. The output string won’t have a separator or a delimiter
symbol at the end of it. The PostgreSQL 9.0 version supports STRING_AGG() function. To
concatenate the strings, we can employ a variety of separators or delimiter symbols.
Syntax
STRING_AGG ( expression, separator|delimiter [order_by] )
With the above syntax, a table called “players” will be created, with the columns as
player_name, team_name, and player_position.
Note: Run the following SELECT query to verify that the table is created with the
desired columns. SELECT * FROM “players” ;
Let’s use the INSERT INTO command to add some values to the “players” table now:
—OUTPUT—
player_name | team_name | player_positon
-------------+-------------+----------------
Virat | India | Batsman
Rohit | India | Batsman
Jasprit | India | Bowler
Chris | West Indies | Batsman
Shannon | West Indies | Bowler
Bravo | West Indies | Batsman
James | New Zealand | All rounder
(7 rows)
We will use the STRING_AGG() function to produce a list of values separated by commas.
The syntax to create comma-separated values is as follows:
SELECT "team_name",string_agg("player_name", ',' )
FROM "players" GROUP BY "team_name" ;
—OUTPUT—
team_name | string_agg
-------------+---------------------
West Indies | Chris,Shannon,Bravo
India | Virat,Rohit,Jasprit
New Zealand | James
(3 rows)
The “player_name” column in the SELECT query is separated by commas and
displayed alongside the “team_name” as seen in the output obtained. The rows are
divided according to the field “team_name” using the GROUP BY command. The
expression that needs to be separated is defined in the first parameter of the
STRING_AGG() function, and the values are separated in the second parameter by
the comma character “,”.
SELECT "team_name",
string_agg ("player_name", ',' ORDER BY "player_name" asc) AS
players_name,
string_agg ("player_positon", ',' ORDER BY "player_positon" asc) AS
players_positions
FROM "players" GROUP BY "team_name";
—OUTPUT—
team_name | players_name | players_positions
-------------+---------------------+------------------------
India | Jasprit,Rohit,Virat | Batsman,Batsman,Bowler
New Zealand | James | All rounder
West Indies | Bravo,Chris,Shannon | Batsman,Batsman,Bowler
(3 rows)
Subquery Syntax:
The subquery (inner query) executes once before the main query (outer query) executes.
The main query (outer query) use the subquery result.
PostgreSQL Subquery Example:
Using a subquery, list the name of the employees, paid more than 'Alexander' from
employees.
Code:
SELECT first_name,last_name, salary FROM employees
WHERE salary >
(SELECT max(salary) FROM employees
WHERE first_name='Alexander');
Sample Output:
first_name last_name salary
Steven King 24000
Neena Kochhar 17000
Lex De Haan 17000
Nancy Greenberg 12000
Den Raphaely 11000
John Russell 14000
Karen Partners 13500
Michael Hartstein 13000
Hermann Baer 10000
Shelley Higgins 12000
Subqueries: Guidelines
There are some guidelines to consider when using subqueries :
• A subquery must be enclosed in parentheses.
• Use single-row operators with single-row subqueries, and use multiple-row operators
with multiple-row subqueries.
• If a subquery (inner query) returns a null value to the outer query, the outer query will
not return any rows when using certain comparison operators in a WHERE clause.
Types of Subqueries
1. The Subquery as Scalar Operand
2. Comparisons using Subqueries
3. Subqueries with ALL, ANY, IN, or SOME
4. Row Subqueries
5.Subqueries with EXISTS or NOT EXISTS
6.Correlated Subqueries
7.Subqueries in the FROM Clause
5.2.1PostgreSQL Subquery as Scalar Operand
• A scalar subquery is a subquery that returns exactly one single value.
• It is normally executed in select query.
• It is an error to use a query that returns more than one row or more than one column as
a scalar subquery.
• During a particular execution, if the subquery returns no rows, that is not an error; the
scalar result is taken to be null.
• The subquery can refer to variables from the surrounding query, which will act as
constants during any one evaluation of the subquery.
Example: PostgreSQL Subquery as Scalar Operand
Code:
SELECT employee_id, last_name,
(CASE WHEN department_id=(
SELECT department_id from departments WHERE location_id=2500)
THEN 'Canada' ELSE 'USA' END)
FROM employees;
Sample Output:
= Equal to
Code:
SELECT employee_id,first_name,last_name,salary
FROM employees
WHERE salary >
(SELECT AVG(SALARY) FROM employees);
Sample Output:
employee_id first_name last_name salary
100 Steven King 24000
101 Neena Kochhar 17000
102 Lex De Haan 17000
103 Alexander Hunold 9000
108 Nancy Greenberg 12000
109 Daniel Faviet 9000
110 John Chen 8200
114 Den Raphaely 11000
121 Adam Fripp 8200
145 John Russell 14000
146 Karen Partners 13500
176 Jonathon Taylor 8600
177 Jack Livingston 8400
201 Michael Hartstein 13000
204 Hermann Baer 10000
205 Shelley Higgins 12000
206 William Gietz 8300
Code:
SELECT department_id, AVG(SALARY)
FROM employees GROUP BY department_id
HAVING AVG(SALARY)>=ALL
(SELECT AVG(SALARY) FROM employees
GROUP BY department_id);
Sample Output:
department_id avg
9 19333.33
Note: Here we have used ALL keyword for this subquery as the department selected by the
query must have an average salary greater than or equal to all the average salaries of the other
departments.
Note: We have used ANY keyword in this query, because it is likely that the subquery will
find more than one departments in 1700 location. If you use the ALL keyword instead of the
ANY keyword, no data is selected because no employee works in all departments of 1700
location
expression IN (subquery)
The right-hand side is a parenthesized subquery, which must return exactly one column. The
left-hand expression is evaluated and compared to each row of the subquery result.
1. The result of IN is true if any equal subquery row is found.
2. The result is “false” if no equal row is found (including the case where the subquery returns no
rows).
3. If the left-hand expression yields null, or if there are no equal right-hand values and at least one
right-hand row yields null, the result of the IN construct will be null, not false.
Example: PostgreSQL Subquery, IN operator
The following query selects those employees who work in the location 1800. The subquery
finds the department id in the 1800 location, and then the main query selects the employees
who work in any of these departments.
Code:
SELECT first_name, last_name,department_id
FROM employees
WHERE department_id IN
(SELECT DEPARTMENT_ID FROM departments
WHERE location_id=1800);
Sample Output:
Sample Output:
first_name last_nam department_id
e
Steven King 9
Pat Fay 2
William Gietz 11
Examples
Assume that, given the data above, you want to return the average total for all students. In other
words, the average of Chun's 148 (75+73), Esben's 74 (43+31), etc.
SELECT AVG(sq_sum) FROM (SELECT SUM(score) AS sq_sum FROM student GROUP BY name) A
S t;
Practice exercises:
Dane country Airport officials decided that all the information related to the airline flight
should be organized using a DBMS, and you are hired to design the database. Your first task
is to organize the information about all the airplanes stationed and maintained at the airport.
The following relations keep track of airline flight information:
Flights (flno, from_loc, to_loc, distance, price) //details about all the flights
Aircraft (aid, aname, cruisingrange) //details of the aircraft including the registration
number assigned, model of the craft and the maximum capacity of the craft in terms of
distance it can travel.
Certified (eid, aid) //Certification details of the pilots for specific crafts.
Employees (eid, ename, salary) //details of employees
Part A:
1. Display employee id along with the name of the aircraft for which he is certified for.
2. Modify above to display employee names too.
3. Display name of employee who is certified for the aircraft with the highest of cruising
range.
4. Modify above and show his salary too.
5. Display name of the suitable air craft for the flight from LA (based upon the distance
and cruising range of craft).
Solution:
-1
Select result2.EID,ANAME from Aircraft,
(Select Employee.EID,result1.AID from Employee,(Select EID,AID from Certified) result1
where Employee.EID = result1.EID) result2
where Aircraft.AID = result2.AID ;
--2
Select result2.EID,result2.Ename,ANAME from Aircraft,
(Select Employee.EID,Employee.Ename,result1.AID from Employee,(Select EID,AID from
Certified) result1 where Employee.EID = result1.EID) result2
where Aircraft.AID = result2.AID ;
--3
SELECT * FROM
(Select Ename,result3.cruisingrange from (
Select result2.EID,result2.Ename,ANAME,Cruisingrange from Aircraft,
(Select Employee.EID,Employee.Ename,result1.AID from Employee,(Select EID,AID from
Certified) result1 where Employee.EID = result1.EID) result2
where Aircraft.AID = result2.AID
)result3 ORDER BY result3.cruisingrange DESC
) WHERE ROWNUM=1;
--4
SELECT * FROM
(Select Ename,result3.salary from (
Select result2.EID,result2.salary,result2.Ename,ANAME,Cruisingrange from Aircraft,
(Select Employee.EID,Employee.salary,Employee.Ename,result1.AID from Employee,(Select
EID,AID from Certified) result1 where Employee.EID = result1.EID) result2
where Aircraft.AID = result2.AID
)result3 ORDER BY result3.cruisingrange DESC
) WHERE ROWNUM=1;
--5
Select Cruisingrange from Aircraft where Cruisingrange >=
(Select Distance from Flight where from_loc='LA');
CS232L – Database Management System
LAB#06
Postgres Joins
Objective
The objective of this session is to
1. Introduction of Joins
2. Types of Joins
3. Examples
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered and the output displayed.
1
CS232-L – LAB 6
POSTGRES JOINS
A PostgreSQL Join statement is used to combine data or rows from one(self-join) or more tables based on a
common field(or column) between them. These common fields are generally the Primary key of the first
table and Foreign key of other tables.
There are 4 basic types of joins supported by PostgreSQL, namely:
SELECT columns
FROM table1
INNER JOIN table2
ON table1.column = table2.column;
2
CS232-L – LAB 6
EXAMPLE:
Sample table: Customer Sample table: Item Sample table: Invoice
SELECT *
FROM invoice
ON invoice.item_no=item.item_no;
Output:
SELECT *
FROM invoice
ON invoice.item_no=item.item_no
WHERE
item.rate>=10;
Output:
3
CS232-L – LAB 6
Outer Join
LEFT OUTER JOIN or LEFT JOIN
While joining the table using the left outer join, PostgreSQL first does normal join and then it starts scanning
from the left table. PostgreSQL left join retrieves all rows from the left table and all matching rows from the
right table. If there is no match in both tables, the right tables have null values.
Syntax:
Select *
FROM table1
LEFT [ OUTER ] JOIN table2
ON table1.column_name=table2.column_name
The PostgreSQL LEFT OUTER JOIN would return the all records from table1 and only those records from table2 that intersect
with table1.
Example:
Code:
SELECT item.item_no,item_descrip,
invoice.invoice_no,invoice.sold_qty
FROM item
ON item.item_no=invoice.item_no;
OR
Code:
SELECT item.item_no,item_descrip,
invoice.invoice_no,invoice.sold_qty
4
CS232-L – LAB 6
FROM item
ON item.item_no=invoice.item_no;
Output:
Explanation :
In the above example, the item_no I8 of item table not exists in the invoice table, and for this rows of item table
a new row in invoice table have generated and sets the NULL for this rows.
from the right table. PostgreSQL right join retrieves all rows from the right table and all matching rows from the
left table. If there is no match in both tables, the left tables have null values.
Syntax:
Select *
FROM table1
RIGHT [ OUTER ] JOIN table2
ON table1.column_name=table2.column_name;
In this visual diagram, the PostgreSQL RIGHT OUTER JOIN returns the shaded area:
5
CS232-L – LAB 6
The PostgreSQL RIGHT OUTER JOIN would return the all records from table2 and only those
Example:
Code:
SELECT invoice.invoice_no,invoice.sold_qty,
item.item_no,item_descrip
FROM invoice
ON item.item_no=invoice.item_no;
OR
Code:
SELECT invoice.invoice_no,invoice.sold_qty,
item.item_no,item_descrip
FROM invoice
ON item.item_no=invoice.item_no;
Output:
6
CS232-L – LAB 6
In the above example, the item_no I8 of item table not exists in the invoice table, and for this row in the item
table, a new row have been generated in the invoice table and sets the value NULL for this row.
PostgreSQL full outer join returns all rows from the left table as well as the right table. It will put null when
the full outer join condition was not satisfied. While joining the table using FULL OUTER JOIN first, it will
join be using an inner join. The combination of left and right join is known as a full outer join.
Syntax:
SELECT * | column_name(s)
FROM table_name1
FULL [OUTER] JOIN table_name2
ON table_name1.column_name=table_name2.column_name
The PostgreSQL FULL OUTER JOIN would return the all records from both table1 and table2.
Example:
Code:
SELECT item.item_no,item_descrip,
invoice.invoice_no,invoice.sold_qty
FROM invoice
ON invoice.item_no=item.item_no
ORDER BY item_no;
7
CS232-L – LAB 6
OR
Code:
SELECT item.item_no,item_descrip,
invoice.invoice_no,invoice.sold_qty
FROM invoice
ON invoice.item_no=item.item_no
ORDER BY item_no;
Output:
Explanation
In the above example, the matching rows from both the table item and invoice have appeared, as well
as the unmatched row i.e. I8 of item table which is not exists in the invoice table have also appeared,
and for this row of item table a new row in invoice table have generated and sets the value NULL .
The Cross Join creates a cartesian product between two sets of data. This type of join does not maintain any
relationship between the sets; instead returns the result, which is the number of rows in the first table multiplied
by the number of rows in the second table. It is called a product because it returns every possible combination
Syntax:
SELECT [* | column_list]
FROM table1
CROSS JOIN table2;
OR
SELECT [* | column_list]
FROM table1,table2;
EXAMPLE:
OR
Code:
SELECT customer.cust_no, customer.cust_name,
invoice.invoice_no,invoice.cust_no,invoice.item_no,
invoice.sold_qty,invoice.disc_per
FROM customer,invoice;
Output:
9
CS232-L – LAB 6
Explanation
In the above example, the 'customer' table and 'invoice' table join together to return a
cartesian product. Here in the above example the two rows of 'customer' table joining with 4
rows of 'invoice' table and makes a product of 4*2 rows.
ANTI-JOINS
Even after learning the principles of inner, outer, left, and right joins, you might still have a nagging question:
how do I find values from one table that are NOT present in another table?
That's where anti joins come in. They can be helpful in a variety of business situations when you're trying to
find something that hasn't happened, such as:
Anti-join will return rows, when no matching records are found in the second table. It is quite opposite to the
semi join, since it is returning records from the first table, when there is no match in the second table.
10
CS232-L – LAB 6
• A left anti join : This join returns rows in the left table that have no matching rows in the right table.
• A right anti join : This join returns rows in the right table that have no matching rows in the left table.
Syntax:
An anti join doesn't have its own syntax - meaning one actually performs an anti join using a combination of
other postgres sql queries. To find all the values from Table_1 that are not in Table_2, you'll need to use a
combination of LEFT JOIN and WHERE.
SELECT * FROM Table_1 t1 LEFT JOIN Table_2 t2 ON t1.id = t2.id WHERE t2.id IS NULL
The above entire query will return only values that are in Table_1 but not in Table_2.
Example:
PostgreSQL optimizes both LEFT JOIN / IS NULL and NOT EXISTS in the same way.
11
CS232-L – LAB 6
Semi-join:
Semi join will return a single value for all the matching records from the other table. That is, if the second table
has multiple matching entries for the first table's record, then it will return only one copy from the first table.
However, a normal join it will return multiple copies from the first table.
EXAMPLE:
ORDER BY departments.department_id;
Output:
12
CS232-L – LAB 6
Semi or anti joins are kind of sub join types to the joining methods such as hash, merge, and nested loop, where
the optimizer prefers to use them for EXISTS/IN or NOT EXISTS/NOT IN operators.
Difference between anti-join and semi-join
While a semi-join returns one copy of each row in the first table for which at least one match is found, an anti-
join returns one copy of each row in the first table for which no match is found.
EQUI-JOINS:
Oracle Equi join returns the matching column values of the associated tables. It uses a comparison operator in the
Equijoin also can be performed by using JOIN keyword followed by ON keyword and then specifying names of
Syntax:
SELECT *
FROM table1
JOIN table2
[ON
(join_condition)]
13
CS232-L – LAB 6
What is the difference between Equi Join and Inner Join in SQL?
An equijoin is a join with a join condition containing an equality operator. An equijoin returns only the rows
that have equivalent values for the specified columns.
An inner join is a join of two or more tables that returns only those rows (compared using a comparison
operator) that satisfy the join condition.
SELF JOIN
Self Join is a specific type of Join. In Self Join,
14
CS232-L – LAB 6
Syntax:
SELECT a.name, b.age, a.SALARY
FROM CUSTOMERS a, CUSTOMERS b
WHERE a.SALARY < b.SALARY;
Output:
Natural Join:
A natural join is a join that creates an implicit join based on the same column names in the joined
tables.
SELECT select_list
FROM T1
15
CS232-L – LAB 6
A natural join can be an inner join, left join, or right join. If you do not specify a join explicitly
e.g., INNER JOIN, LEFT JOIN, RIGHT JOIN, PostgreSQL will use the INNER JOIN by default.
16
CS232L – Database Management System
LAB 07
PL/PG SQL BASICS – BLOCK
PROCESSING, IF ELSE, CASE
WHEN, LOOPS
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered, and the output displayed.
• Try to practice and revise all the concepts covered in all previous session before coming
to the lab to avoid un-necessary ambiguities.
Overview of PostgreSQL PL/pgSQL
SQL is a query language that allows you to query data from the database easily. However,
PostgreSQL only can execute SQL statements individually.
It means that you have multiple statements, you need to execute them one by one like this:
• First, send a query to the PostgreSQL database server.
• Next, wait for it to process.
• Then, process the result set.
• After that, do some calculations.
• Finally, send another query to the PostgreSQL database server and repeat this process.
This process incurs the interprocess communication and network overheads.
To resolve this issue, PostgreSQL uses PL/pgSQL.
PL/pgSQL wraps multiple statements in an object and store it on the PostgreSQL database
server.
So instead of sending multiple statements to the server one by one, you can send one
statement to execute the object stored in the server. This allows you to:
• Reduce the number of round trips between the application and the PostgreSQL
database server.
• Avoid transferring the immediate results between the application and the server.
PL/pgSQL Blocks
[ <<label>> ]
[ declare
declarations ]
begin
statements;
...
end [ label ];
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Let’s examine the block structure in more detail:
• Each block has two sections: declaration and body. The declaration section is optional
while the body section is required. A block is ended with a semicolon (;) after
the END keyword.
• A block may have an optional label located at the beginning and at the end. You use
the block label when you want to specify it in the EXIT statement of the block body or
when you want to qualify the names of variables declared in the block.
• The declaration section is where you declare all variables used within the body
section. Each statement in the declaration section is terminated with a semicolon (;).
• The body section is where you place the code. Each statement in the body section is
also terminated with a semicolon (;).
The following example illustrates a very simple block. It is called an anonymous block.
do $$
<<first_block>>
declare
film_count integer := 0;
begin
-- get the number of films
select count(*)
into film_count
from film;
-- display a message
raise notice 'The number of films is %', film_count;
end first_block $$;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
To execute a block from pgAdmin, you click the Execute button as shown in the following
picture:
Notice that the DO statement does not belong to the block. It is used to execute an anonymous
block. PostgreSQL introduced the DO statement since version 9.0.
However, we used the dollar-quoted string constant syntax to make it more readable.
In the declaration section, we declared a variable film_count and set its value to zero.
film_count integer := 0;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Inside the body section, we used a select into statement with the count() function to get the
number of films from the film table and assign the result to the film_count variable.
select count(*)
into film_count
from film;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
• if then
• if then else
• if then elsif
if condition then
statements;
end if;
The if statement executes statements if a condition is true. If the condition evaluates to false,
the control is passed to the next statement after the END if part. The condition is a boolean
expression that evaluates to true or false. The statements can be one or more statements that
will be executed if the condition is true. It can be any valid statement, even
another if statement. When an if statement is placed inside another if statement, it is called a
nested-if statement.
do $$
declare
selected_film film%rowtype;
input_film_id film.film_id%type := 0;
begin
The found is a global variable that is available in PL/pgSQL procedure language. If the select
into statement sets the found variable if a row is assigned or false if no row is returned.
We used the if statement to check if the film with id (0) exists and raise a notice if it does not.
if condition then
statements;
else
alternative-statements;
END if;
The if then else statement executes the statements in the if branch if the condition evaluates to
true; otherwise, it executes the statements in the else branch.
do $$
declare
selected_film film%rowtype;
input_film_id film.film_id%type := 100;
begin
In this example, the film id 100 exists in the film table so that the FOUND variable was set to
true. Therefore, the statement in the else branch executed.
if condition_1 then
statement_1;
elsif condition_2 then
statement_2
...
elsif condition_n then
statement_n;
else
else-statement;
end if;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The if and ifthen else statements evaluate one condition. However, the if then elsif statement
evaluates multiple conditions.
For example, if the condition_1 is true then the if then ELSif executes the statement_1 and
stops evaluating the other conditions.
If all conditions evaluate to false, the if then elsif executes the statements in the else branch.
do $$
declare
v_film film%rowtype;
len_description varchar(100);
begin
How it works:
• First, select the film with id 100. If the film does not exist, raise a notice that the film
is not found.
• Second, use the if then elsif statement to assign the film a description based on the
length of the film.
The search-expression is an expression that evaluates to a result. The case statement compares
the result of the search-expression with the expression in each when branch using equal
operator ( =) from top to bottom. If the case statement finds a match, it will execute the
corresponding when section. Also, it stops comparing the result of the search-expression with
the remaining expressions. If the case statement cannot find any match, it will execute
the else section.
The else section is optional. If the result of the search-expression does not match
expression in the when sections and the else section does not exist, the case statement will
raise a case_not_found exception. The following is an example of the simple case statement.
do $$
declare
rate film.rental_rate%type;
price_segment varchar(50);
begin
-- get the rental rate
select rental_rate into rate
from film
where film_id = 100;
Output:
NOTICE: High End
This example first selects the film with id 100. Based on the rental rate, it assigns a price
segment to the film that can be mass, mainstream, or high end. In case the price is not 0.99,
2.99 or 4.99, the case statement assigns the film the price segment as unspecified.
In this syntax, the case statement evaluates the boolean expressions sequentially from top to
bottom until it finds an expression that evaluates to true
Once it finds an expression that evaluates to true, the case statement executes the
corresponding when section and immediately stops searching for the remaining expressions.
In case no expression evaluates to true, the case statement will execute the the else section.
The else section is optional. If you omit the else section and there is no expression evaluates
to true, the case statement will raise the case_not_found exception.
The following example illustrates how to use a simple case statement:
do $$
declare
total_payment numeric;
service_level varchar(25) ;
begin
select sum(amount) into total_payment
from Payment
where customer_id = 100;
if found then
case
when total_payment > 200 then
service_level = 'Platinum' ;
when total_payment > 100 then
service_level = 'Gold' ;
else
service_level = 'Silver' ;
end case;
raise notice 'Service Level: %', service_level;
else
raise notice 'Customer not found';
end if;
end; $$
How it works:
First, select the total payment paid by the customer id 100 from the payment table.
Then, assign the service level to the customer based on the total payment.
The while loop statement executes a block of code until a condition evaluates to false.
[ <<label>> ]
while condition loop
statements;
end loop;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax, PostgreSQL evaluates the condition before executing the statements. If the
condition is true, it executes the. After each iteration, the while loop evaluates
the condition again. Inside the body of the while loop, you need to change the values of some
variables to make the condition false or null at some points. Otherwise, you will have an
indefinite loop. Because the while loop tests the condition before executing the statements,
the while loop is sometimes referred to as a pretest loop.
The following example uses the while loop statement to display the value of a counter:
do $$
declare
counter integer := 0;
begin
while counter < 5 loop
raise notice 'Counter %', counter;
counter := counter + 1;
end loop;
end$$;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
NOTICE: Counter 0
NOTICE: Counter 1
NOTICE: Counter 2
NOTICE: Counter 3
NOTICE: Counter 4
Code language: Shell Session (shell)
How it works.
The loop defines an unconditional loop that executes a block of code repeatedly until
terminated by an exit or return statement.
<<label>>
loop
statements;
end loop;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Typically, you use an if statement inside the loop to terminate it based on a condition like this:
<<label>>
loop
statements;
if condition then
exit;
end if;
end loop;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
It’s possible to place a loop statement inside another loop statement. When a loop statement is
placed inside another loop statement, it is called a nested loop:
<<outer>>
loop
statements;
<<inner>>
loop
/* ... */
exit <<inner>>
end loop;
end loop;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
When you have nested loops, you need to use the loop label so that you can specify it in
the exit and continue statement to indicate which loop these statements refer to.
The following example shows how to use the loop statement to calculate the Fibonacci
sequence number.
do $$
declare
n integer:= 10;
fib integer := 0;
counter integer := 0 ;
i integer := 0 ;
j integer := 1 ;
begin
if (n < 1) then
fib := 0 ;
end if;
loop
exit when counter = n ;
counter := counter + 1 ;
select j, i + j into i, j ;
end loop;
fib := i;
raise notice '%', fib;
end; $$
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
NOTICE: 55
By definition, Fibonacci numbers are a sequence of integers starting with 0 and 1, and
each subsequent number is the sum of the two previous numbers, for example, 1, 1, 2
(1+1), 3 (2+1), 5 (3 +2), 8 (5+3), …
In the declaration section, the counter variable is initialized to zero (0). The loop is
terminated when counter equals n. The following select statement swaps values of two
variables i and j :
SELECT j, i + j INTO i, j ;
CS232L – Database Management System
LAB#08
PL/pgSQL — SQL Procedural Language
CURSORS, FUNCTIONS,
PROCEDURES
Objective
The objective of this session is to
2. Functions
3. Procedures
4. Examples
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered and the output displayed.
1
CS232-L – LAB 6
CURSORS
WHAT IS CURSOR?
Rather than executing a whole query at once, it is possible to set up a cursor that encapsulates the
query, and process each individual row at a time.
1. Declaring cursors
To access to a cursor, you need to declare a cursor variable in the declaration section of a
block. PostgreSQL provides you with a special type called REFCURSOR to declare a cursor variable.
declare my_cursor refcursor;
You can also declare a cursor that bounds to a query by using the following syntax:
2
CS232-L – LAB 6
Next, you specify whether the cursor can be scrolled backward using the SCROLL. If you use NO SCROLL,
the cursor cannot be scrolled backward.
Then, you put the CURSOR keyword followed by a list of comma-separated arguments ( name datatype)
that defines parameters for the query. These arguments will be substituted by values when the cursor
is opened.
After that, you specify a query following the FOR keyword. You can use any valid SELECT
statement here.
declare
cur_films cursor for
select *
from film;
cur_films2 cursor (year integer) for
select *
from film
where release_year = year;
The cur_films is a cursor that encapsulates all rows in the film table.
The cur_films2 is a cursor that encapsulates film with a particular release year in the film table.
2. Opening cursors
Cursors must be opened before they can be used to query rows. PostgreSQL provides the syntax for
opening an unbound and bound cursor.
https://www.postgresqltutorial.com/postgresql-plpgsql/plpgsql-cursor/
Because a bound cursor already bounds to a query when we declared it, so when we open it, we just
need to pass the arguments to the query if necessary.
3
CS232-L – LAB 6
In the following example, we open bound cursors cur_films and cur_films2 that we declared above:
open cur_films;
open cur_films2(year:=2005);
The FETCH statement gets the next row from the cursor and assigns it a target_variable, which could be a
record, a row variable, or a comma-separated list of variables. If no more row found,
the target_variable is set to NULL(s).
By default, a cursor gets the next row if you don’t specify the direction explicitly. The following is valid
for the cursor:
• NEXT
• LAST
• PRIOR
• FIRST
• ABSOLUTE count
• RELATIVE count
• FORWARD
• BACKWARD
Note that FORWARD and BACKWARD directions are only for cursors declared with SCROLL option.
4. Closing cursors
close cursor_variable;
The CLOSE statement releases resources or frees up cursor variable to allow it to be opened
again using OPEN statement.
Example:
https://pgdocptbr.sourceforge.io/pg82/plpgsql-cursors.html
FUNCTIONS:
The create function statement allows you to define a new user-defined function.
returns return_type
language plpgsql
as
$$
declare
-- variable declaration
begin
-- logic
end;
$$
5
CS232-L – LAB 6
In this syntax:
• First, specify the name of the function after the create function keywords. If you want to replace the
existing function, you can use the or replace keywords.
• Then, specify the function parameter list surrounded by parentheses after the function name. A function
can have zero or many parameters.
• Next, specify the datatype of the returned value after the returns keyword.
• After that, use the language plpgsql to specify the procedural language of the function. Note that
PostgreSQL supports many procedural languages, not just plpgsql.
• Finally, place a block in the dollar-quoted string constant.
The following statement creates a function that counts the films whose length between
the len_from and len_to parameters:
returns int
language plpgsql
as
$$
declare
film_count integer;
6
CS232-L – LAB 6
begin
select count(*)
into film_count
from film
return film_count;
end;
$$;
The function get_film_count has two main sections: header and body.
• First, the name of the function is get_film_count that follows the create function keywords.
• Second, the get_film_count() function accepts two parameters len_from and len_to with the integer
datatype.
• Third, the get_film_count function returns an integer specified by the returns int clause.
• Finally, the language of the function is plpgsql indicated by the language plpgsql.
• Use the dollar-quoted string constant syntax that starts with $$ and ends with $$. Between these $$, you
can place a block that contains the declaration and logic of the function.
• In the declaration section, declare a variable called film_count that stores the number of films selected
from the film table.
• In the body of the block, use the select into statement to select the number of films whose length are
between len_from and len_to and assign the result to the film_count variable. At the end of the block, use
the return statement to return the film_count.
7
CS232-L – LAB 6
Finnally, you can find the function get_film_count in the Functions list:
To call a function using the positional notation, you need to specify the arguments in the same order
as parameters. For example:
select get_film_count(40,90);
Output:
get_film_count
----------------
325
(1 row)
In this example, the arguments of the get_film_count() are 40 and 90 that corresponding to
the from_len and to_len parameters.
You call a function using the positional notation when the function has few parameters.
If the function has many parameters, you should call it using the named notation since it will make the
function call more obvious.
8
CS232-L – LAB 6
The following shows how to call the get_film_count function using the positional notation:
select get_film_count(
len_from => 40,
len_to => 90
);
Output:
get_film_count
----------------
325
(1 row)
In the named notation, you use the => to separate the argument’s name and its value.
For backward compatibility, PostgreSQL supports the older syntax based on := as follows:
select get_film_count(
len_from := 40,
len_to := 90
);
The mixed notation is the combination of positional and named notations. For example:
Note that you cannot use the named arguments before positional arguments like this:
Error:
9
CS232-L – LAB 6
PROCEDURES
So far, you have learned how to define user-defined functions using the create function statement.
A drawback of user-defined functions is that they cannot execute transactions. In other words, inside
a user-defined function, you cannot start a transaction, and commit or rollback it.
The benefits of using PostgreSQL Stored Procedures are immense. Just imagine
deploying a new function every time a new use case arises. Instead, creating a stored
procedure to later use in an app makes sense. Likewise, it is also an essential step while
creating user-defined functions.
PostgreSQL Stored Procedures support procedural operations, which are helpful while
building powerful database apps — it increases their performance, productivity, and
scalability.
In short, developing custom functions becomes easier using the PostgreSQL Stored
Procedures. Moreover, once created, you can deploy stored procedures in any app based
on your requirements.
SYNTAX:
To define a new stored procedure, you use the create procedure statement.
The following illustrates the basic syntax of the create procedure statement:
In this syntax:
• First, specify the name of the stored procedure after the create procedure keywords.
10
CS232-L – LAB 6
• Second, define parameters for the stored procedure. A stored procedure can accept zero or
more parameters.
• Third, specify plpgsql as the procedural language for the stored procedure. Note that you can
use other procedural languages for the stored procedure such as SQL, C, etc.
• Finally, use the dollar-quoted string constant syntax to define the body of the stored
procedure.
Parameters in stored procedures can have the in and inout modes. They cannot have the out mode.
A stored procedure does not return a value. You cannot use the return statement with a value inside a
store procedure like this:
However, you can use the return statement without the expression to stop the stored procedure
immediately:
If you want to return a value from a stored procedure, you can use parameters with the inout mode.
EXAMPLES:
The following statement shows the data from the accounts table:
11
CS232-L – LAB 6
The following example creates a stored procedure named transfer that transfers a specified amount of
money from one account to another.
commit;
end;$$
call stored_procedure_name(argument_list);
12
CS232L – Database Management System
LAB 09
TRIGGERS, VIEWS, INDEXES
Objective
The objective of this session is to learn about PostgreSQL triggers, views and Indexes. What
they are and how they are used using different postgresql examples.
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercises.
• Turn in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered, and the output displayed.
• Try to practice and revise all the concepts covered in all previous sessions before coming
to the lab to avoid unnecessary ambiguities.
9.1 PostgreSQL CREATE TRIGGER
A trigger function receives data about its calling environment through a special structure
called Trigger Data which contains a set of local variables. For example, OLD and NEW
represent the states of the row in the table before or after the triggering event. Once you
define a trigger function, you can bind it to one or more trigger events such
as INSERT, UPDATE, and DELETE.
The CREATE TRIGGER statement creates a new trigger. The following illustrates the basic
syntax of the CREATE TRIGGER statement:
In this syntax: First, specify the name of the trigger after the TRIGGER keywords. Second,
specify the timing that causes the trigger to fire. It can be BEFORE or AFTER an event occurs.
Third, specify the event that invokes the trigger. The event can be INSERT , DELETE,
UPDATE or TRUNCATE. Fourth, specify the name of the table associated with the trigger after
the ON keyword. Fifth, specify the type of triggers which can be:
If the DELETE statement deletes 100 rows, the row-level trigger will fire 100 times, once for
each deleted row. On the other hand, a statement-level trigger will be fired for one time
regardless of how many rows are deleted.
Finally, specify the name of the trigger function after the EXECUTE PROCEDURE keywords.
Suppose that when the name of an employee changes, you want to log the changes in a
separate table called employee_audits :
RETURN NEW;
END;
$$
The function inserts the old last name into the employee_audits table including employee id, last
name, and the time of change if the last name of an employee changes.
The OLD represents the row before update while the NEW represents the new row that will be
updated. The OLD.last_name returns the last name before the update and the NEW.last_name
returns the new last name.
Second, bind the trigger function to the employees table. The trigger name is last_name_changes.
Before the value of the last_name column is updated, the trigger function is automatically
invoked to log the changes.
Suppose that Lily Bush changes her last name to Lily Brown. Fifth, update Lily’s last name to the
new one:
UPDATE employees
SET last_name = 'Brown'
WHERE ID =2;
As you can see from the output, Lily’s last name has been updated.
First, create a function that validates the username of a staff. The username of staff must not
be null and its length must be at least 8.
Second, create a new trigger on the staff table to check the username of a staff. This trigger
will fire whenever you insert or update a row in the staff table (from the sample database):
A view is a stored query. A view can be accessed as a virtual table in PostgreSQL. In other
words, a PostgreSQL view is a logical table that represents data of one or more underlying
tables through a SELECT statement.
• A view helps simplify the complexity of a query because you can query a view, which
is based on a complex query, using a simple SELECT statement.
• Like a table, you can grant permission to users through a view that contains specific
data that the users are authorized to see.
• A view provides a consistent layer even the columns of the underlying table change.
To create a view, we use CREATE VIEW statement. The simplest syntax of the CREATE
VIEW statement is as follows:
First, you specify the name of the view after the CREATE VIEW clause, then you put a query
after the AS keyword. A query can be a simple SELECT statement or a
complex SELECT statement with joins.
For example, in our sample database i.e. dvd rental we have four tables:
1. customer – stores all customer data
2. address – stores address of customers
3. city – stores city data
4. country– stores country data
If you want to get complete customers data, you normally construct a join statement as
follows:
This query is quite complex. However, you can create a view named customer_master as
follows:
CREATE VIEW customer_master AS
SELECT cu.customer_id AS id,
cu.first_name || ' ' || cu.last_name AS name,
a.address,
a.postal_code AS "zip code",
a.phone,
city.city,
country.country,
CASE
WHEN cu.activebool THEN 'active'
ELSE ''
END AS notes,
cu.store_id AS sid
FROM customer cu
INNER JOIN address a USING (address_id)
INNER JOIN city USING (city_id)
INNER JOIN country USING (country_id);
From now on, whenever you need to get complete customer data, you just query it from the
view by executing the following simple SELECT statement:
SELECT
*
FROM
customer_master;
This query produces the same result as the complex one with the joins above.
To change the definition of a view, you use the ALTER VIEW statement. For example, you
can change the name of the view from customer_master to customer_info by using the
following statement:
PostgreSQL allows you to set a default value for a column name, change the view’s schema,
set or reset options of a view. For detailed information on the altering view’s definition, check
it out the PostgreSQL ALTER VIEW statement.
To remove an existing view in PostgreSQL, you use DROP VIEW statement as follows:
You specify the name of the view that you want to remove after DROP VIEW clause.
Removing a view that does not exist in the database will result in an error. To avoid this, you
normally add IF EXISTS option to the statement to instruct PostgreSQL to remove the view if
it exists, otherwise, do nothing.
For example, to remove the customer_info view that you have created, you execute the
following query:
DROP VIEW IF EXISTS customer_info;
First, create a new updatable view name usa_cities using CREATE VIEW statement. This
view contains all cities in the city table locating in the USA whose country id is 103.
Next, check the data in the usa_cities view by executing the following SELECT statement:
SELECT
*
FROM
usa_cities;
SELECT
city,
country_id
FROM
city
WHERE
country_id = 103
ORDER BY
last_update DESC;
DELETE
FROM
usa_cities
WHERE
city = 'San Jose';
The entry has been deleted from the city table through the usa_cities view.
Indexes are special lookup tables that the database search engine can use to speed up data
retrieval. Simply put, an index is a pointer to data in a table. An index in a database is very
similar to an index in the back of a book.
For example, if you want to reference all pages in a book that discusses a certain topic, you
must first refer to the index, which lists all topics alphabetically and then refer to one or more
specific page numbers.
An index helps to speed up SELECT queries and WHERE clauses; however, it slows down
data input, with UPDATE and INSERT statements. Indexes can be created or dropped with
no effect on the data.
Suppose you need to look up John Doe’s phone number in a phone book. Assuming that the
names on the phone book are in alphabetical order. To find John Doe’s phone number, you
first look for the page where the last name is Doe, then look for the first name John, and finally,
get his phone number. If the names on the phone book were not ordered alphabetically, you
would have to go through all pages and check every name until you find John Doe’s phone
number. This is called a sequential scan in which you go over all entries until you find the one
that you are looking for. Like a phonebook, the data stored in the table should be organized in
a particular order to speed up various searches. This is why indexes come into play. By
definition, An index is a separated data structure that speeds up the data retrieval on a table at
the cost of additional writes and storage to maintain the index.
PostgreSQL CREATE INDEX Syntax
The syntax to create an index using the CREATE INDEX statement in PostgreSQL is:
UNIQUE
Optional. The UNIQUE modifier indicates that the combination of values in the
indexed columns must be unique.
CONCURRENTLY
Optional. When the index is created, it will not lock the table. By default, the table is
locked while the index is being created.
index_name
The name to assign to the index.
table_name
The name of the table in which to create the index.
index_col1, index_col2, ... index_col_n
The columns to use in the index.
ASC
Optional. The index is sorted in ascending order for that column.
DESC
Optional. The index is sorted in descending order for that column.
If a column contains NULL, you can specify NULLS FIRST or NULLS LAST option. The NULLS
FIRST is the default when DESC is specified and NULLS LAST is the default when DESC is
not specified.
To check if a query uses an index or not, you use the EXPLAIN statement.
We will use the address table from the sample database for the demonstration.
The following query finds the address whose phone number is 223664661973:
It is obvious that the database engine has to scan the whole address table to look for the address
because there is no index available for the phone column. To show the query plan, you use
the EXPLAIN statement as follows:
EXPLAIN SELECT *
FROM address
WHERE phone = '223664661973';
To create an index for the values in the phone column of the address table, you use the
following statement:
Now, if you execute the query again, you will find that the database engine uses the index for
lookup:
EXPLAIN SELECT *
FROM address
WHERE phone = '223664661973';
B-tree indexes
B-tree is a self-balancing tree that maintains sorted data and allows searches, insertions,
deletions, and sequential access in logarithmic time. PostgreSQL query planner will consider
using a B-tree index whenever index columns are involved in a comparison. In addition, the
query planner can use a B-tree index for queries that involve a pattern-matching operator
LIKE and ~ if the pattern is a constant and is anchor at the beginning of the pattern.
Hash indexes
Hash indexes can handle only simple equality comparison (=). It means that whenever an
indexed column is involved in a comparison using the equal(=) operator, the query planner
will consider using a hash index. To create a hash index, you use the CREATE INDEX
statement with the HASH index type in the USING clause as follows:
GIN indexes
GIN stands for generalized inverted indexes. These are most useful when you have multiple
values stored in a single column, for example, hstore, array, jsonb, and range types.
BRIN
BRIN stands for block range indexes. BRIN is much smaller and less costly to maintain in
comparison with a B-tree index. BRIN allows the use of an index on a very large table that
would previously be impractical using a B-tree without horizontal partitioning. BRIN is often
used on a column that has a linear sort order, for example, the created date column of the sales
order table.
GiST Indexes
GiST stands for Generalized Search Tree. GiST indexes allow the building of general tree
structures. GiST indexes are useful in indexing geometric data types and full-text searches.
SP-GiST Indexes
SP-GiST stands for space-partitioned GiST. SP-GiST supports partitioned search trees that
facilitate the development of a wide range of different non-balanced data structures. SP-GiST
indexes are most useful for data that has a natural clustering element to it and is also not an
equally balanced tree, for example, GIS, multimedia, phone routing, and IP routing.
PostgreSQL UNIQUE index
When you define a UNIQUE index for a column, the column cannot store multiple rows with
the same values. If you define a UNIQUE index for two or more columns, the combined values
in these columns cannot be duplicated in multiple rows. PostgreSQL treats NULLs as distinct
values, therefore, you can have multiple NULL values in a column with a UNIQUE index.
When you define a primary key or a unique constraint for a table, PostgreSQL automatically
creates a corresponding UNIQUE index. Note that only B-tree indexes can be declared as
unique indexes.
In this statement, the employee_id is the primary key column and email column has a unique
constraint, therefore, PostgreSQL created two UNIQUE indexes, one for each column. To show
indexes of the employees table, you use the following statement:
SELECT
tablename,
indexname,
indexdef
FROM
pg_indexes
WHERE
tablename = 'employees';
The following statement adds the mobile_phone column to the employees table:
To ensure that the mobile phone numbers are distinct for all employees, you define
a UNIQUE index for the mobile_phone column as follows:
Second, attempt to insert another row with the same phone number::
PostgreSQL issues the following error due to the duplicate mobile phone number:
The following statement adds two new columns called work_phone and extension to
the employees table:
Multiple employees can share the same work phone number. However, they cannot have the
same extension number. To enforce this rule, you can define a UNIQUE index on
both work_phone and extension columns:
To test this index, first, insert a row into the employees table:
Second, insert another employee with the same work phone number but a different extension:
The statement works because the combination of values in the work_phone and extension column
is unique. Third, attempt to insert a row with the same values in both work_phone and
extension columns that already exist in the employees table:
LAB#10
PostgreSQL VS MongoDB
Objective
The objective of this session is to
4. Examples
Instructions
• Open the handout/lab manual in front of a computer in the lab during the session.
• Practice each new command by completing the examples and exercise.
• Turn-in the answers for all the exercise problems as your lab report.
• When answering problems, indicate the commands you entered and the output displayed.
1
CS232-L – LAB 6
Introduction
One of the most important parts of the function of any company is a secure database. With phishing
attacks, malware, and other threats on the rise, it is essential that you make the right choice to keep
your data safe and process it effectively. However, it can be extremely difficult to choose from the
wide variety of database solutions on the market today.
Two commonly used options are MongoDB and PostgreSQL.
Here are the key points from our Mongo DB vs. PostgreSQL comparison:
• MongoDB is an open-source non-relational database system that falls under the NoSQL
category.
• PostgreSQL is a relational database management system.
1. PostgreSQL
Relational databases are great at running complex
queries and data-based reporting in cases where the
data structure doesn’t change frequently.
Open-source databases like PostgreSQL is a
highly stable database management system.
PostgreSQL stores the data in the tabular format and uses constraints, triggers, roles, stored
procedures and views as the core components .
Use cases for PostgreSQL include bank systems, risk assessment, multi-app data
repository, BI (business intelligence), manufacturing, and powering various business applications. It is
ideal for transactional workflows. Also, PostgreSQL has fail-safes and redundancies that make its
storage particularly reliable. This means that it is perfect for important industries like healthcare and
manufacturing.
Architecture:
PostgreSQL has a monolithic architecture, meaning that the components are completely
united. This also means that the database can only scale as much as the machine running it.
2
CS232-L – LAB 6
2. MongoDB?
MongoDB is a schema-free,
general purpose,
document high-performance database.
Common use cases for MongoDB include customer analytics, content management,
business transactions, and product data. Thanks to its ability to scale, the database is also ideal for
mobile solutions that need to be scaled to millions of users.
Architecture:
The database features a distributed architecture, meaning that components function across multiple
platforms in collaboration with one another. This also means that MongoDB has nearly unlimited
scalability since it can be scaled across more than one platform as needed. That is one of the many
factors that differentiate MongoDB from a relational database.
Differences
MongoDB PostgreSQL
Schema-free SQL-based but supports various
NoSQL features
Document database Relational database
Uses JSON Uses SQL
Distributed architecture Monolithic architecture
Uses collections Uses tables
Uses documents to obtain data Uses rows to obtain data
Does not support foreign key Supports foreign key constraints
constraints
Uses the aggregation pipeline for Uses GROUP_BY
running queries
Uses indexes Uses joins
3
CS232-L – LAB 6
PostgreSQL is best when: you want a relational database that will run complex SQL queries and
work with lots of existing applications based on a tabular, relational data model, PostgreSQL will do
the job.
MongoDB is best when: you are at the beginning of a development project and are seeking to
figure out your needs and data model by using an agile development process, MongoDB will shine
because developers can reshape the data on their own, when they need to. MongoDB enables you to
manage data of any structure, not just tabular structures defined in advance.
Environment Setup
Complete installation of MongoDB is given in the link below:
https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-windows/
To make this work, in PostgreSQL and all other SQL databases, the database schema must be
created and data relationships established before the database is populated with data. Related
information may be stored in separate tables, but associated through the use of Foreign Keys and
JOINs.
The challenge of using a relational database is the need to define its structure in advance.
Changing structure after loading data is often very difficult, requiring multiple teams across
development, DBA, and Ops to tightly coordinate changes.
Now in the document database world of MongoDB, the structure of the data doesn’t have
to be planned up front in the database and it is much easier to change. Developers can
decide what’s needed in the application and change it in the database accordingly.
4
CS232-L – LAB 6
2. Insertion:
3. Retrieving data:
4. Updating a record:
5
CS232-L – LAB 6
The following table presents the various SQL statements and the corresponding MongoDB
statements. The examples in the table assume the following conditions:
• The MongoDB examples assume a collection named people that contain documents of the
following prototype:
{
_id: ObjectId("509a8fb2f3f4948bd2f983a0"),
user_id: "abc123",
age: 55,
status: 'A'
}
1. Insert
The following table presents the various SQL statements related to
inserting records into tables and the corresponding MongoDB statements.
6
CS232-L – LAB 6
2. Insert
The following table presents the various SQL statements related to inserting records into tables and
the corresponding MongoDB statements.
7
CS232-L – LAB 6
The following example inserts three new documents into the inventory collection. If the documents
do not specify an _id field, MongoDB adds the _id field with an ObjectId value to each document.
See Insert Behavior.
db.inventory.insertMany([
{ item: "journal", qty: 25, tags: ["blank", "red"], size: { h: 14, w: 21, uom: "cm" } },
{ item: "mat", qty: 85, tags: ["gray"], size: { h: 27.9, w: 35.5, uom: "cm" } },
{ item: "mousepad", qty: 25, tags: ["gel", "blue"], size: { h: 19, w: 22.85, uom: "cm" } }
])
insertMany() returns a document that includes the newly inserted documents _id field values.
db.inventory.find( {} )
Insert Behavior
Collection Creation
If the collection does not currently exist, insert operations will create the collection.
_id Field
In MongoDB, each document stored in a collection requires a unique _id field that acts as a primary
key. If an inserted document omits the _id field, the MongoDB driver automatically generates
an ObjectId for the _id field.
3. Select
The following table presents the various SQL statements related to reading records from tables and
the corresponding MongoDB statements.
8
CS232-L – LAB 6
SELECT * db.people.find()
FROM people
SELECT * db.people.find(
FROM people { status: "A" }
WHERE status = "A" )
SELECT * db.people.find(
FROM people { status: { $ne: "A" } }
WHERE status != "A" )
SELECT * db.people.find(
FROM people { status: "A",
WHERE status = "A" age: 50 }
AND age = 50 )
SELECT * db.people.find(
FROM people { age: { $gt: 25 } }
WHERE age > 25 )
SELECT *
FROM people db.people.find( { user_id: /bc/ } )
WHERE user_id like "%bc%"
9
CS232-L – LAB 6
SELECT *
FROM people db.people.find( { status: "A" } ).sort( { user_id: -1 } )
WHERE status = "A"
ORDER BY user_id DESC
db.people.find().count()
4. Update Records
The following table presents the various SQL statements related to updating existing records in tables
and the corresponding MongoDB statements.
5. Delete Records
The following table presents the various SQL statements related to deleting records from tables and
the corresponding MongoDB statements.
10
CS232-L – LAB 6
The aggregation pipeline allows MongoDB to provide native aggregation capabilities that corresponds
to many common data aggregation operations in SQL.
The following table provides an overview of common SQL aggregation terms, functions, and concepts
and the corresponding MongoDB aggregation operators:
WHERE $match
GROUP BY $group
HAVING $match
ORDER BY $sort
SUM() $sum
COUNT() $sum
$sortByCount
join $lookup
11
CS232-L – LAB 6
Examples
The following table presents a quick reference of SQL aggregation statements and the corresponding
MongoDB statements. The examples in the table assume the following conditions:
• The SQL examples assume two tables, orders and order_lineitem that join by
the order_lineitem.order_id and the orders.id columns.
• The MongoDB examples assume one collection orders that contain documents of the
following prototype:
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: [ { sku: "xxx", qty: 25, price: 1 },
{ sku: "yyy", qty: 25, price: 1 } ]
}
}
])
13
CS232-L – LAB 6
14