You are on page 1of 34

INTRODUCTION TO SQL

Contents
Introduction ............................................................................................................................................ 1
Structured Query Language .................................................................................................................... 1
Types of SQL commands ......................................................................................................................... 1
SQL Syntax............................................................................................................................................... 1
SQL Server syntax................................................................................................................................ 1
Writing queries in SQL Server ................................................................................................................. 2
Example database ................................................................................................................................... 2
Exercise 1 ............................................................................................................................................ 4
Retrieving data ........................................................................................................................................ 4
Exercise 2 ............................................................................................................................................ 5
Filtering data ........................................................................................................................................... 5
Where clause ...................................................................................................................................... 5
Operators in the WHERE clause .......................................................................................................... 6
Logical operators................................................................................................................................. 6
AND operator .................................................................................................................................. 6
OR operator .................................................................................................................................... 7
NOT operator .................................................................................................................................. 7
Combining AND, OR and NOT operators ........................................................................................ 7
Special operators ................................................................................................................................ 8
BETWEEN ........................................................................................................................................ 8
IS NULL ............................................................................................................................................ 8
LIKE .................................................................................................................................................. 9
IN ..................................................................................................................................................... 9
Arithmetic operators............................................................................................................................. 10
Inserting data ........................................................................................................................................ 11
Inserting data into table with an identity column ............................................................................ 11
Inserting data with a select subquery ............................................................................................... 12
Updating data ....................................................................................................................................... 12
Deleting data ......................................................................................................................................... 13
Sorting data ........................................................................................................................................... 13
Grouping data ....................................................................................................................................... 15
Aggregate functions .............................................................................................................................. 15
Examples: Average ............................................................................................................................ 16
Examples: Count ............................................................................................................................... 17
Examples: Min ................................................................................................................................... 18
Examples: Max .................................................................................................................................. 18
Examples: Sum .................................................................................................................................. 19
Considerations .................................................................................................................................. 19
Filtering on an aggregate ...................................................................................................................... 19
Retrieving data from more than one table ........................................................................................... 20
Inner join ........................................................................................................................................... 20
Outer join .......................................................................................................................................... 22
Left Join ......................................................................................................................................... 22
Right Join ....................................................................................................................................... 24
Full outer join ................................................................................................................................ 26
Recursive joins .................................................................................................................................. 26
Exercise ......................................................................................................................................... 27
Data Definition Commands ................................................................................................................... 28
Stored procedures ................................................................................................................................ 28
Creating a stored procedure ......................................................................................................... 28
Executing a stored procedure ....................................................................................................... 30
Advantages.................................................................................................................................... 30
References ............................................................................................................................................ 31

2|Page
Introduction
Up to now we have learnt to design databases and how data looks in the database. Next, we will learn
how to get data into the database, change data in the database, remove data and run queries – to get
information from the data.

Please use this as a companion to Chapter 8 and not a replacement. You need to work through Chapter
8 as well.

Structured Query Language


SQL stands for structured query language and is used to communicate with a database. It is considered
the standard language for relational databases. Data manipulation is achieved through the use of SQL.
Although SQL is the standard, there are different versions of the SQL language. Microsoft’s extension
to SQL is called Transact-SQL or T-SQL and is used by SQL Server.

Types of SQL commands


There are 5 types of SQL commands. For our purposes, we are going to look at two of those, namely
Data Manipulation Language (DML) and Data Definition Language (DDL). DML is used to modify data
and consists of commands such as INSERT, UPDATE and DELETE. DDL is used to make changes to the
structure of the database or database objects such as tables. DDL includes commands such as
CREATE, DROP and ALTER.

SQL Syntax
SQL is (for most engines and platforms) not case sensitive. That means you do not need to use casing
identical to your table schemas or specific casing for SQL keywords or commands. However, as a rule
of thumb in the industry, we use casing in our statements that is identical to our schema. This makes
the statement much more readable.

You will also often see that SQL keywords and commands are uppercase. While these are generally
case-insensitive, the industry standard seems to be uppercase. Personally, I prefer lowercase but my
suggestion would be to make sure what your team and/or company policy or standard is regarding
this.

Some databases also require semi-colons at the end of an SQL statement.

SQL Server syntax


SQL Server is case-insensitive and does not require a semi-colon. It will however, accept a semi-colon
and view this as the end of a single SQL statement.

SQL Server also has syntax highlighting to make the statements easy to read. The syntax highlighting
that is relevant for now is as follows:

Feature Colour
SQL commands or keywords Blue

SQL functions (to be covered in DB II) Pink


Text in quotes Red
Operators Grey

1|Page
Writing queries in SQL Server

Figure 1: SQL Server interface for writing queries

Example database
The following database will be used throughout this guide:

Figure 2: ERD for example database

The following data has been inserted into the tables:

2|Page
Figure 3: Data in Student entity

Figure 4: Data in Module entity

Figure 5: Data in StudentModule entity

3|Page
Exercise 1
Create the tables as shown in the ERD (Figure 1) and insert the data shown in Figures 3 – 5.

Retrieving data
In order to retrieve data or get data out of a table, we use the SELECTcommand. The general syntax
for the SELECT command is as follows:

SELECT column1, column2,…..


FROM table_name

This returns all records in the table.

The <column list> is a comma separated list of all columns or fields we want to retrieve from the table.
For example, in order to get the student number and first name of all students in our table we would
use the following query:
SELECT StudentNumber, StudentFirstName
FROM Student

You can have any number of columns in the column list and they can appear in any order. The order
you specify them will be the order that the results are displayed in. The results of the above query will
look as follows:

You can use shorthand to select all the columns in the table by substituting the column list with an
asterisk:

SELECT *
FROM Student

4|Page
This will give you the following result set:

Note: In order to optimise our queries and reduce overhead on the server and the
network by reducing the amount of data we send over the network, as a rue we only
select the columns we NEED. Therefore, if you do not need the email address, do not
include the email address in the select statement.

Exercise 2
1. Replicate the select statements in this section of the guide
2. Select all the data (all columns) form the module table
3. Select student number, module code and year registered from the StudentModule table

Filtering data
Very often we do not want all the data in a table and of course sticking to the rule of optimisation and
reducing overhead, we only retrieve the data we want. The WHERE clause is used to filter data.

Where clause
As mentioned, the WHERE clause is used to filter records and extract only those records that meet a
specified condition.

SELECT column1, column2,…..


FROM table_name
WHERE condition

The WHERE clause directly follows the FROM in a SELECT statement.

SELECT *
FROM Student
WHERE StudentNumber = '3471293716'

This statement will select only those records that have a student number equal to the number
specified (3471293716).

SELECT StudentNumber
FROM StudentModule
WHERE ModuleCode = 'CSIS2634'

This statement will list all students who have registered for the module CSIS2634. This can return more
than one record.

5|Page
When using text values such as the student number or module code, you must enclose them in
quotes. Dates must also be enclosed in quotes. Numeric fields do not need quotes.

Operators in the WHERE clause


The following operators can be used in the WHERE clause:

Operator Description
= Equal
> Greater than
< Less than
>= Greater than or equal to
<= Less than or equal to
<> Not equal. Note: In some versions of SQL this operator may be written as !=

The examples we have done up until now have used the equality sign. This means the field must be
exactly equal to the value specified.

Suppose we wanted to get all the students who passed modules. We could use the following
statement:

SELECT StudentNumber, ModuleCode, ModuleMark


FROM StudentModule
WHERE ModuleMark >= 50

This statement is equivalent to this statement:

SELECT StudentNumber, ModuleCode, ModuleMark


FROM StudentModule
WHERE ModuleMark > 49

Logical operators
The WHERE clause can be combined with AND, OR and NOT operators.

AND operator

The AND operator displays a record if all the conditions separated by AND are TRUE.

SELECT column1, column2, ...


FROM table_name
WHERE condition1
AND condition2
AND condition3 ...

For example, suppose you wanted only a list of student who had passed CSIS1614, you would use
the following statement:

SELECT StudentNumber, ModuleCode, ModuleMark


FROM StudentModule
WHERE ModuleMark >= 50
AND ModuleCode = 'CSIS1614'

6|Page
In order for a record to be included in the result set, both the stipulated conditions must be true for
the record.

OR operator
The OR operator displays a record if any of the conditions separated by OR is TRUE.

SELECT column1, column2, ...


FROM table_name
WHERE condition1
OR condition2
OR condition3 ...

For example, you might want a list of all students who registered for either CSIS1614 or CSIS1624.
You could use the following statement:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE ModuleCode = 'CSIS1614'
OR ModuleCode = 'CSIS1624'
In order for a record to be included in the result set, either of the stipulated conditions must be true
for the record – module code must be either CSIS1614 or CSIS1624.

NOT operator
The NOT operator displays a record if the condition(s) is NOT TRUE.

SELECT column1, column2, ...


FROM table_name
WHERE NOT condition1

For example, we could select all modules that are not CSIS2634:

SELECT ModuleCode
FROM Module
WHERE NOT ModuleCode = 'CSIS2634'

Combining AND, OR and NOT operators


You can also combine these operators in a single SQL statement. For example if you would like a list
of all students who received a distinction for either CSIS2634 or CSIS3714, you can use the following
statement:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE (ModuleCode = 'CSIS1614'
OR ModuleCode = 'CSIS1624')
AND ModuleMark >= 75

7|Page
In order to ensure you get the desired result, I would suggest using brackets to group
expressions that belong together. Play around with the previous statement, remove
the brackets and change the order of the conditions in every possible way to see what
the effect is on the result set.

Note that the first clause must be a WHERE and this can then be followed by any combination of
AND and OR clauses.

Special operators
BETWEEN
The BETWEEN operator selects values within a given range. The values can be text, numbers or dates.
The BETWEEN operator is inclusive, meaning the begin and end values are included.

SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value1 AND value2;

For example, you could select all registrations for a certain date range:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE YearRegistered BETWEEN 2000 AND 2020
You could also select all students who qualified for a reassessment:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE ModuleMark BETWEEN 45 AND 49
These statements could also be written using greater than and less than operators:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE YearRegistered >= 2000 AND
YearRegistered <= 2020
IS NULL
Recall that NULL is nothing in a database. It is not an empty string. Therefore if you want to select
records that have a field that is null, you need a special operator to do so. You cannot use comparison
operators usch as =, >, <, >=, <=.

You cannot for example use the following where clause:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE ModuleMark = null
You must use the operator or clause IS NULL:

SELECT StudentNumber, ModuleCode


FROM StudentModule
WHERE ModuleMark is null
This will give you a list of all modules that have no marks for registered students.

8|Page
LIKE
The LIKE operator is used to search for a specified pattern in a field. When using LIKE you can use
wildcards to indicate the pattern in conjunction with actual values. There are two wildcards in SQL,
namely:

• % to indicate any combination and number of characters (this is an asterisk in some databases)
• _ (underscore) to indicate a single character (this is a question mark in some databases)

For example, in order to select all students whose surnames start with the letter B, in other words,
the first letter is B followed by any combination and number of letters, the following statement can
be used:

SELECT *
FROM Student
WHERE StudentLastName LIKE 'B%'

If an underscore instead of a percentage was used in the previous example, the database would return
all records for students whose surnames started with a B, followed by a single letter (there are no such
records in this particular dataset).

Suppose we wanted all students with a gmail email address. We could use the following statement:

SELECT *
FROM Student
WHERE StudentEmail LIKE '%gmail%'

The wildcards can appear anywhere in the search string and can be used in any combination. The
following are all valid search strings with wildcards:

• %ta%
• _rown
• _ta%
• %t%a%
• a__% (note that there are two consecutive underscores)
• _r%

Note that when searching using wildcards you must use the LIKE operator. Using the equal
sign will cause the engine to search for the literal string, in other words, in the example above
it would look for someone with the surname B%.

IN
The IN Operator allows you to specify multiple values in a WHERE clause.

SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1, value2, ...);

If we would like to select all students registered for the modules CSIS1614, CSIS1624 and CSIS2634,
you can use the following query:

SELECT *
FROM StudentModule
WHERE ModuleCode in ('CSIS1614', 'CSIS1624', 'CSIS2634')

9|Page
The IN operator is an alternative way of having a select clause with multiple OR conditions. The
previous query is equivalent to the following:

SELECT *
FROM StudentModule
WHERE ModuleCode = 'CSIS1614'
OR ModuleCode = 'CSIS1624'
OR ModuleCode = 'CSIS2634'

Arithmetic operators
Arithmetic operators run mathematical operations on two expressions of one or more data types.
They're run from the numeric data type category.

The following arithmetic operators are available:

Operator Meaning
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus

/*
Add, subtract, multiply, divide and calculate modulus for two numbers
*/

select 10 + 19 as ' Addition',


25 - 3 as 'Subtraction',
80 * 63 as 'Multiplication',
100 / 25 as 'Division',
56 % 3 as 'Modulus'

/*
calculate the module mark for each student
if we were to raise the module mark with
10 marks
*/

select StudentNumber,
MOduleCode,
ModuleMark + 10
from StudentModule

/*
calculate the module mark for each student
if we were to raise the module mark with
10%
*/

select StudentNumber,
ModuleCode,
ModuleMark + (ModuleMark * 10 / 100)
from StudentModule

10 | P a g e
Inserting data
Up to now you have added records via the SQL Server Management Studio GUI directly into the table.
However, more often you need to add data programmatically via a front end and that requires an SQL
statement to achieve. The INSERT INTO statement is used to add new records to a table:

INSERT INTO table_name (column1, column2, column3,...)


VALUES (value1, value2, value3, ...);

The columns can be in any order, they do not need to be specified in the same order as the table
schema. The value list must be in the same order as the columns.

The following insert statement will create a new record in the Module table for computer literacy:

insert into Module(ModuleCode, ModuleName, ModuleCredits)


values ('CSIL1614', 'Computer Literacy', 8)

Note that any columns left out of the column list will have a null inserted as a value (nulls must be
allowed on this field). For example, to leave the credits field null in the previous example, one could
use the following statement:

insert into Module(ModuleCode, ModuleName)


values ('CSIL1614', 'Computer Literacy')

You can also insert a new record without specifying the column list. In this instance a value must be
given for all fields in the table and the values must be given in the order of the table schema:

insert into Module


values ('CSIL1614', 'Computer Literacy', 8)

When specifying the columns, the insert statement will still work without giving
any errors even if the table schema changes. However, if you do not specify a
column list and the table schema fails, the statement will fail when you next
execute it since the specified values are less than the number of columns. You must
change the statement to reflect the new schema of the table. Therefore, it is much
safer and better programming to specify the column list.

Inserting data into table with an identity column


Recall that an identity column is an auto-incremented field and that you cannot put values into an
identity column. Therefore, when inserting into a table that has an identity field, two conditions apply:

1. You must specify the column list and leave the identity field out
2. You may not give a value to the identity field.

11 | P a g e
Suppose you have the following table:

Figure 6: Lecturer Entity

To insert a record into the table, you must use a statement like the following:

insert into Lecturer(LecturerName, LecturerSurname, LecturerTelephone)


values ('John', 'Smith', '0514445252')

Note that similar to the previous examples you need not specify all columns. The only restrictions are
the two mentioned above – there must be a column list and you may not insert a value into the identity
field.

Inserting data with a select subquery


The previous examples inserted a single record at a time. It is however, possible to insert multiple
records at a time by using a select statement. The results of the select statement will then be inserted
into the table:

INSERT INTO table_name (column1, column2, column3,...)


SELECT value1, value2, value3, ...
FROM table_name
WHERE condition;
The number of column sin the select statement must be equal to the number of columns specified in
the column list. The select statement can be any valid select statement and contain any number and
type of conditions.

Updating data
Sometimes you need to change existing data. For this you need to use the UPDATE statement:

UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

For example, suppose you wanted to change the name of the computer literacy module, you could
use a statement as follows:

UPDATE Module
SET ModuleName = 'Introduction to computing'
WHERE ModuleCode = 'CSIL1614'

You can change the value of multiple columns in a single statement. For example:

UPDATE Student
SET StudentEmail = 'JackMan@gmail.com',

12 | P a g e
StudentCellphone = '0621234863'
WHERE StudentNumber = '1324567890'

This statement will change both the email address and the cellphone number of the specified student.
You change specify more columns to change if the need arises.

You can also change more than one record at a time. For example, suppose you want to change the
ModuleResult field to reflect the P for PASS for all students who received more than 50% for a module,
you can use the following statement:

UPDATE StudentModule
SET ModuleResult = 'P'
where ModuleMark >= 50

This will change a number of records simultaneously and once again you can also change a number of
columns simultaneously.

The condition can contain any number of conditions and any combination of AND and OR
clauses, similar to a SELECT statement. Beware however that neglecting to specify any
condition will update the entire table (which is fine if that is your intention but will result
in loss of data integrity if that was not your intention).

Deleting data
You can remove data from a table using a DELETE statement:

DELETE FROM table_name


WHERE condition;
For example, to delete a student from the table, you can use the following statement:

DELETE FROM Student


WHERE StudentNumber = '4679322130'

This will delete a single record from the table. Similar to the update statement, you can delete multiple
records at the same time by adjusting your WHERE clause:

DELETE FROM Student


WHERE StudentLastName LIKE 'B%

This will delete all students whose surnames start with the letter B.

As with the UPDATE and SELECT, the condition can contain any number of conditions and any
combination of AND and OR clauses, similar to a SELECT statement. Beware however that
neglecting to specify any condition will delete the entire table.

Sorting data
Up to now we have been selected data from entities, either all the data or a filtered set of data. Have
you noticed how the data has been sorted – in what order does the engine show the results? Here is
an example data set, showing all the data from the student table:

13 | P a g e
What do you notice about the order of the data? The data is sorted ascending using the primary key
(keep in mind that the student number is of type varchar, not numeric. The data type plays a role in
the sorting of data).

You might like to specify a different order for your data. For instance, as a lecturer I might like a class
list to be sorted alphabetically and not according to student numbers. The ORDER BY keyword can be
used to do this:

SELECT column1, column2, ...


FROM table_name
ORDER BY column1, column2, ... ASC|DESC;

To sort the list of modules, you could use the following statement:

select ModuleCode, ModuleName


from Module
order by ModuleName

The result set should look as follows:

Figure 7: Ordered module list

Notice that the list is now alphabetical according to the module name and no longer sorted according
to the module code (the primary key). By default the sort order is ascending – smallest to largest. You
can specify ascending order by using the asc keyword or if no order is given, the order will be
ascending You can also sort the result set in descending order – largest to smallest – by using the desc
keyword.

14 | P a g e
select ModuleCode, ModuleName
from Module
order by ModuleName desc

You can also order by more than one column at a time. Simply list the columns in the order that the
sorting must occur and separate them with commas. To sort the students alphabetically, you would
do the following:

select StudentFirstName, StudentLastName


from Student
order by StudentLastName, StudentFirstName
This statement will order students first by last name, then by first name, both ascending. You can also
have different sort orders for each column:

select StudentFirstName, StudentLastName


from Student
order by StudentLastName desc, StudentFirstName asc

Grouping data
The GROUP BY statement groups rows that have the same values into summary rows, usually for the
purpose of performing one or more aggregations on each group (see next section). The SELECT
statement returns one row per group.

SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);

Aggregate functions
An aggregate function performs a calculation on a set of values, and returns a single value.
Aggregates are most often used with a GROUP BY clause but this is not a requirement. There
are a number of aggregates. For our purposes, you need to be familiar with the following:

Aggregate Calculation
AVG Calculates the average of a set of values
COUNT Counts rows in a specified table or view
MIN Gets the minimum value in a set of values
MAX Gets the maximum value in a set of values
SUM Calculates the sum of values

15 | P a g e
Examples: Average
/*
To calculate the average mark for all students and all modules
(not a sensible calculation but used only for illustration purposes with current
data):
*/

select avg(ModuleMark)
from StudentModule

/*
To calculate the average mark for each module on it's own
*/

select avg(ModuleMark), ModuleCode


from StudentModule
group by ModuleCode

/*
To calculate the average mark for each module on it's own and for each year group
*/

select avg(ModuleMark), ModuleCode, YearRegistered


from StudentModule
group by ModuleCode, YearRegistered

Note that for each the summary rows (group by) gives different results.

16 | P a g e
For the remainder of the aggregates, the SQL statements will be shown without result sets.
You should practice the various SQL statements yourself and pay close attention to the result
set that you get for each and make sure you understand what the effect is.

Examples: Count
/*
To count all students
*/

select count(StudentNumber)
from Student

/*
To count all students registered per module
*/

select count(StudentNumber), ModuleCode


from StudentModule
group by ModuleCode

/*
To count students for each module on it's own and for each year group
*/

select count(StudentNumber), ModuleCode, YearRegistered


from StudentModule
group by ModuleCode, YearRegistered

/*
To count students for a specific module
*/

select count(StudentNumber), ModuleCode, YearRegistered


from StudentModule
where ModuleCode = 'CSIS1614'
group by ModuleCode, YearRegistered

17 | P a g e
Examples: Min
/*
To get the minimum mark for all modules
*/

select min(ModuleMark)
from StudentModule

/*
To get minimum mark for all students registered per module
*/

select min(ModuleMark), ModuleCode


from StudentModule
group by ModuleCode

/*
To get minimum mark for each module on it's own and for each year group
*/

select min(ModuleMark), ModuleCode, YearRegistered


from StudentModule
group by ModuleCode, YearRegistered

/*
To get minimum mar for a specific module
*/

select min(ModuleMark), ModuleCode, YearRegistered


from StudentModule
where ModuleCode = 'CSIS1614'
group by ModuleCode, YearRegistered

Examples: Max
/*
To get the maximum mark for all modules
*/

select max(ModuleMark)
from StudentModule

/*
To get maximum mark for all students registered per module
*/

select max(ModuleMark), ModuleCode


from StudentModule
group by ModuleCode

/*
To get maximum mark for each module on it's own and for each year group
*/

select max(ModuleMark), ModuleCode, YearRegistered


from StudentModule
group by ModuleCode, YearRegistered

/*
To get maximum mar for a specific module

18 | P a g e
*/

select max(ModuleMark), ModuleCode, YearRegistered


from StudentModule
where ModuleCode = 'CSIS1614'
group by ModuleCode, YearRegistered

Examples: Sum
/*
suppose I have a table that a product code and a product price
I can sum the product price to determine the value on hand of my products
*/

select ProductCode, sum(ProductPrice)


from Product

/*
if I have a quantity, then I can even calculate the value of the products
currently on the shelf
*/
select ProductCode, sum(ProductPrice * ProductQuantity)
from Product

Considerations
• Except for COUNT, aggregate functions ignore null values
• Note that SQL Server gives a generic name to the calculated field. You can give a
“friendlier” name by using a column alias - sum(ProductPrice) as 'TotalStock'
• Every column in the select statement (apart from that in the aggregate) must be
listed in the group by clause.

Filtering on an aggregate
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate
functions. The HAVING clause enables you to specify conditions that filter which group results appear
in the results.
The WHERE clause places conditions on the selected columns, whereas the HAVING clause places
conditions on groups created by the GROUP BY clause.

SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
HAVING condition
ORDER BY column_name(s);

For example, if you wanted to list all students who had an average mark higher than 75% you could
do the following:

select StudentNumber, avg(ModuleMark) as 'Avg Degree Mark'


from StudentModule
group by StudentNumber
having avg(ModuleMark) > 75

19 | P a g e
If we want a list of “large” classes, that is modules that have more than 100 students registered, we
could do the following:

select ModuleCode, count(*)


from StudentModule
where YearRegistered = '2020'
group by ModuleCode
having count(*) > 100

The following conditions apply when using the HAVING clause:

• HAVING filters records that work on summarized GROUP BY results.


• HAVING applies to summarized group records, whereas WHERE applies to individual records.
• Only the groups that meet the HAVING criteria will be returned.
• HAVING requires that a GROUP BY clause is present.
• WHERE and HAVING can be in the same query.

Note that the where filters the individual records before the aggregation is calculated
and before the group by is applied. The having filters after the aggregation and group
by and filters on the aggregate column itself.

If you put a HAVING in a WHERE clause you are making a mistake!!!! This syntax is not
allowed.

Retrieving data from more than one table


Up until now, we have been selecting data from one table at a time only. We however know that the
tables are related to one another and sometimes we require data from more than one table at a time.
For instance, instead of drawing a class list for CSIS2634 showing only student numbers, I would like
to see the names of my students as well. A JOIN clause is used to combine rows from two or more
tables, based on a related column between them.

You must use this notation to join multiple tables. Do not use the join notation (with commas) that
is shown in Chapter 8.

These are the different types of joins we find in SQL:

• (INNER) JOIN: Returns records that have matching values in both tables
• LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the
right table
• RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from
the left table
• FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table

Inner join
The inner join (most common) selects records that have matching vales in both tables (see figure
below).

20 | P a g e
Figure 8: Inner join illustration

The syntax of an inner join is as follows:

SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;

Take note of the following:

• Note that the keyword INNER is optional and rarely used.


• You can join as many tables as you need, you simply repeat the join syntax for each set of
tables that must be joined.
• The keyword ON must follow directly after the keyword JOIN.
• Recall that tables are related via referential integrity, hence primary and foreign keys. The
JOIN condition is table1.PrimaryKey = table2.ForeignKey
• The JOIN will return all records that relate to one another, where PK = FK.
• Within the statement you can have any number of conditions, ANDs and ORs. The ON
effectively replaces the WHERE clause we have used up until now.
• The table1 and table2 prefix you see in the query, tells the SQL engine which table to use.
Since both tables will have a column with the same name, you must specify a table so that the
engine knows which table column to use
• When selected fields from more than one table, fields that have the same name must also
have a table prefix

Let’s look at some examples:

/*
We would like a class list for CSIS2634 for 2020
It should show student number, student name, student surname
Registered students are in StudentModule
Student demographics are in Student
Hence we join on the related field, namely StudentNumber
*/

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName
from Student
join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
and ModuleCode = 'CSIS2634'
and YearRegistered = '2020'

21 | P a g e
This query will give the following result:

Suppose I now want a list of all modules that a particular student has registered for and completed –
effectively a study record. The following query will achieve this:

/*
We would like a study record for student 1324567890
It should show student number, student name, student surname,
Module Code, MOdule Mark and Module Name
Only modules the students has completed (has a mark for)
must be listed
*/

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName,
StudentModule.ModuleCode, --ModuleCode could be selected from Module table
as well
Module.ModuleName
from Student
join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
and Student.StudentNumber = '1324567890'
and StudentModule.ModuleMark is not null
join Module
on Module.ModuleCode = StudentModule.ModuleCode
order by Module.ModuleCode

Outer join
For the inner join, there must be matching records in both tables. The outer join allows you to select
records that are not necessarily in both tables. There are 3 types of outer joins, namely the left, right
and full outer join.

Left Join
The LEFT JOIN keyword returns all records from the left table (table1), and the matched records from
the right table (table2). The result is NULL from the right side, if there is no match.

22 | P a g e
Figure 9: Left join illustration

SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;

For example, suppose we want a list of all students at the university, whether they registered for
modules or not. This means they can appear in StudentModule, but they don’t have to. Clearly, an
inner join will not work, since they must appear in both tables. A left join will achieve this, since it will
return all records in the left table (Student), whether they appear in the right table (StudentModule)
or not.

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName
from Student
left join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber

Retype and run this query. Notice that all students in Student table are returned, whether they appear
in StudentModule or not. Also notice the duplicate records – this is because some students appear
multiple times in StudentModule. Let’s refine our query to firstly remove all duplicates:

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName
from Student
left join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
group by Student.StudentNumber,
StudentFirstName,
StudentLastName

Notice here we have used the group by without an aggregate – perfectly allowable SQL syntax. Instead
we are using the group by to still group or summarise records in order to get rid of the duplicate
records.

Let’s go back to duplicate records and instead also list the ModuleCode the student has registered for:

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName ,
ModuleCode

23 | P a g e
from Student
left join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
order by Student.StudentNumber, ModuleCode

What do you notice about the result set now (apart from the duplicates)? Specifically on the “right”
of the result set? There are NULL values in the ModuleCode column. Why would the ModuleCode be
null? These are the students that have not registered for any modules as yet.

Suppose now you only want a list of students that have not registered for any modules. That would
be the records in the previous result set that had NULL values for the module code. Therefore, to get
only those students, use the following query:

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName ,
ModuleCode
from Student
left join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
where StudentModule.ModuleCode is null

In summary, when doing an outer join, you can use the syntax to list

1. Every record that appears in the left table, with the corresponding record from the right table
or NULLs for that value (without the is null condition)
2. Only records that appear in the left table and that do not have any corresponding records in
the right table (with the is null condition)

Notice that when using the is null condition, we used a field that is part of the primary
key. We know that a primary key cannot be null. However, since there is no corresponding
record in the right table – that is the value does not appear there – the field is null as a
consequence of the outer join.

Similar to the inner join, you can use any number of left joins, or a combination of left, right
and inner joins. Just make sure that you put them in the correct order since the order of
tables makes a difference in an outer join (but not in an inner join)

Right Join
The RIGHT JOIN keyword returns all records from the right table (table2), and the matched records
from the left table (table1). The result is NULL from the left side, when there is no match.

24 | P a g e
Figure 10: Right join illustration

SELECT column_name(s)
FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;

The left and right join can be used interchangeably, simply swop the tables around. As an example,
let’s redo the previous queries, but use right joins instead of left joins. Note that the order of the tables
no changes as well.

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName
from StudentModule
right join Student
on Student.StudentNumber = StudentModule.StudentNumber

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName
from StudentModule
right join Student
on Student.StudentNumber = StudentModule.StudentNumber
group by Student.StudentNumber,
StudentFirstName,
StudentLastName

select Student.StudentNumber, --table prefix is required, since both Student and


StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName ,
ModuleCode
from StudentModule
right join Student
on Student.StudentNumber = StudentModule.StudentNumber
order by Student.StudentNumber, ModuleCode

25 | P a g e
select Student.StudentNumber, --table prefix is required, since both Student and
StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName ,
ModuleCode
from StudentModule
right join Student
on Student.StudentNumber = StudentModule.StudentNumber
where StudentModule.ModuleCode is null

To get a list of modules that have never been taken by any students (in other words, they never appear
in StudentModule), use the following query:

select *
from Module
right join StudentModule
on Module.ModuleCode = StudentModule.ModuleCode
where StudentModule.ModuleCode is null

Now try and rewrite this query using a left join.

Full outer join


The FULL OUTER JOIN keyword returns all records when there is a match in left (table1) or right
(table2) table records.

Figure 11: Full outer join

SELECT column_name(s)
FROM table1
FULL OUTER JOIN table2
ON table1.column_name = table2.column_name
WHERE condition;

The FULL OUTER JOIN keyword returns all matching records from both tables whether the other table
matches or not. So, if there are rows in "Student" that do not have matches in "StudentModule", or if
there are rows in "StudentModule" that do not have matches in "Student", those rows will be listed
as well.

It is very rare that you would need to execute a FULL OUTER JOIN, since if your referential integrity is
set up correctly, only an outer join would ever be necessary.

Recursive joins
Recursive joins are more commonly referred to as self joins. This happens when we have a unary (or
recursive) relationship on a table and need information.

26 | P a g e
SELECT column_name(s)
FROM table1 T1
JOIN table1 T2
ON condition;
Note that in this instance we must provide an alias for our tables since we now have multiple
references to the same table so for the engine to be able to distinguish between the references, we
must provide aliases that we use for the remainder of the query. The table alias follows directly after
the initial reference to the table in the JOIN syntax.

For example, suppose that every student can have a mentor. That mentor is themselves a student. A
mentor can mentor more than one student. We therefore have a recursive relationship on student as
follows:

Figure 12: Recursive relationship on Student

Suppose we now want a list of all students, together with their mentor name and surname. We need
to join Student on Student – a self join.

select stud.StudentFirstName as 'Student First Name',


stud.StudentLastName as 'Student Last Name',
mentor.StudentFirstName as 'Mentor First Name',
mentor.StudentLastName as 'Mentor Last Name'
from Student stud
join Student mentor
on Student.MentorId = mentor.StudentNumber

Take note of the following in the above query:

1. Table aliases are specified directly after the table reference


2. These table aliases are then used throughout the query
3. The query is an inner join (you can of course also do outer joins)
4. Column aliases are also specified to distinguish easily between the columns
5. Carefully note the condition, make sure you know which side is “primary key” and which side
is the “foreign key”

Exercise
1. Create the following table:

EMPLOYEE(EMPLOYEE_ID, EMPLOYEE_NAME, EMPLOYEE_SURNAME, MANAGER_ID)

Manager_Id is a foreign key referencing Employee_Id in employee.

27 | P a g e
2. Write a query to list all employees (name and surname) with their manager details (name
and surname)

Data Definition Commands


Data Definition Language are commands that change the structure of the database or database
objects. Up until now, we have been using the GUI to create tables, relationships, and databases. We
also use the GUI to make changes to the database objects. All of this can be done with DDL instead of
the GUI. This guide will not cover DDL, instead please refer to the text book and make sure you are
comfortable with using DDL.

Stored procedures
A stored procedure is a prepared SQL code that you can save, so the code can be reused over and over
again. So if you have an SQL query that you write over and over again, save it as a stored procedure,
and then just call it.

Creating a stored procedure


Use the following syntax to create a stored procedure:

create procedure ProcedureName


as
begin
--insert SQL query here
end

Note the following:

1. The name of the procedure must be unique within the whole database
2. Any SQL command/query can be inserted into a stored procedure
3. When you execute this query, it will not run the embedded SQL command, instead it will run
the whole query (Create procedure) and a stored procedure will be created in your database

28 | P a g e
Figure 13: Stored procedure node

For example, let’s take our class list query and rather create a stored procedure that will generate a
class list for us.

create procedure ClassList


as
begin

select Student.StudentNumber,
StudentFirstName,
StudentLastName
from Student
join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
and ModuleCode = 'CSIS2634'
and YearRegistered = '2020'
end

Of course, we would prefer not to hard code and instead make our procedure more flexible and
enable ourselves to request any class list using the same stored procedure. For this we can add
parameters to the stored procedure.

create procedure ProcedureName


--List parameters here
as
begin
--insert SQL query here
end

29 | P a g e
Let’s add parameters to our stored procedure:

alter procedure ClassList


/*
Parameters must have a parameter name and data type
Parameter names must be preceded by the @sign to allow
SQL Server to identify it as a variable
*/
@ModuleCode varchar(8),
@Year varchar(4)
as
begin

select Student.StudentNumber, --table prefix is required, since both Student


and StudentModule have a column called StudentNumber
StudentFirstName,
StudentLastName
from Student
join StudentModule
on Student.StudentNumber = StudentModule.StudentNumber
and ModuleCode = @ModuleCode --instead of hard-coding the value us ethe
parameter instead
and YearRegistered = @Year --instead of hard-coding the value us ethe
parameter instead
end

Note that when making changes to an existing stored procedure, we use the ALTER command and
not the CREATE – that is only used when initially creating the stored procedure.

Executing a stored procedure


In order to execute a stored procedure, that is run the SQL query contained within the body of the
procedure, use the following SQL command:

exec ProcedureName

To execute our initial version of our stored procedure, we would use the following:

exec ClassList

If the stored procedure has parameters, these must be sent through to the procedure as a comma
separated list directly after the procedure name:

exec ProcedureName --Parameter list

To execute the altered version of our stored procedure, we would use the following:

exec ClassList 'CSIS2634', '2020'

All parameters must be passed to the stored procedure.

Advantages
• Since stored procedures are compiled and stored, whenever you call a procedure the response
is quick.
• You can group all the required SQL statements in a procedure and execute them at once.

30 | P a g e
• Since procedures are stored on the database server which is faster than client. You can
execute all the complicated queries using it, which will be faster.
• Using procedures, you can avoid repetition of code moreover with these you can use
additional SQL functionalities like calling stored functions.
• Once you compile a stored procedure you can use it in any number of applications. If any
changes are needed you can just change the procedures without touching the application
code.
• You can call SQL stored procedures from any application, such as a Windows or web
application.

References
1. https://www.w3schools.com
2. MSDN library

31 | P a g e

You might also like