You are on page 1of 27

Select Into:

The SQL SELECT INTO statement is used to select data from a SQL database table and to
insert it to a different table at the same time.

The general SQL SELECT INTO syntax looks like this:

SELECT Column1, Column2, Column3,


INTO Table2
FROM Table1

The list of column names after the SQL SELECT command determines which columns will be
copied, and the table name after the SQL INTO keyword specifies to which table to copy
those rows.

If we want to make an exact copy of the data in our Customers table, we need the
following SQL SELECT INTO statement:

SELECT *
INTO Customers_copy
FROM Customers

SQL DISTINCT

The SQL DISTINCT clause is used together with the SQL SELECT keyword, to return a
dataset with unique entries for certain database table column.

We will use our Customers database table to illustrate the usage of SQL DISTINCT.

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

For example if we want to select all distinct surnames from our Customers table, we will
use the following SQL DISTINCT statement:
SELECT DISTINCT LastName
FROM Customers

The result of the SQL DISTINCT expression above will look like this:

LastName
Smith
Goldfish
Brown

SQL WHERE :
The SQL WHERE clause is used to select data conditionally, by adding it to already existing
SQL SELECT query. We are going to use the Customers table from the previous chapter,
to illustrate the use of the SQL WHERE command.

Table: Customers

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

If we want to select all customers from our database table, having last name 'Smith' we
need to use the following SQL syntax:

SELECT *
FROM Customers
WHERE LastName = 'Smith'

The result of the SQL expression above will be the following:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

In this simple SQL query we used the "=" (Equal) operator in our WHERE criteria:

LastName = 'Smith'

But we can use any of the following comparison operators in conjunction with the SQL
WHERE clause:

<> (Not Equal)

SELECT *
FROM Customers
WHERE LastName <> 'Smith'
> (Greater than)

SELECT *
FROM Customers
WHERE DOB > '1/1/1970'

>= (Greater or Equal)

SELECT *
FROM Customers
WHERE DOB >= '1/1/1970'

< (Less than)

SELECT *
FROM Customers
WHERE DOB < '1/1/1970'

<= (Less or Equal)

SELECT *
FROM Customers
WHERE DOB =< '1/1/1970'

LIKE (similar to)

SELECT *
FROM Customers
WHERE Phone LIKE '626%'

Note the LIKE syntax is different with the different RDBMS (SQL Server syntax used
above). Check the SQL LIKE article for more details.

Between (Defines a range)

SELECT *
FROM Customers
WHERE DOB BETWEEN '1/1/1970' AND '1/1/1975'

SQL LIKE

We will use the Customers table to illustrate the SQL LIKE clause usage:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

The SQL LIKE clause is very useful when you want to specify a search condition within
your SQL WHERE clause, based on a part of a column contents. For example if you want to
select all customers having FirstName starting with 'J' you need to use the following SQL
statement:

SELECT *
FROM Customers
WHERE FirstName LIKE 'J%'

Here is the result of the SQL statement above:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

If you want to select all Customers with phone numbers starting with '416' you will use this
SQL expression:
SELECT *
FROM Customers
WHERE Phone LIKE '416%'

The '%' is a so called wildcard character and represents any string in our pattern.
You can put the wildcard anywhere in the string following the SQL LIKE clause and you can
put as many wildcards as you like too.

Note that different databases use different characters as wildcard characters, for example
'%' is a wildcard character for MS SQL Server representing any string, and '*' is the
corresponding wildcard character used in MS Access.

Another wildcard character is '_' representing any single character.

The '[]' specifies a range of characters. Have a look at the following SQL statement:

SELECT *
FROM Customers
WHERE Phone LIKE '[4-6]_6%'

This SQL expression will return all customers satisfying the following conditions:

 The Phone column starts with a digit between 4 and 6 ([4-6])


 Second character in the Phone column can be anything (_)
 The third character in the Phone column is 6 (6)
 The remainder of the Phone column can be any character string (%)

Here is the result of this SQL expression:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

SQL INSERT INTO

The SQL INSERT INTO syntax has 2 main forms and the result of either of them is adding
a new row into the database table.
The first syntax form of the INSERT INTO SQL clause doesn't specify the column names
where the data will be inserted, but just their values:

INSERT INTO Table1


VALUES (value1, value2, value3…)

The second form of the SQL INSERT INTO command, specifies both the columns and the
values to be inserted in them:

INSERT INTO Table1 (Column1, Column2, Column3…)


VALUES (Value1, Value2, Value3…)

As you might already have guessed, the number of the columns in the second INSERT INTO
syntax form must match the number of values into the SQL statement, otherwise you will
get an error.

If we want to insert a new row into our Customers table, we are going to use one of the
following 2 SQL statements:

INSERT INTO Customers


VALUES ('Peter', 'Hunt', 'peter.hunt@tgmail.net', '1/1/1974', '626 888-8888')

INSERT INTO Customers (FirstName, LastName, Email, DOB, Phone)


VALUES ('Peter', 'Hunt', 'peter.hunt@tgmail.net', '1/1/1974', '626 888-8888')

The result of the execution of either of the 2 INSERT INTO SQL statements will be a new
row added to our Customers database table:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888
Peter Hunt peter.hunt@tgmail.net 1/1/1974 626 888-8888
If you want to enter data for just a few of the table columns, you’ll have to use the second
syntax form of the SQL INSERT INTO clause, because the first form will produce an error
if you haven’t supplied values for all columns.

To insert only the FirstName and LastName columns, execute the following SQL statement:

INSERT INTO Customers (FirstName, LastName)


VALUES ('Peter', 'Hunt')

SQL UPDATE

The SQL UPDATE general syntax looks like this:

UPDATE Table1
SET Column1 = Value1, Column2 = Value2
WHERE Some_Column = Some_Value

The SQL UPDATE clause changes the data in already existing database row(s) and usually
we need to add a conditional SQL WHERE clause to our SQL UPDATE statement in order to
specify which row(s) we intend to update.

If we want to update the Mr. Steven Goldfish's date of birth to '5/10/1974' in our
Customers database table

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

we need the following SQL UPDATE statement:

UPDATE Customers
SET DOB = '5/10/1974'
WHERE LastName = 'Goldfish' AND FirstName = 'Steven'
If we don’t specify a WHERE clause in the SQL expression above, all customers' DOB will be
updated to '5/10/1974', so be careful with the SQL UPDATE command usage.

We can update several database table rows at once, by using the SQL WHERE clause in our
UPDATE statement. For example if we want to change the phone number for all customers
with last name Smith (we have 2 in our example Customers table), we need to use the
following SQL UPDATE statement:

UPDATE Customers
SET Phone = '626 555-5555'
WHERE LastName = 'Smith'

After the execution of the UPDATE SQL expression above, the Customers table will look as
follows:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 555-5555
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 626 555-5555

SQL DELETE

So far we’ve learnt how to select data from a database table and how to insert and update
data into a database table. Now it’s time to learn how to remove data from a database.
Here comes the SQL DELETE statement!

The SQL DELETE command has the following generic SQL syntax:

DELETE FROM Table1


WHERE Some_Column = Some_Value

If you skip the SQL WHERE clause when executing SQL DELETE expression, then all the
data in the specified table will be deleted. The following SQL statement will delete all the
data from our Customers table and we’ll end up with completely empty table:

DELETE FROM Table1


If you specify a WHERE clause in your SQL DELETE statement, only the table rows
satisfying the WHERE criteria will be deleted:

DELETE FROM Customers


WHERE LastName = 'Smith'

The SQL query above will delete all database rows having LastName 'Smith' and will leave
the Customers table in the following state:

FirstName LastName Email DOB Phone


Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232

SQL ORDER BY

The SQL ORDER BY clause comes in handy when you want to sort your SQL result sets by
some column(s). For example if you want to select all the persons from the already familiar
Customers table and order the result by date of birth, you will use the following statement:

SELECT * FROM Customers


ORDER BY DOB

The result of the above SQL expression will be the following:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

As you can see the rows are sorted in ascending order by the DOB column, but what if you
want to sort them in descending order? To do that you will have to add the DESC SQL
keyword after your SQL ORDER BY clause:
SELECT * FROM Customers
ORDER BY DOB DESC

The result of the SQL query above will look like this:

FirstName LastName Email DOB Phone


James Smith jim@supergig.co.uk 20/10/1980 416 323-8888
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222

If you don't specify how to order your rows, alphabetically or reverse, than the result set is
ordered alphabetically, hence the following to SQL expressions produce the same result:

SELECT * FROM Customers


ORDER BY DOB

SELECT * FROM Customers


ORDER BY DOB ASC

You can sort your result set by more than one column by specifying those columns in the
SQL ORDER BY list. The following SQL expression will order by DOB and LastName:

SELECT * FROM Customers


ORDER BY DOB, LastName

SQL AND & OR

The SQL AND clause is used when you want to specify more than one condition in your
SQL WHERE clause, and at the same time you want all conditions to be true.
For example if you want to select all customers with FirstName "John" and LastName
"Smith", you will use the following SQL expression:
SELECT * FROM Customers
WHERE FirstName = 'John' AND LastName = 'Smith'

The result of the SQL query above is:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222

The following row in our Customer table, satisfies the second of the conditions (LastName =
'Smith'), but not the first one (FirstName = 'John'), and that's why it's not returned by our
SQL query:

FirstName LastName Email DOB Phone


James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

The SQL OR statement is used in similar fashion and the major difference compared to the
SQL AND is that OR clause will return all rows satisfying any of the conditions listed in the
WHERE clause.

If we want to select all customers having FirstName 'James' or FirstName 'Paula' we need
to use the following SQL statement:

SELECT * FROM Customers


WHERE FirstName = 'James' OR FirstName = 'Paula'

The result of this query will be the following:

FirstName LastName Email DOB Phone


Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

You can combine AND and OR clauses anyway you want and you can use parentheses to
define your logical expressions.
Here is an example of such a SQL query, selecting all customers with LastName 'Brown' and
FirstName either 'James' or 'Paula':

SELECT * FROM Customers


WHERE (FirstName = 'James' OR FirstName = 'Paula') AND LastName = 'Brown'

The result of the SQL expression above will be:

FirstName LastName Email DOB Phone


Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232

SQL IN

The SQL IN clause allows you to specify discrete values in your SQL WHERE search criteria.

THE SQL IN syntax looks like this:

SELECT Column1, Column2, Column3, …


FROM Table1
WHERE Column1 IN (Valu1, Value2, …)

Lets use the EmployeeHours table to illustrate how SQL IN works:

Employee Date Hours


John Smith 5/6/2004 8
Allan Babel 5/6/2004 8
Tina Crown 5/6/2004 8
John Smith 5/7/2004 9
Allan Babel 5/7/2004 8
Tina Crown 5/7/2004 10
John Smith 5/8/2004 8
Allan Babel 5/8/2004 8
Tina Crown 5/8/2004 9

Consider the following SQL query using the SQL IN clause:

SELECT *
FROM EmployeeHours
WHERE Date IN ('5/6/2004', '5/7/2004')

This SQL expression will select only the entries where the column Date has value of
'5/6/2004' or '5/7/2004', and you can see the result below:

Employee Date Hours


John Smith 5/6/2004 8
Allan Babel 5/6/2004 8
Tina Crown 5/6/2004 8
John Smith 5/7/2004 9
Allan Babel 5/7/2004 8
Tina Crown 5/7/2004 10

We can use the SQL IN statement with another column in our EmployeeHours table:

SELECT *
FROM EmployeeHours
WHERE Hours IN (9, 10)

The result of the SQL query above will be:

Employee Date Hours


John Smith 5/7/2004 9
Tina Crown 5/7/2004 10
Tina Crown 5/8/2004 9

SQL BETWEEN

The SQL BETWEEN & AND keywords define a range of data between 2 values.

The SQL BETWEEN syntax looks like this:


SELECT Column1, Column2, Column3, …
FROM Table1
WHERE Column1 BETWEEN Value1 AND Value2

The 2 values defining the range for SQL BETWEEN clause can be dates, numbers or just
text.

In contrast with the SQL IN keyword, which allows you to specify discrete values in your
SQL WHERE criteria, the SQL BETWEEN gives you the ability to specify a range in your
search criteria.

We are going to use the familiar Customers table to show how SQL BETWEEN works:

FirstName LastName Email DOB Phone


John Smith John.Smith@yahoo.com 2/4/1968 626 222-2222
Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455-4545
Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

Consider the following SQL BETWEEN statement:

SELECT *
FROM Customers
WHERE DOB BETWEEN '1/1/1975' AND '1/1/2004'

The SQL BETWEEN statement above will select all Customers having DOB column between
'1/1/1975' and '1/1/2004' dates. Here is the result of this SQL expression:

FirstName LastName Email DOB Phone


Paula Brown pb@herowndomain.org 5/24/1978 416 323-3232
James Smith jim@supergig.co.uk 20/10/1980 416 323-8888

SQL aliases

SQL aliases can be used with database tables and with database table columns, depending
on task you are performing.
SQL column aliases are used to make the output of your SQL queries easy to read and
more meaningful:

SELECT Employee, SUM(Hours) As SumHoursPerEmployee


FROM EmployeeHours
GROUP BY Employee

In the example above we created SQL alias SumHoursPerEmployee and the result of this
SQL query will be the following:

Employee SumHoursPerEmployee
John Smith 25
Allan Babel 24
Tina Crown 27

Consider the following SQL statement, showing how to use SQL table aliases:

SELECT Emp.Employee
FROM EmployeeHours AS Emp

Here is the result of the SQL expression above:

Employee
John Smith
Allan Babel
Tina Crown

The SQL table aliases are very useful when you select data from multiple tables.

SQL COUNT

The SQL COUNT aggregate function is used to count the number of rows in a database
table.

The SQL COUNT syntax is simple and looks like this:


SELECT COUNT(Column1)
FROM Table1

If we want to count the number of customers in our Customers table, we will use the
following SQL COUNT statement:

SELECT COUNT(LastName) AS NumberOfCustomers


FROM Customers

The result of this SQL COUNT query will be:

NumberOfCustomers
4

SQL MAX

The SQL MAX aggregate function allows us to select the highest (maximum) value for a
certain column.

The SQL MAX function syntax is very simple and it looks like this:

SELECT MAX(Column1)
FROM Table1

If we use the Customers table from our previous chapters, we can select the highest date
of birth with the following SQL MAX expression:

SELECT MAX(DOB) AS MaxDOB


FROM Customers

The SQL MIN aggregate function allows us to select the lowest (minimum) value for a
certain column.
The SQL MIN function syntax is very simple and it looks like this:

SELECT MIN(Column1)
FROM Table1

SQL MIN

\If we use the Customers table from our previous chapters, we can select the lowest date
of birth with the following SQL MIN expression:

SELECT MIN(DOB) AS MinDOB


FROM Customers

SQL AVG

The SQL AVG aggregate function selects the average value for certain table column.

Have a look at the SQL AVG syntax:

SELECT AVG(Column1)
FROM Table1

If we want to find out what is the average SaleAmount in the Sales table, we will use the
following SQL AVG statement:

SELECT AVG(SaleAmount) AS AvgSaleAmount


FROM Sales
which will result in the following dataset:

AvgSaleAmount
$195.73

SQL SUM

The SQL SUM aggregate function allows selecting the total for a numeric column.

The SQL SUM syntax is displayed below:

SELECT SUM(Column1)
FROM Table1

We are going to use the Sales table to illustrate the use of SQL SUM clause:

Sales:

CustomerID Date SaleAmount


2 5/6/2004 $100.22
1 5/7/2004 $99.95
3 5/7/2004 $122.95
3 5/13/2004 $100.00
4 5/22/2004 $555.55

Consider the following SQL SUM statement:

SELECT SUM(SaleAmount)
FROM Sales

SQL GROUP BY
The SQL GROUP BY statement is used along with the SQL aggregate functions like SUM to
provide means of grouping the result dataset by certain database table column(s).

The best way to explain how and when to use the SQL GROUP BY statement is by
example, and that’s what we are going to do.

Consider the following database table called EmployeeHours storing the daily hours for each
employee of a factious company:

Employee Date Hours


John Smith 5/6/2004 8
Allan Babel 5/6/2004 8
Tina Crown 5/6/2004 8
John Smith 5/7/2004 9
Allan Babel 5/7/2004 8
Tina Crown 5/7/2004 10
John Smith 5/8/2004 8
Allan Babel 5/8/2004 8
Tina Crown 5/8/2004 9

If the manager of the company wants to get the simple sum of all hours worked by all
employees, he needs to execute the following SQL statement:

SELECT SUM (Hours)


FROM EmployeeHours

But what if the manager wants to get the sum of all hours for each of his employees?
To do that he need to modify his SQL query and use the SQL GROUP BY statement:

SELECT Employee, SUM (Hours)


FROM EmployeeHours
GROUP BY Employee

The result of the SQL expression above will be the following:

Employee Hours
John Smith 25
Allan Babel 24
Tina Crown 27
As you can see we have only one entry for each employee, because we are grouping by the
Employee column.

The SQL GROUP BY clause can be used with other SQL aggregate functions, for example
SQL AVG:

SELECT Employee, AVG(Hours)


FROM EmployeeHours
GROUP BY Employee

The result of the SQL statement above will be:

Employee Hours
John Smith 8.33
Allan Babel 8
Tina Crown 9

In our Employee table we can group by the date column too, to find out what is the total
number of hours worked on each of the dates into the table:

SELECT Date, SUM(Hours)


FROM EmployeeHours
GROUP BY Date

Here is the result of the above SQL expression:

Date Hours
5/6/2004 24
5/7/2004 27
5/8/2004 25

SQL HAVING

The SQL HAVING clause is used to restrict conditionally the output of a SQL statement, by
a SQL aggregate function used in your SELECT list of columns.
You can't specify criteria in a SQL WHERE clause against a column in the SELECT list for
which SQL aggregate function is used. For example the following SQL statement will
generate an error:

SELECT Employee, SUM (Hours)


FROM EmployeeHours
WHERE SUM (Hours) > 24
GROUP BY Employee

The SQL HAVING clause is used to do exactly this, to specify a condition for an aggregate
function which is used in your query:

SELECT Employee, SUM (Hours)


FROM EmployeeHours
GROUP BY Employee
HAVING SUM (Hours) > 24

The above SQL statement will select all employees and the sum of their respective hours,
as long as this sum is greater than 24. The result of the SQL HAVING clause can be seen
below:

Employee Hours
John Smith 25
Tina Crown 27

SQL JOIN

The SQL JOIN clause is used whenever we have to select data from 2 or more tables.

To be able to use SQL JOIN clause to extract data from 2 (or more) tables, we need a
relationship between certain columns in these tables.

We are going to illustrate our SQL JOIN example with the following 2 tables:

Customers:

CustomerID FirstName LastName Email DOB Phone


1 John Smith John.Smith@yahoo.com 2/4/1968 626 222-
2222
323 455-
2 Steven Goldfish goldfish@fishhere.net 4/4/1974
4545
416 323-
3 Paula Brown pb@herowndomain.org 5/24/1978
3232
416 323-
4 James Smith jim@supergig.co.uk 20/10/1980
8888

Sales:

CustomerID Date SaleAmount


2 5/6/2004 $100.22
1 5/7/2004 $99.95
3 5/7/2004 $122.95
3 5/13/2004 $100.00
4 5/22/2004 $555.55

As you can see those 2 tables have common field called CustomerID and thanks to that we
can extract information from both tables by matching their CustomerID columns.

Consider the following SQL statement:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS


SalesPerCustomer
FROM Customers, Sales
WHERE Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

The SQL expression above will select all distinct customers (their first and last names) and
the total respective amount of dollars they have spent.
The SQL JOIN condition has been specified after the SQL WHERE clause and says that the
2 tables have to be matched by their respective CustomerID columns.

Here is the result of this SQL statement:

FirstName LastName SalesPerCustomers


John Smith $99.95
Steven Goldfish $100.22
Paula Brown $222.95
James Smith $555.55
The SQL statement above can be re-written using the SQL JOIN clause like this:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS


SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

There are 2 types of SQL JOINS – INNER JOINS and OUTER JOINS. If you don't put
INNER or OUTER keywords in front of the SQL JOIN keyword, then INNER JOIN is used.
In short "INNER JOIN" = "JOIN" (note that different databases have different syntax for
their JOIN clauses).

The INNER JOIN will select all rows from both tables as long as there is a match between
the columns we are matching on. In case we have a customer in the Customers table,
which still hasn't made any orders (there are no entries for this customer in the Sales
table), this customer will not be listed in the result of our SQL query above.

If the Sales table has the following rows:

CustomerID Date SaleAmount


2 5/6/2004 $100.22
1 5/6/2004 $99.95

And we use the same SQL JOIN statement from above:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS


SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

We'll get the following result:

FirstName LastName SalesPerCustomers


John Smith $99.95
Steven Goldfish $100.22

Even though Paula and James are listed as customers in the Customers table they won't be
displayed because they haven't purchased anything yet.
But what if you want to display all the customers and their sales, no matter if they have
ordered something or not? We’ll do that with the help of SQL OUTER JOIN clause.

The second type of SQL JOIN is called SQL OUTER JOIN and it has 2 sub-types called
LEFT OUTER JOIN and RIGHT OUTER JOIN.

The LEFT OUTER JOIN or simply LEFT JOIN (you can omit the OUTER keyword in most
databases), selects all the rows from the first table listed after the FROM clause, no matter
if they have matches in the second table.

If we slightly modify our last SQL statement to:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS


SalesPerCustomer
FROM Customers LEFT JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

and the Sales table still has the following rows:

CustomerID Date SaleAmount


2 5/6/2004 $100.22
1 5/6/2004 $99.95

The result will be the following:

FirstName LastName SalesPerCustomers


John Smith $99.95
Steven Goldfish $100.22
Paula Brown NULL
James Smith NULL

As you can see we have selected everything from the Customers (first table). For all rows
from Customers, which don’t have a match in the Sales (second table), the
SalesPerCustomer column has amount NULL (NULL means a column contains nothing).

The RIGHT OUTER JOIN or just RIGHT JOIN behaves exactly as SQL LEFT JOIN,
except that it returns all rows from the second table (the right table in our SQL JOIN
statement).

IT professionals and students from all over the world have many options for SQL training
nowadays. They can learn SQL by going to instructor-led SQL course, they can by a SQL
book, they can take an online SQL training course, or they can use one of the many SQL
training resources online. The first difference between the above SQL training options is the
price tag. Instructor led courses usually last 2 to 5 days and can cost up to several
thousand dollars. Online SQL training courses are usually less expensive, but they cost in
hundreds of dollars most of the time. Another SQL training option is buying SQL training
DVDs. Again the price may vary from $50 to $1,000. Buying a SQL book is the most
inexpensive way for SQL preparation (usually cost between $30 and $100). The last option
is to use free online resources like SQL-Tutorial.net or SQL Training.

Each of the SQL training alternatives has its pros and cons. For example the instructor led
courses have the advantage of real time communication with the instructor and hands-on
SQL training. On the other hand they are very expensive and not everybody will be willing
to invest thousands of dollars for SQL education. If you can get your company to pay for
such SQL course, don’t miss the opportunity.

You can buy SQL training DVDs, but the content won’t be interactive most of the time,
which is a drawback. The advantage of the SQL DVD is that is less expensive.

If you buy a SQL training book, make sure that the book has good reviews; otherwise you
will be wasting your money.

Finally – Practice, practice and practice again the SQL skills you have learned, no matter
which SQL training avenue do you choose.

What is ETL?
ETL stands for Extract, Transform and Load, which is a process used to collect data from
various sources, transform the data depending on business rules/needs and load the data
into a destination database. The need to use ETL arises from the fact that in modern
computing business data resides in multiple locations and in many incompatible formats.
For example business data might be stored on the file system in various formats (Word
docs, PDF, spreadsheets, plain text, etc), or can be stored as email files, or can be kept in a
various database servers like MS SQL Server, Oracle and MySQL for example. Handling all
this business information efficiently is a great challenge and ETL plays an important role in
solving this problem.

Extract, Transform and Load


The ETL process has 3 main steps, which are Extract, Transform and Load.

Extract – The first step in the ETL process is extracting the data from various sources.
Each of the source systems may store its data in completely different format from the rest.
The sources are usually flat files or RDBMS, but almost any data storage can be used as a
source for an ETL process.

Transform – Once the data has been extracted and converted in the expected format, it’s
time for the next step in the ETL process, which is transforming the data according to set of
business rules. The data transformation may include various operations including but not
limited to filtering, sorting, aggregating, joining data, cleaning data, generating calculated
data based on existing values, validating data, etc.

Load – The final ETL step involves loading the transformed data into the destination target,
which might be a database or data warehouse.

ETL Tools
Many of the biggest software players produce ETL tools, including IBM (IBM InfoSphere
DataStage), Oracle (Oracle Warehouse Builder) and of course Microsoft with their SQL
Server Integration Services (SSIS) included in certain editions of Microsoft SQL Server
2005 and 2008.

You might also like