You are on page 1of 17

Using Joins in DB2

Pam Odden

Staff Development Session 2002.21

1 12/11/2002

Objectives
What

is an inner join? How is an outer join different? Which one should I use? Joining a table to itself Coding considerations for joining multiple tables

Staff Development Session 2002.21

2 12/11/2002

What is an Inner Join?

The process of forming pairs of rows by matching the contents of related columns is called joining the tables. The resulting table, which contains data from both of the original tables, is called a join between the two tables. An inner join returns a row for every pair of rows that are matched by the related columns. For example, to see a list of students and their emergency contact numbers for a particular class, we can join the astu_student table of student information with the aemg_contact table of students emergency contact information. (see next slide)
3 12/11/2002

Staff Development Session 2002.21

Inner Join Example


SELECT A.PERMNUM, A.FIRSTNAME, C.TELEPHONE1 FROM SSASIDB1.ASTU_STUDENT A INNER JOIN SSASIDB1.AEMG_CONTACT C ON A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K'; ---------+---------+---------+---------+---------+--PERMNUM FIRSTNAME TELEPHONE1 ---------+---------+---------+---------+---------+--294286 EVAN (702) 604-4708 295027 CAMERON (702) 656-5636 326104 PRESTON (702) 646-1653 332914 CHASEN (702) 496-0047 351658 SAMANTHA ( ) 648-4429 351672 HUNTER ( ) 286-6999

400437 400439

MARCUS TAYLOR

( (

) )

655-9361 373-5672 453-3709 315-8291

464301 DEIRDRE ( ) 465775 JOHNATHEN (702) DSNE610I NUMBER OF ROWS DISPLAYED IS 37

Staff Development Session 2002.21

4 12/11/2002

Inner Join Notation

The select statement on the prior slide shows an inner join with SQL2 standard notation, which specifically uses the words INNER JOIN and puts the join conditions in the FROM clause. Before the SQL2 standard, inner joins were expressed more like a single-table select statement, with the tables in a list separated by commas and the join conditions included in the WHERE clause. The following query has the same result set as the one on the prior slide:

SELECT A.PERMNUM, A.FIRSTNAME, C.TELEPHONE1 FROM SSASIDB1.ASTU_STUDENT A, SSASIDB1.AEMG_CONTACT C WHERE A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM AND A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K';

The SQL2 standard also includes other optional notation for inner joins which is not supported by DB2.

Staff Development Session 2002.21

5 12/11/2002

When an Inner Join is not Sufficient

The inner join in the previous example is fine if we want only students with emergency contacts on our list. However, what if we wanted the list to include ALL students in the class, whether they have an emergency contact or not? Many students in the class are not listed in our inner join, because they are not paired with a matching emergency contact row.
SELECT COUNT(*) FROM SSASIDB1.ASTU_STUDENT A WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K'; ---------+---------+---------+113

Staff Development Session 2002.21

6 12/11/2002

What is an Outer Join?

An outer join extends the standard inner join by retaining unmatched rows of one or both of the joined tables in the query results, and using null values for data from the other table. Which tables unmatched rows are kept depends on the key words LEFT or RIGHT OUTER JOIN. They literally refer to the table whose name would be on the left or right side of the FROM clause if it were written out all on one line. When the key word FULL OUTER JOIN is used, single rows from both tables are kept, with nulls in the columns of the table without a matching row. Full outer joins are not efficient in DB2 v5, so they are not used at CCSD.
7 12/11/2002

Staff Development Session 2002.21

Outer Join Example


SELECT A.PERMNUM, A.FIRSTNAME, C.TELEPHONE1 FROM SSASIDB1.ASTU_STUDENT A LEFT OUTER JOIN SSASIDB1.AEMG_CONTACT C ON A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K'; ---------+---------+---------+---------+---------+-PERMNUM FIRSTNAME TELEPHONE1 ---------+---------+---------+---------+---------+-294286 EVAN (702) 604-4708 295027 CAMERON (702) 656-5636 326104 PRESTON (702) 646-1653 332914 CHASEN (702) 496-0047 351658 SAMANTHA ( ) 648-4429 351672 HUNTER ( ) 286-6999

400437 400438 400439 400440

MARCUS MARTIN TAYLOR CHAYLA

( ) 655-9361 --------------( ) 373-5672 --------------453-3709 315-8291

464301 DEIRDRE ( ) 465775 JOHNATHEN (702) DSNE610I NUMBER OF ROWS DISPLAYED IS 113

Staff Development Session 2002.21

8 12/11/2002

Outer Join Example using Coalesce


SELECT A.PERMNUM, A.FIRSTNAME, COALESCE(C.TELEPHONE1, 'NO CONTACT INFO!') FROM SSASIDB1.ASTU_STUDENT A LEFT OUTER JOIN SSASIDB1.AEMG_CONTACT C ON A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K'; ---------+---------+---------+---------+---------+-PERMNUM FIRSTNAME TELEPHONE1 ---------+---------+---------+---------+---------+-294286 EVAN (702) 604-4708 295027 CAMERON (702) 656-5636 326104 PRESTON (702) 646-1653 332914 CHASEN (702) 496-0047 351658 SAMANTHA ( ) 648-4429 351672 HUNTER ( ) 286-6999

400437 400438 400439 400440

MARCUS MARTIN TAYLOR CHAYLA

( ) 655-9361 NO CONTACT INFO! ( ) 373-5672 NO CONTACT INFO! 453-3709 315-8291

464301 DEIRDRE ( ) 465775 JOHNATHEN (702) DSNE610I NUMBER OF ROWS DISPLAYED IS 113

Staff Development Session 2002.21

9 12/11/2002

Self Joins
Some multi-table queries involve a relationship that a table has with itself. For example, suppose an Employee table has the employees managers employee number as one of its columns. The managers name and other information is a row in the table just like any other employee. If we wanted to list the employees with the name of their manager, we would match each employees row with his managers row in the same table:
SSN 111-11-1111 222-22-2222 333-33-3333 NAME PAM ODDEN KATHY JONES PHIL BRODY MGR_SSN 222-22-2222 333-33-3333 ???-??-????

SELECT EMPS.NAME, MGRS.NAME FROM MHRMSDB1.EMPLOYEE INNER JOIN MHRMSDB1.EMPLOYEE ON EMPS.MGR_SSN = MGRS.SSN

EMPS MGRS

Staff Development Session 2002.21

10 12/11/2002

SQL Considerations for Joins

Qualified column names are needed to eliminate ambiguous column references. Note in the example in slide 4, all column names are prefaced (or qualified) by A. or C. This is only required for columns that appear by the same name in both tables, schoolnum and permnum. However, it is good practice to qualify all column names. Table aliases can be used in the FROM clause to simplify qualifying column names, and are needed when joining a table with itself. SELECT * has a special meaning for multi-table queries. When used as we use it for a single-table query, it selects all columns of all tables in the query. It can also be used with with a qualifier to mean all the columns of one table. The following query selects two columns from table A and all columns from table C.
SELECT A.PERMNUM, A.FIRSTNAME, C.* FROM SSASIDB1.ASTU_STUDENT A INNER JOIN SSASIDB1.AEMG_CONTACT C ON . . .

Staff Development Session 2002.21

11 12/11/2002

SQL Considerations, cont.

Checking for nulls is necessary even for columns that dont allow nulls in the database. For any table that is not the first or major table in an outer join, columns will be null when there was no matching row in that table. In COBOL, a null indicator must be defined and checked, just as for any other possible null value. Join using SQL instead of program logic. The DB2 optimizer can usually perform a join faster than a programming language can process separate cursors for each table. Use joins instead of subqueries when possible. Even when only columns from one of the tables are needed, it is usually more efficient to use a join. Join on clustered or indexed columns when possible, for better efficiency. Use caution when using ORDER BY. If the columns in the ORDER BY clause are all from one table, DB2 might avoid a sort. Provide as much search criteria as possible in addition to the join criteria. Additional criteria in the WHERE clause, preferably for each of the tables, provides DB2 with the best opportunity to rank the tables for joining in the most efficient manner.
12 12/11/2002

Staff Development Session 2002.21

Multitable Joins

When multiple tables are joined using the SQL2 notation, the first join occurs, producing a results set. Then the next table is joined with the results set from the first join, and so on. Join expressions may be enclosed in parentheses, and the resulting table used in another join expression. The processing specified in the FROM clause occurs first, including all joins. Then the WHERE clause conditions are applied to the resulting table. In the WHERE clause, consideration must be given to the fact that some columns may be null in the results set even though those columns do not allow nulls in the database.
13 12/11/2002

Staff Development Session 2002.21

Multitable Join Example


SELECT A.PERMNUM, A.FIRSTNAME, C.TELEPHONE1, P.FIRSTNAME FROM SSASIDB1.ASTU_STUDENT A LEFT OUTER JOIN SSASIDB1.AEMG_CONTACT C ON A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM INNER JOIN SSASIDB1.APRN_PARENT P ON A.SCHOOLNUM = P.SCHOOLNUM AND A.PERMNUM = P.PERMNUM WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K' AND P.CCSD_SEQUENCE = 1; ---------+---------+---------+---------+---------+---------+----PERMNUM FIRSTNAME TELEPHONE1 FIRSTNAME ---------+---------+---------+---------+---------+---------+----294286 EVAN (702) 604-4708 SHANNON 295027 CAMERON (702) 656-5636 CHRISTOPHER 326104 PRESTON (702) 646-1653 JANA

400382 400389 400400

HALEY TIANNA KELSEY

--------------( ) 360-4277 --------------360-6255 656-2589

BRADLEY LAWSON JACK DANILO JESSE

459439 ISAIAH ( ) 500507 GRAYDON ( ) DSNE610I NUMBER OF ROWS DISPLAYED IS 40

This is not the number of rows we were expecting!

Staff Development Session 2002.21

14 12/11/2002

Multitable Join Example, cont.


SELECT A.PERMNUM, A.FIRSTNAME, C.TELEPHONE1, P.FIRSTNAME FROM SSASIDB1.ASTU_STUDENT A LEFT OUTER JOIN SSASIDB1.AEMG_CONTACT C ON A.SCHOOLNUM = C.SCHOOLNUM AND A.PERMNUM = C.PERMNUM LEFT OUTER JOIN SSASIDB1.APRN_PARENT P ON A.SCHOOLNUM = P.SCHOOLNUM AND A.PERMNUM = P.PERMNUM WHERE A.SCHOOLNUM = '235' AND A.TRK = '5' AND A.GRADE = ' K' AND (P.CCSD_SEQUENCE = 1 OR P.CCSD_SEQUENCE IS NULL); ---------+---------+---------+---------+---------+---------+--------PERMNUM FIRSTNAME TELEPHONE1 FIRSTNAME ---------+---------+---------+---------+---------+---------+--------294286 EVAN (702) 604-4708 SHANNON 295027 CAMERON (702) 656-5636 CHRISTOPHER 400497 TROY ( ) 233-2830 -------------------400500 BOSTYN --------------- -------------------

400506 400514

SARAH SAVANNAH

( ) 645-0003 ---------------

RICHARD -------------------SHANNON AMANDA DEBRA -------------------JESSE

407217 ZOIE ( ) 646-6156 413048 ALEX --------------413111 RACHEL ( ) 645-8081 499390 SPENCER --------------500507 GRAYDON ( ) 656-2589 DSNE610I NUMBER OF ROWS DISPLAYED IS 113

Note we need to add the last line to the query, to include rows where there was no APRN_PARENT row and the P.CCSD_SEQUENCE is null.

Staff Development Session 2002.21

15 12/11/2002

Cross Joins and Union Joins in SQL2

Here are two new types of joins defined in the SQL2 standard, but not yet adopted by DB2 (as of v.7). Cross join a Cartesian product of two tables
SELECT * FROM TABLEA CROSS JOIN TABLEB

Union join all the rows of the first table, nullextended with columns of the second table plus all the rows of the second table, null-extended with columns of the first table.
SELECT * FROM TABLEA UNION JOIN TABLEB

Staff Development Session 2002.21

16 12/11/2002

Summary

In a multi-table query (a join), the tables containing the data are named in the FROM clause. Join criteria (called the join predicate) may be in the FROM clause (SQL2 notation) or included in the WHERE clause. The selection criteria (called the local predicate) are named in the WHERE clause, and are applied to the results of the join. Outer joins extend the inner join by retaining unmatched rows of one or both of the joined tables in the query results, and using null values for data from the other table. A table may be joined to itself. Self-joins require the use of a table alias.

Staff Development Session 2002.21

17 12/11/2002