You are on page 1of 42

Sub-queries

IMPROVING QUERY PERFORMANCE IN SQL SERVER

Dean Smith
Founder, Atamai Analytics
How do sub-queries look?

IMPROVING QUERY PERFORMANCE IN SQL SERVER


How do sub-queries look?

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query with FROM

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query with FROM
SELECT OrderID, OrderID CustomerID NumDays
CustomerID,
NumDays 10380 HUNGO 35
FROM
10427 PICCO 35
(SELECT *,
DATEDIFF(DAY,OrderDate,ShippedDate) AS NumDays 10545 LAZYK 35
FROM Orders) AS o
10593 LEHMS 35
WHERE NumDays >= 35;
10660 HUNGC 37
10777 GOURL 37
10924 BERGS 35

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query with WHERE

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query with WHERE
SELECT CustomerID CustomerID CompanyName
,CompanyName
FROM Customers QUEEN Queen Cozinha
WHERE CustomerID
QUICK QUICK-Stop
IN (SELECT CustomerID
FROM Orders SAVEA Save-a-lot Markets
WHERE Freight > 800);

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query with SELECT

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query with SELECT
SELECT CustomerID, CustomerID CompanyName AvgFreight
CompanyName,
(SELECT AVG(Freight) Alfreds
ALFKI 37.6
FROM Orders o Fu erkiste
WHERE c.CustomerID = o.CustomerID) AS AvgFreight
FROM Customers c; Ana Trujillo
ANATR Emparedados y 24.4
helados
Antonio Moreno
ANTON 38.4
Taquería
... ... ...

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Types of sub-queries
Uncorrelated sub-query Correlated sub-query

SELECT CustomerID SELECT CustomerID,


,CompanyName CompanyName,
FROM Customers (SELECT AVG(Freight)
WHERE CustomerID IN FROM Orders o
(SELECT CustomerID WHERE c.CustomerID = o.CustomerID) AS AvgFreight
FROM Orders FROM Customers c;
WHERE Freight > 800);

Sub-query contains a reference to the outer


Sub-query does not contain a reference to
query
the outer query
Sub-query cannot run independently of the
Sub-query can run independently of the
outer query
outer query
Used with WHERE and SELECT
Used with WHERE and FROM

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query performance
Correlated

Sub-query executes for each row in the outer query

Uncorrelated

Sub-query executes only once and returns the results to the outer query

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Sub-query vs. INNER JOIN
Correlated sub-query INNER JOIN

SELECT CustomerID, SELECT c.CustomerID,


CompanyName, c.CompanyName,
(SELECT AVG(Freight) AVG(o.Freight)
FROM Orders o FROM Customers c
WHERE c.CustomerID = o.CustomerID) AS AvgFreight INNER JOIN Orders o
FROM Customers c; ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerID,
c.CompanyName;

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Let's practice!
IMPROVING QUERY PERFORMANCE IN SQL SERVER
Presence and
absence
IMPROVING QUERY PERFORMANCE IN SQL SERVER

Dean Smith
Founder, Atamai Analytics
Venn diagram - presence
Data present in both tables.

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Venn diagram - absence
Data present in the le table but absent in the right table.

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Customer Orders database

IMPROVING QUERY PERFORMANCE IN SQL SERVER


INTERSECT
SELECT CustomerID
FROM Customers

SELECT CustomerID
FROM Orders;

IMPROVING QUERY PERFORMANCE IN SQL SERVER


INTERSECT
SELECT CustomerID CustomerID
FROM Customers
ALFKI
INTERSECT LAUGB

SELECT CustomerID
QUICK
FROM Orders; REGGC
SPLIR
CHOPS
...

IMPROVING QUERY PERFORMANCE IN SQL SERVER


EXCEPT
SELECT CustomerID
FROM Customers

SELECT CustomerID
FROM Orders;

IMPROVING QUERY PERFORMANCE IN SQL SERVER


EXCEPT
SELECT CustomerID CustomerID
FROM Customers
FISSA
EXCEPT PARIS

SELECT CustomerID
FROM Orders;

IMPROVING QUERY PERFORMANCE IN SQL SERVER


INTERSECT and EXCEPT
Advantages

Great for data interrogation

Remove duplicates from the returned results

Disadvantages

The number and order of columns in the SELECT statement must be the same between
queries

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Let's practice!
IMPROVING QUERY PERFORMANCE IN SQL SERVER
Alternative methods
1
IMPROVING QUERY PERFORMANCE IN SQL SERVER

Dean Smith
Founder, Atamai Analytics
EXISTS
SELECT CustomerID, CustomerID CompanyName ContactName
CompanyName,
Alfreds
ContactName ALFKI Maria Anders
Fu erkiste
FROM Customers c
Laughing
WHERE EXISTS Yoshi
LAUGB Bacchus Wine
(SELECT 1 Tannamuri
Cellars
FROM Orders o
WHERE c.CustomerID = o.CustomerID);
QUICK QUICK-Stop Horst Kloss
... ... ...

IMPROVING QUERY PERFORMANCE IN SQL SERVER


IN
SELECT CustomerID, CustomerID CompanyName ContactName
CompanyName, Alfreds
ALFKI Maria Anders
ContactName Fu erkiste
FROM Customers
Laughing
WHERE CustomerID IN Yoshi
LAUGB Bacchus Wine
Tannamuri
(SELECT CustomerID Cellars
FROM Orders);
QUICK QUICK-Stop Horst Kloss
... ... ...

IMPROVING QUERY PERFORMANCE IN SQL SERVER


EXISTS vs. IN

EXISTS will stop searching the sub-query when the condition is TRUE

IN collects all the results from a sub-query before passing to the outer query

Consider using EXISTS instead of IN with a sub-query

IMPROVING QUERY PERFORMANCE IN SQL SERVER


NOT EXISTS
SELECT CustomerID, CustomerID CompanyName ContactName
CompanyName,
FISSA Fabrica
ContactName FISSA Inter. Diego Roel
FROM Customers c Salchichas S.A.
WHERE NOT EXISTS
Marie
(SELECT 1 PARIS Paris spécialités
Bertrand
FROM Orders o
WHERE c.CustomerID = o.CustomerID);

IMPROVING QUERY PERFORMANCE IN SQL SERVER


NOT IN
SELECT CustomerID, CustomeID CompanyName ContactName
CompanyName,
FISSA Fabrica
ContactName FISSA Inter. Diego Roel
FROM Customers Salchichas S.A.
WHERE CustomerID NOT IN
Marie
(SELECT CustomerID PARIS Paris spécialités
Bertrand
FROM Orders);

IMPROVING QUERY PERFORMANCE IN SQL SERVER


NOT IN and NULLs
SELECT UNStatisticalRegion AS UN_Region UN_Region CountryName Capital
,CountryName
,Capital
FROM Nations
WHERE Capital NOT IN
(SELECT NearestPop
FROM Earthquakes);

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Handling NOT IN NULLs
SELECT UNStatisticalRegion AS UN_Region UN_Region CountryName Capital
,CountryName
New
,Capital South Asia India
Delhi
FROM Nations
East Asia and
WHERE Capital NOT IN Indonesia Jakarta
Paci c
(SELECT NearestPop
FROM Earthquakes East Asia and
East Timor Dili
WHERE NearestPop IS NOT NULL);
Paci c
Sahara Africa Comoros Moroni
... ... ...

IMPROVING QUERY PERFORMANCE IN SQL SERVER


EXISTS, NOT EXISTS, IN and NOT IN
Advantages

Results can contain any column from the outer query, and in any order

Disadvantages

The way NOT IN handles NULL values in the sub-query

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Let's practice!
IMPROVING QUERY PERFORMANCE IN SQL SERVER
Alternative methods
2
IMPROVING QUERY PERFORMANCE IN SQL SERVER

Dean Smith
Founder, Atamai Analytics
INNER JOIN
SELECT c.CustomerID CustomerID CompanyName OrderID ...
,c.CompanyName Vins et alcools
VINET 10248 ...
,o.OrderID Chevalier
,o.OrderDate
HANAR Hanari Carnes 10250 ...
,o.ShippedDate
,o.Freight Victuailles en
VICTE 10251 ...
stock
FROM Customers c
INNER JOIN Orders o Suprêmes
SUPRD 10252 ...
ON c.CustomerID = o.CustomerID délices
... ... ... ...

IMPROVING QUERY PERFORMANCE IN SQL SERVER


LEFT OUTER JOIN
Inclusive LEFT OUTER JOIN Exclusive LEFT OUTER JOIN

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Exclusive LEFT OUTER JOIN
SELECT c.CustomerID CustomerID CompanyName OrderID ...
,c.CompanyName FISSA Fabrica
,o.OrderID FISSA Inter. Salchichas NULL ...
,o.OrderDate S.A.
,o.ShippedDate PARIS Paris spécialités NULL ...
,o.Freight
FROM Customers c
LEFT OUTER JOIN Orders o
ON c.CustomerID = o.CustomerID
WHERE o.CustomerID IS NULL

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Review: INTERSECT and EXCEPT
INTERSECT : checks for presence

EXCEPT : checks for absence

Advantages

Great for data interrogation

Remove duplicates from the returned results

Disadvantage

The number and order of columns in the SELECT statement must be the same between
queries

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Review: EXISTS and NOT EXISTS
EXISTS : checks for presence

NOT EXISTS : checks for absence

Advantages

The sub-query will stop searching as soon as it evaluates to TRUE

Results can contain any column from the outer query, and in any order

Disadvantage

Results can only contain columns from the outer query

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Review: IN and NOT IN
IN : checks for presence

NOT IN : checks for absence

Advantage

Results can contain any column from the outer query, and in any order

Disadvantages

Results can only contain columns from the outer query

No results returned because of the way NOT IN handles nulls in the sub-query

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Review: INNER JOIN and exclusive L.O.J
INNER JOIN : checks for presence

exclusive LEFT OUTER JOIN : checks for absence

Advantage

Results can contain any column, from all joined queries, in any order

Disadvantage

Requirement to add the IS NULL WHERE lter condition with the exclusive
LEFT OUTER JOIN

IMPROVING QUERY PERFORMANCE IN SQL SERVER


Let's practice!
IMPROVING QUERY PERFORMANCE IN SQL SERVER

You might also like