You are on page 1of 58

SQL

:

The Query
Language

CS 186, Spring 2006,
Lectures 11&12
R &G - Chapter 5

Life is just a bowl of queries.
-Anon

Administrivia
• Midterm1 was a bit easier than I wanted it to be.
– Mean was 80
– Three people got 100(!)
– I’m actually quite pleased.
– But, I do plan to “kick it up a notch” for the future
exams.
• Be sure to register your name with your cs186 login if you
haven’t already --- else, you risk not getting grades.
• Homework 2 is being released today.
– Today and Tuesday’s lectures provide background.
– Hw 2 is due Tuesday 3/14
– It’s more involved than HW 1.

Relational Query Languages
• A major strength of the relational model: supports
simple, powerful querying of data.
• Two sublanguages:
• DDL – Data Defn Language
– define and modify schema (at all 3 levels)
• DML – Data Manipulation Language
– Queries can be written intuitively.
• The DBMS is responsible for efficient evaluation.
– The key: precise semantics for relational queries.
– Allows the optimizer to extensively re-order
operations, and still ensure that the answer does
not change.
– Internal cost model drives use of indexes and choice
of access paths and physical operators.

The SQL Query Language
• The most widely used relational query
language.
• Originally IBM, then ANSI in 1986
• Current standard is SQL-2003
• Introduced XML features, window functions,
sequences, auto-generated IDs.
• Not fully supported yet

• SQL-1999 Introduced “Object-Relational”
concepts. Also not fully suppored yet.
• SQL92 is a basic subset
• Most systems support a medium

• PostgreSQL has some “unique” aspects (as
do most systems).

The SQL DML
• Single-table queries are
straightforward.
• To find all 18 year old students,
we can write:
SELECT *
FROM Students S
WHERE S.age=18

QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.

• To find just names and logins, replace

SELECT S.name, S.login

E. E.cid result = S.name Jones History105 Note: obviously no referential integrity constraints have been used .sid=E.name.Querying Multiple Relations • Can specify a join over two tables as follows: SELECT S.cid FROM Students S. Enrolled E WHERE S.grade=‘B' QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.sid AND E.

SELECT [DISTINCT] target-list Basic SQL Query FROM relation- list • relation-list : A list WHERE of relation names qualification – possibly with a range-variable after each name • target-list : A list of attributes of tables in relation-list • qualification : Comparisons combined using AND. – Comparisons are Attr op const or Attr1 op Attr2. – In SQL SELECT. OR and NOT. the default is that duplicates are not eliminated! (Result is called a “multiset”) . where op is one of =≠<>≤≥ • DISTINCT: optional keyword indicating that the answer should not contain duplicates.

.Query Semantics • Semantics of an SQL query are defined in terms of the following conceptual evaluation strategy: 1.g. do SELECT clause: Delete unwanted fields. (i. do FROM clause: compute cross-product of tables (e. “projection”). Probably the least efficient way to compute a query! – An optimizer will find more efficient strategies to get the same answer.e. Students and Enrolled). If DISTINCT specified. . 2. “selection”).. do WHERE clause: Check conditions..e. (i. 3. discard tuples that fail. eliminate duplicate rows. 4.

name. .grade QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.sid AND E.SELECT Cross Product S. Enrolled E WHERE S. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.cid FROM Students S.sid=E. E.

Step 2) Discard tuples that fail predicate QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.sid AND E.sid=E.grade=‘B' . E. SELECT S.cid FROM Students S. Enrolled E WHERE S.name.

SELECT S. Enrolled E WHERE S. E.sid AND E.cid FROM Students S.name.sid=E.grade=‘B' .Step 3) Discard Unwanted Columns QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.

0 Lubber 8 55. sid 22 31 95 bid day 101 10/10/96 103 11/12/96 sname rating age Dustin 7 45.5 bid 101 102 103 104 bname Interlake Interlake Clipper Marine color blue red green red .5 Bob 3 63.Reserves sid 22 95 Now the Details Sailors We will use these instances of relations in Boats our examples.

Example Schemas (in SQL DDL) CREATE TABLE Sailors (sid INTEGER. FOREIGN KEY sid REFERENCES Sailors. bname CHAR (20). PRIMARY KEY (sid. FOREIGN KEY bid REFERENCES Boats) bid .rating INTEGER. INTEGER. day DATE. date). bid. PRIMARY KEY sid) age CREATE TABLE Boats (bid INTEGER. REAL. color CHAR(10) PRIMARY KEY bid) CREATE TABLE Reserves (sid INTEGER. sname CHAR(20).

5 22 101 10/10/ 96 95 Bob 3 63.sid (sid) sname rating age (sid) bid day AND bid=103 22 dustin 7 45. Reserves SELECT FROM WHERE Sailors.5 22 101 10/10/ 96 31 lubber 8 55.0 22 101 10/10/ 96 22 dustin 7 45.sid=Reserves.Another Join Query sname Sailors.5 95 103 11/12/ 96 .5 58 103 11/12/ 96 95 Bob 3 63.0 58 103 11/12/ 96 31 lubber 8 55.

sid AND bid=103 Can be SELECT S. if same table used multiple times in same FROM (called a “selfjoin”) SELECT sname FROM Sailors. – saves writing.sid=R.Reserves WHERE Sailors.sid AND bid=103 . – for example. Reserves R range variables WHERE as: S.sname rewritten using FROM Sailors S.sid=Reserves. makes queries easier to understand • Needed when ambiguity could arise.Some Notes on Range Variables • Can associate “range variables” with the tables in the FROM clause.

sname.age.age > 20 . Sailors y WHERE x.age FROM Sailors x.sname. y. y. x.More Notes • Here’s an example where range variables are required (self-join example): SELECT x.age • Note that target list can be replaced by “*” if you don’t want to do a projection: SELECT * FROM Sailors x WHERE x.age > y.

sname in the SELECT clause? – Would adding DISTINCT to this variant of the query make a difference? .sid FROM Sailors S. Reserves R WHERE S.sid by S.Find sailors who’ve reserved at least one boat SELECT S.sid • Would adding DISTINCT to this query make a difference? • What is the effect of replacing S.sid=R.

age.sname AS name2 FROM Sailors S1.age-5column AS age1.Expressions • Can use arithmetic expressions in SELECT clause (plus other operations we’ll discuss later) •SELECT Use ASS.sname AS name1.age AS age2 FROM Sailors S WHERE S. Sailors S2 WHERE 2*S1.sname = ‘dustin’ • Can also have expressions in WHERE clause: SELECT S1. S2. to provide names S. 2*S.rating = S2.rating .1 .

sname LIKE ‘B_%B’ `_’ stands for any one character and `%’ stands for 0 or more arbitrary characters.String operations •SQL also supports some string operatio •“LIKE” is used for string matching.age-5. 2*S.age. SELECT S. .age AS age2 FROM Sailors S WHERE S. age1=S.

color=‘green’) (note: SELECT R.Find sid’s of sailors who’ve reserved a red or a green boat • UNION: Can be used to compute the union of any two union-compatible sets of tuples (which are themselves the result of SQL queries).color=‘red’OR B.sid FROM Boats B. Vs.color=‘green’ .bid=B.sid by default.bid AND UNION ALL) B.bid=B.bid AND (B. Reserves R eliminates WHERE R.bid AND B. Reserves R Override w/ WHERE R.color=‘red’ duplicates UNION SELECT R. FROM Boats B.Reserves R WHERE R.bid=B. SELECT DISTINCT R.sid UNION FROM Boats B.

sid (B.color=‘red’ AND B.sid=R2.color=‘green’) AND R1. FROM Boats Boats B.bid AND R2.Reserves R B2.sid B1. could use a self-join: SELECT R1.bid AND (B1.Find sid’s of sailors who’ve reserved a red and a green boat • If we simply replace OR by AND in the previous query. Reserves R1. Reserves R2 WHERE R.bid=B.color=‘green’) .sid SELECT FROM BoatsR. (Why?) • Instead. we get the wrong answer.color=‘red’ AND B2.bid=B2.bid AND WHERE R1.bid=B1.

sid FROM Sailors S. Can be used to compute the intersection of any two union-compatible sets of tuples. but many systems don’t support them.sid FROM Sailors S.color=‘red’ INTERSECT SELECT S.AND Continued… • INTERSECT:discussed in book.color=‘green’ . Boats B.sid=R. Reserves R WHERE S.bid AND B.sid AND R.bid=B.sid=R.bid AND B.bid=B. Reserves R WHERE S.sid AND R. Boats B. • Also in text: EXCEPT (sometimes called MINUS) • Included in the SQL/92 standard. Key field! SELECT S.

• To understand semantics of nested queries: – think of a nested loops evaluation: For each Sailors tuple.Nested Queries • Powerful feature of SQL: WHERE clause can itself contain an SQL query! – Actually.sid IN (SELECT R. Names of sailors who’ve reserved boat # • To find sailors who’ve not reserved #103. SELECT S.sname FROM Sailors S WHERE S. use NOT IN.bid=103) . so can FROM and HAVING clauses.sid FROM Reserves R WHERE R. check the qualification by computing the subquery.

s • EXISTS is another set comparison operator. and * is replaced by R. • Subquery must be recomputed for each Sailors tuple. • Can also specify NOT EXISTS • If UNIQUE is used.Nested Queries with Correlation Find names of sailors who’ve reserved SELECT S. – UNIQUE checks for duplicate tuples in a subquery.bid. like IN. – Think of subquery as a function call that runs a query! .sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R. finds sailors with at most one reservation for boat #103.sid=R.bid=103 AND S.

op ALL • Find sailors whose rating is greater than that of some sailor called Horatio: SELECT * FROM Sailors S WHERE S.rating > ANY (SELECT S2.More on Set-Comparison Operators • We’ve already seen IN. Can also use NOT IN.rating FROM Sailors S2 WHERE S2.sname=‘Ho . EXISTS and UNIQUE. NOT EXISTS and NOT UNIQUE. • Also available: op ANY.

bid AND B. EXCEPT queries re-written using NOT IN.bid AND B2.color=‘green • Similarly. • How would you change this to find names (not sid’s) of Sailors who’ve reserved both red and green boats? .sid FROM Boats B2.color=‘red’ AND R.sid IN (SELECT R2.sid FROM Boats B.Rewriting INTERSECT Queries Using IN Find sid’s of sailors who’ve reserved both a red and a green boat: SELECT R.bid=B. Reserves R WHERE R.bid=B2. Rese WHERE R2.

FROM Sailors S WHERE NOT EXISTS (SELECT B.. not using EXCEPT: SELECT S.. .bid there is no FROM Boats boat B WHERE NOT EX that doesn’t a Reserves tuple showing S reserved B have ..Division in SQL Find names of sailors who’ve reserved all b • Example in book.snameSailors S such that ..

etc. . • Typically. set operators. • SQL provides functionality close to that of the basic relational model.Summary • An advantage of the relational model is its well-defined query semantics. • Lots more functionality beyond these basic features. null values. – some differences in duplicate handling. many ways to write a query – the system is responsible for figuring a fast way to actually execute a query regardless of how it is written.Basic SQL Queries .

rating) .sname=‘Bob’ S.COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) Aggregate Operators • Significant extension of relational algebra. SELECT COUNT (*) FROM Sailors S single column SELECT AVG (S.rating=10 SELECT COUNT (DISTINCT FROM Sailors S WHERE S.age) FROM Sailors S WHERE S.

COUNT (*) COUNT ( [DISTINCT] A SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) Aggregate Operators (continued) single colum SELECT S.sname FROM Sailors S WHERE S.rating= (SELECT MAX(S2.rating) FROM Sa .

SELECT S. S.a SELECT S. MAX FROM Sailors S (S.age FROM Sailors S WHERE S.sname.age . but not supported in some systems.age = (SELECT MA FROM Sai SELECT S.ag FROM Sa = S.Find name and age of the oldest sailor(s) • The first query is incorrect! • Third query equivalent to second query – allowed in SQL/92 standard. S.sname.age FROM Sailors S WHERE (SELECT MAX (S2.sname.

FROM . and what the rating values for these levels are! – Suppose we know that rating values go from 1 to 10.. we don’t know how many rating levels exist. . 10: Sailors S WHERE S.GROUP BY and HAVING • So far.rating = i .. we’ve applied aggregate operators to all (qualifying) tuples. we want to apply them to each of several groups of tuples. – In general. • Consider: Find the age of the youngest sailor for each rating level. 2.age) For i = 1. – Sometimes. we can write 10 queries that look like this (!): SELECT MIN (S.

MIN (S.Queries With GROUP BY • To generate values for a column based on groups of rows. use aggregate functions in SELECT statements with the GROUP BY clause SELECT [DISTINCT] target-list FROM relation-list [WHERE qualification] GROUP BY grouping-list The target-list contains (i) list of column names & (ii) terms with aggregate operations (e..g. – column name list (i) can contain only attributes from the grouping-list. .age)).

rating.rating AVG (S. FROM Sailors S GROUP BY S.age) For each rating find the age of the youn sailor with age  18 SELECT S.rating. MIN FROM Sailors S WHERE S.age >= 18 GROUP BY S.Group By Examples For each rating.rating (S. find the average age of SELECT S.age) .

`unnecessary’ fields are deleted. and the remaining tuples are partitioned into groups by the value of attributes in groupinglist. • One answer tuple is generated per qualifying group. .Conceptual Evaluation • The cross-product of relation-list is computed. tuples that fail qualification are discarded.

0 .0 3. rows.0 35.age >= 18 GROUP BY S. Delete unneeded columns.0 35. form groups age 33.5 16.rating. Perform Aggregation rating 1 7 8 10 1.0 55.0 33.5 35.0 55.0 35.0 55. (S.rating sname rating age dustin lubber zorba horatio brutus rusty 7 8 10 7 1 10 45.0 MIN Answer Table rating 1 7 7 8 10 age 33.0 35. Form cross product 2.0 45.0 35.age) FROM Sailors S WHERE S.SELECT sid 22 31 71 64 29 58 S.

.bid=B.bid • Grouping over a join of two relations.color=‘red’ GROUP BY B.bid AND B. COUNT(*)AS scount FROM Boats B. Reserves R WHERE R. SELECT B.bid.Find the number of reservations for each red boat.

color=‘red’ GROUP BY B.bid 101 102 103 104 101 102 103 104 b. Reserves R WHERE R.bid AND B.bid b.bid.bid 101 101 101 101 102 102 102 102 b.bid r. COUNT (*) AS scount FROM Boats B.bid 102 red 102 2 b.color r.bid=B.color blue red green red blue red green red 1 b.bid scount 102 1 answer .SELECT B.

Queries With GROUP BY and HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING groupqualification • Use the HAVING clause with the GROUP BY clause to restrict which group-rows are returned in the result set .

– Expressions in group-qualification must have a single value per group! – That is. • The group-qualification is then applied to eliminate some groups. (SQL does not exploit primary key semantics here!) • One answer tuple is generated per qualifying group. attributes in group-qualification must be arguments of an aggregate op or must also appear in the grouping-list. .Conceptual Evaluation • Form groups as before.

0 2 TIFF (LZW) decompressor are needed to see this picture. 8 55.0 10 35.0 rating age 1 33.0 1 7 35.5 8 55.rating.5 WHERE S.rating 64 horatio 7 35.age >= 18 71 zorba 10 16. MIN (S.0 1 Answer relati .0 rating m­age count 7 45.age) 22 dustin 7 45.0 58 rusty 10 35.0 HAVING COUNT (*) > 1 29 brutus 1 33.0 GROUP BY S. for each rating with at least 2 such sailors sid sname rating age SELECT S.0 FROM Sailors S 31 lubber 8 55.0 QuickTime™ and a 7 35.Find the age of the youngest sailor with age  18.0 1 33.0 1 10 35.

not using EXCEPT: SELECT S...sname FROM Sailors S Sailors S such that ... WHERE a Reserves tuple showing S reserved B NOT EX .bid there is no boat B FROM Boats without . WHERE NOT EXISTS (SELECT B.Find names of sailors who’ve reserved all • Example in book.

bi ( Select Note: must have both sid and name in the G clause.sid = R. reserves R WHERE S.name FROM Sailors S.sid GROUP BY S.Find names of sailors who’ve reserved • Can you do this using Group By and Having? SELECT S. S.sid HAVING COUNT(DISTINCT R. Why? .name.

sid GROUP BY S.name Dustin Bob SELECT S.name Apply having clause to groups s.sid .sid FROM Sailors S.bid Select COUNT (*) FR 22 22 101 Boats 31 22 101 QuickTime™ and a TIFF (LZW) decompressor 95 22 101 are needed to see this picture. S.sid = r.name Dustin Lubber Bob Dustin Lubber Bob s.bid) = s.s. reserves R WHERE S.sid HAVING COUNT(DISTINCT R.sid Count (*) from boats = 4 bcount 22 95 1 1 s.name. 22 95 102 31 95 102 95 95 102 s. S.sid r.name.sid r.

sid = 22. (must be type compatible) . ‘Clipper’. ‘purple’) INSERT INTO Boats (bid.INSERT INSERT [INTO] table_name [(column_list)] VALUES ( value_list) INSERT [INTO] table_name [(column_list)] INSERT INTO Boats VALUES ( 105.bid FROM Reserves R WHERE r. ‘yellow’) <select statement> You can also do a “bulk insert” of values from one table into another: INSERT INTO TEMP(bid) SELECT r. color) VALUES (99.

. bid = (SELECT r.sid = 22) Can also modify tuples using UPDATE statement.DELETE & UPDATE DELETE [FROM] table_name [WHERE qualification] DELETE FROM Boats WHERE color = ‘red’ DELETE FROM Boats b WHERE b. UPDATE Boats SET Color = “green” WHERE bid = 103.bid FROM Reserves R WHERE r.

) – New operators (in particular.: – Special operators needed to check if value is/is not null. false and unknown). OR and NOT connectives? – We need a 3-valued logic (true. .g. E. – Is rating>8 true or false when rating is equal to null? What about AND.g..g. outer joins) possible/needed. – SQL provides a special value null for such situations. WHERE clause eliminates rows that don’t evaluate to true.g.. a rating has not been assigned) or inapplicable (e. • The presence of null complicates many issues.Null Values • Field values in a tuple are sometimes unknown (e. (e. – Meaning of constructs must be defined carefully.. no spouse’s name).

Joins SELECT (column_list) FROM table_name [INNER | {LEFT |RIGHT | FULL } OUTER] JOIN table_name ON qualification_list WHERE … Explicit join semantics needed unless it is an INNER join (INNER is default) .

sid = r. s.bid FROM Sailors s INNER JOIN Reserves r ON s.sid Returns only those sailors who have reserved boats SQL-92 also allows: SELECT s.bid FROM Sailors s NATURAL JOIN Reserves r “NATURAL” means equi-join for each pair of attributes with the same name (may need to rename with “AS”) .name.Inner Join Only the rows that match the search conditions are returned.sid. r. s. SELECT s.sid.name. r.

sid = r. s.sid sid sname rating age 22 31 95 Dustin Lubber Bob s. r.0 55.sid 7 8 3 45.SELECT s.5 sid bid day 22 101 10/10/96 95 103 11/12/96 s.name r.sid.bid 22 Dustin 101 95 Bob 103 .name.5 63.bid FROM Sailors s INNER JOIN Reserves r ON s.

name.Left Outer Join Left Outer Join returns all matched rows. plus all unmatched rows from the table on the left of the join clause (use nulls in fields of non-matching tuples) SELECT s.sid = r. s.sid Returns all sailors & information on whether they have reserved boats .sid. r.bid FROM Sailors s LEFT OUTER JOIN Reserves r ON s.

sid. r.name.sid = r. s.bid 22 Dustin 101 95 Bob 103 31 Lubber .sid sid sname rating age 22 31 95 Dustin Lubber Bob s.0 55.bid FROM Sailors s LEFT OUTER JOIN Reserves r ON s.5 63.sid 7 8 3 45.name r.SELECT s.5 sid bid day 22 101 10/10/96 95 103 11/12/96 s.

. b.bid = b.sid.name FROM Reserves r RIGHT OUTER JOIN Boats b ON r.bid. plus all unmatched rows from the table on the right of the join clause SELECT r.Right Outer Join Right Outer Join returns all matched rows. b.bid Returns all boats & information on which ones are reserved.

bid sid bid day 22 101 10/10/96 95 103 11/12/96 r. b.bid = b.sid.name FROM Reserves r RIGHT OUTER JOIN Boats b ON r.bid 22 95 color blue red green red . b.SELECT r.bid.name Interlake Interlake Clipper Marine b.sid bid 101 102 103 104 bname Interlake Interlake Clipper Marine 101 102 103 104 b.

bid = b. b.name FROM Reserves r FULL OUTER JOIN Boats b ON r. b.Full Outer Join Full Outer Join returns all (matched or unmatched) rows from the tables on both sides of the join clause SELECT r.bid.bid Returns all boats & all information on reservations .sid.

so all reser have a corresponding tuple in boats.bid bid bname color sid bid day 22 101 10/10/96 95 103 11/12/96 r.bid. b.bid 22 95 blue red green red Note: in this case it is the same as the ROJ b bid is a foreign key in reserves.name Interlake Interlake Clipper Marine b.bid = b. .sid. b.sid 101 102 103 104 Interlake Interlake Clipper Marine 101 102 103 104 b.SELECT r.name FROM Reserves r FULL OUTER JOIN Boats b ON r.

COUNT (*) FROM Boats B.bid.Views CREATE VIEW view_name AS select_statement Makes development simpler Often used for security Not instantiated .color=‘red’ GROUP BY B.bid AND B.makes updates tricky CREATE VIEW Reds AS SELECT B.bid=B. Reserves WHERE R.bid AS scount R .

bid b. Reserves WHERE R. COUNT (*) FROM Boats B.bid=B.bid.color=‘red’ .bid AND GROUP BY B.CREATE VIEW Reds AS SELECT B.bid scount 102 1 Reds AS scount R B.