You are on page 1of 67

SQL: Queries, Constraints, Triggers, Null

February 18, 2014

Administrivia
! Announcements
-! none

! Reading assignment
-!

Chapter 5

! Today
-!

SQL: queries, constraints, triggers, null

! Acknowledgement
-! Some slide content courtesy of Ramakrishnan and Gehrke and Ullman

Example schema
Sailors (sid: integer, sname: string, rating: integer, age: real) Boats (bid: integer, bname: string, color: string) Reserves (sid: integer, bid: integer, day: date)

R1 sid

Example instances
! We will use these instances of the Sailors and Reserves relations in our examples ! If the key for the Reserves relation contained only the attributes sid and bid, how would the semantics differ? S1

22 58

bid day 101 2/10/14 103 2/12/14

sid 22 31 58

sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0

S2

sid 28 31 44 58

sname rating age yuppy 9 35.0 lubber 8 55.5 guppy 5 35.0 rusty 10 35.0
4

Basic SQL query


SELECT FROM WHERE [DISTINCT]

relation-list qualification

target-list

! relation-list: a list of relation names (possibly with a rangevariable after each name) ! target-list: a list of attributes of relations in relation-list ! qualification: relational operators (<, >, = , !, ", # ) combined with logical operators (AND, OR, NOT) ! DISTINCT is an optional keyword indicating that the answer should not contain duplicates. Default includes duplicates! Reminder: - Relational algebra sets - SQL bags (multisets)

Relation algebra <==> SQL

SELECT L FROM R WHERE C = !L("C(R)) projection selection

Unfortunate choice of words!

Conceptual evaluation strategy


! Semantics of an SQL query defined in terms of the following conceptual evaluation strategy:
1.! 2.! 3.! 4.!

Compute the cross-product of relations in relation-list Discard resulting tuples if they fail qualification Delete attributes that are not in target-list If DISTINCT is specified, eliminate duplicate rows

! This strategy is probably the least efficient way to compute a query!


-! An optimizer will find more efficient strategies to compute the same answers

Example of conceptual evaluation


SELECT FROM WHERE

S.sname Sailors S, Reserves R S.sid=R.sid AND R.bid=103

(sid) sname rating age 22 22 31 31 58 58 dustin dustin lubber lubber rusty rusty 7 7 8 8 10 10 45.0 45.0 55.5 55.5 35.0 35.0

(sid) bid day 22 58 22 58 22 58 101 10/10/12 103 11/12/12 101 10/10/12 103 11/12/12 101 10/10/12 103 11/12/12
8

A note on range variables


! Really needed only if the same relation appears twice or more in the FROM clause. The previous query can also be written as:
SELECT FROM WHERE SELECT FROM WHERE

OR

S.sname Sailors S, Reserves R S.sid=R.sid AND bid=103 sname Sailors, Reserves Sailors.sid=Reserves.sid AND bid=103

It is a good style to use range variables!


9

Find sailors whove reserved at least one boat.


SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid

! Would adding DISTINCT to this query make a difference? ! Suppose we replace S.sid by S.sname in the SELECT clause.
-! Would adding DISTINCT make a difference?

10

Expressions and strings


SELECT S.age, age1=S.age-5, 2*S.age AS age2 FROM Sailors S WHERE S.sname LIKE B_%B

! Illustrates use of arithmetic expressions and string pattern matching: Find triples (of ages of sailors and two fields defined by expressions) for sailors whose names begin and end with B and contain at least three characters. ! AS and = are two ways to name fields in result ! LIKE is used for string matching. _ stands for any one character and % stands for 0 or more arbitrary characters.
-! cf. NOT LIKE

11

Find sids of sailors whove reserved a red or a green boat.


! UNION:

can be used to compute the union of any two union-compatible sets of tuples (which are themselves the result of SQL queries) ! If we replace OR by AND in the first version, what do we get? ! Also available: EXCEPT (What do we get if we replace UNION by EXCEPT?)

SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=red OR B.color=green) Another solution: SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=red UNION SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=green
12

Find sids of sailors whove reserved a red and a green boat.


! INTERSECT:

can be used to compute the intersection of any two unioncompatible sets of tuples ! Included in the SQL/92 standard, but some systems dont support it ! Contrast symmetry of the UNION and INTERSECT queries with how much the other versions differ

SELECT S.sid FROM Sailors S, Boats B1, Reserves R1,

Boats B2, Reserves R2 WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid AND (B1.color=red AND B2.color=green)
Better solution: Key field! SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=red INTERSECT SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=green

13

Set semantics by default for set ops


! Motivation: efficiency ! When doing projection, it is easier to avoid eliminating duplicates
-! Just work a-tuple-at-a-time

! For intersection or difference, it is most efficient to sort the relations first


-! At that point you may as well eliminate the duplicates

14

Nested queries
Find names of sailors whove reserved boat #103.
SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)

! A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses) ! To find sailors whove not reserved #103, use NOT IN ! To understand semantics of nested queries, think of a nested loops evaluation: For each Sailors tuple, check the qualification by (re)computing the subquery
15

The IN operator
! <tuple> IN (<subquery>) is true iff the tuple is a member of the relation produced by the subquery
-! Opposite: <tuple> NOT IN (<subquery>)

! IN-expressions can appear in WHERE clauses

16

What do these do?


SELECT a FROM R, S WHERE R.b = S.b; SELECT a FROM R WHERE b IN (SELECT b FROM S);

17

This query pairs tuples from R, S


SELECT a FROM R, S WHERE R.b = S.b; a b 1 2 3 4 R b c 2 5 2 6 S

Double loop, over the tuples of R and S

(1,2) with (2,5) and (1,2) with (2,6) both satisfy the condition; 1 is output twice

18

IN is a predicate about Rs tuples


Two 2s SELECT a FROM R WHERE b IN (SELECT b FROM S); a b 1 2 3 4 R b c 2 5 2 6 S (1,2) satisfies the condition; 1 is output once

One loop, over the tuples of R

19

Nested queries with correlation


Find names of sailors whove reserved boat #103.
SELECT S.sname FROM Sailors S correlation WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)

! EXISTS is another set comparison operator, meaning is it nonempty? ! Correlation requires that subquery be re-computed for each Sailors tuple
20

Nested queries with correlation (cont.)


Find sailors with at most one reservation for boat #103.
SELECT S.sname FROM Sailors S WHERE UNIQUE (SELECT R.bid FROM Reserves R WHERE R.bid=103 AND

S.sid=R.sid)

! UNIQUE checks for duplicate tuples ! What would be the difference of using * vs. R.bid in the subquery?
21

More on set-comparison operators


! Weve already seen IN, EXISTS and UNIQUE
-! Can also use NOT IN, NOT EXISTS and NOT UNIQUE

! Also available: op ANY, op ALL, where op is one of { >,<,=,",#,<> }


!

! Find sailors whose rating is greater than that of some sailor called Rusty:
SELECT * FROM Sailors S WHERE S.rating

ANY = SOME

> ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.sname=Rusty)


22

Aggregate operators

23

Aggregate ops
! Significant extension of relational algebra
SELECT COUNT (*) FROM Sailors S SELECT AVG (S.age) FROM Sailors S WHERE S.rating=10

COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A)

SELECT S.sname FROM Sailors S WHERE S.rating= (SELECT MAX(S2.rating) FROM Sailors S2) SELECT SUM (DISTINCT S.age) FROM Sailors S WHERE S.rating=10
24

single column

SELECT COUNT (DISTINCT S.rating) FROM Sailors S WHERE S.sname=Bob

Find name and age of the oldest sailor(s).


! The first query is illegal!
-! If SELECT clause uses an aggregate operator, it must use only aggregate operators unless the query contains a GROUP BY clause! SELECT S.sname, MAX FROM Sailors S

(S.age)

! We can use a nested query to compute the desired answer though.

SELECT S.sname, S.age FROM Sailors S WHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)

The = testing: int vs. relation?!


25

Motivation for grouping


! So far, weve applied aggregate operators to all (qualifying) tuples
-! We may want to apply them to each of several groups of tuples

! Consider: Find the age of the youngest sailor for each rating level.
-! -!

In general, we dont know in advance how many rating levels exist, and what the rating values for these levels are! Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this:

For i = 1, 2, ... , 10:

SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i

Solution: extension to the basic SQL query form: grouping


-!

Grouping cannot be expressed in relational algebra

26

Queries with GROUP BY and HAVING


SELECT [DISTINCT] attribute-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification

! The attribute-list contains (i) attribute names (ii) terms with aggregate operators (e.g., MIN (S.age))
-! -!

-!

The attribute list must be a subset of grouping-list Intuitively, each answer tuple corresponds to a group, and these attributes must have a single value per group. (A group is a set of tuples that have the same value for all attributes in grouping-list) If a column appears in list (i), but not in grouping-list, there can be multiple rows within a group that have different values in this column, and it is not clear what value should be assigned to this column in an answer row.
27

Conceptual evaluation strategy


! The cross-product of relation-list is computed; tuples that fail qualification are discarded, unnecessary fields are deleted; and the remaining tuples are partitioned into groups by the value of attributes in grouping-list ! Necessary fields are the ones appearing in SELECT clause, GROUP BY clause, and HAVING clause ! The group-qualification is then applied to eliminate some groups. Expressions in group-qualification must have a single value per group! ! One answer tuple is generated per qualifying group

28

Find age of the youngest sailor with age ! 18, for each rating with at least 2 such sailors.
SELECT S.rating, MIN (S.age) AS minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1

Sailors instance:
sid 22 29 31 32 58 64 71 74 85 95 96 sname rating age dustin 7 45.0 brutus 1 33.0 lubber 8 55.5 andy 8 25.5 rusty 10 35.0 horatio 7 35.0 zorba 10 16.0 horatio 9 35.0 art 3 25.5 bob 3 63.5 frodo 3 25.5
29

Answer relation:

rating 3 7 8

minage 25.5 35.0 25.5

Find age of the youngest sailor with age ! 18, for each rating with at least 2 such sailors.
rating 7 1 8 8 10 7 10 9 3 3 3 age 45.0 33.0 55.5 25.5 35.0 35.0 16.0 35.0 25.5 63.5 25.5

rating 1 3 3 3 7 7 8 8 9 10

age 33.0 25.5 63.5 25.5 45.0 35.0 55.5 25.5 35.0 35.0

rating 3 7 8

minage 25.5 35.0 25.5

30

Find age of the youngest sailor with age! 18, for each rating with at least 2 such sailors and with every sailor under 60.

HAVING COUNT (*) > 1 AND EVERY (S.age <=60)


rating 7 1 8 8 10 7 10 9 3 3 3 age 45.0 33.0 55.5 25.5 35.0 35.0 16.0 35.0 25.5 63.5 25.5

rating 1 3 3 3 7 7 8 8 9 10

age 33.0 25.5 63.5 25.5 45.0 35.0 55.5 25.5 35.0 35.0

rating minage 7 35.0 8 25.5

What is the result of changing EVERY to ANY?


31

Find age of the youngest sailor with age !18, for each rating with at least 2 sailors between 18 and 60. S.rating, MIN (S.age) AS minage FROM Sailors S WHERE S.age >= 18 AND S.age <= 60 GROUP BY S.rating HAVING COUNT (*) > 1
SELECT

Sailors instance:
sid 22 29 31 32 58 64 71 74 85 95 96 sname rating age dustin 7 45.0 brutus 1 33.0 lubber 8 55.5 andy 8 25.5 rusty 10 35.0 horatio 7 35.0 zorba 10 16.0 horatio 9 35.0 art 3 25.5 bob 3 63.5 frodo 3 25.5
32

Answer relation:

rating 3 7 8

minage 25.5 35.0 25.5

Null values

33

Another example schema


Beers(name, manf) Bars(name, addr, license) Drinkers(name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar)

34

NULL values
! Tuples in SQL relations can have NULL as a value for one or more components ! Meaning depends on context. Two common cases:
-! Missing value : e.g., we know Joes Bar has some address, but we dont know what it is (possibly yet) -! Inapplicable : e.g., the value of attribute spouse for an unmarried person

35

Rules on comparing NULLs to values


! The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, or UNKNOWN ! Comparing any value (including NULL itself) with NULL yields UNKNOWN ! A tuple is in a query answer iff the WHERE clause is TRUE (not FALSE or UNKNOWN)

36

Three-valued logic
! To understand how AND, OR, and NOT work in 3valued logic, think of
TRUE = 1, FALSE = 0, UNKNOWN = !, AND = MIN, OR = MAX, and NOT(x) = 1-x

! Example: TRUE AND (FALSE OR NOT(UNKNOWN)) = MIN(1, MAX(0, (1 - ! ))) = MIN(1, MAX(0, ! )) = MIN(1, ! ) = ! = UNKNOWN
37

Surprising example
! From the following Sells relation:

bar
Joes Bar

beer
Bud

price
NULL

SELECT bar FROM Sells WHERE price < 2.00 OR price >= 2.00;

38

Reason: 2-valued laws != 3-valued laws


! Some common laws, like commutativity of AND, hold in 3-valued logic ! But, not others, e.g., the law of the excluded middle: p OR NOT p = TRUE does not hold
-! When p = UNKNOWN, (the left side) = MAX( !, (1 ! )) = ! != 1 = TRUE

39

Truth table for 3-valued logic


x T T T U U U F F F y T U F T U F T U F x AND y T U F U U F F F F x OR y T T T T U U T U F NOT x F F F U U U T T T

NULL is not part of relational model its added as an extension in SQL


40

Constraints and triggers

41

Constraints and triggers


! A constraint is a relationship among data elements that the DBMS is required to enforce
-! Example: key constraints

! A Trigger is only executed when a specified condition occurs, e.g., insertion of a tuple
-! Easier to implement than complex constraints

42

Kinds of constraints
1.! Keys 2.! Foreign-keys (or referential-integrity) 3.! Value-based constraints
-! Constrain values of a particular attribute

4.! Tuple-based constraints


-! Relationship among components

5.! Assertions: any SQL boolean expression ! (We have already discussed keys and foreign keys)

43

3. Attribute-based checks
! Constraints on the value of a particular attribute ! Add CHECK(<condition>) to the declaration for the attribute ! The condition may use the name of the attribute, but any other relation or attribute name must be in a subquery

44

Example: Attribute-based check


CREATE TABLE Sells ( bar CHAR(20), beer CHAR(20) CHECK (beer IN (SELECT name FROM Beers)), price REAL CHECK (price <= 5.00) );

45

Timing of checks
! Attribute-based checks are performed only when a value for that attribute is inserted or updated
-! Example: CHECK (price <= 5.00) checks every new price and rejects the modification (for that tuple) if the price is more than $5 -! Example: CHECK (beer IN (SELECT name FROM Beers)) not checked if a beer is deleted from Beers (unlike foreign-keys)

46

4. Tuple-based checks
! CHECK (<condition>) may be added as a relationschema element ! The condition may refer to any attribute of the relation
-! But other attributes or relations require a subquery

! Checked on insert or update only

47

Example: Tuple-based check


! Only Joes Bar can sell beer for more than $5: CREATE TABLE Sells ( bar CHAR(20), beer CHAR(20), price REAL, CHECK (bar = Joes Bar OR price <= 5.00) );

p ! q is equivalent to !p v q
48

5. Assertions
! These are database-schema elements, like relations or views ! Defined by: CREATE ASSERTION <name> CHECK (<condition>); ! Condition may refer to any relation or attribute in the database schema

49

Example: Assertion
! In Sells(bar, beer, price), no bar may charge an average of more than $5. CREATE ASSERTION NoRipoffBars CHECK ( NOT EXISTS ( SELECT bar Bars with an FROM Sells average price above $5 GROUP BY bar HAVING 5.00 < AVG(price) ) );
50

Example: Assertion
! In Drinkers(name, addr, phone) and Bars(name, addr, license), there cannot be more bars than drinkers CREATE ASSERTION FewBar CHECK ( (SELECT COUNT(*) FROM Bars) <= (SELECT COUNT(*) FROM Drinkers) );

51

Timing of assertion checks


! In principle, we must check every assertion after every modification to any relation of the database ! A clever system can observe that only certain changes could cause a given assertion to be violated
-! Example: No change to Beers can affect FewBar. Neither can an insertion to Drinkers

52

Triggers: motivation
! Assertions are powerful, but the DBMS often cant tell when they need to be checked, thus costly ! Attribute- and tuple-based checks are checked at known times, but are not powerful ! Triggers let the user decide when to check for any userdecided condition ! Active database

53

Event-Condition-Action rules
! Another name for trigger is ECA rule, or eventcondition-action rule ! Event : typically a type of database modification, e.g., insert on Sells ! Condition : Any SQL boolean-valued expression including a query ! Action : Any SQL statements

54

Preliminary example: A trigger


! Instead of using a foreign-key constraint and rejecting insertions into Sells(bar, beer, price) with unknown beers, a trigger can add that beer to Beers, with a NULL manufacturer

55

Example: Trigger definition


CREATE TRIGGER BeerTrig The event AFTER INSERT ON Sells REFERENCING NEW ROW AS NewTuple FOR EACH ROW WHEN (NewTuple.beer NOT IN The condition (SELECT name FROM Beers)) INSERT INTO Beers(name) VALUES(NewTuple.beer);
The action

Also see trigger.sql in the lecture notes area.


56

Options: CREATE TRIGGER


! CREATE TRIGGER <name> ! Or: CREATE OR REPLACE TRIGGER <name>
-! Useful if there is a trigger with that name and you want to modify the trigger

57

Options: The Event


! AFTER can be BEFORE
-! Also, INSTEAD OF, if the relation is a view
! A clever way to execute view modifications: have triggers translate them to appropriate modifications on the base tables

! INSERT can be DELETE or UPDATE


-! And UPDATE can be UPDATE . . . ON a particular attribute

58

Options: FOR EACH ROW


! Triggers are either row-level or statement-level ! FOR EACH ROW indicates row-level; its absence indicates statement-level ! Row level triggers : execute once for each modified tuple ! Statement-level triggers : execute once for a SQL statement, regardless of how many tuples are modified

59

Options: REFERENCING
! INSERT statements imply a new tuple (for row-level) or new table (for statement-level)
-! The table is the set of inserted tuples

! DELETE implies an old tuple or table ! UPDATE implies both ! Refer to these by [NEW OLD][ROW TABLE] AS <name>

60

Options: The Condition


! Any boolean-valued condition ! Evaluated on the database as it would exist before or after the triggering event, depending on whether BEFORE or AFTER is used
-! But always before the changes take effect

! Access the new/old row/table through the names in the REFERENCING clause

61

Options: The Action


! There can be more than one SQL statement in the action
-! Surround by BEGIN . . . END if there is more than one

! But queries make no sense in an action, so we are really limited to modifications

62

Another Example
! Using Sells(bar, beer, price) and a unary relation RipoffBars(bar), maintain a list of bars that raise the price of any beer by more than $1.

63

Example: Trigger definition


CREATE TRIGGER PriceTrig AFTER UPDATE OF price ON Sells Updates let us REFERENCING talk about old OLD ROW AS ooo and new tuples NEW ROW AS nnn We need to consider FOR EACH ROW each price change WHEN(nnn.price > ooo.price + 1.00) INSERT INTO RipoffBars VALUES(nnn.bar);
The event only changes to prices

Condition: a raise in price > $1

When the price change is great enough, add the bar to RipoffBars
64

Example2: Trigger definition


MovieExec(name, addr, cert#, netWorth) CREATE TRIGGER NetWorthTrigger AFTER UPDATE OF netWorth ON MovieExec REFERENCING OLD ROW AS OldTuple, NEW ROW AS NewTuple FOR EACH ROW WHEN (OldTuple.netWorth > NewTuple.netWorth) UPDATE MovieExec SET netWorth = OldTuple.netWorth WHERE certNo = NewTuple.certNo;

65

Summary
! SQL was an important factor in the early acceptance of the relational model
-! more natural than earlier, procedural query languages

! Relationally complete; in fact, significantly more powerful than relational algebra ! Even queries that can be expressed in RA can often be expressed more naturally in SQL ! Many alternative ways to write a query; optimizer should look for most efficient evaluation plan
-!

In practice, users need to be aware of how queries are optimized and evaluated for best results

66

Summary (cont.)
! NULL for unknown field values brings many complications ! SQL allows specification of rich integrity constraints ! Triggers respond to changes in the database
-! Active databases

67