Professional Documents
Culture Documents
3. Execution
The number of
attributes is the arity of
the relation
Hacettepe University Computer Engineering Department 8
The Relational Model: Data
Student
sid name gpa
Two conventions:
1. We call relational database instances as simply databases
2. We assume all instances are valid, i.e., satisfy the domain constraints
Relational
SQL Optimized
Algebra (RA) Execution
Query RA Plan
Plan
Relational
SQL Optimized
Algebra (RA) Execution
Query RA Plan
Plan
P Name,Salary (Employee)
Name Salary
John 200000
John 600000
SELECT DISTINCT
sname, Π45$67,"#$ (𝜎"#$&'.) (𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠))
gpa
FROM Students
WHERE gpa > 3.5;
𝜎"#$&'.) (Π45$67,"#$ ( 𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠))
How do we represent
this query in RA? Are these logically equivalent?
Hacettepe University Computer Engineering Department 25
Cross-Product (×) Students(sid,sname,gpa)
People(ssn,pname,address)
×
1234545 John 216 Rosse e
𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠 × 𝑃𝑒𝑜𝑝𝑙𝑒
𝜌4BCDED,5$67,"F$D7GBHI" (𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠)
Students
studId name gradePtAvg
• Notation: R1 ⋈ R2 SQL:
SELECT DISTINCT
• Joins R1 and R2 on equality of all shared ssid, S.name, gpa,
attributes ssn, address
• If R1 has attribute set A, and R2 has attribute set FROM
B, and they share attributes A⋂B = C, can also be
written: R1 ⋈ 𝐶 R2 Students S,
People P
• Our first example of a derived RA operator: WHERE S.name = P.name;
• Meaning: R1 ⋈ R2 = PA U B(sC=D(𝜌M→O (R1) ´ R2))
• Where:
• The rename 𝜌M→O renames the shared attributes in
one of the relations
• The selection sC=D checks equality of the shared
attributes RA:
• The projection PA U B eliminates the duplicate
common attributes 𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠 ⋈ 𝑃𝑒𝑜𝑝𝑙𝑒
Hacettepe University Computer Engineering Department 30
Another example:
Students S People P
sid S.name gpa ssn P.name address
001 John 3.4 ⋈ 1234545 John 216 Rosse
002 Bob 1.3 5423341 Bob 217 Rosse
𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠 ⋈ 𝑃𝑒𝑜𝑝𝑙𝑒
SELECT DISTINCT
gpa,
address
FROM Students S, Π"#$,$DDF744 (𝜎"#$&'.) (𝑆 ⋈ 𝑃))
People P
WHERE gpa > 3.5 AND
S.sname = P.sname;
How do we represent
this query in RA?
Hacettepe University Computer Engineering Department 33
Division
• Notation: r ÷ s
• It has nothing to do with arithmetic division.
• Let r and s be relations on schemas R and S respectively where
• R = (A1, …, Am , B1, …, Bn )
• S = (B1, …, Bn)
The result of r ÷ s is a relation on schema
R – S = (A1, …, Am)
r ÷ s = { t | t Î Õ R-S (r) Ù " u Î s ( tu Î r ) }
Where tu means the concatenation of tuples t and u to produce a single tuple
Database System Concepts, Silberschatz, Korth and Sudarshan
Hacettepe University Computer Engineering Department 34
Division Operation - Example
■ Relations r, s
A B B
a 1 1
a 2 2
a 3
b 1 s
g 1
d 1
d 3 ■ r÷s A
d 4
Î 6 a
Î 1 b
b 2
r
a a a a 1 a 1
a a g a 1 b 1
a a g b 1 s
b a g a 1
b a g b 3
■ r÷s
g a g a 1
A B C
g a g b 1
g a b b 1 a a g
r g a g
Relational
SQL Optimized
Algebra (RA) Execution
Query RA Plan
Plan
Relational
SQL Optimized
Algebra (RA) Execution
Query RA Plan
Plan
Relational
SQL Optimized
Algebra (RA) Execution
Query RA Plan
Plan
1. Set Operations in RA
2. Fancier RA
• R1 – R2
• Example:
• AllEmployees – RetiredEmployees R1 R2
SQL:
• A join that involves a predicate
SELECT *
• R1 ⋈q R2 = s q (R1 ´ R2) FROM
Students,People
• Here q can be any condition WHERE q;
• R ⋉ S = P A1,…,An (R ⋈ S) SQL:
• Where A1, …, An are the attributes in R SELECT DISTINCT
sid,sname,gpa
• Example: FROM
Students,People
• Employee ⋉ Dependents WHERE
sname = pname;
RA:
𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠 ⋉ 𝑃𝑒𝑜𝑝𝑙𝑒
Hacettepe University Computer Engineering Department 49
Semijoins in Distributed Databases
• Semijoins are often used to compute natural joins in distributed databases
Dependents
Employee
SSN Dname Age
SSN Name ... ...
network
... ...
pid=pid
seller-ssn=ssn
P ssn P pid
sname=fred sname=gizmo
• Find the loan number for each loan of an amount greater than $1200
• Find the names of all customers who have a loan and an account at
bank.
Õcustomer_name (borrower) Ç Õcustomer_name (depositor)
• Find the names of all customers who have a loan at the Perryridge
branch but do not have an account at any branch of the bank.
Õcustomer_name (sbranch_name = “Perryridge”
● Query 2
Õcustomer_name(sloan.loan_number = borrower.loan_number (
(sbranch_name = “Perryridge” (loan)) x borrower))