IS312 - L04 - QP-Projection and Set Operation

Database Systems II
Query Optimization Techniques
Dr. Noha Nagy

What is the Problem?
• Problem: efficiently answering a given Query
Query Query
Processing
Result
2
Relational Algebra: Project Operator
• Produces table containing subset of columns of argument table
attribute list(relation)
• Example:
Person Name,Hobby(Person)
Select Name, Hoppy from Person
Id Name Address Hobby Name Hobby
1123 John 123 Main stamps John stamps

1133 John 123 Main coins John coins
5556 Mary 7 Lake Dr hiking Mary hiking
9876 Bart 5 Pine St stamps Bart stamps
3
Project Operator
• Example:
Person Name,Address(Person)
Id Name Address Hobby Name Address
1123 John 123 Main stamps John 123 Main

1133 John 123 Main coins Mary 7 Lake Dr
5556 Mary 7 Lake Dr hiking Bart 5 Pine St
9876 Bart 5 Pine St stamps
Result is a table (no duplicates)

Tuples are unique
4
Projection
• Consider the projection
SELECT DISTINCT ID, name FROM R
ID name
name sal DOB
• The implementation requires the following

– Remove unwanted columns (on-the-fly)
– Eliminate any duplicate tuples produces.
• This step is the difficult one!
• We will describe Sorting to cope with duplicate
elimination
5
 Algorithms
•  <attribute list>(R)
– Just remove unwanted attributes!
• Duplicate rows:
– No DISTINCT in query Sort → Eliminate
• No need to remove duplicates
– DISTINCT in query
• Key ∈ attribute list ➔ No duplicates
• Key ∉ attribute list ➔ Duplicate elimination
6
Sort → Eliminate
•Sort rows
•Remove repeated ones.
7
Sort → Eliminate
Sort → Eliminate
Sort tuples with all remaining tuples
DeptName Salary Address EmpName

OR 3000 Cairo Randa
CS 4000 Giza Nemin
CS 4000 Giza Nemin
T’ CS 4000 6 October Nemin T
IS 2000 Giza Randa
IS 2000 Cairo Nemin
CS 4000 Giza Nemin
8
Sort → Eliminate
Sort → Eliminate

CS 4000 Giza Nemin
CS 4000 Giza Nemin
CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
9
Sort → Eliminate
Sort → Eliminate
DeptName Salary Address EmpName DeptName Salary Address EmpName
CS 4000 Giza Nemin

CS 4000 Giza Nemin
CS 4000 Giza Nemin
T’ Nemin
T
CS 4000 6 October
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
10
Sort → Eliminate
i:1
j:2

CS 4000 Giza Nemin
CS 4000 Giza Nemin
CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
11
Sort → Eliminate
Sort → Eliminate
i:1
j:2

CS 4000 Giza Nemin CS 4000 Giza Nemin
CS 4000 Giza Nemin

CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
12
Sort → Eliminate
Sort → Eliminate
i:1
j:3
DeptName DeptID StudentID StudentName

CS 4000 Giza Nemin
CS 4000 Giza Nemin
CS 4000 Giza Nemin
CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
13
Sort → Eliminate
Sort → Eliminate
i:1
j:4
DeptName Salary Address EmpName DeptName DeptID StudentID StudentName

CS 4000 Giza Nemin
CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
14
Sort → Eliminate
Sort → Eliminate
i:4
j:5
DeptName Salary Address EmpName DeptName DeptID StudentID StudentName

CS 4000 Giza Nemin CS 4000 6 October Nermin
CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
15
Sort → Eliminate
Sort → Eliminate
i:4
j:5

CS 4000 Giza Nemin CS 4000 6 October Nemin
CS 4000 Giza Nemin
IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
16
Sort → Eliminate
Sort → Eliminate
i:4
j:5

CS 4000 Giza Nemin

IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
17
Sort → Eliminate
CS 4 20092300 Nemin
Sort → Eliminate
i:5
j:6

CS 4000 Giza Nemin

IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
18
Sort → Eliminate
Sort → Eliminate
i:5
j:6
And, so on…
CS 4000 Giza Nemin IS 2000 Giza Randa

IS 2000 Giza Randa
IS 2000 Cairo Nemin
OR 3000 Cairo Randa
19
SET Operations
• Algorithm for SET operations
• Set operations:
• UNION, INTERSECTION, SET DIFFERENCE and CARTESIAN PRODUCT
• CARTESIAN PRODUCT of relations R and S
include all possible combinations of records
from R and S. The attribute of the result
include all attributes of R and S.
• Cost analysis of CARTESIAN PRODUCT
• If R has n records and j attributes and S has m records and k
attributes, the result relation will have n*m records and j+k
attributes.
• CARTESIAN PRODUCT operation is very
expensive and should be avoided if possible.
20
SET Operations
• Apply to type compatible relations
• Same number of attributes
• Names of attributes are the same in both
• Same attribute domains
• Tables:
Person (SSN, Name, Address, Hobby)
Professor (Id, Name, Office, Phone)
are not union compatible.
 Name (Person) and  Name (P rofessor)
are union compatible and
 Name (Person) -  Name (Professor)
makes sense.
21
Algorithms for SET Operations
• UNION
• Sort the two relations on the same attributes.
• Scan and merge both sorted files concurrently, whenever
the same tuple exists in both relations, only one is kept in
the merged results.
• INTERSECTION
• Sort the two relations on the same attributes.
• Scan and merge both sorted files concurrently, keep in the
merged results only those tuples that appear in both
relations.
• SET DIFFERENCE R-S
• Keep in the merged results only those tuples that appear in
relation R but not in relation S.
• Sorting-based
22
Sorting-based ∪ Algorithm
DeptName DeptID
Accounting 1
Research 2 DeptName DeptID
Accounting 1
HR 4
R
DeptName DeptID
Management 5
Logistics 7
Accounting 1
S R∪S
23
DeptName DeptID
Accounting 1
DeptName DeptID
Research 2
HR 4
Admin 9
R
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
24
DeptName DeptID
Accounting 1 DeptName DeptID
Research 2
HR 4
Admin 9
R
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
25
DeptName DeptID
Research 2
Accounting 1
HR 4
Admin 9
R
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
26
DeptName DeptID
Research 2
Accounting 1
HR 4
Research 2
Admin 9
R
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
27
DeptName DeptID
Accounting 1
DeptName DeptID
Research 2
Accounting 1
HR 4
Research 2
Admin 9
HR 4
R
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
28
DeptName DeptID
Research 2
Accounting 1
HR 4
Research 2
Admin 9 HR 4
R Management 5
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
29
DeptName DeptID
Research 2
Accounting 1
HR 4
Research 2
Admin 9 HR 4
R Management 5
Logistics 7
DeptName DeptID
Accounting 1
Management 5
Logistics 7
S R∪S
30
DeptName DeptID
Research 2
Accounting 1
HR 4
Research 2
Admin 9 HR 4
R Management 5
Logistics 7
DeptName DeptID Admin 9
Accounting 1
Management 5
Logistics 7
S R∪S
31
Sorting-based ∩ and - Algorithms
32
Summary
• Relational Algebra operators can be classified into three groups
[Selection- Projection- grouping – grouping- set operation- rename]
• Tuple at a Time Unary Operators
• Selection and Projection
• No need to bring entire relation into memory at one time
• Full Relation Unary operators
• Duplicate elimination and grouping
• Require seeing all or most of the tuples in memory at once
• Full Relation Binary Operators
• Set operators like union, intersection , difference, join and Cartesian products
• Requires seeing the tuples of both relations in memory
33
Aggregate Operators
• Functions that operate on sets:
– COUNT, SUM, AVG, MAX, MIN
• Produce numbers (not tables)
• Not part of relational algebra
• You needn’t the whole row

SELECT MAX(Salary)
FROM EMPLOYEE;
• Index
• Index scan
• No index and table sorted on the attribute we want
• the Min and Max is readily available
• No index and table is not sorted
• Table scan
34
Aggregate Operators with B-Tree
• MAX, MIN, SUM, AVERAGE, COUNT
26
6 12
42 51 62
1 2 4 7 8 13 15 18 25
27 29 45 46 48 53 55 60 64 70 90
39
Aggregate Operators with B-Tree
• MAX
• Traverse till the right-most index key
• MIN
• Traverse till the left-most index key
• SUM
• Visit all keys
• Average
• Visit all keys
• Count
• Is usually part of the catalog so it can be found directly
40
GROUP BY
41
GROUP BY
SELECT Dno, AVG (Salary)
FROM EMPLOYEE
GROUP BY Dno;
• Sorting
• Sort the relation on the grouping attributes
• Partition the relation into groups by the grouping attributes
• Apply the aggregate operators on the groups
Computation is complex
Clustering index → Already grouped
42
43
GROUP BY
• Sorting
• Sort the relation on the grouping attributes
• Partition the relation into groups by the grouping attributes
• Apply the aggregate operators on the groups
Dno Salary Dno Salary

Dno Salary
1 3000 1 3500
1 3000
1 4000 3 3333
5 4000
3 4000 5 3000
3 4000
3 4000
3 4000
3 2000
5 2000
5 2000
3 2000
5 4000
1 4000
44
45

IS312 - L04 - QP-Projection and Set Operation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IS312 - L04 - QP-Projection and Set Operation

Uploaded by

Copyright:

Available Formats

Database Systems II

Query Optimization Techniques

Dr. Noha Nagy

• Problem: efficiently answering a given Query

Select Name, Hoppy from Person

Id Name Address Hobby Name Hobby

1123 John 123 Main stamps John stamps

1123 John 123 Main stamps John 123 Main

Result is a table (no duplicates)

• The implementation requires the following

DeptName Salary Address EmpName

DeptName Salary Address EmpName

DeptName Salary Address EmpName DeptName Salary Address EmpName

CS 4000 Giza Nemin

DeptName Salary Address EmpName

DeptName Salary Address EmpName DeptName Salary Address EmpName

CS 4000 Giza Nemin

DeptName DeptID StudentID StudentName

DeptName Salary Address EmpName DeptName DeptID StudentID StudentName

DeptName Salary Address EmpName DeptName DeptID StudentID StudentName

DeptName Salary Address EmpName DeptName Salary Address EmpName

DeptName Salary Address EmpName DeptName Salary Address EmpName

CS 4000 Giza Nemin CS 4000 6 October Nemin

CS 4000 Giza Nemin

DeptName Salary Address EmpName DeptName Salary Address EmpName

CS 4000 Giza Nemin CS 4000 6 October Nemin

CS 4000 Giza Nemin

CS 4000 Giza Nemin CS 4000 6 October Nemin

CS 4000 Giza Nemin IS 2000 Giza Randa

• You needn’t the whole row

Clustering index → Already grouped

Dno Salary Dno Salary

You might also like