Professional Documents
Culture Documents
Overview
Measures of Query Cost
Selection Operation
Sorting
Join Operation
Other Operations
Evaluation of Expressions
Database System Concepts - 7th Edition 15.2 ©Silberschatz, Korth and Sudarshan
Basic Steps in Query Processing
Database System Concepts - 7th Edition 15.3 ©Silberschatz, Korth and Sudarshan
Basic Steps in Query Processing (Cont.)
Database System Concepts - 7th Edition 15.4 ©Silberschatz, Korth and Sudarshan
Basic Steps in Query Processing:
Optimization
A relational algebra expression may have many equivalent
expressions
• E.g., salary75000(salary(instructor)) is equivalent to
salary(salary75000(instructor))
Each relational algebra operation can be evaluated using
one of several different algorithms
• Correspondingly, a relational-algebra expression can be
evaluated in many ways.
Annotated expression specifying detailed evaluation
strategy is called an evaluation-plan. E.g.,:
• Use an index on salary to find instructors with salary <
75000,
• Or perform complete relation scan and discard
instructors with salary 75000
Database System Concepts - 7th Edition 15.5 ©Silberschatz, Korth and Sudarshan
Basic Steps: Optimization (Cont.)
Database System Concepts - 7th Edition 15.6 ©Silberschatz, Korth and Sudarshan
Measures of Query Cost
Database System Concepts - 7th Edition 15.7 ©Silberschatz, Korth and Sudarshan
Measures of Query Cost
Database System Concepts - 7th Edition 15.8 ©Silberschatz, Korth and Sudarshan
Measures of Query Cost (Cont.)
Database System Concepts - 7th Edition 15.9 ©Silberschatz, Korth and Sudarshan
Selection Operation
File scan
Algorithm A1 (linear search). Scan each file block and test
all records to see whether they satisfy the selection
condition.
• Cost estimate = br block transfers + 1 seek
br denotes number of blocks containing records
from relation r
• If selection is on a key attribute, can stop on finding
record
cost = (br /2) block transfers + 1 seek
• Linear search can be applied regardless of
selection condition or
ordering of records in the file, or
availability of indices
Note: binary search generally does not make sense since
data is not stored consecutively
• except when there is an index available,
• and binary search requires more seeks than index
search
Database System Concepts - 7th Edition 15.10 ©Silberschatz, Korth and Sudarshan
Selections Using Indices
Database System Concepts - 7th Edition 15.11 ©Silberschatz, Korth and Sudarshan
Selections Using Indices
Database System Concepts - 7th Edition 15.12 ©Silberschatz, Korth and Sudarshan
Selections Involving Comparisons
Database System Concepts - 7th Edition 15.13 ©Silberschatz, Korth and Sudarshan
Implementation of Complex Selections
Database System Concepts - 7th Edition 15.14 ©Silberschatz, Korth and Sudarshan
Algorithms for Complex Selections
Disjunction:1 2 . . . n (r).
A10 (disjunctive selection by union of identifiers).
• Applicable if all conditions have available indices.
Otherwise use linear scan.
• Use corresponding index for each condition, and take
union of all the obtained sets of record pointers.
• Then fetch records from file
Negation: (r)
• Use linear scan on file
• If very few records satisfy , and an index is applicable
to
Find satisfying records using index and fetch from
file
Database System Concepts - 7th Edition 15.15 ©Silberschatz, Korth and Sudarshan
Bitmap Index Scan
Database System Concepts - 7th Edition 15.16 ©Silberschatz, Korth and Sudarshan
Sorting
Database System Concepts - 7th Edition 15.17 ©Silberschatz, Korth and Sudarshan
Example: External Sorting Using Sort-
Merge
Database System Concepts - 7th Edition 15.18 ©Silberschatz, Korth and Sudarshan
External Sort-Merge
Database System Concepts - 7th Edition 15.19 ©Silberschatz, Korth and Sudarshan
External Sort-Merge (Cont.)
Database System Concepts - 7th Edition 15.20 ©Silberschatz, Korth and Sudarshan
External Sort-Merge (Cont.)
Database System Concepts - 7th Edition 15.21 ©Silberschatz, Korth and Sudarshan
External Merge Sort (Cont.)
Cost analysis:
• 1 block per run leads to too many seeks during merge
Instead use bb buffer blocks per run
read/write bb blocks at a time
Can merge M/bb–1 runs in one pass
• Total number of merge passes required: log
M/bb–1(br/M).
Database System Concepts - 7th Edition 15.22 ©Silberschatz, Korth and Sudarshan
External Merge Sort (Cont.)
Cost of seeks
• During run generation: one seek to read each run
and one seek to write each run
2 br / M
• During the merge phase
Need 2 br / bb seeks for each merge pass
• except the final one which does not require a
write
Total number of seeks:
2 br / M + br / bb (2 logM/bb–1(br / M) -1)
Database System Concepts - 7th Edition 15.23 ©Silberschatz, Korth and Sudarshan
Join Operation
Database System Concepts - 7th Edition 15.24 ©Silberschatz, Korth and Sudarshan
Nested-Loop Join
Database System Concepts - 7th Edition 15.25 ©Silberschatz, Korth and Sudarshan
Nested-Loop Join (Cont.)
Database System Concepts - 7th Edition 15.26 ©Silberschatz, Korth and Sudarshan
Block Nested-Loop Join
Database System Concepts - 7th Edition 15.27 ©Silberschatz, Korth and Sudarshan
Indexed Nested-Loop Join
Database System Concepts - 7th Edition 15.29 ©Silberschatz, Korth and Sudarshan
Merge-Join
Database System Concepts - 7th Edition 15.31 ©Silberschatz, Korth and Sudarshan
Merge-Join (Cont.)
Database System Concepts - 7th Edition 15.32 ©Silberschatz, Korth and Sudarshan
Hash-Join
Database System Concepts - 7th Edition 15.33 ©Silberschatz, Korth and Sudarshan
Hash-Join (Cont.)
Database System Concepts - 7th Edition 15.34 ©Silberschatz, Korth and Sudarshan
Hash-Join Algorithm
The hash-join of r and s is computed as
follows.
1. Partition the relation s using hashing function h. When
partitioning a relation, one block of memory is reserved
as the output buffer for each partition.
2. Partition r similarly.
3. For each i:
(a) Load si into memory and build an in-memory
hash index on it using the join attribute. This hash
index uses a different hash function than the earlier
one h.
(b) Read the tuples in ri from the disk one by
one. For each tuple tr locate each matching tuple ts
Relationinssis
i using the
called in-memory
the hash
build input andindex. Output
r is called the
the probe
input. concatenation of their attributes.
Database System Concepts - 7th Edition 15.36 ©Silberschatz, Korth and Sudarshan
Hash-Join algorithm (Cont.)
Database System Concepts - 7th Edition 15.37 ©Silberschatz, Korth and Sudarshan
Handling of Overflows
Database System Concepts - 7th Edition 15.39 ©Silberschatz, Korth and Sudarshan
Hybrid Hash–Join
Database System Concepts - 7th Edition 15.41 ©Silberschatz, Korth and Sudarshan
Complex Joins
Database System Concepts - 7th Edition 15.42 ©Silberschatz, Korth and Sudarshan
Joins over Spatial Data
Database System Concepts - 7th Edition 15.43 ©Silberschatz, Korth and Sudarshan
Other Operations
Database System Concepts - 7th Edition 15.44 ©Silberschatz, Korth and Sudarshan
Other Operations : Aggregation
Database System Concepts - 7th Edition 15.45 ©Silberschatz, Korth and Sudarshan
Other Operations : Set Operations
Database System Concepts - 7th Edition 15.46 ©Silberschatz, Korth and Sudarshan
Other Operations : Set Operations
Database System Concepts - 7th Edition 15.47 ©Silberschatz, Korth and Sudarshan
Answering Keyword Queries
Database System Concepts - 7th Edition 15.48 ©Silberschatz, Korth and Sudarshan
Other Operations : Outer Join
Database System Concepts - 7th Edition 15.49 ©Silberschatz, Korth and Sudarshan
Other Operations : Outer Join
Database System Concepts - 7th Edition 15.50 ©Silberschatz, Korth and Sudarshan
Evaluation of Expressions
Database System Concepts - 7th Edition 15.51 ©Silberschatz, Korth and Sudarshan
Materialization
Database System Concepts - 7th Edition 15.52 ©Silberschatz, Korth and Sudarshan
Materialization (Cont.)
Database System Concepts - 7th Edition 15.53 ©Silberschatz, Korth and Sudarshan
Pipelining
Database System Concepts - 7th Edition 15.54 ©Silberschatz, Korth and Sudarshan
Pipelining (Cont.)
Database System Concepts - 7th Edition 15.55 ©Silberschatz, Korth and Sudarshan
Pipelining (Cont.)
Database System Concepts - 7th Edition 15.56 ©Silberschatz, Korth and Sudarshan
Blocking Operations
Database System Concepts - 7th Edition 15.57 ©Silberschatz, Korth and Sudarshan
Pipeline Stages
Pipeline stages:
• All operations in a stage run concurrently
• A stage can start only after preceding stages have
completed execution
Database System Concepts - 7th Edition 15.58 ©Silberschatz, Korth and Sudarshan
Evaluation Algorithms for Pipelining
Data streams
• Data entering database in a continuous manner
• E.g., Sensor networks, user clicks, …
Continuous queries
• Results get updated as streaming data enters the
database
• Aggregation on windows is often used
E.g., tumbling windows divide time into units, e.g.,
hours, minutes
Need to use pipelined processing algorithms
• Punctuations used to infer when all data for a window
has been received
Database System Concepts - 7th Edition 15.60 ©Silberschatz, Korth and Sudarshan
Query Processing in Memory
Database System Concepts - 7th Edition 15.61 ©Silberschatz, Korth and Sudarshan
Cache Conscious Algorithms
Database System Concepts - 7th Edition 15.63 ©Silberschatz, Korth and Sudarshan