Distributed Database Systems Chap 9

ICS 611 Spring Semester, 2008 L Gottschalk
Distributed Data Base Systems1
Chapter 9: Optimized Distributed Queries, page

228
This chapter deals with the last TWO steps of the fours steps below:
Calculus Query on Distributed Relations
V
Control Site Query Decomposition Global Schema
V
Algebraic Query on Distributed Relations
V
Control Site Data Localization Fragment Schema
V
Fragment Query
V
Control Site Global Optimization Statistics on Fragments
V
Optimized Fragment Query with Communication Operations
V
Local Sites Local Optimization Local Schema
V
Optimized Local Queries
The first two steps are dealt with in Chapter 8.
8.1 Query Decomposition, page 204

The steps of chapter eight (steps 1 and 2 above) need to be done for local data
bases that are fragmented, as well as fragmented distributed databases.
The third step is finding the best permutation of ordering of operations in order to
minimize cost, or optimize performance.
The calculation to find the optimal or minimal permutation is too difficult.
Therefore, the goal is near-optimal or near-minimal. Or, a more realistic strategy is

to avoid plainly bad execution plans.
To find the (near-)optimal or (near-)minimal strategy, we need to predict execution

costs:
• I/O
• CPU
• Communication.
1
Principles of Distributed Database Systems, second edition, Ozsu and Valduriez
Page 1
So we need
• fragment statistics
• forecast of cardinalities of interim resulting tables.
To simplify, we focus mostly on ordering of joins.
Extending the discussion to semi-joins and unions should be straight forward.
Organization of this chapter:

9.1: main components of query optimization, including
• the search space
• the search strategy
• the cost model.
9.2 Centralized query optimization
9.3 join ordering in fragment queries
9.4 examples of the techniques in Ingres, R*, and SDD-1.
9.1 Query Optimization, page 229

The input to query optimization is a query expressed in relational algebra (not
calculus).
Query optimization: process of producing a query execution plan (QEP) that

optimizes or minimizes.
The search space: all possible execution plans. They all must produce the same
result.
The cost model: the anticipated cost of a particular execution plan.
The search strategy: this explores the search space and selects the best plan. The
strategy defines the order in which plans are examined.
9.1.1 Search space, page 229
The number of alternative join trees that can be produced for N relations is O(N!).
Therefore, we need to reduce the number of alternatives.
Heuristic 1: perform selection and projection when first accessing a base relation.
Heuristic 2: avoid Cartesian products, that can be avoided.
Page 2
Heuristic 3: if only linear trees, the search space is reduced to O (2N).

However, bushy trees are useful in using parallelism.
(Linear tree: at every node a new relation is introduced.)
(Bushy tree: base relations are introduced at the bottom row.
Operations above are working on intermediary resulting tables.)
9.1.2 Search Strategy, page 232

The most popular search strategy is dynamic programming.
It is deterministic.
It builds all possible plans.
As soon as plan is determined to be non-optimal, it is abandoned (not finished).
Another deterministic plan is the “greedy algorithm”) builds only one plan, depth
first.
Dynamic programming has acceptable costs only when number of relations is small.
Therefore, randomized strategies which start with a greedy algorithm, then tries to
improve by visiting its neighbors (nearly alike plans). The neighbors are found by
randomly switching two steps in the plan.
For more than a few relations, randomized does better than deterministic.
9.1.3 Distributed Cost Model, page 233

The cost model deals either in
• total time (a cost analysis), or
• response time (a performance analysis).
Each element of total time can be given a cost, and therefore cost can be
minimized.
Database Statistics, page 235

The size of intermediate tables is key to accuracy in predicting costs.
There is a direct trade-off between precision of database statistics and the cost of
maintaining them.
There are usually two simplifying assumptions made at this point:

• the distribution of attribute values is a relation is supposed to be uniform
• all attributes are independent.
These assumptions are often wrong.
Using these simplifying assumptions, then very simple rules-of-thumb are given for
estimating the results of
selection
Page 3
projection
Cartesian product
join
semijoin
union, and
difference.
9.2 Centralized Query Optimization, page 239
Since centralized query optimization was discussed in chapter 8, what is left is to

show examples in commercial systems.
9.2.1 INGRES algorithm, page 239

Ingres employs a dynamic optimization algorithm.
Ingres breaks up the calculus query into smaller pieces, recursively.
The break up is into queries on relations having a common variable.
Then, a selection is made on each relation (using the where clause):

- unary operations.
Then, the pieces of the calculus query are executed in turn. The intermediate
product of the first piece is consumed (used) in the 2nd piece query. (This is a linear
join tree rather than bushy join tree strategy.)
9.2.2 System R Algorithm, page 243

System R uses a static optimization algorithm based on an exhaustive search of the
solution space. It uses statistics about the database.
Instead of systematically doing select operations before joins, System R only does
that if it leads to a better strategy. Every candidate tree is given a cost (total time),
and the lowest cost one is retained.
The candidate trees are derived by permuting the join ordering.
The number of alternative trees is limited by using dynamic programming, which

includes pruning of paths that are likely to not be optimal. Also, plans that include
Cartesian products are pruned immediately.
9.3 Join Ordering in Fragment Queries, page 247

Ordering joins is the principle way to optimize a query.
It can dramatically reduce communication time between servers.
Page 4
Two approaches:
• Optimize the ordering of joins,
• Replace joins by combinations of semijoins (to minimize communication
costs).
9.3.1 Join Ordering, page 247

INGRES and R* use joins rather than semijoins.
Simplifying assumptions:
In this discussion, “relation” will mean a fragment stored at a particular site.

It doesn’t matter whether it is to be combined with other fragments of the
same relation, or of another relation, since we are only interested in the
processing cost. It is a given that the algorithm will produce the right result
(all alternative plans will do that.)
Since we are focusing on join processing (and join ordering), we ignore local
processing time for selection and projection.
We only consider join queries of relations on different servers.
We assume transfer is relation at a time, not tuple at a time, and therefore

ignore time for producing the data at the result site.
First, if there are two relations to be joined, R and S, we want to send the smaller to
the larger.
If there are three relations, Emp, Assignmts, and Proj, we must estimate what will
be the resulting cardinality of Emp⋈Assigmts, Emp⋈Proj, and Assignmts⋈Proj.
Then we can calculate the lowest cost ordering.
Explanation: Emp⋈Assigmts must be combined with Proj, and so again there
will be network transmission. And so to calculate that cost, we need to guess
the size of Emp⋈Assigmts.
Estimating this way is more costly as the number of relations grows.
System R* does it this way.
A common way to estimate intermediate table sizes (e.g., Emp⋈Assigmts) is to

assume a Cartesian product.
9.3.2 Semijoin Based Algorithms, page 249

If one were to do the join of Emp, Assignmts, and Proj using semijoins, then one
could:
Page 5
1) send the EmpNo column of Emp to the location of relation Assignmts.

2) Do a semijoin at that server.
3) send the ProjNo column of the Assignmts⋈EmpNo semijoin to the location of Proj.
4) Do a semijoin.
5) Send the resulting EmpNo column of the Emp⋈Assigmts⋈Proj semijoin to the
locations of Emp and Assigmts tables, asking for the corresponding rows to be sent
to the location of the Proj table.
6) Do a three way join of the REDUCED number of rows of Assignmts and Emp with
the Proj table.
Why does this work:

1) It is much less costly to send just one column of a table than the full table.
2) It is much less costly to send just the needed rows of a table, than sending the
full table.
When does this work well:

1) When there are only a subset of matches.
Example:
Select * from Emp, Assignments, Proj
where (emp.EmpNo = Assignmens.EmpNo and Assgnments,ProjNo = Proj.ProjNo)
and EmpNo = “fred”;
THIS WILL WORK WELL AS AS A SEMIJOIN STRATEGY.
Example:
Select * from Emp, Assignments, Proj
where (emp.EmpNo = Assignmens.EmpNo and Assgnments,ProjNo = Proj.ProjNo);
THIS WILL WORK WELL AS AS A SEMIJOIN STRATEGY.
“NEITHER APPROACH IS SYSTEMATICALLLY THE BEST; THEY SHOULD BE

CONSIDERED AS COMPLEMENTARY.” page 250.
Disadvantage of semijoin strategy: The number of possible algorithms for

processing query is huge. Finding the minimal execution plan can be very
expensive.
9.3.3 Join versus Semijoin, page 252

There are many more steps to a semijoin plan than a join plan.
Using semijoins may not be a good idea of local processing is significant, contrasted
to communication (network) costs.
9.4 Distributed Query Optimization Algorithms, page 254

We look at three different products: INGRES, System R*, and SDD-1.
Page 6
Some differences to note about the three:

1) Optimization process is dynamic for INGRES.
Static for the other two.
2) Objective of SDD-1 and R* is to minimize total time (cost).

INGRES aims at decreasing a combination of response time and total time.
3) SDD-1 uses semijoins. Distributed INGRES and SYSTEM R use methods similar to
their non-distributed versions (i.e., joins).
4) All the three products use statistical information about the data.
5) INGRES can handle horizontal fragments.
To summarize these differences about the three products:

Product Method Goal Semijoins? Handle fragments?
INGRES Dynamic Response Time No Horizontal

& total cost
R* Static TotalCost No No
SDD-1 Static TotalCost Yes No
9.4.1 Distributed INGRES Algorithm, page 254

Repeated from Centralized INGRES (section 9.2.1):
Ingres breaks up the calculus query into smaller pieces, recursively.
The break up is into queries on relations having a common variable.
Then, a selection is made on each relation (using the where clause):

- unary operations.
Then, the pieces of the calculus query are executed in turn. The intermediate
product of the first piece is consumed (used) in the 2nd piece query. (This is
a linear join tree rather than bushy join tree strategy.)
As the processing of the query proceeds, “a simple choice for the next subquery is
to take the next one having no predecessor and involving the smaller fragments.”
“For example, if a query q

has the subqueries q1, q2, and q3,
with dependencies q1 >>> q3 and q2 >>> q3, and
if fragments referred to by q1 are smaller than those referred to by q1,
then q1 is selected.”
“The algorithm of Distributed INGRES is characterized by a limited search of the

solution space, where an optimization decision is taken for each step without
concerning itself with the consequences of that decision on global optimization.
Page 7
However, the algorithm is able to correct a local decision that proves to be

incorrect.
“An alternative to the limited search is the exhaustive search used b y R* where all
possible strategies are evaluated to find the best one. In [one study], the two
approaches are compared on the basis of the size of data transfers. An important
conclusion of this study is that
- exhaustive search significantly outperforms limited search as soon as the query
access more than three relations,
- dynamic optimization is beneficial because the exact sizes of the intermediate
results are known.”
9.4.2 R* Algorithm, page 259

An exhaustive search of all possible strategies is performed.
This is costly.
The overhead is rapidly amortized if the query is executed frequently.
R* does not deal with fragments.
The goal is total time.
9.4.3 SDD-1 Algorithm, page 263

The SDD-1 algorithm is derived from the “hill-climbing” algorithm:
refinements of an initial feasible solution are recursively computerd until no more
improvements can be made.
The hill-climbing algorithm could pursue either total time or response time, but SDD
pursues only total cost.
The initial feasible solution is found by computing the cost of all solutions that
transfer all the required relations to a single candidate site, and then choosing the
least costly of these.
A cost that is ignored is sending the final result to the site where the result is
needed.
Then the cost of joining each possible pair of tables is computed (which would
require sending one table to the other’s site), and then sending the result to the
single candidate site.
IF one of these is less, then this is the new chosen algorithm (at least for now).
This is done recursively until no more improvements are found.
The hill-climbing algorithm is of the class of greedy algorithms.
Page 8
Greedy Algorithms start with an initial feasible algorithm and then iteratively
improve upon it.
The main problem is that strategies with initial higher cost are eliminated, and yet
might end up being the best overall cost.
Page 9

Distributed Database Systems Chap 9

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Database Systems Chap 9

Uploaded by

Copyright:

Available Formats

ICS 611 Spring Semester, 2008 L Gottschalk

Distributed Data Base Systems1

Chapter 9: Optimized Distributed Queries, page

The first two steps are dealt with in Chapter 8.

8.1 Query Decomposition, page 204

The calculation to find the optimal or minimal permutation is too difficult.

Therefore, the goal is near-optimal or near-minimal. Or, a more realistic strategy is

To find the (near-)optimal or (near-)minimal strategy, we need to predict execution

To simplify, we focus mostly on ordering of joins.

Extending the discussion to semi-joins and unions should be straight forward.

Organization of this chapter:

9.2 Centralized query optimization

9.3 join ordering in fragment queries

9.4 examples of the techniques in Ingres, R*, and SDD-1.

9.1 Query Optimization, page 229

Query optimization: process of producing a query execution plan (QEP) that

The cost model: the anticipated cost of a particular execution plan.

9.1.1 Search space, page 229

Therefore, we need to reduce the number of alternatives.

Heuristic 2: avoid Cartesian products, that can be avoided.

Heuristic 3: if only linear trees, the search space is reduced to O (2N).

9.1.2 Search Strategy, page 232

9.1.3 Distributed Cost Model, page 233

Database Statistics, page 235

There are usually two simplifying assumptions made at this point:

9.2 Centralized Query Optimization, page 239

Since centralized query optimization was discussed in chapter 8, what is left is to

9.2.1 INGRES algorithm, page 239

Ingres breaks up the calculus query into smaller pieces, recursively.

The break up is into queries on relations having a common variable.

Then, a selection is made on each relation (using the where clause):

9.2.2 System R Algorithm, page 243

The candidate trees are derived by permuting the join ordering.

The number of alternative trees is limited by using dynamic programming, which

9.3 Join Ordering in Fragment Queries, page 247

It can dramatically reduce communication time between servers.

9.3.1 Join Ordering, page 247

In this discussion, “relation” will mean a fragment stored at a particular site.

We only consider join queries of relations on different servers.

We assume transfer is relation at a time, not tuple at a time, and therefore

Estimating this way is more costly as the number of relations grows.

System R* does it this way.

A common way to estimate intermediate table sizes (e.g., Emp⋈Assigmts) is to

9.3.2 Semijoin Based Algorithms, page 249

1) send the EmpNo column of Emp to the location of relation Assignmts.

Why does this work:

When does this work well:

THIS WILL WORK WELL AS AS A SEMIJOIN STRATEGY.

THIS WILL WORK WELL AS AS A SEMIJOIN STRATEGY.

“NEITHER APPROACH IS SYSTEMATICALLLY THE BEST; THEY SHOULD BE

Disadvantage of semijoin strategy: The number of possible algorithms for

9.3.3 Join versus Semijoin, page 252

9.4 Distributed Query Optimization Algorithms, page 254

Some differences to note about the three:

2) Objective of SDD-1 and R* is to minimize total time (cost).

5) INGRES can handle horizontal fragments.

To summarize these differences about the three products:

INGRES Dynamic Response Time No Horizontal

SDD-1 Static TotalCost Yes No

9.4.1 Distributed INGRES Algorithm, page 254

The break up is into queries on relations having a common variable.

Then, a selection is made on each relation (using the where clause):

“For example, if a query q