The query optimization is a critical performance issue in centralized databasemanagement system, because the main difficulty is to choose one execution strategy thatminimizes the consumption of computing resources. Another objective of queryoptimization is to reduce the total execution time of the query which is the sum of theexecution times of all individual operations that make up the query. There are two maintechniques for query optimization in centralized DBMS, although the two methods areusually combined in practice. The first approach uses
that order theoperations in a query while the second approach compares different strategies based ontheir relative costs and selects one which requires minimum computing resources. In adistributed system, several additional factors further complicate the process of queryexecution. In general, the relations are fragmented in a distributed database, therefore, thedistributed query processor transform a high-level query on a logical distributed database(entire relation) into a sequence of database operations (relational algebra) on relationfragments. The data accessed by the query in a distributed database must be localized sothat the database operations can be performed on local data or relation fragments. Finally,the query on fragments must be extended with communication operations, andoptimization should be done with respect to a cost function that minimizes the use of computing resources such as disk I/Os, CPUs and communication networks.
Let us consider the following two relations which are stored in a centralized DBMS
Employee (empid, ename, salary, designation, deptno)Department (deptno, dname, location)
and the following query:
“Retrieve the names of all employees whose department location is ‘inside’ thecampus”
are primary key for the relation Employee and Departmentrespectively and
is a foreign key of the relation Employee.Using SQL, the above query can be expressed as:
Select ename from Employee, Department where Employee.deptno =Department.deptno and location = “inside”.
Two equivalent relational algebraic expressions that correspond to the above SQLstatement are as follows:(i)∏
(location = ‘inside’)
(Employee.deptno = Department.deptno)
Employee.deptno = Department.deptno
location = ‘inside’