You are on page 1of 32

.

DATABASE MANAGEMENT
SYSTEMS (DBMS II)
CHAPTER 02
QUERY PROCESSING
& OPTIMIZATION
DATABASE MANAGEMENT SYSTEMS (DBMS II)
✓2.1. Query Processing Steps in DBMS.
✓2.2. Query Decomposition in DBMS.
✓2.3. Optimization Process in DBMS.
✓2.4. Approaches to Query Optimization.
✓2.5.Transformation Rules.
✓2.6.Implementing Relational Operators.
✓2.7.Pipelining.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What is Data Processing in DBMS?
Definition:
►data processing may refer to a known steps in the
information processing cycle in which data is acquired,
entered, validated, processed, stored, and output, in
response to QUERIES.
The processing is the step that organizes the information in
order to form the desired output.
►Queries of a database can be fast or slow. Depends on a lot
of things. The size of the table, the amount of data you are
requesting from the query, etc.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
Introduction to QPO
►By writing a query, there are many plans that a database management
system
(DBMS) can follow to process it and produce its answer. So All plans are
equivalent in terms of their final output but they vary in their cost & the
amount of time that they need to run.
What is the plan that needs the least amount of time?
Such query optimization is absolutely necessary in a DBMS.
►What is a Query processor?
Definition 1:A query processor is part of a DBMS ( it’s the DBMS
compiler) responsible for translating declarative statements written
in a given query language into a sequence of physical operations to
be executed by the lower levels of the system to generate outputs.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What is a Query processor?
Definition 2:
All database systems must be able to respond to requests for
information from the user process queries So Obtaining the
desired information from a database system in a predictable and
reliable fashion is called Query Processing.
►What is a Query Optimization?
Definition : Getting the desired information from a
database system (out put) back in a timely manner that
deals with the custom required techniques is called
Query Optimization. This paper will introduce the reader to
the basic concepts alternatives can be enormous.
►You can Optimize your query by using various commands and
operators such as and, or ,like ,order by …etc.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What is Query Processing and Optimization (QPO)?
Definition 1:
► Basic idea of QPO :
In SQL, queries are expressed in high level declarative format
so QPO do translates a SQL query to an execution plan over
physical data model by using an operations & Optimized
operations or commands on file-structures, indices, etc.
► Ideal execution plan by Query should take as little time
as possible.
Definition 2 of QPO ( very important):
►Query processing is a set of activities involving in getting the
result of a query expressed in a high-level language. These
activities includes parsing the queries and translate them into
expressions that can be implemented at the physical level of the
file system & then if it’s possible Optimizing and evaluating
the query.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
❖ The cost of processing of query is dominated by the speed of
disk access.
 For a given query, there are several possible strategies for
processing , especially when query is complex. So The
difference between a good strategies and a bad one is based on
the order of magnitude. Therefore, it is worthwhile for the
system to spend some time on selecting a good strategies for
processing query.

►Why learn about QPO in a DBMS?


1. Identify performance of a query on the available data to get
the best results.
2. to help in speeding up the processing of a query.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
DATABASE MANAGEMENT SYSTEMS (DBMS II)
DATABASE MANAGEMENT SYSTEMS (DBMS II)
Query Processing Steps in DBMS
There are three phases that a query passes through during the
DBMS’ processing of that query :
1. Parsing or translation. By DBMS
2. Optimization By DBMS & SOMETIMES USER
3. Evaluation By USER
►What do we mean by Query Parsing or translation in DBMS?
1. Parsing and translation:
►Most queries submitted to a DBMS are in a high-level language such as SQL.
During the parsing and translation stage, the human readable form of
the query is translated into forms usable by the DBMS. These can be
in the forms of a relational algebra expression, query tree that can
be represented by a query graph. So SQL must translates the
query. Consider the following:
SQL query:
select * from Registration where FirstName= “Ford”
select * from Employee Where (Salary >= 50,000) and (Profession=ICT) ;
DATABASE MANAGEMENT SYSTEMS (DBMS II)
2. Optimization:
►What do we mean by Query optimization in DBMS?
►In this stage, the query processor applies rules to the internal data structures
of the query. We need optimization in the cases of complex queries .
to transform these query structures into equivalent & more
efficient representations. The rules can be based upon
mathematical models of the relational algebra expression or even
other formats.
►cost estimation of different algorithms applied to operations or
upon the semantics within the query and the relations it involves.

►Selecting the proper rules to apply, when to apply them and how
they are applied is the function of the query optimization engine.

►What is the primary job of Query optimization in DBMS?


1. Set equivalent & more efficient Query representations ( checks the
way blocks of statements logically used).
2. Verify the validity of relational algebra of the original query string.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What is query optimizer in DBMS?
Definition :
►The query optimizer is the component of a database management
system that attempts to determine the most efficient way to execute a
query (It optimizes the logical blocks of data & the used relational
operators and operators).

► Costs are used to estimate the runtime cost of evaluating the query, in
terms of the number of I/O operations required, the CPU requirements,
and other factors determined from the data dictionary.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
3. Evaluation:
►What do we mean by Query EVALUATION in DBMS?
► The final step in processing a query is the evaluation phase which
can be made by either:

1. The best evaluation plan is generated by the optimization engine


of DBMS therefore it’s selected and then executed.
2. checking whether the output of the query meets the necessary
criteria needed or not so then it can be adjusted. Like complex
JOIN, SELECT & other statements.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What do we mean by the Query Cost Optimization?
►How we can Optimize the cost of query?
QUERY METRICS: COST is:
1. ► To optimize the cost of query it’s important to Know that The execution
time of a query depends on the resources needed to execute the
operations such as Storage disks, CPU s, RAMs and, in the case of parallel
and distributed systems , thread and process communication are used.
2. ► So to optimize the cost of queries we need to optimize our operational
resources.
► Since data transfer to/from disks is substantially should be faster then it can
be costly for a standard organization. Due to the high cost of the equipments
needed.
► disk accesses usually represent an over whelming majority of the total cost
particularly for very large databases that cannot be pre-loaded into memory.
► With today’s computers, the CPU cost also can be insignificant compared to
disk access for many operations.
► The cost to access a disk is usually measured in terms of the number of
blocks transferred from and to a disk.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
► THE ROLE OF INDEXES:
Q- What is the role of Indexes in Query Optimization with an
example explaination?
►The utilization of indexes can dramatically reduce the execution time of
various complex and large query operations such as SELECT and JOIN.
Let us review one type of index file structure and the role that plays in
reducing execution time :
►Primary Index:
The data file is ordered by the attribute that is also the search key in the
index file. It’s is also referred to as an Index-Sequential File. For
scanning through a relation’s records in sequential order by a key
value, this is one of the fastest and more efficient structures for
locating a record.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
► Why we sometimes need to use an structured Algorithm to
form an optimized query?
►Queries should ultimately be reduced to a number of smaller
parts of operations on the underlying physical table structures.
For each relational operation, there can exist several different
access paths to the particular records needed. So the
algorithms can help us to minimize our query size & increase
the query efficiency.
►The query execution engine can have a multitude of specialized algorithms
designed to process particular relational operation and access path
combinations. We will look at some examples of algorithms for both the
select and join operations.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What do we mean by Query Decomposition in DBMS?
Definition :
Separating the elements of a query so that each part can be processed &
optimized easily by a database so that the query will be executed faster
& in a better manner by using the required database schema.
►What is database schema?
Definition :
►Database Schema is a logical grouping of a set of database tables,
fields, queries.
►What is database table?
►A table is a logical sub unit of information that is split up into rows.

►Queries of a database can be fast or slow. Depends on a lot of things.


The size of the table, the amount of data you are requesting from the
query, etc.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What do we mean by Approaches to Query Decomposition
in DBMS?

1. Managing large quantity of structured queries.

2. Sharing data: multiple users use and manipulate data .

3. Controlling the access to data & maintaining the data


integrity .
4. Data should be independence : that means the separation
of the physical and logical data.
5. The mapping between the logical & physical data should
be provided by the database query schema.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What are the Basic types of query files?
1. Linear query files

➢ Sequence of records with a fixed format usually stored on


a single file
➢ Limitation: single file

➢ Example query:

✓ Select * from Salesperson


✓ select Last-Name, First-Name from Salesperson
✓ insert into persons values ('hetland', 'camilla', 'hagabakka
24', 'sandnes')

DATABASE MANAGEMENT SYSTEMS (DBMS II)
2. Hierarchical structure
This query contains Records that has the hierarchal structure.
➢ Limitations:

A. It may Require duplicating records from the different


tables.
B. Problems when updated if the numbers of fields
becomes too large.
C. Retrieval requires knowing the structure of data.(limited
data independence).

➢Example query:
select employees.name, orders.product from
employees
inner join orders on
(employees.employee_id=orders.employee_id)
DATABASE MANAGEMENT SYSTEMS (DBMS II)
3. TREE structure :
➢ This query contains Records that may executed or not
based on the conditions that the branches of query uses.
➢ Limitations:
➢ The user should carefully monitor the conditions used to be
able to retrieve the required information.

➢ Example query:
select * from persons
where(firstname='tove' or firstname='stephen')
and
lastname='svendson'
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What are the Transformation rules used in query
optimization?
1. Query normalization:
This achieving the following goals:
A. Isolate the different types of operators at different levels
in the expression.
B. Eventually eliminate operators when necessary.
C. Perform the selection operators.

2. Representive rules:
A. Follow the SQL command syntax.
B. Separate between data & the database schema.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What are the Transformation rules used in query
optimization?
3. Simplification rules :
This means that the simplification of query is critical so we have to
make our query simple in format.
A. don't use relation algebra we it’s not necessary.
B. Don’t use join when it’ not necessary.
C. Other simplifications.

3. Semantic optimization:
This means use as much as conditions, blocks of data
necessary to obtain the outputs.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►What is parallel database?
►Parallel database systems:
Is used improve performance through parallelizing
various operations like loading data, query
execution data may be distributed, but purely for
performance reasons.
►What is distributed database?
►Distributed database systems:
Data is physically stored across various sites, each
of which runs DBMS and can function
independently. Data distribution determined by
local ownership and availability, in addition to
performance.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
DATABASE MANAGEMENT SYSTEMS (DBMS II)

Parallelism is natural to DBMS processing


➢ Define what is Pipeline parallelism?
►Pipeline parallelism: many machines each doing
one
step in a multi-step process.

➢ Define what is Partition parallelism?


►Partition parallelism: many machines doing the
same thing to different pieces of data.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
DATABASE MANAGEMENT SYSTEMS (DBMS II)
►Speed-Up
More resources means proportionally less
time for given amount of data. problem size
constant , system grows.
►Scale-Up
If resources increased in proportion to
increase in data size, time is constant.
problems size, system both grow.
DATABASE MANAGEMENT SYSTEMS (DBMS II)
DATABASE MANAGEMENT SYSTEMS (DBMS II)
DATABASE MANAGEMENT
SYSTEMS (DBMS II)

END of CHAPTER 02
QUERY PROCESSING
&
OPTIMIZATION
THANK YOU FOR YOUR TIME

You might also like