You are on page 1of 14

Defence Institute Of Advanced

Technology

Advanced database management system


(Subject code: CSMSCD506)

Parallel databases
Deekshita S Iyer
(23-60-02)
Interoperation Parallelism
Different operations in a query expression are executed in parallel

Pipelined Parallelism

Consider a join of four relations:

● Useful with a small number of processors, but does


not scale up well.
P1
● Do not attain sufficient length to provide a high degree of
P2 parallelism.

P3 ● Not possible to pipeline all relational operators .(set-difference


operation)

2
Independent Parallelism

Operations in a query expression that do not depend on one another

temp1 ←

in parallel with

temp2 ←

3
Query Optimization
The process of selecting an efficient execution plan for processing a query is known as query optimization.

It involves:

● Analyzing the query, data, schema, and


available resources
● Generating different execution plans
● Selecting the plan with the least estimated cost

4
Making Choices for Parallel Systems

● How to parallelize each operation?

● How many processors to use for it?

● What operations to pipeline across different processors?

● What operations to execute independently in parallel, and

what operations to execute sequentially,one after the other?


5
Allocating Resources: deciding how Balancing Parallelism: it is a good idea
much of processors, disks, and memory not to execute certain operations in
each task needs. parallel.

Avoiding Long Pipelines: too


many tasks lined up one after the
other can lead to inefficiencies
6
Choosing Plans

● Consider only evaluation plans that parallelize every operation


across all processors

● Choose the most efficient sequential evaluation plan,


and then to parallelize

7
Design of parallel systems
A large parallel database system must also address these availability issues:

Problem: Resilience to failure of some processors or disks.

Solution:Data are replicated across at least two processors.

Problem: Online reorganization of data and schema changes

Solution: Some tasks, like insertion,deletion and updates, can be done while the system is
still running other tasks

New parallel database products - Netezza, DATAllegro (which was acquired by Microsoft),

Greenplum, and Aster Data. 8


Parallelism versus Raw Speed

Moore's law- number of transistors in an integrated circuit (IC) doubles every 18-24 months

Fast processors are power inefficient.

Modern processors typically are not one single processor but rather consist of several processors on
one chip

The term core is used for an on-chip processor.


9
Cache Memory and Multithreading

10
Threads
Threads are virtual sequences of instructions given to a CPU.

Multithreading is a form of parallelization or dividing up work for simultaneous processing.

11
Adapting Database System Design for Modern Architectures

● Increasing Concurrency -> Increase in data stored in cache->Increase in cache misses

● Concurrent transactions need some sort of concurrency control ->waiting or the loss of

work due to transaction aborts->To avoid this, we want the actions to conflict as little as

possible->can increase the amount of data needed in cache, resulting in more cache misses.

12
● Large concurrent transactions may not fully utilize modern processors.

● To optimize, enable multiple cores to collaborate on a single transaction.

● The database query processor must parallelize queries without overloading


cache.

● Achieve this by creating pipelines of database operations.

● Also, seek opportunities to parallelize individual database tasks.

13
THANK YOU

14

You might also like