Cơ Sở Dữ Liệu Song Song

Database Systems
CSE 414
Lectures 18: Parallel Databases
(Ch. 20.1)
CSE 414 - Fall 2017 1

Announcements
• HW4 is due tomorrow 11pm
CSE 414 - Fall 2017 2

Why compute in parallel?
• Multi-cores:
– Most processors have multiple cores
– This trend will increase in the future
• Big data: too large to fit in main memory

– Distributed query processing on 100-1000 servers
– Widely available now using cloud services
CSE 414 - Fall 2017 3

Big Data
• Companies, organizations, scientists have
data that is too big (and sometimes too
complex) to be managed without changing
tools and processes
• Complex data processing:

– Decision support queries (SQL w/ aggregates)
– Machine learning (adds linear algebra and iteration)
CSE 414 - Fall 2017 4

Two Kinds of Parallel Data
Processing
• Parallel databases, developed starting
with the 80s (this lecture)
– OLTP (Online Transaction Processing)
– OLAP (Online Analytic Processing, or
Decision Support)
• General purpose distributed processing:
MapReduce, Spark
– Mostly for Decision Support Queries
CSE 414 - Fall 2017 5

Performance Metrics
for Parallel DBMSs
P = the number of nodes (processors, computers)
• Speedup:
– More nodes, same data  higher speed
• Scaleup:
– More nodes, more data  same speed
• OLTP: “Speed” = transactions per second (TPS)

• Decision Support: “Speed” = query time
CSE 414 - Fall 2017 6

Linear vs. Non-linear Speedup
Speedup
×1 ×5 ×10 ×15
# nodes (=P)
CSE 414 - Fall 2017 7

Linear vs. Non-linear Scaleup
Batch
Scaleup
Ideal
×1 ×5 ×10 ×15
# nodes (=P) AND data size

CSE 414 - Fall 2017 8
Challenges to
Linear Speedup and Scaleup
• Startup cost
– Cost of starting an operation on many nodes
• Interference
– Contention for resources between nodes
• Stragglers
– Slowest node becomes the bottleneck
CSE 414 - Fall 2017 9
Architectures for Parallel
Databases
• Shared memory
• Shared disk
• Shared nothing
CSE 414 - Fall 2017 10

Shared Memory
P P P
Interconnection Network
Global Shared Memory
D D D
CSE 414 - Fall 2017 11
Shared Disk
M M M
P P P
D D D
CSE 414 - Fall 2017 12
Shared Nothing
P P P
M M M
D D D
CSE 414 - Fall 2017 13
A Professional Picture…
From: Greenplum (now EMC) Database Whitepaper
SAN = “Storage Area Network”
CSE 414 - Fall 2017 14

Shared Memory
• Nodes share both RAM and disk
• Dozens to hundreds of processors
Example: SQL Server runs on a single machine

and can leverage many threads to get a query
to run faster (see query plans)
• Easier to program and easy to use

• But very expensive to scale: last remaining
cash cows in the hardware industry
CSE 414 - Fall 2017 15

Shared Disk
• All nodes access the same disks
• Found in the largest "single-box" (non-
cluster) multiprocessors
Oracle dominates this class of systems.
Characteristics:
• Also hard to scale past a certain point:
existing deployments typically have fewer
than 10 machines
CSE 414 - Fall 2017 16

Shared Nothing
• Cluster of machines on high-speed network
• Each machine has its own memory and disk:
– lowest contention
NOTE: Because all machines today have many cores

and many disks, then shared-nothing systems typically
run many "nodes” on a single physical machine.
Characteristics:
• Today, this is the most scalable architecture.
• Most difficult to administer and tune.
We discuss only Shared Nothing in class

CSE 414 - Fall 2017 17
Approaches to
Parallel Query Evaluation cid=cid
cid=cid
• Inter-query parallelism
cid=cid
pid=pid
– Transaction per node

pid=pid
Customer
pid=pid
Customer
Customer
– OLTP
Product Purchase
Product Purchase
Product Purchase
• Inter-operator parallelism cid=cid
– Operator per node

– Both OLTP and Decision Support
pid=pid
Customer
Product Purchase
• Intra-operator parallelism cid=cid
– Operator on multiple nodes pid=pid
– Decision Support Customer
Product Purchase
We study only intra-operator parallelism: most scalable

CSE 414 - Fall 2017 18
Single Node Query Processing
(Review)
Given relations R(A, B) and S(B, C), no indexes:
• Selection: σA=123(R)
– Scan file R, select records with A=123
• Group-by: γA, sum(B)(R)

– Scan file R, insert into a hash table using attr. A as key
– When a new key is equal to an existing one, add B to the value
• Join: R S
– Scan file S, insert into a hash table using attr. B as key
– Scan file R, probe the hash table using attr. B
CSE 414 - Fall 2017 19

Distributed Query Processing
• Data is horizontally partitioned across
many servers
• Operators may require data reshuffling

– not all the needed data is in one place
CSE 414 - Fall 2017 20

Horizontal Data Partitioning
Data: Servers:
1 2 . . . P
K A B
… …
CSE 414 - Fall 2017 21

Data: Servers:
1 2 . . . P
K A B K A B K A B K A B
… … … … … …
… …
Which tuples
go to what server?
CSE 414 - Fall 2017 22

• Block Partition:
– Partition tuples arbitrarily s.t. size(R1) ≈ … ≈ size(RP)
• Hash partitioned on attribute A:

– Tuple t goes to chunk i, where i = h(t.A) mod P + 1
• Range partitioned on attribute A:

– Partition the range of A into -∞ = v0 < v1 < … < vP = ∞
– Tuple t goes to chunk i, if vi-1 < t.A < vi
CSE 414 - Fall 2017 23

Parallel GroupBy
Data: R(K, A, B, C)
Query: γA, sum(C)(R)
How can we compute in each case?

• R is hash-partitioned on A easy case!
• R is block-partitioned
• R is hash-partitioned on K
CSE 414 - Fall 2017 24

Parallel GroupBy
Data: R(K, A, B, C)
Query: γA, sum(C)(R)
• R is block-partitioned or hash-partitioned on K
R1 R2 . . . RP
Reshuffle R
on attribute A
R1’ R2’ RP’

. . .
CSE 414 - Fall 2017 25
Parallel Join
• Data: R(K1, A, B), S(K2, B, C)
• Query: R(K1, A, B) S(K2, B, C)
Initially, both R and S are horizontally partitioned on K1 and K2
R1, S1 R2, S2 . . . RP, SP

Reshuffle R on R.B
and S on S.B
R’1, S’1 R’2, S’2 . . . R’P, S’P

Each server computes
the join locally
CSE 414 - Fall 2017 26

Data: R(K1, A, B), S(K2, B, C)
Query: R(K1, A, B) ⋈ S(K2, B, C)
R1 S1 R2 S2
K1 B K2 B K1 B K2 B
Partition 1 20 101 50 3 20 201 20
2 50 102 50 4 20 202 50
M1 M2
Shuffle
R1’ S1’ R2’ S2’

K1 B K2 B K1 B K2 B
Local 1 20 201 20 2 50 101 50
Join 3 20 102 50
4 20 M1 M2 202 50
CSE 414 - Fall 2017 27

Speedup and Scaleup
• Consider:
– Query: γA, sum(C)(R)
– Runtime: dominated by reading chunks from disk
• If we double the number of nodes P, what is
the new running time?
– Half (each server holds ½ as many chunks)
• If we double both P and the size of R, what is
the new running time?
– Same (each server holds the same # of chunks)
CSE 414 - Fall 2017 28
Uniform Data vs. Skewed Data
• Let R(K, A, B, C); which of the following
partition methods may result in skewed
partitions?
• Block partition Uniform
• Hash-partition Assuming good

Uniform hash function
– On the key K
– On the attribute A May be skewed
E.g. when all records have
the same value of the
attribute A, then all records
end up in the same partition
CSE 414 - Fall 2017 29
Loading Data into a Parallel DBMS
Example using Teradata System
Need to figure out

where it belongs…
AMP = “Access Module Processor” = unit of parallelism
CSE 414 - Fall 2017 30

Order(oid, pid, date), Product(pid, …)
Example Parallel Query Execution
Find all orders from today, along with the items ordered
SELECT *
FROM Order o, Product p ⋈ o.pid = p.pid
WHERE o.pid = p.pid
AND o.date = today() date = today()
Product Order
p o
CSE 414 - Fall 2017 31

Example Parallel ⋈ o.pid = p.pid
Query Execution
date = today()
scan Order o
AMP 1 AMP 2 AMP 3
hash hash hash

h(o.pid) h(o.pid) h(o.pid)
select select select
date=today() date=today() date=today()
scan scan scan
Order o Order o Order o
AMP 1 AMP 2 AMP 3
CSE 414 - Fall 2017 32

Example Parallel ⋈ o.pid = p.pid
date = today()
Query Execution scan Item i

Order o
AMP 1 AMP 2 AMP 3
hash hash hash

h(p.pid) h(p.pid) h(p.pid)
scan scan scan

Product p Product p Product p
AMP 1 AMP 2 AMP 3
CSE 414 - Fall 2017 33

Example Parallel Query Execution
o.pid = p.pid o.pid = p.pid o.pid = p.pid
AMP 1 AMP 2 AMP 3
contains all orders and all

lines where hash(pid) = 3
CSE 414 - Fall 2017 34

Cơ Sở Dữ Liệu Song Song

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cơ Sở Dữ Liệu Song Song

Uploaded by

Copyright:

Available Formats

Database Systems

CSE 414 - Fall 2017 1

CSE 414 - Fall 2017 2

• Big data: too large to fit in main memory

CSE 414 - Fall 2017 3

• Complex data processing:

CSE 414 - Fall 2017 4

CSE 414 - Fall 2017 5

• OLTP: “Speed” = transactions per second (TPS)

CSE 414 - Fall 2017 6

CSE 414 - Fall 2017 7

# nodes (=P) AND data size

CSE 414 - Fall 2017 10

Global Shared Memory

From: Greenplum (now EMC) Database Whitepaper

SAN = “Storage Area Network”

CSE 414 - Fall 2017 14

Example: SQL Server runs on a single machine

• Easier to program and easy to use

CSE 414 - Fall 2017 15

Oracle dominates this class of systems.

CSE 414 - Fall 2017 16

NOTE: Because all machines today have many cores

We discuss only Shared Nothing in class

– Transaction per node

• Inter-operator parallelism cid=cid

– Operator per node

• Intra-operator parallelism cid=cid

– Operator on multiple nodes pid=pid

– Decision Support Customer

We study only intra-operator parallelism: most scalable

• Group-by: γA, sum(B)(R)

CSE 414 - Fall 2017 19

• Operators may require data reshuffling

CSE 414 - Fall 2017 20

CSE 414 - Fall 2017 21

CSE 414 - Fall 2017 22

• Hash partitioned on attribute A:

• Range partitioned on attribute A:

CSE 414 - Fall 2017 23

How can we compute in each case?

CSE 414 - Fall 2017 24

R1’ R2’ RP’

R1, S1 R2, S2 . . . RP, SP

R’1, S’1 R’2, S’2 . . . R’P, S’P

CSE 414 - Fall 2017 26

R1’ S1’ R2’ S2’

CSE 414 - Fall 2017 27

• Block partition Uniform

• Hash-partition Assuming good

Need to figure out

AMP = “Access Module Processor” = unit of parallelism

CSE 414 - Fall 2017 30

Example Parallel Query Execution

CSE 414 - Fall 2017 31

Example Parallel ⋈ o.pid = p.pid

AMP 1 AMP 2 AMP 3

hash hash hash

AMP 1 AMP 2 AMP 3

CSE 414 - Fall 2017 32

Example Parallel ⋈ o.pid = p.pid

Query Execution scan Item i

AMP 1 AMP 2 AMP 3

hash hash hash