Professional Documents
Culture Documents
1
Course Content
High performance database systems: client-server databases, parallel and distributed
databases, cloud databases; Transaction oriented computing: transaction models, flat
transactions, nested transactions, distributed transactions, long-lived transactions,
transaction processing monitors; Concurrency control: isolation theorems, locking,
nested transaction locking, scheduling and deadlock, deadlock detection and
management; Failure and recovery; Replica management, Transactional and tuple
oriented life system; Transaction and database performance benchmarks; NoSQL
systems: data models, system architecture, transactions, elasticity, and optimizations.
Reference Book:
Principles of Distributed Database Systems - M. Tamer Özsu,
Patrick Valduriez
2
Introduction
What is a distributed DBMS
History
Distributed DBMS promises
Design issues
Distributed DBMS architecture
3
Distributed Computing
4
Current Distribution – Geographically
Distributed Data Centers
5
What is a Distributed Database System?
6
What is not a DDBS?
7
Distributed DBMS Environment
8
Implicit Assumptions
9
Important Point
Logically integrated
but
Physically distributed
10
Outline
Introduction
What is a distributed DBMS
History
Distributed DBMS promises
Design issues
Distributed DBMS architecture
11
History – File Systems
12
History – Database Management
13
History – Early Distribution
Peer-to-Peer (P2P)
14
History – Client/Server
15
History – Data Integration
16
History – Cloud Computing
17
Data Delivery Alternatives
Delivery modes
Pull-only
Push-only
Hybrid
Frequency
Periodic
Conditional
Ad-hoc or irregular
Communication Methods
Unicast
One-to-many
Note: not all combinations make sense
18
Outline
Introduction
What is a distributed DBMS
History
Distributed DBMS promises
Design issues
Distributed DBMS architecture
19
Distributed DBMS Promises
Improved performance
22
Transparent Access
Tokyo
SELECT ENAME,SAL
Boston Paris
FROM EMP,ASG,PAY
WHERE DUR > 12 Paris projects
Paris employees
AND EMP.ENO = ASG.ENO Communication Paris assignments
Network Boston employees
AND PAY.TITLE = EMP.TITLE
Boston projects
Boston employees
Boston assignments
Montreal
New
Montreal projects
York Paris projects
Boston projects New York projects
New York employees with budget > 200000
New York projects Montreal employees
New York assignments Montreal assignments
23
Distributed Database - User View
Distributed Database
24
Distributed DBMS - Reality
User
Query
User
DBMS
Application
Software
DBMS
Software
DBMS Communication
Software Subsystem
User
DBMS User Application
Software Query
DBMS
Software
User
Query
25
Types of Transparency
Data independence
Network transparency (or distribution transparency)
Location transparency
Fragmentation transparency
Fragmentation transparency
Replication transparency
26
Reliability Through Transactions
Data replication
Great for read-intensive workloads, problematic for updates
Replication protocols
27
Potentially Improved Performance
Parallelism in execution
Inter-query parallelism
Intra-query parallelism
28
Scalability
29
Outline
Introduction
What is a distributed DBMS
History
Distributed DBMS promises
Design issues
Distributed DBMS architecture
30
Distributed DBMS Issues
31
Distributed DBMS Issues
32
Distributed DBMS Issues
Replication
Mutual consistency
Freshness of copies
Eager vs lazy
Centralized vs distributed
Parallel DBMS
Objectives: high scalability and performance
Not geo-distributed
Cluster computing
33
Related Issues
34
Outline
Introduction
What is a distributed DBMS
History
Distributed DBMS promises
Design issues
Distributed DBMS architecture
35
DBMS Implementation Alternatives
36
Dimensions of the Problem
Distribution
Whether the components of the system are located on the same machine or not
Heterogeneity
Various levels (hardware, communications, operating system)
DBMS important one
data model, query language,transaction management algorithms
Autonomy
Not well understood and most troublesome
Various versions
Design autonomy: Ability of a component DBMS to decide on issues related to its own
design.
Communication autonomy: Ability of a component DBMS to decide whether and how to
communicate with other DBMSs.
Execution autonomy: Ability of a component DBMS to execute local operations in any
manner it wants to.
37
Client/Server Architecture
38
Advantages of Client-Server
Architectures
More efficient division of labor
Horizontal and vertical scaling of resources
Better price/performance on client machines
Ability to use familiar tools on client machines
Client access to remote data (via standards)
Full DBMS functionality provided to client workstations
Overall better system price/performance
39
Database Server
40
Distributed Database Servers
41
Peer-to-Peer Component Architecture
42
MDBS Components & Execution
43
Mediator/Wrapper Architecture
44
Cloud Computing
PaaS – Platform-as-a-Service
SaaS – Software-as-a-Service
DaaS – Database-as-a-Service
45
Simplified Cloud Architecture
46