Professional Documents
Culture Documents
Client-Server Architectures
Nasrullah Memon
File Systems:
MOTIVATION
program 1
File 1
data description 1
program 2
data description 2 File 2
program 3
data description 3 File 3
description
Application
program 2 manipulation
(with data database
semantics) control
Application
program 3
(with data
semantics)
Database Computer
Technology Networks
integration distribution
Distributed
Database
Systems
integration
Distributed Computing
• Synonymous terms
– distributed data processing
– multiprocessors/multicomputers
– satellite processing
– backend processing
– dedicated/special purpose
computers
– timeshared systems
– functionally modular systems
– Peer to Peer Systems
• Processing logic
• Functions
• Data
• Control
Site 1
Site 2
Site 5
Communication
Network
Site 4 Site 3
Site 1
Site 2
Site 5
Communication
Network
Site 4 Site 3
Implicit Assumptions
• Data stored at a number of sites each site
logically consists of a single processor.
• Processors at different sites are
interconnected by a computer network no
multiprocessors
– parallel database systems
• Distributed database is a database, not a
collection of files data logically related as
exhibited in the users’ access patterns
– relational data model
• D-DBMS is a full-fledged DBMS
– not remote file system, not a TP system
F7S/MIT – Database Systems Page 12
Shared-Memory
Architecture
P1 Pn M
D
Shared-Nothing Architecture
P1 Pn
D1 Dn
M1 Mn
– Replication transparency
– Fragmentation transparency
• horizontal fragmentation: selection
• vertical fragmentation: projection
• hybrid
Example
EMP ASG
ENO ENAME TITLE ENO PNO RESP DUR
E1 J. Doe Elect. Eng. E1 P1 Manager 12
E2 M. Smith Syst. Anal. E2 P1 Analyst 24
E3 A. Lee Mech. Eng. E2 P2 Analyst 6
E4 J. Miller Programmer E3 P3 Consultant 10
E5 B. Casey Syst. Anal. E3 P4 Engineer 48
E6 L. Chu Elect. Eng. E4 P2 Programmer 18
E7 R. Davis Mech. Eng. E5 P2 Manager 24
E8 J. Jones Syst. Anal. E6 P4 Manager 48
E7 P3 Engineer 36
E7 P5 Engineer 23
E8 P3 Manager 40
PROJ PAY
PNO PNAME BUDGET TITLE SAL
P1 Instrumentation 150000 Elect. Eng. 40000
P2 Database Develop. 135000 Syst. Anal. 34000
P3 CAD/CAM 250000 Mech. Eng. 27000
P4 Maintenance 310000 Programmer 24000
SELECTENAME,SAL
Tokyo
FROM EMP,ASG,PAY
WHERE DUR > 12
Boston Paris
AND EMP.ENO = ASG.ENO
Paris projects
AND PAY.TITLE = Paris employees
Communication
EMP.TITLE Network
Paris assignments
Boston employees
Boston projects
Boston employees
Boston assignments
Montreal
New
Montreal projects
York
Paris projects
Boston projects New York projects
New York employees with budget > 200000
New York projects Montreal employees
New York assignments Montreal assignments
F7S/MIT – Database Systems Page 19
Distributed Database
DBMS Communication
Software Subsystem
User
DBMS User Application
Software Query DBMS
Software
User
Query
F7S/MIT – Database Systems Page 21
• Parallelism in execution
– Inter-query parallelism
– Intra-query parallelism
• Communication overhead
Query Processing
– convert user transactions to data manipulation
instructions
– optimization problem
– min{cost = data transmission + local processing}
– general formulation is NP-hard
F7S/MIT – Database Systems Page 24
Distributed DBMS Issues
• Concurrency Control
– Synchronization of concurrent accesses
– Consistency and isolation of transactions' effects
– Deadlock management
• Reliability
– How to make the system resilient to failures
– Atomicity and durability
• Privacy/Security
– Keep database access private
– Protect against malicious activities
Query Distribution
Reliability
Processing Design
Concurrency
Control
Deadlock
Management
F7S/MIT – Database Systems Page 26
Related Issues
• Operating System Support
– operating system with proper support for
database operations
– dichotomy between general purpose
processing requirements and database
processing requirements
• Open Systems and Interoperability
– Distributed Multidatabase Systems
– More probable scenario
– Parallel issues
• Network Behavior
F7S/MIT – Database Systems Page 27
Architecture of a Database
System
Conceptual Conceptual
view
Schema
Standardization
Reference Model
– A conceptual framework whose purpose is to divide
standardization work into manageable pieces and to show at a
general level how these pieces are related to one another.
Approaches
– Component-based
• Components of the system are defined together with the
interrelationships between components.
• Good for design and implementation of the system.
– Function-based
• Classes of users are identified together with the functionality
that the system will provide for each class.
• The objectives of the system are clearly identified. But how do
you achieve these objectives?
– Data-based
• Identify the different types of describing data and specify the
functional units that will realize and/or use data according to
these views.
F7S/MIT – Database Systems Page 30
Conceptual Schema Definition
RELATION EMP [
KEY = {ENO}
ATTRIBUTES = {
ENO : CHARACTER(9)
ENAME : CHARACTER(15)
TITLE : CHARACTER(10)
}
]
RELATION PAY [
KEY = {TITLE}
ATTRIBUTES = {
TITLE : CHARACTER(10)
SAL : NUMERIC(6)
}
]
F7S/MIT – Database Systems Page 31
⇓
INTERNAL_REL EMPL [
INDEX ON E# CALL EMINX
FIELD = {
HEADER: BYTE(1)
E# : BYTE(9)
ENAME : BYTE(15)
TIT : BYTE(10)
}
F7S/MIT – Database Systems Page 33
]
Client/server
Autonomy
Multi-DBMS
Federated DBMS
Heterogeneity
Database
LAN
High-level Filtered
requests data only
Communications
DBMS Services
Database
Task Distribution
Application
QL Programmatic
Interface … Interface
Communications Manager
SQL result
query table
Communications Manager
Query Optimizer
Lock Manager
Storage Manager
Page & Cache Manager
Database
Communications Communications
Database Database
Server-to-Server
• SQL interface Applications
• programmatic Client
interface Services
Communications
• other
application
LAN
support
environments
Communications Communications
Database Database
GS GRM GQO
Directory Issues
Type
Location
Global & central
& replicated (?)
Local & distributed
& replicated
Distribution Design
• Top-down
– mostly in designing systems from scratch
• Bottom-up
– when the databases already exist at a number
of sites
• How to fragment?
• How to allocate?
• Information requirements?
Fragmentation
• Can't we just distribute relations?
• What is a reasonable unit of distribution?
– relation
• views are subsets of relations
• extra communication
– fragments of relations (sub-relations)
• concurrent execution of a number of transactions that
access different portions of a relation
• views that cannot be defined on a single fragment will
require extra processing
• semantic data control (especially integrity enforcement)
more difficult
F7S/MIT – Database Systems Page 52
Fragmentation Alternatives –
Horizontal
PROJ
PROJ1 : projects with budgets less than PNO PNAME BUDGET LOC
$200,000 P1 Instrumentation 150000 Montreal
PROJ2 : projects with budgets greater P2 Database Develop. 135000 New York
P3 CAD/CAM 250000 New
New York
York
than or equal to $200,000 P4 Maintenance 310000 Paris
P5 CAD/CAM 500000 Boston
PROJ1 PROJ2
Fragmentation Alternatives –
Vertical
PROJ
PROJ1: information about project PNO PNAME BUDGET LOC
budgets P1 Instrumentation 150000 Montreal
PROJ2: information about project P2 Database Develop. 135000 New York
P3 CAD/CAM 250000 New
New York
York
names and locations P4 Maintenance 310000 Paris
P5 CAD/CAM 500000 Boston
PROJ1 PROJ2
tuples relations
or
attributes
Correctness of Fragmentation
• Completeness
– Decomposition of relation R into fragments R1, R2, ..., Rn is
complete if and only if each data item in R can also be found
in some Ri
• Reconstruction
– If relation R is decomposed into fragments R1, R2, ..., Rn,
then there should exist some relational operator ∇such that
R = ∇1≤i≤nRi
• Disjointness
– If relation R is decomposed into fragments R1, R2, ..., Rn,
and data item di is in Rj, then di should not be in any other
fragment Rk (k ≠ j ).
Query Processing
high level user query
query
processor
• Query optimization
– How do we determine the “best” execution
plan? F7S/MIT – Database Systems Page 59
Selecting Alternatives
SELECT ENAME Π Project
FROM EMP,ASG σ Select
WHERE EMP.ENO = ASG.ENO × Join
AND DUR > 37
Strategy 1
ΠENAME(σDUR>37∧EMP.ENO=ASG.ENO (EMP × ASG))
Strategy 2
ΠENAME(EMP ENO (σDUR>37 (ASG)))
Site 5 Site 5
result = EMP1’∪EMP2’ result2=(EMP1∪ EMP2) ENOσDUR>37(ASG1∪ ASG1)
EMP1’ EMP2’
ASG1 ASG2 EMP1 EMP2
Site 3 Site 4
EMP1’=EMP1 ENOASG1
’ EMP2 ’=EMP
ENOASG2
’
2
Site 1 Site 2 Site 3 Site 4
ASG1’ ASG2’
Site 1 Site 2
ASG1’=σDUR>37(ASG1) ASG2’=σDUR>37(ASG2)
Select
Project O(n)
• Assume (without duplicate elimination)
• Relation
– cardinality
– size of a tuple
– fraction of tuples participating in a join with
another relation
• Attribute
– cardinality of domain
– actual number of distinct values
• Common assumptions
– independence between different attribute
values
– uniform distribution of attribute values within
their domain
F7S/MIT – Database Systems Page 67
Distributed Query
Processing Methodology
Calculus Query on Distributed
Relations
Query GLOBAL
Query GLOBAL
Decomposition
Decomposition SCHEMA
SCHEMA
Fragment Query
Global STATS
STATSON
ON
Global FRAGMENTS
Optimization
Optimization FRAGMENTS
Optimized Local
Queries
F7S/MIT – Database Systems Page 70
Restructuring
• Convert relational calculus to relational ΠENAME Project
algebra
• Make use of query trees
• Example σDUR=12 OR DUR=24
Find the names of employees other
than J. Doe who worked on the
CAD/CAM project for either 1 or 2 σPNAME=“CAD/CAM” Select
years.
SELECT ENAME
FROM EMP, ASG, PROJ
σENAME≠“J. DOE”
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO PNO
Restructuring –Transformation
Rules
• Commutativity of binary operations
– R×S⇔S×R
– R S⇔S R
– R∪S⇔S∪R
• Associativity of binary operations
– ( R × S ) × T ⇔ R × (S × T)
– ( R S ) T ⇔ R (S T )
• Idempotence of unary operations
ΠA’(ΠA’(R)) ⇔ ΠA’(R)
σp1(A1)(σp2(A2)(R)) = σp1(A1) ∧ p2(A2)(R)
where R[A] and A' ⊆ A, A" ⊆ A and A' ⊆ A"
• Commuting selection with projection
Example
Recall the previous example: ΠENAME Project
Find the names of employees
other than J. Doe who worked
on the CAD/CAM project for σDUR=12 OR DUR=24
either one or two years.
σPNAME=“CAD/CAM” Select
SELECT ENAME
FROM PROJ, ASG, EMP σENAME≠“J. DOE”
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
AND ENAME≠“J. Doe” PNO
AND PROJ.PNAME=“CAD/CAM”
AND (DUR=12 OR DUR=24) ENO Join
PNO ∧ENO
Restructuring
ΠENAME
PNO
ΠPNO,ENAME
ENO