You are on page 1of 71

12

Chapter 12

Distributed Database
Management Systems

Database Systems:
Design, Implementation, and Management,
Seventh Edition, Rob and Coronel
1
12

In this chapter, you will learn:

• What a distributed database management


system (DDBMS) is and what its components are
• How database implementation is affected by
different levels of data and process distribution
• How transactions are managed in a distributed
database environment
• How database design is affected by the
distributed database environment

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 2
12
The Evolution of Distributed Database
Management Systems
• Distributed database management system
(DDBMS)
– Governs storage and processing of logically
related data over interconnected computer
systems in which both data and processing
functions are distributed among several sites

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 3
12
The Evolution of Distributed Database
Management Systems (continued)
• Centralized database required that corporate
data be stored in a single central site
• Dynamic business environment and
centralized database’s shortcomings
spawned a demand for applications based on
data access from different sources at multiple
locations

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 4
12
The Evolution of Distributed Database
Management Systems (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 5
12

DDBMS Advantages and Disadvantages

• Advantages include:
– Data are located near “greatest demand” site
– Faster data access
– Faster data processing
– Growth facilitation
– Improved communications

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 6
12
DDBMS Advantages and Disadvantages
(continued)
• Advantages include (continued):
– Reduced operating costs
– User-friendly interface
– Less danger of a single-point failure
– Processor independence

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 7
12
DDBMS Advantages and Disadvantages
(continued)
• Disadvantages include:
– Complexity of management and control
– Security
– Lack of standards
– Increased storage requirements
– Increased training cost

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 8
12
DDBMS Advantages and Disadvantages
(continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 9
12
DDBMS Advantages and Disadvantages
(continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 10
12
DDBMS Advantages and Disadvantages
(continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 11
12
Characteristics of Distributed
Management Systems
• Application interface
• Validation
• Transformation
• Query optimization
• Mapping
• I/O interface

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 12
12
Characteristics of Distributed
Management Systems (continued)
• Formatting
• Security
• Backup and recovery
• DB administration
• Concurrency control
• Transaction management

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 13
12
Characteristics of Distributed
Management Systems (continued)
• Must perform all the functions of centralized
DBMS
• Must handle all necessary functions imposed
by distribution of data and processing
– Must perform these additional functions
transparently to the end user

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 14
12
Characteristics of Distributed
Management Systems (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 15
12

DDBMS Components

• Must include (at least) the following


components:
– Computer workstations
– Network hardware and software
– Communications media
– Transaction processor (application processor,
transaction manager)
• Software component found in each computer
that requests data

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 16
12

DDBMS Components (continued)

• Must include (at least) the following


components (continued):
– Data processor or data manager
• Software component residing on each
computer that stores and retrieves data located
at the site
• May be a centralized DBMS

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 17
12

DDBMS Components (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 18
12

Levels of Data and Process Distribution

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 19
12
Single-Site Processing,
Single-Site Data (SPSD)
• All processing is done on single CPU or host
computer (mainframe, midrange, or PC)
• All data are stored on host computer’s local
disk
• Processing cannot be done on end user’s
side of system

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 20
12
Single-Site Processing,
Single-Site Data (SPSD) (continued)
• Typical of most mainframe and midrange
computer DBMSs
• DBMS is located on host computer, which is
accessed by dumb terminals connected to it
• Also typical of first generation of single-user
microcomputer databases

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 21
12
Single-Site Processing,
Single-Site Data (SPSD) (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 22
12
Multiple-Site Processing,
Single-Site Data (MPSD)
• Multiple processes run on different computers
sharing single data repository
• MPSD scenario requires network file server
running conventional applications that are
accessed through LAN
• Many multiuser accounting applications,
running under personal computer network, fit
such a description

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 23
12
Multiple-Site Processing,
Single-Site Data (MPSD) (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 24
12
Multiple-Site Processing,
Multiple-Site Data (MPMD)
• Fully distributed database management
system with support for multiple data
processors and transaction processors at
multiple sites
• Classified as either homogeneous or
heterogeneous
• Homogeneous DDBMSs
– Integrate only one type of centralized DBMS
over a network
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 25
12
Multiple-Site Processing,
Multiple-Site Data (MPMD) (continued)
• Heterogeneous DDBMSs
– Integrate different types of centralized DBMSs
over a network
• Fully heterogeneous DDBMS
– Support different DBMSs that may even
support different data models (relational,
hierarchical, or network) running under
different computer systems, such as
mainframes and microcomputers
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 26
12
Multiple-Site Processing,
Multiple-Site Data (MPMD) (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 27
12
Distributed Database
Transparency Features
• Allow end user to feel like database’s only
user
• Features include:
– Distribution transparency
– Transaction transparency
– Failure transparency
– Performance transparency
– Heterogeneity transparency

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 28
12

Distribution Transparency

• Allows management of physically dispersed


database as though it were a centralized
database
• Following three levels of distribution
transparency are recognized:
– Fragmentation transparency
– Location transparency
– Local mapping transparency

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 29
12

Distribution Transparency (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 30
12

Distribution Transparency (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 31
12

Transaction Transparency

• Ensures database transactions will maintain


distributed database’s integrity and
consistency

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 32
12
Distributed Requests and Distributed
Transactions
• Distributed transaction
– Can update or request data from several
different remote sites on network
• Remote request
– Lets single SQL statement access data to be
processed by single remote database
processor
• Remote transaction
– Accesses data at single remote site
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 33
12
Distributed Requests and Distributed
Transactions (continued)
• Distributed transaction
– Allows transaction to reference several
different (local or remote) DP sites
• Distributed request
– Lets single SQL statement reference data
located at several different local or remote DP
sites

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 34
12
Distributed Requests and Distributed
Transactions (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 35
12
Distributed Requests and Distributed
Transactions (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 36
12
Distributed Requests and Distributed
Transactions (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 37
12
Distributed Requests and Distributed
Transactions (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 38
12
Distributed Requests and Distributed
Transactions (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 39
12

Distributed Concurrency Control

• Multisite, multiple-process operations are


much more likely to create data
inconsistencies and deadlocked transactions
than are single-site systems

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 40
12
Distributed Concurrency Control
(continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 41
12

Two-Phase Commit Protocol

• Distributed databases make it possible for


transaction to access data at several sites
• Final COMMIT must not be issued until all
sites have committed their parts of
transaction
• Two-phase commit protocol requires each
individual DP’s transaction log entry be
written before database fragment is actually
updated
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 42
12
Performance Transparency
and Query Optimization
• Objective of query optimization routine is to
minimize total cost associated with execution
of request
• Costs associated with request are function of:
– Access time (I/O) cost
– Communication cost
– CPU time cost
• Must provide distribution transparency as well
as replica transparency
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 43
12
Performance Transparency
and Query Optimization (continued)
• Replica transparency
– DDBMS’s ability to hide existence of multiple
copies of data from user
• Query optimization techniques include:
– Manual or automatic
– Static or dynamic
– Statistically based or rule-based algorithms

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 44
12

Distributed Database Design

• Data fragmentation
– How to partition database into fragments
• Data replication
– Which fragments to replicate
• Data allocation
– Where to locate those fragments and replicas

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 45
12

Data Fragmentation

• Breaks single object into two or more


segments or fragments
• Each fragment can be stored at any site over
computer network
• Information about data fragmentation is
stored in distributed data catalog (DDC), from
which it is accessed by TP to process user
requests

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 46
12

Data Fragmentation (continued)

• Strategies
– Horizontal fragmentation
• Division of a relation into subsets (fragments) of
tuples (rows)
– Vertical fragmentation
• Division of a relation into attribute (column)
subsets
– Mixed fragmentation
• Combination of horizontal and vertical
strategies

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 47
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 48
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 49
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 50
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 51
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 52
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 53
12

Data Fragmentation (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 54
12

Data Replication

• Storage of data copies at multiple sites


served by computer network
• Fragment copies can be stored at several
sites to serve specific information
requirements
– Can enhance data availability and response
time
– Can help to reduce communication and total
query costs

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 55
12

Data Replication (continued)

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 56
12

Data Replication (continued)

• Replication scenarios
– Fully replicated database
• Stores multiple copies of each database
fragment at multiple sites
• Can be impractical due to amount of overhead
– Partially replicated database
• Stores multiple copies of some database
fragments at multiple sites
• Most DDBMSs are able to handle the partially
replicated database well

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 57
12

Data Replication (continued)

• Replication scenarios (continued)


– Unreplicated database
• Stores each database fragment at single site
• No duplicate database fragments

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 58
12

Data Allocation

• Deciding where to locate data


• Allocation strategies
– Centralized data allocation
• Entire database is stored at one site
– Partitioned data allocation
• Database is divided into several disjointed parts
(fragments) and stored at several sites

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 59
12

Data Allocation (continued)

• Allocation strategies (continued)


– Replicated data allocation
• Copies of one or more database fragments are
stored at several sites

• Data distribution over computer network is


achieved through data partition, data
replication, or combination of both

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 60
12

Client/Server vs. DDBMS

• Way in which computers interact to form


system
• Features user of resources, or client, and
provider of resources, or server
• Can be used to implement a DBMS in which
client is the TP and server is the DP

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 61
12

Client/Server vs. DDBMS (continued)

• Client/server advantages
– Less expensive than alternate minicomputer
or mainframe solutions
– Allow end user to use microcomputer’s GUI,
thereby improving functionality and simplicity
– More people in job market have PC skills than
mainframe skills
– PC is well established in workplace

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 62
12

Client/Server vs. DDBMS (continued)

• Client/server advantages (continued)


– Numerous data analysis and query tools exist
to facilitate interaction with DBMSs available in
PC market
– Considerable cost advantage to offloading
applications development from mainframe to
powerful PCs

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 63
12

Client/Server vs. DDBMS (continued)

• Client/server disadvantages
– Creates more complex environment
• Different platforms (LANs, operating systems,
and so on) are often difficult to manage
– An increase in number of users and
processing sites often paves the way for
security problems

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 64
12

Client/Server vs. DDBMS (continued)

• Client/server disadvantages (continued)


– Possible to spread data access to much wider
circle of users
• Increases demand for people with broad
knowledge of computers and software
• Increases burden of training and cost of
maintaining the environment

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 65
12
C. J. Date’s Twelve Commandments for
Distributed Databases
• Local site independence
• Central site independence
• Failure independence
• Location transparency
• Fragmentation transparency
• Replication transparency

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 66
12
C. J. Date’s Twelve Commandments for
Distributed Databases (continued)
• Distributed query processing
• Distributed transaction processing
• Hardware independence
• Operating system independence
• Network independence
• Database independence

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 67
12

Summary

• Distributed database stores logically related data in


two or more physically independent sites connected
via computer network
• Distributed processing is division of logical database
processing among two or more network nodes
• Distributed databases require distributed processing
• Main components of DDBMS are transaction
processor and data processor

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 68
12

Summary (continued)

• Current database systems can be classified


by extent to which they support processing
and data distribution
• Homogeneous distributed database system
integrates only one particular type of DBMS
over computer network
• Heterogeneous distributed database system
integrates several different types of DBMSs
over computer network
Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 69
12

Summary (continued)

• DDBMS characteristics are best described as set of


transparencies
• Transaction is formed by one or more database
requests
• Distributed concurrency control is required in network
of distributed databases
• Distributed DBMS evaluates every data request to
find optimum access path in distributed database

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 70
12

Summary (continued)

• The design of distributed database must


consider fragmentation and replication of data
• Database can be replicated over several
different sites on computer network
• Client/server architecture refers to way in
which two computers interact over computer
network to form a system

Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 71

You might also like