You are on page 1of 18

2.

Introduction

Chapter 1

Introduction
Outline
v What is a distributed database system?
v Promises of DDBSs
v Complicating Factors
v Problem Areas
•Description
From File System to DBMS •Store
•Manipula-
tion
•Control
Program 1 Program 1
Deposit/ Deposit/
Withdraw Withdraw
File 1: current
D
Program 2 Program 2
accounts B
Transfer Transfer
M Bank
File 2: saving Database
Program 3 accounts Program 3
S
Print stmt File 3: Print stmt
customers
Program 4 Program 4
Customer Customer
Information Information
Example
v Multinational manufacturing company:
w head quarters in New York
w manufacturing plants in Chicago and Montreal
w warehouses in Phoenix and Edmonton
w R&D facilities in San Francisco
v Data and Information:
w employee records (working location)
w projects (R&D)
w engineering data (manufacturing plants, R&D)
w inventory (manufacturing, warehouse)
Features

v Data are distributed over sites (e.g. employee,


inventory).
v Queries (e.g. “get employees who are younger
than 45”) involve more than one site.
Distributed Database System Technology

Database Computer
Technology Networks
integration distributed computing

Distributed
Database System
v The key is integration, not centralization
v Distributed database technology attempts to achieve integration
without centralization.
DDBS = Database + Network
v Distributed database system technology is the union
of what appear to be opposed approaches to data
processing
w DDBS = database + network
w Database integrates operational data of an enterprise to
provide a centralized and controlled access to that data.
w Computer network promotes a work mode that goes
against all centralization efforts and facilitates distributed
computing.
Distributed Computing
v A distributed computing system consists of
w a number of autonomous processing elements (not
necessarily homogeneous), which
– are interconnected by a computer network;
– cooperate in performing their assigned tasks.

v What is distributed?
w Processing Logic
w Function
w Data All these are necessary and
w Control important for distributed
database technology.
What is a Distributed Database
System?
v A distributed database (DDB) is a collection of
multiple, logically interrelated databases, distributed
over a computer network
w i.e., storing data on multiple computers (nodes) over the
network
v A distributed database management system
(DDBMS) is the software that
w manages the DDB;
w provides an access mechanism that makes this distribution
transparent to the users.
v Distributed database system (DDBS):
w DDBS = DDB + DDBMS
What is not a DDBS?
v A timesharing computing system
v A loosely or tightly coupled multiprocessor system
v A database system which resides at one of the
nodes of a network of computers
w This is a centralized database on a network node
Centralized DBMS on a Network
Site 4
Site 1
Site 2
Communication Network

Site 3
Site 6
Site 5

v Data resides only at one node.


v Database management is the same as in a
centralized DBMS.
v Remote processing, single-server- multiple-clients
Distributed DBMS Environment

Site 4
Site 1
Site 2
Communication Network

Site 3
Site 6
Site 5
Applications of DDBMS
v Manufacturing – especially multi-plant manufacturing
v Military command and control
v Airlines
v Hotel chains
v Any organization which has a decentralized
organization structure
Reasons for Data Distribution
v Several factors leading to the development of DDBS
w Distributed nature of some database applications
w Increased reliability and availability
w Allowing data sharing while maintaining some measure of
local control
w Improved performance
Implicit Assumptions
v Data stored at a number of sites.
w Each site has processing power
v Processors at different sites are interconnected by a
computer network
v Distributed database is a database, not a collection
of files
w Data logically related as exhibited in the users’ access
patterns (e.g., relational data model)
v DDBMS is a fully-fledged DBMS
w Not remote file systems
Design Issues of Distributed Systems
v Must be transparent
v Provide flexibility
v Be reliable
w Design should not require the simultaneous functioning of a substantial
number of critical components
w More redundancy, greater availability, and greater inconsistency
w Fault tolerance, the ability to mask failures from the user
v Good performance
w Important (the rest are useless without this)
w Balance number of messages and grain size of distributed
computations
v Scalable
w A maximum for developing distributed systems
w Avoid centralized components, tables, and algorithms
w Only decentralized algorithms should be used
Characteristics of Decentralized
Algorithms
v No machine has complete information about the
state of the system.
v Machines make decisions based only on locally
available information.
v Failure of one machine does not ruin the algorithm.
Promises of DDBSs
v Transparent management of distributed,
fragmented, and replicated data
v Improved reliability and availability through
distributed transactions
v Improved performance
v Easier and more economical system expansion

You might also like