Professional Documents
Culture Documents
Chapter 1: An Overview
• Introduction.
• DDB uses global/distributed applications which access data at more than one site.
Eg – transfer of funds from an account of one bank branch to another account of another
branch updating the db at both branches)
• Each site
✔Has autonomous processing capability .
✔Can perform local applications.
✔ Participate in the execution of at least one global application which requires accessing
data at several sites using a communication subsystem.
Centralized Distributed
•Cons: •Emphasizes
-bottleneck may occur 1.Distribution: not same
-Single point of failure. site(processor).
2.Logical correlation :
some properties tie data
together.
Centralized vs Distributed DB Features
• Centralized Control
• Data independence
• Reduction of redundancy
• In DDB
• There is a Hierarchical control structure based on a global DBA who has central
responsibility of whole DB.
• Local DBAs who have responsibility of their respective local DB
✔May have high degree of autonomy (Site autonomy), upto the
point that a global DBA is not required and inter-site coordination is
performed by the local administrators themselves.
• Programs written having conceptual view of data (conceptual schema) & unaffected
by changes in physical organization of data.
• In DDB
• Same importance as traditional DB.
• Introduce Distribution Transparency
• Programs can be written as if the database were not distributed.
• Correctness of programs unaffected by data movement from site to
another while speed of execution may be affected
Reduction of Redundancy
• In Traditional DB
• Redundancy was reduced for two reasons
1. There is only one copy of data shared by several applications –
inconsistency can be avoided
2. Storage space saved
• Redundancy was reduced by data sharing – by allowing several applications
to access the same files and records.
• In DDB
• Data redundancy needed/desirable as
1. Availability of data can be increased if it is replicated at all sites.
2. Also one site failure does not stop execution due to presence of
replicated data at other sites.
• Data replication convenience increase with
ratio of retrieval accesses (any copy) versus update accesses (all copies) performed
by applications to it. -🡪 a tradeoff – retrieval of data can be done on any copy but
updations must be performed consistently on all copies.
[if retrieval is more – more replication desirable]
Complex physical structures & efficient access
• In Traditional DB
• Secondary indexes, interfile chains & others.
• Used to obtain complex and efficient access of data
• In DDB
• Very difficult to build and maintain such structures in distributed db.
• Efficient access can’t be provided by this structure as
1. Very difficult to build and maintain such structures.
2. Not convenient to navigate at record level in DDB
• A distributed access plan can be produced by an optimizer.
✔ Global optimization : determines which data must be accessed at which
sites and which files must consequently be transmitted. (parameters –
communication cost)
✔ Local optimization : decides how to perform local db access at each site.
Optimizers’ Design problems
Categories
• Failures
• Concurrency
Integrity, recovery and concurrency control
• DB integrity
• Transaction atomicity assure DB integrity by assuring all actions transfer
DB from consistent state to another are performed or initial consistent
state is preserved.
• In DDB
• Local DBAs face same DBA problems in traditional DB.
• In DDB with very high degree of site autonomy, local DBA more protected
through enforcing their own protection instead of central DBA.
• Communication networks are vulnerable to attacks. So security problems are
intrinsic.
Why Distributed Databases?
• Why DDB development has just begun ?
• Support for database administration & control – this feature includes tools
for monitoring the db, gathering info about db utilization, and providing a
global view of data files existing at the various sites.
• Heterogeneous DDBMS :
• At least two different DBMSs.
• Added problem of translating between different data models of the different
local DBMSs
• Used in case of integrating preexisting DBs .
• Some systems support communication between different DC
components(mainly developed for compatibility reasons in centralized systems)
as in DBMSs produced for running on IBM computers
End of Chapter 1