You are on page 1of 9

Distributed Data Dictionary

Management

Data Dictionary or Metadata


Repository
System catalog Constitutes the data
Dictionary.
The classical definition of metadata is data
about data.
One of the major issues surrounding metadata
is whether metadata should be centralized or
decentralized.

Centralized Approach
The centralized architecture is the traditional approach to building a
metadata repository. It offers efficient access to information,
adaptability to additional data stores, scalability to capture
additional metadata, and high performance.
Under this approach the system catalog is maintained at one of the
participating sites in the distributed database.
This site acts as the central coordinator of the distributed data base
management system.
However, like any other centralized architecture, centralized
metadata repository is a single point of failure. It requires
continuous synchronization with the participants of the data
environment, may become a performance bottleneck, and may
negatively affect quality of metadata. Indeed, the need to copy
information from various applications and data stores into the
central repository may compromise data quality if proper data
validation procedures are not a part of the data acquisition process.
Moreover people who need to use and change local metadata do
not want a central authority controlling their usage of metadata.

Distributed Approach
A distributed architecture avoids the concerns and
potential errors of maintaining copies of the source
metadata by accessing up-to-date metadata from all
systems' metadata repositories in real time.
Distributed metadata repositories offer superior metadata
quality since the users see the most current information
about the data.
However, since distributed architecture requires real-time
availability of all participating systems, a single system
failure may potentially bring the metadata repository
down.
Also, as source systems configurations change, or as new
systems become available, a distributed architecture needs
to adapt rapidly to the new environment, and this degree
of flexibility may require a temporary shutdown of the
repository.

Advantages of Distributed Approach


Each architectural entity has its own metadata
Autonomy of processing
Automatic update as each change is made
locally
The technical support for local metadata is
built into each product
Local metadata can be changed, on as needed
basis with no interference from a centralized
authority.

Disadvantages of Distributed Approach


There is a need for exchange of metadata.
There are no central agreed or standard.
Once local metadata has been passed to
another location, there is the issue of keeping
the metadata up to date .

Distributed Approach

Full
Replication
Approach

Partial
Replication
Approach

Full Replication Approach


Under this approach the complete catalog is
maintained on all the sites this allows processing
locally on all sites in a manner that the system
catalog being locally available at each site has a
greater degree of Autonomy.
One of the draw back is the storage overhead
owing to greater redundancy.
The other draw back is the consistency problem
that is how to keep the replicated copies of the
system catalog on various sites synchronized.

Partial Replication Approach


Under this approach each site maintains a local catalog
where information about the data base objects for
which the corresponding site is the birth site is stored.
Additionally it also holds the information of the
replicas and each site maintains a set of links to the
database objects on the other site.
Whenever any site submits a Query if it can be handled
using the local catalog its ok else the links are
evaluated.
If the information is not available in the set of links
than hunt for the data base object is made and the set
of links is accordingly updated.

You might also like