You are on page 1of 4

56 ACTA ELECTROTEHNICA

Distributed Database Design –


Top-Down Design
Ileana ŞTEFAN and Maricel POPA
Abstract: The design of distributed databases is an optimization problem requiring solutions to several
interrelated problems: data fragmentation, allocation, and local optimization. Each problem can be
solved with several different approaches thereby making the distributed database design a very difficult
task

1. Introduction important role in maintaining the integrity


and the accuracy of the data stored in a
The creation of a distributed database database. If the database is not carefully
system, which reflects the desired structure, designed, it would be hard to find all types of
permits data access form all the units and information, and some of them might be
stores data in the vicinity of the sites that will erroneous. Broken data represent the worst
use it most, has to improve the portioning of outcome of a badly designed database.
data and to offer easy and fast access for all Design methods. The strategies for
the users, no matter where they are situated. designing a distributed database, as in the
The distributed systems must aid case of centralized databases, are:
resolving the data isles issues. Sometimes, • Top-down design, which aims for an
databases are considered as electronic isles, optimal distribution of data.
which point to distinct and inaccessible Homogenous systems are usually a
places, similar to far-off islands. This can be result of using this design;
a result of geographical division, • Bottom-up design, which is used for
incompatible architectures, incompatible local databases that need to be
communication protocols, etc. integrated into an unitary system;
Those that work with RDBMS software • Mixed design, which uses both
might ask themselves why they should worry methods. This design suits practical
about designing the databases, noticing that cases best.
the majority of RDBMS suites come with
sample databases that can be copied or 2. Top down design
modified according to necessities. These
tables can be taken form the sample databases This design is a process of creating data
and used in a user-defined database. Some models that contain high-level entities and
programs provide tools for the users to help relationships, to which successive
them in creating and defining the tables. refinements are applied, in order to identify
Nevertheless, these instruments don’t actually the corresponding low-level entities,
contribute to the design of the database; they relationships and attributes. The top-down
merely help for the physical creation of the approach is illustrated by using the concepts
tables that will be part of a database. of the entity-relationship model.
The reason for which databases must be The top-down approach consists of
well designed is that the design has an taking the following steps:
Volume 48, Number 1, 2007 57

• Analyzing requirements; used as an indicator when the analysis is


• View integration and conceptual performed.
design; The design process should depend on
• Data distribution design; quantity information, as well as on quality
• Local physical schema design. information. Quality information is used in
the fragmentation process, as quantity
Analyzing requirements is the stage in information is used in the allocation process.
which the database users are defined, as well Quality information refers to transactions run
as the work environment and the information by the application (including relationships,
processing requirements. Starting with these attributes and tuples), access time (read or
requirements the view design and the write) and predicates of read operations.
conceptual schema will be created next. The Quantity information refers to the
visualization activity aims to define the frequency at which an application is used, the
interfaces for the end users. The conceptual site where the application is run from and the
design consists of defining the entity types criteria for transaction and application
and the relationships between them (entity performance.
analysis), followed by acknowledging the The degree at which a database should
activities that the organization undertakes be fragmented has an effect on query
(functional analysis). Next is the data execution performance. Fragmentation degree
distribution stage, the output of which is the varies form one extreme, which is no
local conceptual schemas obtained by fragmentation, to the other, where
distributing the entities on the different fragmentation should be applied at tuple or
stations of the distributed systems. The attribute levels. In practice, a reasonable
distribution of relationships is preceded by degree of fragmentation should be used.
the creation fragmentation model of these After the database is fragmented, the
relationships. As a result, the data distribution allocation of fragments on the network
design consists of two steps: fragmentation of stations follows. The designer of the database
the relationships of the global conceptual must decide if the fragments would be
schema and allocation of the fragments replicated, and their degree of replication.
attained. The fragments can be multiplied, or kept into
The fragmentation process consists of one copy. Data replication increases the fault-
splitting a global relationship into a number tolerance of the system and the speed of data
of sub relationships, called fragments, which retrieval.
can be stored optimally in the distributed Top-down design can be illustrated in a
database. The fragmentation must maintain schema as in the figure below:
the semantic coherence of the database, must
make a split without information loss and 3. Bottom-up design
duplicates, and to permit the rebuild of the
initial relationship from its fragments. There This approach starts from the
are three types of fragmentation: horizontal, fundamental level of attributes, which are
vertical, and mixed. The fragments are grouped in relationships that represent entity
allocated to one or more sites. The fragments types and associations between them, as a
can be replicated, in order to obtain high result of analyzing associations between
availability and performance. attributes. The normalization process
Fragment definition and allocation constitutes a bottom-up approach.
should depend on how the database will be Normalization implies identifying attributes
used. This implies analysis of the and placing them in normalized tables, based
applications. Practical experience saw that 20 on functional dependences between
% of the most used queries represent 80 % of attributes.
the total data accesses, and this rule can be
58 ACTA ELECTROTEHNICA

Requirement
Collection

View analysis and


integration
Distribution design

Data
Acquisition

Partitioning
Vertical Partitioning

Allocation
& Horizontal Partitioning
Replication
Mixed Partitioning

Local
Optimization

Physical Database Design for Each Local Database

Operational Database

Fig. 1. Distributed Database Design Methodology.

Bottom-up modeling is indicated to be a heterogeneous system adds the necessity of


used for systems where databases already translating between different representations
exist, but have to be integrated in a single of data, besides the complexity of data
database. The process consists of integrating integration.
local conceptual schemas into the global Bottom-up approach consists of taking
conceptual schema. This is equivalent in the following steps:
means to correlating the common data • Choosing a common data model for
descriptions and resolving the conflicts that describing the global schema;
may appear. If the existing databases have • Translation of each local scheme intro
used different Database Management a common data model;
Systems, it is possible that they also use • Integration of the local schemas into a
different DBMS after integration. As a result, global schema.
Volume 48, Number 1, 2007 59

References 8. Oszu T., Valduriez P. - Principales of Distributed


System, second Edition, Prentice Hall, 1999.
1. Bell D., Grimson J. – Distributed Database 9. Oracle 8i - Distributed Database System, 1999
System, Addison Wesley, 1995. 10. Michael J. Hernandez – Proiectarea bazelor de
2. Anahory S., Murray- Data Warehousing in the date, Editura Teora, 2003
Real World: A Practical Guide for Building 11. Ileana Popescu – Modelarea bazelor de date,
Decision Support System, Harlow: Addison Editura Tehnică, Bucureşti, 2001
Wesley, 1997. 12. Ileana Popescu – Prelucrarea avansată a
3. Ceri S., Pellegatti G. - Distributed Database - informaţiei Oracle8, Editura tehnică, Bucureşti,
Principles and System, McGraw-Hill, 1984. 1999
4. Connoly T., Begg C., Strachan A. – Database 13. Lungu I., Bodea C., Bădescu G., Ioniţă C.- Baze
System – A Practical Approach to Design, de date- organizare, proiectare şi implementare,
Implementation and Management, Addison Editura ALL Educational, 1995
Wesley, 1998.
5. Devlin B. - Data Warehousing: ReseFrom Ileana ŞTEFAN
Architecture to Implementation. Harlow: Addison
“Petru Maior” University Of Târgu Mureş
Wesley, 1998.
6. IBM Corporation – Distributed Relational
Database Architecture Reference Maricel POPA
7. Lindsay B. G. – nnNotes on Distributed OMEGA-Tehnoton Iaşi
Databases, IBM Research Report RJ2571.

You might also like